Skip to content

[QDP] Pr1 phase kernel opt#1386

Open
aloha1357 wants to merge 3 commits into
apache:mainfrom
aloha1357:pr1-phase-kernel-opt
Open

[QDP] Pr1 phase kernel opt#1386
aloha1357 wants to merge 3 commits into
apache:mainfrom
aloha1357:pr1-phase-kernel-opt

Conversation

@aloha1357

@aloha1357 aloha1357 commented Jun 7, 2026

Copy link
Copy Markdown

Related Issues

related #1385

Changes

  • Bug fix
  • New feature
  • Refactoring
  • Documentation
  • Test
  • CI/CD pipeline
  • Other

Why

The original phase encoding and IQP encoding kernels suffered from GPU thread divergence due to conditional branching (if (val != 0.0) or if ((x >> i) & 1U)). Furthermore, the normalization factor (norm_factor) was being redundantly calculated inside the GPU kernel, consuming extra cycles. Eliminating these inefficiencies significantly improves the kernel's execution speed on the GPU.

How

  • Replaced Conditional Branching: In both phase.cu and iqp.cu, the if conditions checking bit states were replaced with boolean arithmetic casting and multiplication (e.g., phases[bit] * (double)((idx >> bit) & 1U)). This ensures that all threads in a warp follow the exact same instruction path, eliminating warp divergence.
  • Host-side Pre-calculation: Moved the norm_factor calculation to the host (CPU) before launching the kernel in phase.cu, passing the result as an immutable parameter.
  • Added Explanatory Comments: Included inline documentation near the bitwise arithmetic lines to aid code reviewers in understanding the optimizations.

Benchmark Results

Environment: Dev Machine (NVIDIA RTX 4060)
Configuration: Qubits (N): 14, Batch Size: 128, Iterations: 5

Implementation Execution Time (ms) Notes
GPU phase (Before PR1) 1.26 ms (per sample) / 161.91 ms (total) Strict checkout of unoptimized phase.cu
GPU phase (This PR) 1.16 ms (per sample) / 147.94 ms (total) ~9.4% Performance Gain with zero divergence.

Checklist

  • Added or updated unit tests for all changes (Verified passing against existing CI test suite)
  • Added or updated documentation for all changes (Added explanatory inline comments for PR)

@aloha1357 aloha1357 changed the title Pr1 phase kernel opt [QDP] Pr1 phase kernel opt Jun 7, 2026

@ryankert01 ryankert01 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need you to do four things:

  1. show the benchmark of this changes (before & after)
  2. enhance our unit tests for this function
  3. cleanup the code & comments from coding agents
  4. read CONTRIBUTING.md

Thanks!

@ryankert01 ryankert01 force-pushed the pr1-phase-kernel-opt branch from ca90282 to 29ecb19 Compare June 8, 2026 07:17
@aloha1357 aloha1357 requested a review from guan404ming as a code owner June 8, 2026 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants