Fix: recover missing orchestrator phases in L2 swimlane level 4#971
Fix: recover missing orchestrator phases in L2 swimlane level 4#971indigo1973 wants to merge 1 commit into
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR refactors L2 swimlane phase profiling to decouple ready-queue ownership from host lane indexing. It replaces a unified flush API with per-kind variants, threads ChangesL2 Swimlane Phase Profiling Refactor
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request refactors the L2 swimlane phase profiling to treat orchestrator phase records as a single instance (pool ordinal 0) instead of dynamically sizing them based on the AICPU thread count. It splits the phase buffer flushing into separate scheduler and orchestrator functions, passes the total AICPU thread count to the collector initialization, and optimizes buffer allocation and JSON export for the orchestrator lane. Feedback recommends adding a null check for shm_host_ and defensively clamping the orchestrator lane count retrieved from shared memory to prevent potential undefined behavior or memory corruption.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/a2a3/platform/shared/host/l2_swimlane_collector.cpp (1)
79-97:⚠️ Potential issue | 🟠 Major | ⚡ Quick winValidate
aicpu_thread_numparameter.The parameter is stored without validation. At line 648, it's used as
aicpu_thread_num_ - 1to derive the orchestrator thread index. Ifaicpu_thread_numis 0 or negative, this will produce an invalid thread index (-1 or worse).🛡️ Proposed fix to add validation
num_aicore_ = num_aicore; + if (aicpu_thread_num <= 0) { + LOG_ERROR("Invalid aicpu_thread_num: %d (must be > 0)", aicpu_thread_num); + return -1; + } aicpu_thread_num_ = aicpu_thread_num; l2_swimlane_level_ = l2_swimlane_level_;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/a2a3/platform/shared/host/l2_swimlane_collector.cpp` around lines 79 - 97, Validate the aicpu_thread_num parameter in L2SwimlaneCollector::initialize: ensure aicpu_thread_num > 0 (and optionally <= a sensible max if applicable) before assigning to aicpu_thread_num_; if invalid, log an error via LOG_ERROR with a clear message and return -1 so downstream code that uses aicpu_thread_num_ - 1 (e.g., computing the orchestrator thread index) cannot produce a negative index. Include the function name L2SwimlaneCollector::initialize and the member aicpu_thread_num_ when locating where to add this check.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/a2a3/platform/shared/host/l2_swimlane_collector.cpp`:
- Around line 79-97: Validate the aicpu_thread_num parameter in
L2SwimlaneCollector::initialize: ensure aicpu_thread_num > 0 (and optionally <=
a sensible max if applicable) before assigning to aicpu_thread_num_; if invalid,
log an error via LOG_ERROR with a clear message and return -1 so downstream code
that uses aicpu_thread_num_ - 1 (e.g., computing the orchestrator thread index)
cannot produce a negative index. Include the function name
L2SwimlaneCollector::initialize and the member aicpu_thread_num_ when locating
where to add this check.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d6cd863e-5819-48ee-90f6-9677f59bc5e3
📒 Files selected for processing (11)
src/a2a3/platform/include/aicpu/l2_swimlane_collector_aicpu.hsrc/a2a3/platform/include/common/l2_swimlane_profiling.hsrc/a2a3/platform/include/host/l2_swimlane_collector.hsrc/a2a3/platform/onboard/host/device_runner.cppsrc/a2a3/platform/onboard/host/device_runner.hsrc/a2a3/platform/shared/aicpu/l2_swimlane_collector_aicpu.cppsrc/a2a3/platform/shared/host/l2_swimlane_collector.cppsrc/a2a3/platform/sim/host/device_runner.cppsrc/a2a3/platform/sim/host/device_runner.hsrc/a2a3/runtime/tensormap_and_ringbuffer/runtime/scheduler/scheduler_cold_path.cppsrc/a2a3/runtime/tensormap_and_ringbuffer/runtime/scheduler/scheduler_dispatch.cpp
--enable-l2-swimlane level 4 produced an empty aicpu_orchestrator_phases section — every orchestrator-phase record was dropped. Two causes: - The orch-phase pool was indexed by the orchestrator's AICPU thread id (the last thread), but only lower indices were primed/drained, so record_orch_phase hit a null pool and dropped the record at write. - flush read the orch buffer's `count` through a SchedPhaseBuffer cast. `count` follows records[N], whose stride differs (sched 40B vs orch 32B), so the read landed past the buffer, saw 0, and skipped the flush. Make orch a single instance at pool ordinal 0 (matching dep_gen / scope_stats): the orchestrator records and flushes pool 0 and uses its thread index only to select its own ready queue; entries are tagged with ordinal 0. Read/write `count` through the matching buffer type. - Split l2_swimlane_aicpu_flush_phase_buffers into flush_sched_phase_buffer (per scheduler thread) and flush_orch_phase_buffer (orchestrator only); gate the orch flush on >= ORCH_PHASES to match where orch records are actually emitted. - Name the phase-enqueue params for their now-distinct roles: thread_idx (selects the caller's own ready queue) vs pool_idx (host filing/recycle ordinal) — they coincide for sched but differ for the single orch instance (pool 0). - Host allocates buffers for one orch pool by actual usage (~24MB less); unused pool states are still zeroed for safe reconcile iteration. - Pass aicpu_thread_num to the host collector so the phase-metadata log reports the orchestrator's real AICPU thread number; bound the JSON export to the single orch lane, and guard its shm_host_ deref.
--enable-l2-swimlane level 4 produced an empty aicpu_orchestrator_phases section — every orchestrator-phase record was dropped. Two causes:
countthrough a SchedPhaseBuffer cast.countfollows records[N], whose stride differs (sched 40B vs orch 32B), so the read landed past the buffer, saw 0, and skipped the flush.Make orch a single instance at pool ordinal 0 (matching dep_gen / scope_stats): the orchestrator records and flushes pool 0 and uses its thread index only to select its own ready queue; entries are tagged with ordinal 0. Read/write
countthrough the matching buffer type.