Add: add scalar float type data dump support.#966
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR extends tensor dump infrastructure to preserve and correctly represent float and int32 scalar argument types throughout the runtime stack, replacing the previous uint64-only approach with per-scalar dtype tracking and specialized export logic. ChangesScalar dtype tracking and export
Sequence Diagram(s)sequenceDiagram
participant Client as User Code
participant Arg as Arg
participant Payload as PTO2TaskPayload
participant Dump as tensor_dump_collector
Client->>Arg: add_scalar(1.5f)
Arg->>Arg: record scalar value + dtype_of<float>()
Note over Arg: scalar stored as uint64 bits, dtype=FLOAT32
Arg->>Payload: init(args)
Payload->>Payload: copy scalars[] + scalar_dtypes[]
Note over Payload: dtype bytes preserved in cold data
Dump->>Payload: read scalar_dtypes[i]
alt dtype is FLOAT32
Dump->>Dump: reinterpret uint64 as float
Dump->>Dump: format 1.5 for JSON
else other dtype
Dump->>Dump: emit raw numeric value
end
Dump->>Dump: output value + correct dtype
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces support for preserving and exporting scalar data types and values throughout the tensor dump pipeline, updating the payload structures, dump collector, and viewer tool. The code review identifies two critical issues: first, signed integer scalars are printed as large positive numbers in the JSON manifest because they are not cast back from uint64_t based on their DataType; second, mapping double to DataType::FLOAT32 causes a bit-pattern mismatch during packing/unpacking, which requires explicit overloads to convert double to float safely.
| if (dt.dtype == static_cast<uint8_t>(DataType::FLOAT32)) { | ||
| float f; | ||
| memcpy(&f, &dt.scalar_value, sizeof(float)); | ||
| std::ostringstream val_ss; | ||
| val_ss << f; | ||
| std::string val_str = val_ss.str(); | ||
| if (val_str.find('.') == std::string::npos && val_str.find('e') == std::string::npos) { | ||
| val_str += ".0"; | ||
| } | ||
| json << ", \"value\": " << val_str; | ||
| } else { | ||
| json << ", \"value\": " << dt.scalar_value; | ||
| } |
There was a problem hiding this comment.
Negative values for signed integer types (e.g., INT32, INT16, INT8, INT64) are printed as large positive numbers in the JSON manifest because dt.scalar_value is a uint64_t and is printed directly without casting back to its original signed type. We should decode and cast the scalar value based on dt.dtype before printing it to the JSON manifest. Additionally, ensure 64-bit unsigned integers (like UINT64) are represented as strings in the JSON output to prevent precision loss in JavaScript-based parsers.
if (dt.dtype == static_cast<uint8_t>(DataType::FLOAT32)) {
float f;
memcpy(&f, &dt.scalar_value, sizeof(float));
std::ostringstream val_ss;
val_ss << f;
std::string val_str = val_ss.str();
if (val_str.find('.') == std::string::npos && val_str.find('e') == std::string::npos) {
val_str += ".0";
}
json << ", \"value\": " << val_str;
} else if (dt.dtype == static_cast<uint8_t>(DataType::INT32)) {
int32_t val;
memcpy(&val, &dt.scalar_value, sizeof(int32_t));
json << ", \"value\": " << val;
} else if (dt.dtype == static_cast<uint8_t>(DataType::INT16)) {
int16_t val;
memcpy(&val, &dt.scalar_value, sizeof(int16_t));
json << ", \"value\": " << val;
} else if (dt.dtype == static_cast<uint8_t>(DataType::INT8)) {
int8_t val;
memcpy(&val, &dt.scalar_value, sizeof(int8_t));
json << ", \"value\": " << static_cast<int32_t>(val);
} else if (dt.dtype == static_cast<uint8_t>(DataType::INT64)) {
int64_t val;
memcpy(&val, &dt.scalar_value, sizeof(int64_t));
json << ", \"value\": " << val;
} else if (dt.dtype == static_cast<uint8_t>(DataType::UINT32)) {
uint32_t val;
memcpy(&val, &dt.scalar_value, sizeof(uint32_t));
json << ", \"value\": " << val;
} else if (dt.dtype == static_cast<uint8_t>(DataType::UINT16)) {
uint16_t val;
memcpy(&val, &dt.scalar_value, sizeof(uint16_t));
json << ", \"value\": " << val;
} else if (dt.dtype == static_cast<uint8_t>(DataType::UINT8)) {
uint8_t val;
memcpy(&val, &dt.scalar_value, sizeof(uint8_t));
json << ", \"value\": " << static_cast<uint32_t>(val);
} else if (dt.dtype == static_cast<uint8_t>(DataType::UINT64)) {
uint64_t val = dt.scalar_value;
json << ", \"value\": \"" << val << "\"";
} else {
json << ", \"value\": \"" << dt.scalar_value << "\"";
}References
- When serializing 64-bit unsigned integers (such as task_id, tensor_id, or hardware addresses) to JSON, represent them as strings to prevent precision loss in JavaScript-based parsers that are limited to 2^53 - 1.
There was a problem hiding this comment.
已修复。在 tensor_dump_collector.cpp 中添加了 INT32 显式解码路径,通过 memcpy 将 scalar_value 按 int32_t 重新解释后输出,保证负值正确显示(如 -1)而非大整数(如 4294967295)。
} else if (dt.dtype == static_cast<uint8_t>(DataType::INT32)) {
int32_t val;
memcpy(&val, &dt.scalar_value, sizeof(int32_t));
json << ", \"value\": " << val;
}There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
src/common/platform/shared/host/tensor_dump_collector.cpp (1)
566-578: ⚡ Quick winINT32 scalars are not sign-extended, causing negative values to appear as large positive numbers.
The current implementation only decodes FLOAT32 scalars. INT32 values stored via
add_scalars_i32are zero-extended (e.g.,int32_t(-1)becomes0x00000000FFFFFFFF), so the JSON export outputs4294967295instead of-1.While the PR objectives focus on FLOAT32 support, the infrastructure already tracks INT32 dtype. For consistency and completeness, consider decoding INT32 as well:
} else if (dt.dtype == static_cast<uint8_t>(DataType::INT32)) { int32_t val; memcpy(&val, &dt.scalar_value, sizeof(int32_t)); json << ", \"value\": " << val; } else { json << ", \"value\": " << dt.scalar_value; }If INT32 decoding is deferred by design, consider adding a TODO comment to track it.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/common/platform/shared/host/tensor_dump_collector.cpp` around lines 566 - 578, The JSON export treats INT32 scalar bytes as unsigned/zero-extended, so negative values (added via add_scalars_i32) appear as large positives; update the scalar handling in tensor_dump_collector.cpp where dt.dtype is inspected (the dt.dtype / dt.scalar_value branch) to explicitly decode INT32 (DataType::INT32) by memcpy’ing into an int32_t and outputting that signed value instead of the raw scalar_value; reference the existing FLOAT32 handling as the pattern to follow and if you intentionally defer this change, add a TODO noting INT32 sign-extension must be fixed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@simpler_setup/tools/dump_viewer.py`:
- Line 190: The separator print currently uses print("-" * 110) which misaligns
the table; change it to match the table width (print("-" * 101)) or, better,
compute dynamically from the header string (e.g., print("-" * len(header)) where
header is the table header being printed) so the separator always equals the
actual header/data width.
In `@src/common/platform/shared/host/tensor_dump_collector.cpp`:
- Around line 566-578: The FLOAT32 branch that converts dt.scalar_value to a
float (using float f and std::ostringstream val_ss) must guard against
non-finite values: if std::isnan(f) or std::isinf(f) produce a JSON string
(e.g., "NaN", "Infinity", "-Infinity") or null instead of emitting the raw
token, otherwise JSON becomes invalid; update the block handling
DataType::FLOAT32 (where dt.scalar_value is memcpy'd into f and written to json)
to check std::isnan/std::isinf and emit the appropriate quoted string (or null)
for those cases, falling back to the existing finite-value formatting for normal
floats.
In `@src/common/task_interface/data_type.h`:
- Around line 152-153: The mapping for double in the type-to-DataType branch
(the else if constexpr (std::is_same_v<T, double>) clause) is wrong: change its
return from DataType::FLOAT32 to DataType::UINT64 so 8-byte doubles stored via
to_u64(double_val) are decoded as 8-byte values (matching
tensor_dump_collector.cpp's scalar_value handling) rather than being truncated
by the FLOAT32 decoder; update that branch only to return UINT64 to preserve
full 8-byte representation.
---
Nitpick comments:
In `@src/common/platform/shared/host/tensor_dump_collector.cpp`:
- Around line 566-578: The JSON export treats INT32 scalar bytes as
unsigned/zero-extended, so negative values (added via add_scalars_i32) appear as
large positives; update the scalar handling in tensor_dump_collector.cpp where
dt.dtype is inspected (the dt.dtype / dt.scalar_value branch) to explicitly
decode INT32 (DataType::INT32) by memcpy’ing into an int32_t and outputting that
signed value instead of the raw scalar_value; reference the existing FLOAT32
handling as the pattern to follow and if you intentionally defer this change,
add a TODO noting INT32 sign-extension must be fixed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: cf9e5e34-e04d-459a-a913-0531086b2382
📒 Files selected for processing (9)
simpler_setup/tools/dump_viewer.pysrc/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_runtime2_types.hsrc/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_types.hsrc/a5/runtime/tensormap_and_ringbuffer/runtime/pto_runtime2_types.hsrc/a5/runtime/tensormap_and_ringbuffer/runtime/pto_types.hsrc/common/platform/include/aicpu/tensor_dump_aicpu.hsrc/common/platform/shared/host/tensor_dump_collector.cppsrc/common/task_interface/data_type.htests/st/a2a3/tensormap_and_ringbuffer/dfx/tensor_dump/test_tensor_dump.py
| f" {'arg':>3} {'kind':>6} {'dtype':>8} {'shape/val':<22} {'bytes':>10}" | ||
| ) | ||
| print("-" * 100) | ||
| print("-" * 110) |
There was a problem hiding this comment.
Fix separator width to match table header.
The separator line is 110 characters, but the table header and data rows are 101 characters wide (6+2+18+2+7+2+5+2+3+2+6+2+8+2+22+2+10 = 101), causing visual misalignment.
🎨 Proposed fix
- print("-" * 110)
+ print("-" * 101)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| print("-" * 110) | |
| print("-" * 101) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@simpler_setup/tools/dump_viewer.py` at line 190, The separator print
currently uses print("-" * 110) which misaligns the table; change it to match
the table width (print("-" * 101)) or, better, compute dynamically from the
header string (e.g., print("-" * len(header)) where header is the table header
being printed) so the separator always equals the actual header/data width.
| } else if constexpr (std::is_same_v<T, double>) { | ||
| return static_cast<uint8_t>(DataType::FLOAT32); |
There was a problem hiding this comment.
Incorrect dtype mapping for double causes decoding errors.
Mapping double to FLOAT32 will cause incorrect decoding in tensor_dump_collector.cpp (lines 566-578). When a double is stored via to_u64(double_val), all 8 bytes are preserved in the uint64_t storage. However, the FLOAT32 decoder only reads the lower 4 bytes via memcpy(&f, &dt.scalar_value, sizeof(float)), producing a corrupted value.
Recommendation: Map double to DataType::UINT64 (fallback for unsupported types) instead of FLOAT32, since the DataType enum does not include FLOAT64 and the decoder cannot correctly interpret 8-byte floating-point values.
Proposed fix
if constexpr (std::is_same_v<T, float>) {
return static_cast<uint8_t>(DataType::FLOAT32);
} else if constexpr (std::is_same_v<T, double>) {
- return static_cast<uint8_t>(DataType::FLOAT32);
+ return static_cast<uint8_t>(DataType::UINT64);
} else if constexpr (std::is_same_v<T, int32_t>) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| } else if constexpr (std::is_same_v<T, double>) { | |
| return static_cast<uint8_t>(DataType::FLOAT32); | |
| } else if constexpr (std::is_same_v<T, double>) { | |
| return static_cast<uint8_t>(DataType::UINT64); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/common/task_interface/data_type.h` around lines 152 - 153, The mapping
for double in the type-to-DataType branch (the else if constexpr
(std::is_same_v<T, double>) clause) is wrong: change its return from
DataType::FLOAT32 to DataType::UINT64 so 8-byte doubles stored via
to_u64(double_val) are decoded as 8-byte values (matching
tensor_dump_collector.cpp's scalar_value handling) rather than being truncated
by the FLOAT32 decoder; update that branch only to return UINT64 to preserve
full 8-byte representation.
There was a problem hiding this comment.
感谢指出。经确认,scalar 接口(Arg::add_scalar)的类型模板参数仅推导 float / int32_t / uint32_t,double 不在 scalar 类型范围内,因此 dtype_of<double>() 的映射问题不影响实际使用场景。
如后续有需要支持 double scalar,可单独处理。
There was a problem hiding this comment.
@vegetabledoww, 感谢说明!确认理解——Arg::add_scalar 的模板参数当前仅实例化 float、int32_t、uint32_t,因此 dtype_of<double>() 的分支在实践中不会被触发,当前映射不影响实际行为。
如有需要,我可以:
- 帮您在
dtype_of<double>()分支处添加一条注释,说明 double 当前不受支持、实际不可达,以便后续维护者理解意图。 - 或者为未来的 double scalar 支持创建一个 follow-up issue 进行追踪。
请问需要我执行上述任一操作吗?另外,如果此 review comment 已无实际阻塞,我可以帮您将其 resolve。
(ᵔ◡ᵔ)っ🐇 感谢确认,已记录到 long-term memory 中以备后续 review 参考。
✏️ Learnings added
Learnt from: vegetabledoww
Repo: hw-native-sys/simpler PR: 966
File: src/common/task_interface/data_type.h:152-153
Timestamp: 2026-06-02T06:53:46.210Z
Learning: In the `hw-native-sys/simpler` repository, `Arg::add_scalar()` (in both `src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_types.h` and `src/a5/runtime/tensormap_and_ringbuffer/runtime/pto_types.h`) only accepts `float`, `int32_t`, and `uint32_t` as scalar type arguments. `double` is explicitly out of scope for the scalar interface, so `dtype_of<double>()` in `src/common/task_interface/data_type.h` is never instantiated in practice. Future `double` scalar support, if needed, would require a dedicated PR.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: hw-native-sys-bot
Repo: hw-native-sys/simpler PR: 930
File: src/a2a3/runtime/tensormap_and_ringbuffer/runtime/scheduler/scheduler_types.h:63-65
Timestamp: 2026-05-31T06:36:21.921Z
Learning: When reviewing native-code changes in this repository, do not introduce new configuration knobs or environment-variable overrides (e.g., scheduling timeout env vars like PTO2_SCHED_TIMEOUT_MS) speculatively. Any new configurable surface must be backed by a concrete, measured workload that demonstrates the need. Prefer hardcoded constants validated on a measurable reproducer over premature tunability. If future tuning is required, require a separate PR that includes its own measured baseline before accepting new knobs.
e6c2322 to
27b90af
Compare
…tagging Scalar dtype is now stored in the upper 8 bits of each uint64_t scalar slot. PTO2TaskPayload stays at 2880B baseline (+0B vs +64B before this change). Changes - data_type.h: add pack_scalar/unpack_scalar_dtype/unpack_scalar_value/from_u64_masked - pto_types.h (a2a3/a5): pack dtype on add_scalar/add_scalars/add_scalars_i32; remove scalar_dtypes_[] member - pto_runtime2_types.h (a2a3/a5): remove scalar_dtypes[] field; restore sizeof 2880B - scheduler_dispatch.cpp (a2a3/a5): unmask dtype bits before dispatch - scheduler_context.h (a2a3/a5): include data_type.h for unpack_scalar_value - tensor_dump_aicpu.h: read dtype via unpack_scalar_dtype(pl.scalars[i]) - tensor_dump_collector.cpp: mask before memcpy decode; support BOOL as true/false in JSON Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Summary
Tensor dump (--dump-tensor) now supports typed scalar values beyond the original uint64-only path. When a user passes float, int32_t, or uint32_t scalars via Arg::add_scalar(), the dump correctly decodes and displays them with the proper dtype instead of showing the raw uint64 bit pattern as UINT64.
Changes
JSON schema
Before (uint64-only):
{"kind": "scalar", "dtype": "UINT64", "value": 1065353216, "raw_value": 1065353216}
After:
{"kind": "scalar", "dtype": "FLOAT32", "value": 1.0}
Testing
Fixes [Feature] Tensor dump: support float/int32 scalar types besides uint64 #965