perf(multitude): add arc_array allocation/timing benchmark#507
perf(multitude): add arc_array allocation/timing benchmark#507martintmk wants to merge 1 commit into
Conversation
Add `criterion_arc_array` bench comparing construction of an `Arc<[Arc<[u8]>]>` (8 x 16B properties) via the global allocator vs the multitude arena, using the `alloc_tracker` crate for per-iteration allocation volume. Two strategies per path: build through a growable vec (`*`) and build directly from a pre-created slice (`*_from_slice`). The bench surfaces a pathology in the arena vec-then-freeze path: under a warmed, reset-and-reused arena it re-grabs a full 64 KiB chunk every iteration and runs ~35x slower than the global allocator. The direct `alloc_slice_clone_arc` path reuses chunks correctly (0 bytes/op). Results (criterion, warm-up 1s / measure 3s): variant time bytes/op allocs/op global ~562 ns 528 10 arena ~20.0 us 65535 1 global_from_slice ~143 ns 144 1 arena_from_slice ~146 ns 0 0 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| for _ in 0..PROPERTIES { | ||
| properties.push(arena.alloc_slice_copy_arc(payload)); | ||
| } | ||
| properties.try_into_arc().unwrap() |
There was a problem hiding this comment.
this is ugly, we should have into_arc here
There was a problem hiding this comment.
⚠️ Not ready to approve
The new benchmark’s module-level documentation misdescribes what the arena “from_slice” path actually allocates, which can mislead interpretation of the results.
Pull request overview
Adds a new Criterion benchmark to the multitude crate to compare building an Arc<[Arc<[u8]>]> via the global allocator vs the multitude arena, including per-iteration allocation tracking with alloc_tracker, to highlight a warm-arena reuse pathology in the vec-then-freeze path.
Changes:
- Add
alloc_trackeras adev-dependencyforcrates/multitudeand register a newcriterion_arc_arraybenchmark target. - Introduce
criterion_arc_arraybench that times two construction strategies (vec-then-freeze vs from-slice) and prints allocation volume viaalloc_tracker. - Update
Cargo.lockto include the new dev dependency.
File summaries
| File | Description |
|---|---|
| crates/multitude/Cargo.toml | Adds alloc_tracker dev-dep and registers the new Criterion bench. |
| crates/multitude/benches/criterion_arc_array.rs | New benchmark measuring time + allocation volume for global vs arena construction paths. |
| Cargo.lock | Locks alloc_tracker as a dependency of multitude (dev). |
Copilot's findings
- Files reviewed: 2/3 changed files
- Comments generated: 1
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| //! Builds an `Arc<[Arc<[u8]>]>` of `PROPERTIES` binary blobs two ways and | ||
| //! compares them: `std::sync::Arc` (global allocator) vs `multitude::Arc` | ||
| //! (arena, `reset` and reused between iterations). | ||
| //! | ||
| //! Each is benchmarked with two strategies: | ||
| //! | ||
| //! - `*` — push freshly allocated properties through a growable vec, then | ||
| //! freeze it into the `Arc`. | ||
| //! - `*_from_slice` — build directly from a pre-created slice of properties, | ||
| //! with no intermediate vec. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #507 +/- ##
=======================================
Coverage 100.0% 100.0%
=======================================
Files 343 343
Lines 26121 26121
=======================================
Hits 26121 26121 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Add
criterion_arc_arraybench comparing construction of anArc<[Arc<[u8]>]>(8 x 16B properties) via the global allocator vs the multitude arena, using thealloc_trackercrate for per-iteration allocation volume. Two strategies per path: build through a growable vec (*) and build directly from a pre-created slice (*_from_slice).The bench surfaces a pathology in the arena vec-then-freeze path: under a warmed, reset-and-reused arena it re-grabs a full 64 KiB chunk every iteration and runs ~35x slower than the global allocator. The direct
alloc_slice_clone_arcpath reuses chunks correctly (0 bytes/op).