Skip to content

Cranelift: use a BTreeMap for JITModule code ranges#13710

Open
mkschreder wants to merge 1 commit into
bytecodealliance:mainfrom
swedishembedded:cranelift-jit-finalize-btreemap
Open

Cranelift: use a BTreeMap for JITModule code ranges#13710
mkschreder wants to merge 1 commit into
bytecodealliance:mainfrom
swedishembedded:cranelift-jit-finalize-btreemap

Conversation

@mkschreder

Copy link
Copy Markdown

Let's make cranelift a little faster when booting JITed linux :)

Currently JITModule resolves a PC back to its defining function (for exception unwinding) via a Vec<(start, end, FuncId)> that was re-sorted in full on every finalize_definitions call. Code that defines and finalizes functions one at a time therefore paid an O(n) sort per finalize, i.e. O(n^2) overall.

I replace the vector with a BTreeMap keyed on the start address. Inserts stay sorted incrementally (O(log n)) and the lookup becomes an O(log n) range query, so finalizing is linear in the number of functions with no sort.

Noticible speedup is observed without any (obvious) problems. I think the change makes sense and I don't see why it wouldn't simply just get rid of O(n^2) complexity without any other issues.

`JITModule` resolves a PC back to its defining function (for exception
unwinding) via a `Vec<(start, end, FuncId)>` that was re-sorted in full on
every `finalize_definitions` call. Code that defines and finalizes functions
one at a time therefore paid an O(n) sort per finalize, i.e. O(n^2) overall.

Replace the vector with a `BTreeMap` keyed on the start address. Inserts stay
sorted incrementally (O(log n)) and the lookup becomes an O(log n) range
query, so finalizing is linear in the number of functions with no sort.
@mkschreder mkschreder requested a review from a team as a code owner June 22, 2026 20:44
@mkschreder mkschreder requested review from cfallin and removed request for a team June 22, 2026 20:44
@github-actions github-actions Bot added the cranelift Issues related to the Cranelift code generator label Jun 22, 2026

@cfallin cfallin left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reasonable change and looks good -- thanks! (FWIW this is more or less how Wasmtime's address map data structures work so it's neat to see cranelift-jit have convergent evolution here.)

Just some style notes re: the comments below, then happy to merge.

// resolve a PC back to its function for exception unwinding. A `BTreeMap`
// keeps this sorted incrementally (O(log n) insert, O(log n) range lookup),
// avoiding an O(n) re-sort of the whole table on every `finalize_definitions`
// call - which made finalizing many functions one at a time quadratic.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for the detailed explainer here -- makes sense when justifying the change, but doesn't have to live in the source forever. Just the format note (map from start addr to end addr and function) is fine.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your feedback. Don't you think though that the comment makes this choice much more obvious and potentially leads to similar choices being made more often in the future? I have the belief that the better code decisions are documented in place the better culture it creates over time because people (or machines) look at the comment like that and then realize that somewhere in a different place the same solution makes sense as well. Without the comment, it becomes an opaque decision burried in a git commit. Happy to remove the comments but I just think they do more good than bad.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, no, I don't think we need this level of detail -- it describes the past implementation and its impact ("re-sort of the whole table ... which made ..."). Level of commenting is always a subjective call, and I am all for detailed comments where things are tricky, but a BTreeMap sorted by address is common enough throughout all of Wasmtime+Cranelift that we don't need this; rather, overly verbose comments like this look (in context of our local style) like an indication of something non-obvious, and take time to read and digest, which is a net negative for a straightforward idiom, IMHO. The comment describing what they keys and values are is enough.

Part of my thinking here is informed by recent LLM-isms too: they have a tendency to be extraordinarily verbose, and dump point-in-time thoughts in comments everywhere, which over time makes the code extremely difficult to read, ironically. So I think we need to apply some judgment and be a bit judicious in exchange.

self.code_ranges
.sort_unstable_by_key(|(start, _end, _)| *start);
// `code_ranges` is a `BTreeMap` kept sorted on insert, so there is no
// sort to perform here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise here -- comment only makes sense in context of the diff, no need to keep this forever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift Issues related to the Cranelift code generator

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants