Cranelift: use a BTreeMap for JITModule code ranges#13710
Conversation
`JITModule` resolves a PC back to its defining function (for exception unwinding) via a `Vec<(start, end, FuncId)>` that was re-sorted in full on every `finalize_definitions` call. Code that defines and finalizes functions one at a time therefore paid an O(n) sort per finalize, i.e. O(n^2) overall. Replace the vector with a `BTreeMap` keyed on the start address. Inserts stay sorted incrementally (O(log n)) and the lookup becomes an O(log n) range query, so finalizing is linear in the number of functions with no sort.
cfallin
left a comment
There was a problem hiding this comment.
Reasonable change and looks good -- thanks! (FWIW this is more or less how Wasmtime's address map data structures work so it's neat to see cranelift-jit have convergent evolution here.)
Just some style notes re: the comments below, then happy to merge.
| // resolve a PC back to its function for exception unwinding. A `BTreeMap` | ||
| // keeps this sorted incrementally (O(log n) insert, O(log n) range lookup), | ||
| // avoiding an O(n) re-sort of the whole table on every `finalize_definitions` | ||
| // call - which made finalizing many functions one at a time quadratic. |
There was a problem hiding this comment.
No need for the detailed explainer here -- makes sense when justifying the change, but doesn't have to live in the source forever. Just the format note (map from start addr to end addr and function) is fine.
There was a problem hiding this comment.
Thank you for your feedback. Don't you think though that the comment makes this choice much more obvious and potentially leads to similar choices being made more often in the future? I have the belief that the better code decisions are documented in place the better culture it creates over time because people (or machines) look at the comment like that and then realize that somewhere in a different place the same solution makes sense as well. Without the comment, it becomes an opaque decision burried in a git commit. Happy to remove the comments but I just think they do more good than bad.
There was a problem hiding this comment.
Well, no, I don't think we need this level of detail -- it describes the past implementation and its impact ("re-sort of the whole table ... which made ..."). Level of commenting is always a subjective call, and I am all for detailed comments where things are tricky, but a BTreeMap sorted by address is common enough throughout all of Wasmtime+Cranelift that we don't need this; rather, overly verbose comments like this look (in context of our local style) like an indication of something non-obvious, and take time to read and digest, which is a net negative for a straightforward idiom, IMHO. The comment describing what they keys and values are is enough.
Part of my thinking here is informed by recent LLM-isms too: they have a tendency to be extraordinarily verbose, and dump point-in-time thoughts in comments everywhere, which over time makes the code extremely difficult to read, ironically. So I think we need to apply some judgment and be a bit judicious in exchange.
| self.code_ranges | ||
| .sort_unstable_by_key(|(start, _end, _)| *start); | ||
| // `code_ranges` is a `BTreeMap` kept sorted on insert, so there is no | ||
| // sort to perform here. |
There was a problem hiding this comment.
Likewise here -- comment only makes sense in context of the diff, no need to keep this forever.
Let's make cranelift a little faster when booting JITed linux :)
Currently
JITModuleresolves a PC back to its defining function (for exception unwinding) via aVec<(start, end, FuncId)>that was re-sorted in full on everyfinalize_definitionscall. Code that defines and finalizes functions one at a time therefore paid an O(n) sort per finalize, i.e. O(n^2) overall.I replace the vector with a
BTreeMapkeyed on the start address. Inserts stay sorted incrementally (O(log n)) and the lookup becomes an O(log n) range query, so finalizing is linear in the number of functions with no sort.Noticible speedup is observed without any (obvious) problems. I think the change makes sense and I don't see why it wouldn't simply just get rid of O(n^2) complexity without any other issues.