emscripten: fix pthread type sizes for wasm64 (MEMORY64)#5156
Open
gergelyvagujhelyi wants to merge 1 commit into
Open
emscripten: fix pthread type sizes for wasm64 (MEMORY64)#5156gergelyvagujhelyi wants to merge 1 commit into
gergelyvagujhelyi wants to merge 1 commit into
Conversation
On wasm64-unknown-emscripten the pthread types are larger than the hardcoded wasm32 sizes, so pthread_attr_init overruns the pthread_attr_t that std stack-allocates in Thread::new and corrupts the spawn. Sizes from sizeof() via emcc / emcc -sMEMORY64=1: pthread_attr_t : 44 -> 88 (11 pointer-width words -> [usize; 11]) pthread_mutex_t : 24 -> 40 pthread_rwlock_t: 32 -> 56 pthread_cond_t : 48 (unchanged); pthread_t = c_ulong (4/8) already correct
gergelyvagujhelyi
added a commit
to nobodywho-ooo/nobodywho
that referenced
this pull request
Jun 8, 2026
… fix On wasm64-unknown-emscripten (MEMORY64), libc hardcodes wasm32 pthread type sizes for `*-emscripten`: pthread_attr_t is 44 bytes, but on wasm64 it's 88. std's `Thread::new` stack-allocates a pthread_attr_t and calls pthread_attr_init, which overruns the buffer, corrupts the pthread_create result, and makes std::thread::spawn fail with a spurious "failed to spawn thread" — so every worker (Encoder/Chat/CrossEncoder) dies on startup. rust-lang/libc#5156 makes those sizes pointer-width-aware (__size [u32;11]→[usize;11]; __SIZEOF_PTHREAD_{RWLOCK,MUTEX}_T split by target_pointer_width). The workspace [patch.crates-io] redirects the app's libc to a local clone carrying that fix. That alone is NOT sufficient: `-Zbuild-std` recompiles std from rust-src and resolves the sysroot's libc separately, so the workspace [patch] never reaches it (verified with `cargo build --unit-graph`: std/unwind/ panic_unwind kept the registry libc). Since the crash is in std's Thread::new, the fix must also be injected into the nightly's rust-src `library/Cargo.toml` — a one-time local-dev step until a nightly's std bumps to a libc that includes #5156. The clone is pinned to 0.2.185 (the version std's library/Cargo.lock locks) so both patches apply as a clean source-swap. - Cargo.toml/Cargo.lock: workspace [patch] libc -> ../../libc-wasm64. - js/README.md: document the rust-src step + how to build the clone. - build-pkg-emscripten-wasm64.sh: abort with instructions if rust-src lacks the libc patch (or the clone lacks the fix), so the manual step can't be silently skipped into a runtime thread-spawn failure. Verified end-to-end: multi-threaded Encoder.encode on wasm64 returns a finite 384-dim embedding (bge-small). The unwinder side (rust#156573, target_family="wasm") is already in nightly 2026-06-07. Drop both libc patches once std bumps to a libc with #5156.
gergelyvagujhelyi
added a commit
to nobodywho-ooo/nobodywho
that referenced
this pull request
Jun 8, 2026
The wasm64 build pinned its forks to local sibling clones (../../wasm-bindgen, ../../llama-cpp-rs, ../../libc-wasm64), which don't exist on CI runners, so every cargo invocation (lint, clippy, maturin) died at dependency resolution: failed to read `/home/runner/work/nobodywho/wasm-bindgen/.../Cargo.toml` Point them at fetchable git sources instead: - wasm-bindgen / -futures / js-sys / web-sys: workspace [patch.crates-io] -> git nobodywho-ooo/wasm-bindgen branch wasm64-emscripten. - llama-cpp-2 / llama-cpp-sys-2: bump the branch in core/Cargo.toml to wasm64-emscripten and drop the workspace [patch] (cargo rejects patching a git source to the same repo's other branch). - libc: drop the app-side [patch] entirely. The app's only libc use is getuid(); the wasm64 pthread_attr_t fix (rust-lang/libc#5156) is needed only by std, which -Zbuild-std resolves separately and gets via the rust-src patch (a documented local-dev step). Native uses stock libc. The wasm64 build.rs / js-sys / cli-support changes are all cfg-gated to wasm64, so native and wasm32 builds are unaffected. `cargo metadata` resolves cleanly.
gergelyvagujhelyi
added a commit
to nobodywho-ooo/nobodywho
that referenced
this pull request
Jun 8, 2026
…uild-std The wasm64 leg builds with -Zbuild-std, which compiles std's libc from rust-src. Without the rust-lang/libc#5156 pthread-size fix, std's Thread::new overruns pthread_attr_t on wasm64 (44 vs 88 bytes) and std::thread::spawn fails, so the artifact can't run. The app-workspace [patch] doesn't reach the separately-resolved sysroot libc, so the fix has to be injected into rust-src. Add js/scripts/patch-rust-src-libc.sh: reads the exact libc version std locks, downloads that source from crates.io, applies the #5156 fix (the three pthread-size edits), and adds [patch.crates-io] libc = { path = ... } to the nightly's rust-src library/Cargo.toml. Self-contained (no fork), idempotent, version-robust; the fix step no-ops once a nightly's std bumps to a libc that already includes #5156. - js_ci.yml: run the script in the build job just before the wasm64 build (after rust-src + python3 are set up). The build script's existing guard then passes and tests-wasm64 can run green. - README: replace the manual rust-src steps with the one-line script (CI runs the same one). Still full_ci-gated — the wasm64 leg builds std + llama.cpp from source.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
On
wasm64-unknown-emscripten(MEMORY64) several pthread types are larger thanthe hardcoded wasm32 sizes used here, so
pthread_attr_initoverruns thepthread_attr_tthatstdstack-allocates inThread::new, corrupting theresult and making
std::thread::spawnfail with a spurious error. This makesthe affected sizes pointer-width-aware.
The numbers are
sizeof()from emscripten's own headers, compiled both ways:pthread_attr_tis exactly 11 pointer-width words (44 = 11·4,88 = 11·8),so
[usize; 11]matches both with nocfg— and is identical to the previous[u32; 11]on wasm32.pthread_mutex_t/pthread_rwlock_taren't a fixed multiple of the pointerwidth (musl sizes them as
int __i[N]with a differentNper width), so theyneed explicit per-width values.
pthread_cond_t(48) andpthread_t = c_ulong(4/8 bytes) are already correct.Found while bringing up
wasm64-unknown-emscriptendownstream (custom targetspec +
-Zbuild-std): with the wasm32 sizes,pthread_createreports success atthe emscripten layer but
stdreads a corrupted result and panicsfailed to spawn thread, so threads can't be spawned at all.There's no
wasm64-unknown-emscriptenjob in CI (it isn't a built-in target), sothis isn't exercised by
libc-test; the sizes above are reproducible with thesnippet.