[nanvix] E: Drop libm.a from math/cmath/_statistics .so link#7
Closed
esaurez wants to merge 2 commits into
Closed
Conversation
Phase 1B of the .a -> .so migration (see nanvix-todo/cpython-static-to-shared-migration.md section 5). Builds on Phase 1A (#5) by promoting the remaining 5 Tier-1 "math + memory" stdlib extension modules from statically linked into python.elf to dlopen-loaded shared objects under lib/python3.12/lib-dynload/. Modules moved to *shared* in Modules/Setup.local generation (.nanvix/docker.py): - math, cmath, _statistics (libm consumers). - mmap, _contextvars (libc-only). None of the five reference external libraries beyond libc / libm (verified by grepping for #include <zlib|expat|openssl|sqlite| mpdec|bzlib|lzma|hacl>). libm is already linked into python.elf via the existing --whole-archive --export-dynamic chain, so the math / cmath / _statistics .so files resolve sin/cos/etc. at dlopen time. Note on .so size: cpython's Makefile attaches libm.a to the link of math / cmath / _statistics via MODULE_*_LDFLAGS, so the resulting math.so (~532 KB) and cmath.so (~180 KB) statically contain a copy of the libm objects they use. This is functionally correct (--allow-multiple-definition handles the duplicate symbols at dlopen) but wastes ramfs space. Optimizing the link to drop libm.a from the .so command line and let the dynamic loader resolve UND symbols against python.elf is a follow-up cleanup, not a Phase 1B blocker. Test coverage (.nanvix/test.py): - New phase1b_snippet imports each module, asserts it is NOT in sys.builtin_module_names, exercises one trivial API call to confirm dlopen + PyInit_<name> succeeded, and prints the resolved __file__ path. Phase 1A probe retained. Validation on local toolchain (phase0-llfix): - All 5 new .so files produced and installed under lib-dynload/ (sizes: math 532K, cmath 180K, _statistics 28K, mmap 54K, _contextvars 11K). - nm python.elf no longer shows PyInit_<name> for any of the 5. - python.elf size: 19.31 MB (Phase 1A) -> 19.18 MB (Phase 1B), ~130 KB further reduction (smaller than the .so total because libm objects are double-counted between python.elf's --whole-archive copy and the math / cmath / _statistics .so). - Hello + Phase 1A probe + Phase 1B probe + lxml + HTTP smoke + full regrtest 160/160 PASS in standalone mode. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Phase 1B follow-up. Now that nanvix/nanvix fixes the libm visibility merge in libposix.a (esaurez/nanvix#26 — localizes the 28 libm wrapper shadows emitted by Rust's compiler_builtins), python.elf's `.dynsym` correctly exposes `sqrt`, `cbrt`, `fma`, `ceil`, `fabs`, `fmod`, etc. as `GLOBAL DEFAULT`. Dlopen'd `.so` modules can now resolve those symbols at runtime against the main executable, exactly like Linux dlopen'd modules resolve them against `libm.so.6`. Pass `--with-libm=` (empty) to CPython's configure so the math / cmath / _statistics module link commands no longer include libm.a. Each `.so` ends up with `sqrt` / `cbrt` / etc. as undefined references that the dynamic loader resolves at dlopen time. Size impact: - math.cpython-312.so: 532 KB -> 381 KB (-28%) - cmath.cpython-312.so: 180 KB -> 91 KB (-50%) - _statistics.cpython-312.so: 28 KB -> 18 KB (-36%) python.elf size is unchanged (libm.a was already linked via --whole-archive into the main binary; that hasn't changed). Total ramfs reduction: ~240 KB across the three modules. Validation (on patched toolchain image that bundles esaurez/nanvix#26's fix to libposix.a / libnvx_crt0.a): - nm python.elf shows sqrt / cbrt / fma as GLOBAL DEFAULT in .dynsym. - math.sqrt(4.0)==2.0, cmath.sqrt(-1)==1j round-trip via dlopen. - Phase 1B import probe: all 5 modules load via dlopen. - Full CPython regrtest 160/160 modules PASS. - Hello + lxml + HTTP smoke tests all pass. Prerequisite: esaurez/nanvix#26 must be merged and the ghcr.io/nanvix/toolchain-python image rebuilt to carry the patched libposix.a before this PR can build with the change. Without the prerequisite, math.so dlopen would fail with "symbol not found" on sqrt — exactly the failure mode that motivated #26. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This was referenced Jun 3, 2026
esaurez
pushed a commit
that referenced
this pull request
Jun 3, 2026
Phase 3 of the .a -> .so migration (see nanvix-todo/cpython-static-to-shared-migration.md section 7). Promotes 7 modules with external Nanvix-ported .a dependencies from statically linked into python.elf to dlopen-loaded shared objects. Modules moved to *shared* in Modules/Setup.local generation (.nanvix/docker.py): - _bz2 (libbz2) --- unbundled, see Group A below - _lzma (liblzma) --- unbundled, see Group A below - zlib (libz) --- unbundled, see Group A below - _sqlite3 (libsqlite3) -- unbundled, see Group A below - _ssl (libssl + libcrypto) --- still bundled, see Group B below - _hashlib (libcrypto) --- still bundled, see Group B below - _ctypes (libffi) --- still bundled, see Group B below ============================================================== Group A: 4 sysroot libs unbundled in this PR ============================================================== For zlib, _bz2, _lzma, _sqlite3 (and binascii which uses libz for CRC32 via the same ZLIB_LIBS chain), this PR does the full Linux- style split: the underlying library lives once in python.elf via --whole-archive --export-dynamic, and each .so resolves its lib symbols at dlopen time against python.elf's .dynsym. Same architectural pattern as Phase 1B-drop-libm (#7) established for libm. Changes in Makefile.nanvix CONFIGURE_ENV: 1. python.elf's LIBS is extended with a --whole-archive ... --no-whole-archive group containing libsqlite3.a, libz.a, libbz2.a, liblzma.a. The four libs now live exactly once in python.elf with every public symbol exported via --export-dynamic. 2. The per-module LIBS env vars used by cpython's configure to inject -l<lib> into individual .so links are cleared: ZLIB_LIBS, BZIP2_LIBS, LIBLZMA_LIBS, LIBSQLITE3_LIBS. The .so files now reference compress2 / BZ2_bzCompress / lzma_code / sqlite3_exec etc. as UND symbols; the dynamic loader resolves them at dlopen time. Size impact for Group A: zlib.so 236 KB -> 181 KB (-55 KB) _bz2.so 128 KB -> 68 KB (-60 KB) _lzma.so 212 KB -> 120 KB (-92 KB) _sqlite3.so 1448 KB -> 473 KB (-975 KB, biggest .so win) python.elf 8.48 MB -> 9.50 MB (+1.02 MB; the 4 libs now baked in via --whole-archive) Net ramfs: ~150 KB savings. Architectural win: each of these 4 sysroot libs now exists exactly once in the image (canonical Linux pattern). ============================================================== Group B: 3 modules with still-bundled libs (deferred) ============================================================== For _ssl, _hashlib, _ctypes, this PR ships the same v1 approach that Phase 2 used: each .so embeds its own copy of the underlying library via the existing cpython per-module LDFLAGS mechanism. The .so files are larger than ideal but functionally complete: - _ssl.so: 6398 KB (libssl + libcrypto baked in) - _hashlib.so: 4795 KB (libcrypto baked in; duplicates _ssl's copy) - _ctypes.so: 744 KB (libffi baked in) These three are blocked by pre-existing Nanvix sysroot bugs that prevent --whole-archive wrapping: 1. libcrypto.a (used by _ssl, _hashlib) contains bss_log.o which references unimplemented POSIX openlog / syslog / closelog (Nanvix's newlib + libposix do not implement these — Nanvix logs through a different syslog kcall, not the POSIX client API). Under normal archive selection nothing references BIO_s_log so bss_log.o is never pulled; under --whole-archive every member gets pulled and the link fails. Fix needed: either strip bss_log.o from libcrypto.a (post-build), provide stub openlog/syslog/closelog, or recompile libcrypto without syslog support. 2. libffi.a (used by _ctypes) contains nested archives (libposix.a, libc.a, libm.a as ar members). ld --whole-archive tries to include every member as an object and fails on the nested .a's. Fix needed: either fix the Nanvix libffi build (probably a libtool convenience-library mishap) or post-process libffi.a to delete the nested members. Group B and the cpython-internal-libs Group C (libmpdec, libexpat, libHacl_Hash_SHA2 for _decimal, pyexpat, _sha2) are tracked in nanvix-todo/phase2-3-unbundle-bundled-libs.md. ============================================================== Correctness vs the libm case ============================================================== Unlike the libm case that required nanvix#26 (Rust compiler_builtins shadows libm's WEAK HIDDEN symbols, breaking visibility merge), the Phase 3 bundled-lib duplication is a pure size issue: - All four Group A libs (libz, libbz2, liblzma, libsqlite3) and the three Group B libs (libssl, libcrypto, libffi) are pure C with no Rust compiler_builtins shadows. - None carry shared mutable per-process state in the way libm does (FP environment, signgam). The Group B libs DO carry init state (OpenSSL algorithm registry, libffi closure caches) but each .so copy initializes its own copy — wasteful but not incorrect. For OpenSSL specifically, each .so copy initializes its own OpenSSL state. Observable side-effect today: _hashlib.openssl_sha256() called BEFORE any other OpenSSL init returns "unsupported hash type sha256"; calls from regrtest's test_hashlib succeed because OpenSSL has been initialized via _ssl by then. The deferred Group B unbundling fixes this for free by ensuring exactly one libcrypto init runs in python.elf before any module dlopens. ============================================================== Validation ============================================================== Tested on local toolchain (phase0-llfix overlay containing the patched libposix.a from esaurez/nanvix#26): - All 7 .so files produced and installed under lib-dynload/. - nm python.elf no longer shows PyInit_<name> for any of the 7. - python.elf size: 16.67 MB (Phase 2) -> 9.50 MB (Phase 3), -7.17 MB. Largest single-phase reduction so far. - Hello + Phase 1A + 1B + 1C + 2 + 3 import probes + lxml + HTTP smoke + full regrtest 160/160 PASS in standalone mode. - test_hashlib, test_ssl, test_zlib, test_bz2, test_lzma, test_sqlite3, test_ctypes all included in regrtest's 160 and pass. Phase 1 + Phase 2 + Phase 3 cumulative: 47 of 47 Tier-1/2/3 modules moved from static to .so. python.elf has dropped from ~20 MB at Phase 0 baseline to 9.50 MB now. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
daa0c44 to
b91f9d8
Compare
Owner
Author
|
Folded into PR #6 (Phase 1B) per esaurez review preference — easier to review the .so move and the libm-unbundling together as one cohesive change. The squashed Phase 1B commit on eat/phase1b-tier1-mathmem-shared now contains both. |
esaurez
pushed a commit
that referenced
this pull request
Jun 3, 2026
Phase 1C of the .a -> .so migration (see nanvix-todo/cpython-static-to-shared-migration.md section 5). Builds on Phase 1B (#6, #7) by promoting the remaining 8 Tier-1 "text codec" stdlib extension modules from statically linked into python.elf to dlopen-loaded shared objects under lib/python3.12/lib-dynload/. Modules moved to *shared* in Modules/Setup.local generation (.nanvix/docker.py): - unicodedata: Unicode database lookups (the big one — 1.2 MB of unicode data tables). - _multibytecodec: shared CJK codec infrastructure. - _codecs_cn / _codecs_hk / _codecs_iso2022 / _codecs_jp / _codecs_kr / _codecs_tw: per-region CJK codec tables. None of the eight reference external libraries; they are pure C with embedded data tables. They link against the same -lc / runtime symbols that the rest of the Phase 1 modules use. Test coverage (.nanvix/test.py): - New phase1c_snippet imports each module, asserts it is NOT in sys.builtin_module_names, exercises one trivial API call to confirm dlopen + PyInit_<name> succeeded (unicodedata.lookup, _multibytecodec.__create_codec, _codecs_<region>.getcodec), and prints the resolved __file__ path. Phase 1A/1B probes retained. Validation on local toolchain (phase0-llfix): - All 8 new .so files produced and installed under lib-dynload/ (unicodedata 1193K, _codecs_jp 262K, _codecs_hk 168K, _codecs_cn 155K, _codecs_kr 145K, _multibytecodec 147K, _codecs_tw 115K, _codecs_iso2022 76K — total ~2.2 MB across the eight files). - nm python.elf no longer shows PyInit_<name> for any of the 8. - python.elf size: 19.18 MB (Phase 1B) -> 17.48 MB (Phase 1C), -1.70 MB. Biggest single-phase reduction so far because the CJK codec tables and the Unicode database are large. - Hello + Phase 1A + Phase 1B + Phase 1C import probes + lxml + HTTP smoke + full regrtest 160/160 PASS in standalone mode. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
esaurez
pushed a commit
that referenced
this pull request
Jun 3, 2026
Phase 3 of the .a -> .so migration (see nanvix-todo/cpython-static-to-shared-migration.md section 7). Promotes 7 modules with external Nanvix-ported .a dependencies from statically linked into python.elf to dlopen-loaded shared objects. Modules moved to *shared* in Modules/Setup.local generation (.nanvix/docker.py): - _bz2 (libbz2) --- unbundled, see Group A below - _lzma (liblzma) --- unbundled, see Group A below - zlib (libz) --- unbundled, see Group A below - _sqlite3 (libsqlite3) -- unbundled, see Group A below - _ssl (libssl + libcrypto) --- still bundled, see Group B below - _hashlib (libcrypto) --- still bundled, see Group B below - _ctypes (libffi) --- still bundled, see Group B below ============================================================== Group A: 4 sysroot libs unbundled in this PR ============================================================== For zlib, _bz2, _lzma, _sqlite3 (and binascii which uses libz for CRC32 via the same ZLIB_LIBS chain), this PR does the full Linux- style split: the underlying library lives once in python.elf via --whole-archive --export-dynamic, and each .so resolves its lib symbols at dlopen time against python.elf's .dynsym. Same architectural pattern as Phase 1B-drop-libm (#7) established for libm. Changes in Makefile.nanvix CONFIGURE_ENV: 1. python.elf's LIBS is extended with a --whole-archive ... --no-whole-archive group containing libsqlite3.a, libz.a, libbz2.a, liblzma.a. The four libs now live exactly once in python.elf with every public symbol exported via --export-dynamic. 2. The per-module LIBS env vars used by cpython's configure to inject -l<lib> into individual .so links are cleared: ZLIB_LIBS, BZIP2_LIBS, LIBLZMA_LIBS, LIBSQLITE3_LIBS. The .so files now reference compress2 / BZ2_bzCompress / lzma_code / sqlite3_exec etc. as UND symbols; the dynamic loader resolves them at dlopen time. Size impact for Group A: zlib.so 236 KB -> 181 KB (-55 KB) _bz2.so 128 KB -> 68 KB (-60 KB) _lzma.so 212 KB -> 120 KB (-92 KB) _sqlite3.so 1448 KB -> 473 KB (-975 KB, biggest .so win) python.elf 8.48 MB -> 9.50 MB (+1.02 MB; the 4 libs now baked in via --whole-archive) Net ramfs: ~150 KB savings. Architectural win: each of these 4 sysroot libs now exists exactly once in the image (canonical Linux pattern). ============================================================== Group B: 3 modules with still-bundled libs (deferred) ============================================================== For _ssl, _hashlib, _ctypes, this PR ships the same v1 approach that Phase 2 used: each .so embeds its own copy of the underlying library via the existing cpython per-module LDFLAGS mechanism. The .so files are larger than ideal but functionally complete: - _ssl.so: 6398 KB (libssl + libcrypto baked in) - _hashlib.so: 4795 KB (libcrypto baked in; duplicates _ssl's copy) - _ctypes.so: 744 KB (libffi baked in) These three are blocked by pre-existing Nanvix sysroot bugs that prevent --whole-archive wrapping: 1. libcrypto.a (used by _ssl, _hashlib) contains bss_log.o which references unimplemented POSIX openlog / syslog / closelog (Nanvix's newlib + libposix do not implement these — Nanvix logs through a different syslog kcall, not the POSIX client API). Under normal archive selection nothing references BIO_s_log so bss_log.o is never pulled; under --whole-archive every member gets pulled and the link fails. Fix needed: either strip bss_log.o from libcrypto.a (post-build), provide stub openlog/syslog/closelog, or recompile libcrypto without syslog support. 2. libffi.a (used by _ctypes) contains nested archives (libposix.a, libc.a, libm.a as ar members). ld --whole-archive tries to include every member as an object and fails on the nested .a's. Fix needed: either fix the Nanvix libffi build (probably a libtool convenience-library mishap) or post-process libffi.a to delete the nested members. Group B and the cpython-internal-libs Group C (libmpdec, libexpat, libHacl_Hash_SHA2 for _decimal, pyexpat, _sha2) are tracked in nanvix-todo/phase2-3-unbundle-bundled-libs.md. ============================================================== Correctness vs the libm case ============================================================== Unlike the libm case that required nanvix#26 (Rust compiler_builtins shadows libm's WEAK HIDDEN symbols, breaking visibility merge), the Phase 3 bundled-lib duplication is a pure size issue: - All four Group A libs (libz, libbz2, liblzma, libsqlite3) and the three Group B libs (libssl, libcrypto, libffi) are pure C with no Rust compiler_builtins shadows. - None carry shared mutable per-process state in the way libm does (FP environment, signgam). The Group B libs DO carry init state (OpenSSL algorithm registry, libffi closure caches) but each .so copy initializes its own copy — wasteful but not incorrect. For OpenSSL specifically, each .so copy initializes its own OpenSSL state. Observable side-effect today: _hashlib.openssl_sha256() called BEFORE any other OpenSSL init returns "unsupported hash type sha256"; calls from regrtest's test_hashlib succeed because OpenSSL has been initialized via _ssl by then. The deferred Group B unbundling fixes this for free by ensuring exactly one libcrypto init runs in python.elf before any module dlopens. ============================================================== Validation ============================================================== Tested on local toolchain (phase0-llfix overlay containing the patched libposix.a from esaurez/nanvix#26): - All 7 .so files produced and installed under lib-dynload/. - nm python.elf no longer shows PyInit_<name> for any of the 7. - python.elf size: 16.67 MB (Phase 2) -> 9.50 MB (Phase 3), -7.17 MB. Largest single-phase reduction so far. - Hello + Phase 1A + 1B + 1C + 2 + 3 import probes + lxml + HTTP smoke + full regrtest 160/160 PASS in standalone mode. - test_hashlib, test_ssl, test_zlib, test_bz2, test_lzma, test_sqlite3, test_ctypes all included in regrtest's 160 and pass. Phase 1 + Phase 2 + Phase 3 cumulative: 47 of 47 Tier-1/2/3 modules moved from static to .so. python.elf has dropped from ~20 MB at Phase 0 baseline to 9.50 MB now. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This was referenced Jun 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 1B follow-up. Now that
esaurez/nanvix#26fixes the libm visibility merge inlibposix.a(localizes the 28 libm wrapper shadows emitted by Rust'scompiler_builtins),python.elf's.dynsymcorrectly exposessqrt,cbrt,fma,ceil,fabs,fmod, etc. asGLOBAL DEFAULT. dlopen'd.somodules can now resolve those symbols at runtime against the main executable — exactly like Linux dlopen'd modules resolve them againstlibm.so.6.This PR passes
--with-libm=(empty) to CPython'sconfigureso themath/cmath/_statisticsmodule link commands no longer includelibm.a. Each.soends up withsqrt/cbrt/ etc. as undefined references that the dynamic loader resolves at dlopen time.Size impact
math.cpython-312.socmath.cpython-312.so_statistics.cpython-312.sopython.elfsize is unchanged — libm.a was already linked via--whole-archiveinto the main binary; that hasn't changed. Net ramfs reduction: ~240 KB across the three modules. The remaining bulk inmath.sois CPython wrapper code, not libm.Architectural pattern
This is the canonical pattern for any Nanvix-on-static-libm CPython extension that uses libm:
sqrt/cbrt/ etc. asUNDsymbols.python.elf's.dynsym, which has them asGLOBAL DEFAULTbecause libm.a is inpython.elf's--whole-archiveLIBS plus the libposix visibility-merge contamination has been fixed at the Nanvix level ([nanvix] Makefile.nanvix: enable ac_cv_func_inet_pton=yes #26).This matches how Linux dlopen'd CPython math modules resolve libm — through the main executable's
.dynsym(Linux useslibm.so.6instead of the static link, but the resolution mechanism is the same). Future Phase 2/3 modules with libm dependencies (_datetime,_decimal, etc.) automatically benefit from this without needing the libm.a baked into each.so.Validation
Tested locally on a toolchain image (
local-nanvix/toolchain-python:phase0-llfix) bundlingesaurez/nanvix#26's patchedlibposix.a/libnvx_crt0.a:readelf --dyn-syms python.elf | grep sqrtshowsGLOBAL DEFAULT sqrtat0x40415bd0.math.sqrt(4.0) == 2.0andcmath.sqrt(-1) == 1jround-trip via dlopen.math,cmath,_statistics,mmap,_contextvars): all 5 modules load via dlopen.Prerequisites
esaurez/nanvix#26must be merged.ghcr.io/nanvix/toolchain-pythonimage must be rebuilt to carry the patchedlibposix.a.Without these,
math.sodlopen would fail withImportError: symbol not foundonsqrt— exactly the failure mode that motivated #26.Note on the 11 newlib-internal
__math_*helpers__math_invalid,__math_oflow,__math_uflow,__math_divzero, etc. are GLOBAL HIDDEN at source innewlib/libm/common/math_config.h(ported from ARM optimized-routines, same as glibc and musl). They are deliberately library-private and do not appear inpython.elf's.dynsymeven with this PR — and that is correct, not a bug. They are called only from inside libm's own internals (e.g.__ieee754_sqrt→__math_invalid), which live insidepython.elfalongside the helpers themselves; the PC-relative call between them is resolved at static-link time and never touches.dynsym. dlopen'd modules don't need them — they call public names likesqrt, which dispatches topython.elf'ssqrt, which internally calls__math_invalidif needed. This is identical to how Linux shipslibm.so.6with the same HIDDEN attribute on the same helpers; dlopen'd Python extensions on Linux never need to look them up either.Risk
One-line change. Trivially revertable if needed (set
--with-libm="$(LIBM)"back). The functional behavior of math/cmath/_statistics is identical to before — same C source, same APIs, same numeric semantics; only the location of libm code changes (nowpython.elf-only instead of duplicated into each.so).