[nanvix] E: Phase 2 — build 15 Tier-2 stdlib extensions as .so#9
Open
esaurez wants to merge 1 commit into
Open
[nanvix] E: Phase 2 — build 15 Tier-2 stdlib extensions as .so#9esaurez wants to merge 1 commit into
esaurez wants to merge 1 commit into
Conversation
d882641 to
0866b55
Compare
Phase 2 of the .a -> .so migration (see nanvix-todo/cpython-static-to-shared-migration.md section 6). Promotes 15 modules with bundled-in-cpython C dependencies from statically linked into python.elf to dlopen-loaded shared objects. Modules moved to *shared* in Modules/Setup.local generation (.nanvix/docker.py): - _asyncio, _datetime (async + time) - _decimal, pyexpat, _elementtree (decimal + XML) - _md5, _sha1, _sha2, _sha3, _blake2 (hashing) - select, _socket, _posixsubprocess, fcntl, termios (net + IO + posix) For the 4 modules with bundled C dep archives (_decimal -> libmpdec.a, pyexpat / _elementtree -> libexpat.a, _sha2 -> libHacl_Hash_SHA2.a), this PR ships them with the bundled archive embedded in the .so itself via the existing cpython per-module LDFLAGS mechanism. The .so files are larger than they could be (libmpdec adds ~1.5 MB to _decimal.so, libHacl_Hash_SHA2 adds ~1.7 MB to _sha2.so, libexpat adds ~0.7 MB to pyexpat.so) but functionally complete. Optimizing those four to resolve bundled-lib symbols via python.elf's .dynsym at dlopen time is a follow-up PR (analogous to Phase 1B -> Phase 1B-drop-libm) that requires either a two-pass configure or a linker-script-based approach to make the bundled .a available to conftest at configure time. The visibility-merge mechanism that required esaurez/nanvix#26 for libm does NOT apply here — mpdec, expat, and HACL are pure C with no compiler_builtins shadows, so the follow-up is a pure size optimization, not a correctness fix. For the other 11 modules with no bundled deps, .so size is what it should be (CPython glue only). Validation on local toolchain (phase0-llfix): - All 15 .so files produced and installed under lib-dynload/. Sizes: _asyncio 352K, _datetime 517K, _decimal 1593K, pyexpat 731K, _elementtree 347K, _md5 86K, _sha1 57K, _sha2 1766K, _sha3 119K, _blake2 257K, select 57K, _socket 238K, _posixsubprocess 135K, fcntl 36K, termios 46K. - nm python.elf no longer shows PyInit_<name> for any of the 15. - python.elf size: 17.48 MB (Phase 1C) -> 16.67 MB (Phase 2), -0.81 MB. - Hello + Phase 1A + 1B + 1C + 2 import probes + lxml + HTTP smoke + full regrtest 160/160 PASS in standalone mode. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
f5a09f8 to
643a0fd
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 2 of the
.a→.somigration (seenanvix-todo/cpython-static-to-shared-migration.mdsection 6). Promotes 15 modules with bundled-in-cpython C dependencies from statically linked intopython.elfto dlopen-loaded shared objects underlib/python3.12/lib-dynload/.Modules moved to
*shared*_asyncio,_datetime_decimal,pyexpat,_elementtree_md5,_sha1,_sha2,_sha3,_blake2select,_socket,_posixsubprocess,fcntl,termiosBundled-lib trade-off in this PR
Four of these modules carry bundled C library dependencies:
_decimal→libmpdec.apyexpat+_elementtree→libexpat.a_sha2→libHacl_Hash_SHA2.aThis PR ships them with the bundled archive embedded in the
.soitself via the existing CPython per-module LDFLAGS mechanism. The result is functionally complete but the.sofiles are larger than ideal:_decimal.sois 1593 KB (libmpdec dominates),_sha2.sois 1766 KB (libHacl_Hash_SHA2 dominates),pyexpat.sois 731 KB (libexpat dominates).Optimizing these four to resolve bundled-lib symbols at dlopen time against
python.elf's.dynsym(analogous to the Phase 1B → Phase 1B-drop-libm follow-up) is deferred to a separate PR. The optimization requires either a two-pass configure (build the bundled libs first so they exist when conftest runs) or a linker-script-based approach, neither of which is trivial. Importantly:esaurez/nanvix#26for libm does NOT apply here —libmpdec/libexpat/libHacl_Hash_SHA2are pure C with no Rustcompiler_builtinsshadows. The deferred follow-up is a pure size optimization, not a correctness fix.For the other 11 modules with no bundled deps (
_asyncio,_datetime,_md5,_sha1,_sha3,_blake2,select,_socket,_posixsubprocess,fcntl,termios),.sosize is what it should be — CPython glue only.Size impact
python.elfTest coverage
New
phase2_snippetin.nanvix/test.pyimports each module, asserts non-builtin status, exercises one trivial API call to confirmdlopen+PyInit_<name>succeeded (_asyncio.Future,_decimal.Decimal('1.1') + Decimal('2.2') == Decimal('3.3'),pyexpat.ParserCreate, hash construction, socket creation, etc.), and prints the resolved__file__path. Phase 1A/1B/1C probes retained.Validation
Tested on
phase0-llfixtoolchain overlay:.sofiles installed underlib-dynload/.nm python.elfno longer showsPyInit_<name>for any of the 15.python.elfsize: 17.48 MB (Phase 1C) → 16.67 MB (Phase 2), −0.81 MB._decimal.Decimal('1.1') + Decimal('2.2') == Decimal('3.3')works (validates libmpdec embed).pyexpat.ParserCreate()works (validates libexpat embed)._sha2.sha256()works (validates libHacl_Hash_SHA2 embed).Prerequisites
Stacked on Phase 1C (esaurez/cpython#8). No additional nanvix / newlib / gcc PRs needed beyond what Phase 1 already requires.
Risk
Mechanical configuration change — 15 entries added to
*shared*block ofSetup.local, same pattern as Phase 1A/1B/1C. The bundled-lib.sosize cost is documented and a follow-up optimization is identified.