Skip to content

[build] Freeze startup encoding modules into CPython binary#317

Merged
ppenna merged 1 commit into
nanvix/v3.12.3from
enhancement/freeze-encodings
Apr 5, 2026
Merged

[build] Freeze startup encoding modules into CPython binary#317
ppenna merged 1 commit into
nanvix/v3.12.3from
enhancement/freeze-encodings

Conversation

@ppenna

@ppenna ppenna commented Mar 30, 2026

Copy link
Copy Markdown

Summary

Freeze the five encoding modules imported during every interpreter startup into
the CPython binary, eliminating filesystem I/O that is prohibitively slow on
Nanvix ramfs/microvm.

Motivation

During CPython startup, 25 modules are imported. Prior to this change, 20 were
frozen/builtin but the encodings package and its codecs required filesystem reads
from ramfs. On Nanvix each file open/read/close involves VM exits, adding
measurable latency to startup.

Changes

  • Tools/build/freeze_modules.py: Add encodings package + 4 codec modules to FROZEN list
  • Python/frozen.c: Auto-regenerated with 5 new frozen module entries
  • Makefile.pre.in: Auto-regenerated with freeze targets for encoding modules
  • PCbuild/_freeze_module.vcxproj{,.filters}: Auto-regenerated for Windows builds

Design Decision

Only the 5 startup-critical codecs are frozen (not all ~100 encoding modules):
encodings, encodings.aliases, encodings.ascii, encodings.latin_1, and
encodings.utf_8. Binary growth is ~92 KB.

Benchmark

Platform: microvm/standalone, 128 MB, Nanvix v0.12.364
Method: 20 A/B interleaved pairs on WSL2 (i9-12900H, KVM)

Metric Baseline Frozen Delta
Mean 79.8 ms 78.0 ms -1.8 ms (2.3%)
Median 79.0 ms 70.5 ms -8.5 ms (10.8%)
P10 64 ms 58 ms -6 ms
P90 109 ms 119 ms +10 ms
Binary size 15.8 MB 15.9 MB +92 KB
Frozen wins - - 13/20 pairs

The improvement is modest on a fast KVM host (~80 ms total startup).
On production Nanvix deployments with higher VM-exit latency the saving
scales proportionally.

Supersedes the earlier revision of #317 which included pre-compiled .h
headers in the commit. This version relies on the build system to
generate them, matching upstream CPython conventions.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to freeze the full encodings.* package into the CPython binary to reduce startup time by eliminating filesystem I/O during interpreter initialization (especially impactful in the Nanvix/ramfs environment).

Changes:

  • Enable freezing of all encodings.* modules via Tools/build/freeze_modules.py.
  • Extend the frozen module registry (Python/frozen.c) and build rules (Makefile.pre.in, PCbuild/_freeze_module.*) to include the encodings modules.
  • Add a guard in Python/import.c to avoid crashing when a deepfrozen-only module has no marshalled byte buffer in subinterpreters.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
Tools/build/freeze_modules.py Enables freezing of encodings.* via the module selection list.
Python/import.c Prevents NULL deref/crash when unmarshalling deepfrozen-only modules in subinterpreters.
Python/frozen.c Adds encodings frozen entries + os.path platform mapping; updates frozen module tables.
Makefile.pre.in Adds freeze/deepfreeze inputs/outputs and per-module freeze targets for encodings on POSIX builds.
PCbuild/_freeze_module.vcxproj Adds Windows freeze targets for encodings modules.
PCbuild/_freeze_module.vcxproj.filters Adds encodings sources into the Visual Studio filter list.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Python/frozen.c Outdated
Comment thread Python/import.c Outdated
@ppenna ppenna self-assigned this Mar 30, 2026
@ppenna ppenna force-pushed the enhancement/freeze-encodings branch from 247b9fe to c128c47 Compare April 4, 2026 23:30
@ppenna ppenna changed the title build: freeze all encoding modules into CPython binary build: freeze startup encoding modules into CPython binary Apr 4, 2026
Freeze the five encoding modules imported during every interpreter
startup into the CPython binary, eliminating filesystem I/O that is
prohibitively slow on Nanvix ramfs/microvm.

During CPython startup, 25 modules are imported.  Prior to this change,
20 were frozen/builtin but the encodings package and its codecs required
filesystem reads from ramfs.  On Nanvix each file open/read/close
involves VM exits, adding measurable latency to startup.

Only the 5 startup-critical codecs are frozen (not all ~100 encoding
modules): encodings, encodings.aliases, encodings.ascii,
encodings.latin_1, and encodings.utf_8.  Binary growth is ~92 KB.

Benchmark (microvm/standalone, 128 MB, 20 A/B interleaved pairs):

  Metric           Baseline    Frozen     Delta
  Mean (ms)          79.8      78.0      -1.8
  Median (ms)        79.0      70.5      -8.5 (10.8%)
  Frozen wins     13/20 pairs

Changes:
- Tools/build/freeze_modules.py: add 5 encoding modules to FROZEN list
- Python/frozen.c: auto-regenerated (includes, externs, table entries)
- Makefile.pre.in: auto-regenerated (FROZEN_FILES_IN/OUT, build rules,
  deepfreeze deps)
- PCbuild/_freeze_module.vcxproj{,.filters}: auto-regenerated

Supersedes #317.
@ppenna ppenna force-pushed the enhancement/freeze-encodings branch from c128c47 to 1bf0ad5 Compare April 5, 2026 23:20
@ppenna ppenna requested a review from Copilot April 5, 2026 23:36

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces CPython interpreter startup latency on Nanvix by freezing the small set of encoding-related modules that are imported on every startup into the binary, avoiding slow ramfs filesystem I/O during initialization.

Changes:

  • Add encodings (package) and four startup-critical codec modules to the frozen stdlib startup set.
  • Regenerate frozen-module artifacts (Python/frozen.c, Makefile.pre.in, and Windows freeze project files) to include the new modules.
Show a summary per file
File Description
Tools/build/freeze_modules.py Adds encodings + key codec modules to the “startup, without site” frozen module list.
Python/frozen.c Registers the new frozen modules (includes, externs, and stdlib_modules[] entries).
Makefile.pre.in Adds the new inputs/outputs and per-module freeze targets for POSIX builds and deepfreeze dependency lists.
PCbuild/_freeze_module.vcxproj Adds the new modules to the Windows frozen-header generation and deepfreeze command list.
PCbuild/_freeze_module.vcxproj.filters Adds the new Python source entries to the Visual Studio filters list.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 5/5 changed files
  • Comments generated: 0 new

@ppenna ppenna changed the title build: freeze startup encoding modules into CPython binary [build] Freeze startup encoding modules into CPython binary Apr 5, 2026
@ppenna ppenna merged commit bbd55c4 into nanvix/v3.12.3 Apr 5, 2026
12 checks passed
@ppenna ppenna deleted the enhancement/freeze-encodings branch April 5, 2026 23:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants