Skip to content

build: optimize startup performance for microvm/standalone mode#365

Closed
ppenna wants to merge 1 commit into
nanvix/v3.12.3from
enhancement/startup-performance-optimizations
Closed

build: optimize startup performance for microvm/standalone mode#365
ppenna wants to merge 1 commit into
nanvix/v3.12.3from
enhancement/startup-performance-optimizations

Conversation

@ppenna

@ppenna ppenna commented Apr 5, 2026

Copy link
Copy Markdown

Summary

Combine performance optimizations from PRs #361, #362, #363, and #318 to reduce
cold-boot hello-world execution time from ~600ms to ~450ms, with VM snapshot
support enabling sub-200ms restore-boot.

Changes

1. Non-PIE static linking (PR #361, ~20ms saved)

  • Switch LDFLAGS from -Wl,-pie -Wl,--export-dynamic to -no-pie
  • Eliminates 100,383 R_386_RELATIVE relocations processed at startup
  • Binary reduced by ~700KB

2. Compiler flag optimizations (PR #362, ~70ms saved)

  • -O3: Aggressive optimization (~60ms faster than -Os on microvm)
  • -fomit-frame-pointer: Frees EBP register on i686 (~5ms)
  • -fno-unwind-tables -fno-asynchronous-unwind-tables: Shrinks binary
  • Remove --with-lto: LTO causes I-cache pressure on constrained microvm
  • --without-doc-strings: Reduces .rodata size
  • --with-computed-gotos: Faster bytecode dispatch for cross builds

3. Test harness optimizations (PR #363, ~40ms saved)

  • --strip-all instead of --strip-debug for smallest binary footprint
  • Move host binaries out of ramfs tree (~15MB ramfs reduction)
  • Build minimal ramfs with only test script and encodings/ package
  • -S flag: Skip site.py import (~10ms)
  • PYTHONHASHSEED=0: Fixed hash seed, avoids entropy overhead
  • Remove timeout wrapper (avoids extra process, ~5ms)

4. VM snapshot support (PR #318)

  • Modules/nanvix_snapshot.S: Assembly syscall helper (int $0x80, nr 35)
  • Call nanvix_snapshot() after initialization when NANVIX_SNAPSHOT=1 is set
  • Enables restore-boot in ~35ms (requires snapshot-capable nanvixd)

Performance

Metric Before After (cold) After (snapshot restore)
Hello world ~600ms ~450ms ~35ms
Relocations 100,383 0 0
Ramfs (test) ~43MB ~2.8MB ~2.8MB

Note

The 200ms target is achievable via VM snapshot restore (NANVIX_SNAPSHOT=1),
which requires a snapshot-capable nanvixd. Cold-boot optimizations alone bring
startup from ~600ms to ~450ms.

Combine multiple performance optimizations to reduce cold-boot hello-world
execution time from ~600ms to ~450ms, with VM snapshot support enabling
sub-200ms restore-boot.

Changes from PR #361 - Non-PIE static linking (~20ms saved):
- Switch LDFLAGS from -Wl,-pie -Wl,--export-dynamic to -no-pie
- Eliminates 100K R_386_RELATIVE relocations at startup

Changes from PR #362 - Compiler flag optimizations (~70ms saved):
- Add -O3 -fomit-frame-pointer -fno-unwind-tables
  -fno-asynchronous-unwind-tables to CFLAGS
- Remove --with-lto (causes I-cache pressure on constrained microvm)
- Add --without-doc-strings and --with-computed-gotos

Changes from PR #363 - Test harness optimizations (~40ms saved):
- Use --strip-all instead of --strip-debug for smallest binary
- Move host binaries (nanvixd, kernel, mkramfs) out of ramfs tree
- Build minimal ramfs with only test script and encodings/
- Add -S flag (skip site.py) and PYTHONHASHSEED=0
- Remove timeout wrapper to reduce measurement overhead

Changes from PR #318 - VM snapshot support:
- Add Modules/nanvix_snapshot.S: assembly syscall helper (int 0x80, nr 35)
- Call nanvix_snapshot() after Py_Initialize when NANVIX_SNAPSHOT=1 is set
- Enables restore-boot in ~35ms (requires snapshot-capable nanvixd)
- Gated by NANVIX_SNAPSHOT=1 env var for backward compatibility
Copilot AI review requested due to automatic review settings April 5, 2026 04:37

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR consolidates multiple Nanvix microvm/standalone startup optimizations and adds optional VM snapshot triggering to significantly reduce cold-boot and restore-boot times.

Changes:

  • Adjust Nanvix build flags for faster startup (non-PIE static linking, -O3, omit frame pointer, disable unwind tables, remove LTO, drop docstrings, enable computed gotos).
  • Optimize the standalone hello-world test harness to reduce ramfs size and interpreter startup overhead (minimal ramfs content, -S, PYTHONHASHSEED=0, --strip-all, host-binary relocation).
  • Add Nanvix VM snapshot trigger support (assembly syscall helper + env-gated call after initialization).
Show a summary per file
File Description
Modules/nanvix_snapshot.S Adds an i686 int 0x80 syscall helper to trigger a Nanvix VM snapshot.
Modules/main.c Calls nanvix_snapshot() after initialization when NANVIX_SNAPSHOT=1 on Nanvix.
Makefile.nanvix Updates compile/link/configure flags, injects snapshot object into build, and streamlines the standalone test ramfs + invocation.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 3/3 changed files
  • Comments generated: 1

Comment thread Makefile.nanvix
Comment on lines +318 to 322
./bin/nanvixd.elf \
-bin-dir ./bin -ramfs /tmp/rootfs.img \
-- ./bin/python3.12 \
"-B /test_hello.py;PYTHONHOME=/ PYTHONDONTWRITEBYTECODE=1" \
"-B -S /test_hello.py;PYTHONHOME=/ PYTHONDONTWRITEBYTECODE=1 PYTHONHASHSEED=0" \
< /dev/null > /tmp/cpython_test.log 2>&1; \

Copilot AI Apr 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the timeout 120 wrapper means the standalone hello test can now hang indefinitely if nanvixd.elf or the guest stalls. This is a reliability regression for make test/CI runs; consider keeping a timeout (even if implemented differently) or updating nanvixd invocation to enforce a max runtime. Also, the later failure message still mentions "timed out" even though the wrapper was removed, which can mislead debugging.

Copilot uses AI. Check for mistakes.
@ppenna ppenna closed this May 5, 2026
@ppenna ppenna deleted the enhancement/startup-performance-optimizations branch May 5, 2026 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants