Skip to content

EPIC: Zero-copy mmap read path for the File backend (large-object read RSS) #171

@27Bslash6

Description

@27Bslash6

Summary

The "zero-copy memory-mapped reads" advertised for compression=none do not exist: there is no pa.memory_map anywhere, the File backend os.reads the whole file then slices (two full heap copies), and the CK frame unwrap copies again. Setting compression=none today only inflates the payload with no read-RSS benefit. This epic builds the real path (measured target ~0.32x read RSS).

Sequenced sub-tasks

  1. SerializationWrapper.unwrap copies the full payload on every read (incl. every L1 hit) #162SerializationWrapper.unwrap returns a memoryview (prerequisite; standalone win)
  2. File backend mmap readFileBackend.get() returns a buffer backed by pa.memory_map/mmap + memoryview slice past the 14-byte header, instead of os.read+slice. backends/file/backend.py:134,167
  3. Lazy / per-batch checksum — the eager full-body xxHash3 (arrow_serializer.py:279) faults every page and erases the mmap RSS win; make it optional/streaming for the mmap path
  4. compression=none default for the File backend (plaintext-scoped) while wire backends keep zstd — wire the coupling at create_cache_wrapper (decorators/wrapper.py:375 where config.backend is known) and extend the name-keyed serializer cache key (serializers/__init__.py:96) so File and Redis decorators don't share one zstd instance
  5. Raise File max_value_mb (currently rejects the ~300 MB motivating payload, file/config.py:75)
  6. Use the dead File header FLAGS field (backend.py:235, hardcoded 0) to record "uncompressed Arrow, mmappable"

Constraints

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestperformancePerformance improvementspythonPython library

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions