Skip to content

SerializationWrapper.unwrap copies the full payload on every read (incl. every L1 hit) #162

@27Bslash6

Description

@27Bslash6

Summary

unwrap() does payload = bytes(mv[header_end:]) — a full copy of the payload out of the CK frame on every deserialize, including every L1 hit (L1 stores the framed envelope). For a 300 MB DataFrame that is a 300 MB memcpy per read before the downstream Arrow zero-copy path even starts.

Evidence

  • src/cachekit/serializers/wrapper.py:122payload = bytes(mv[header_end:])
  • L1 stores framed bytes: decorators/wrapper.py:1160, :1015; L1-hit deserialize at :656
  • ArrowSerializer.deserialize already does memoryview(data) at arrow_serializer.py:273, so it accepts a buffer

Impact

This is the real read-path RSS lever that exists today (mmap does not — see #171). One avoidable full-payload copy on the hot path; currently masked by the regression test, which starts tracemalloc after the copy.

Fix

Return mv[header_end:] (a memoryview slice) instead of bytes(...); propagate through cache_handler.deserialize_data. Note: pins the frame buffer for the slice's lifetime — verify L2 retention semantics. Add a tracemalloc-inclusive assertion.

Metadata

Metadata

Assignees

No one assigned

    Labels

    performancePerformance improvementspythonPython library

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions