Summary
unwrap() does payload = bytes(mv[header_end:]) — a full copy of the payload out of the CK frame on every deserialize, including every L1 hit (L1 stores the framed envelope). For a 300 MB DataFrame that is a 300 MB memcpy per read before the downstream Arrow zero-copy path even starts.
Evidence
src/cachekit/serializers/wrapper.py:122 — payload = bytes(mv[header_end:])
- L1 stores framed bytes:
decorators/wrapper.py:1160, :1015; L1-hit deserialize at :656
ArrowSerializer.deserialize already does memoryview(data) at arrow_serializer.py:273, so it accepts a buffer
Impact
This is the real read-path RSS lever that exists today (mmap does not — see #171). One avoidable full-payload copy on the hot path; currently masked by the regression test, which starts tracemalloc after the copy.
Fix
Return mv[header_end:] (a memoryview slice) instead of bytes(...); propagate through cache_handler.deserialize_data. Note: pins the frame buffer for the slice's lifetime — verify L2 retention semantics. Add a tracemalloc-inclusive assertion.
Summary
unwrap()doespayload = bytes(mv[header_end:])— a full copy of the payload out of the CK frame on every deserialize, including every L1 hit (L1 stores the framed envelope). For a 300 MB DataFrame that is a 300 MB memcpy per read before the downstream Arrow zero-copy path even starts.Evidence
src/cachekit/serializers/wrapper.py:122—payload = bytes(mv[header_end:])decorators/wrapper.py:1160,:1015; L1-hit deserialize at:656ArrowSerializer.deserializealready doesmemoryview(data)atarrow_serializer.py:273, so it accepts a bufferImpact
This is the real read-path RSS lever that exists today (mmap does not — see #171). One avoidable full-payload copy on the hot path; currently masked by the regression test, which starts
tracemallocafter the copy.Fix
Return
mv[header_end:](a memoryview slice) instead ofbytes(...); propagate throughcache_handler.deserialize_data. Note: pins the frame buffer for the slice's lifetime — verify L2 retention semantics. Add a tracemalloc-inclusive assertion.