GH-3493: Optimize PlainValuesReader with direct ByteBuffer reads and batch methods by iemejia · Pull Request #3560 · apache/parquet-java

iemejia · 2026-05-13T19:08:07Z

Summary

Replace LittleEndianDataInputStream wrapper with direct ByteBuffer reads using LITTLE_ENDIAN byte order in PlainValuesReader, eliminating per-value virtual dispatch overhead (4 in.read() calls + manual bit shifts → single ByteBuffer.get*() JVM intrinsic).
Add batch read methods (readIntegers, readFloats, readLongs, readDoubles) that use bulk typed-buffer view reads (e.g. buffer.asIntBuffer().get(dest, offset, count)) to bypass per-value bounds checks and position updates.
Page data is obtained as a single contiguous ByteBuffer via ByteBufferInputStream.slice(available), which handles both single-buffer (zero-copy view) and multi-buffer (copy into contiguous buffer) cases transparently.

Benchmark Results

Per-value read optimization (100k INT32 values, JMH):

Pattern	Before (ops/s)	After (ops/s)	Speedup
SEQUENTIAL	427,630,411	5,397,298,681	12.6x
RANDOM	431,052,072	5,437,926,758	12.6x
LOW_CARDINALITY	423,443,685	5,477,810,011	12.9x
HIGH_CARDINALITY	426,405,891	5,485,493,740	12.9x

Batch read methods (PlainDecodingBenchmark, 100K values, pre-allocated arrays):

Type	Per-value (ops/s)	Batch (ops/s)	Speedup
INT32	5,454M	28,256M	+418%
FLOAT	5,407M	25,798M	+377%
INT64	5,408M	8,088M	+50%
DOUBLE	7,404M	7,965M	+8%

All 573 parquet-column tests pass.

Replace the LittleEndianDataInputStream wrapper with direct ByteBuffer access using LITTLE_ENDIAN byte order in PlainValuesReader. Each read{Integer,Long,Float,Double}() previously dispatched through 4 in.read() calls per value and assembled the result with manual bit shifts; it now compiles to a single ByteBuffer get*() JVM intrinsic. In initFromPage, the page data is obtained as a single contiguous ByteBuffer via ByteBufferInputStream.slice(available). The ByteBufferInputStream.slice() method handles both single-buffer (zero-copy view) and multi-buffer (copy into contiguous buffer) cases transparently. In practice page data is almost always a single contiguous buffer. Benchmark (IntEncodingBenchmark.decodePlain, 100k INT32 values per invocation, JMH -wi 3 -i 5 -f 1): Pattern Before (ops/s) After (ops/s) Speedup SEQUENTIAL 427,630,411 5,397,298,681 12.6x RANDOM 431,052,072 5,437,926,758 12.6x LOW_CARDINALITY 423,443,685 5,477,810,011 12.9x HIGH_CARDINALITY 426,405,891 5,485,493,740 12.9x The improvement is consistent regardless of data distribution because the bottleneck was entirely in the dispatch overhead. All four numeric plain reader types (int, long, float, double) benefit equally. All 573 parquet-column tests pass.

… reads Add readIntegers/readFloats/readLongs/readDoubles batch methods to all PlainValuesReader inner classes. All four types use bulk typed-buffer view reads (e.g. buffer.asIntBuffer().get(dest, offset, count)) which bypass per-value bounds checks and position updates. Benchmark results (PlainDecodingBenchmark, 100K values, pre-allocated arrays): Type Per-value (ops/s) Batch (ops/s) Speedup INT32 5,454M 28,256M +418% FLOAT 5,407M 25,798M +377% INT64 5,408M 8,088M +50% DOUBLE 7,404M 7,965M +8%

iemejia added 2 commits May 13, 2026 14:11

iemejia mentioned this pull request May 13, 2026

Apache Parquet Java Performance Improvements #3530

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-3493: Optimize PlainValuesReader with direct ByteBuffer reads and batch methods#3560

GH-3493: Optimize PlainValuesReader with direct ByteBuffer reads and batch methods#3560
iemejia wants to merge 2 commits into
apache:masterfrom
iemejia:perf-plain-bulk-batch

iemejia commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

iemejia commented May 13, 2026

Summary

Benchmark Results

Per-value read optimization (100k INT32 values, JMH):

Batch read methods (PlainDecodingBenchmark, 100K values, pre-allocated arrays):

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant