feat: rename iaa_fallback_igzip to igzip_fallback, extend to QAT, default=1#57
Open
asonje wants to merge 26 commits into
Open
feat: rename iaa_fallback_igzip to igzip_fallback, extend to QAT, default=1#57asonje wants to merge 26 commits into
asonje wants to merge 26 commits into
Conversation
ISAL's stateful inflate pre-loads input in 8-byte word chunks into a 64-bit shift register (read_in). After reaching ISAL_BLOCK_FINISH for raw deflate (window_bits < 0), read_in_length still holds the bit count of bytes fetched beyond the true stream end. Shifting right by 3 gives the exact over-consumed byte count for every avail_in scenario: avail_in in [1,7] : previous avail_in<8 heuristic approximated this avail_in == 0 : previous heuristic was blind; now handled avail_in == 8 : gap in both old heuristics; now handled avail_in > 8 : multi-frame continuation; now handled The corrected consumed count is reported back to the caller via *input_length so that avail_in / next_in on the zlib stream are advanced accurately, preventing downstream compressed-size and CRC mismatches (reproduces as ZipException in Java ZipInputStream). Boundary guard updated: skip the raw-boundary fallback when *tofixed==1 because the correction has already accounted for any remaining bytes, so non-zero remaining is legitimate trailing data rather than un-corrected over-consumption. BLOCK_INPUT_DONE path (output-buffer-limited mid-stream) retains the existing Z_DATA_ERROR->zlib-fallback behaviour unchanged. Previously attempted stateless-probe approach removed entirely; the read_in_length field is simpler, exact, and has no multi-frame failure mode. Validated: 8/8 IGZIPInflateRegressionTest pass; full make run passes with only the pre-existing ordering-sensitive case 37961; JAR repro (250 entries, scan + build-init) passes clean.
…emove stale post-reset pre-set
The old name 'trailer_overconsumption_fixed' was a holdover from the
avail_in < 8 heuristic era. Since EXP-6f the fix is driven entirely by
isal_inflate.read_in_length >> 3; the flag only signals that a correction
was applied *within the current inflate call*.
Rename the field, its parameter aliases, and all local variables to
read_in_correction_applied to match the actual semantics.
Also remove the stale block in inflateReset:
if (was_igzip_path) {
inflate_settings->trailer_overconsumption_fixed = 1;
}
This block predated the read_in_length fix. It pre-armed the flag so
that the old avail_in < 8 boundary guard would not fire on the first
call after a reset. That logic was correct under the old heuristic but
is actively wrong now: ResetUncompressIGZIP already zeros the flag;
re-arming it to 1 would suppress the boundary guard on exactly the call
where it is needed most (first call after reset on a raw stream).
Remove was_igzip_path (now unused after the block removal) and inline
the INPUT_DONE log expression to eliminate an unused-variable warning
when DEBUG_LOG=OFF.
No behaviour change for correct streams; test suite: 8/8 IGZIP inflate
regressions PASS, only pre-existing 37961 flaky failure remains.
…STICS build flag
Statistics previously tracked QAT and IAA paths but left IGZIP with
commented-out placeholders. Two problems with those placeholders: the
enum values DEFLATE_IGZIP_COUNT, DEFLATE_IGZIP_ERROR_COUNT,
INFLATE_IGZIP_COUNT, INFLATE_IGZIP_ERROR_COUNT did not exist, and the
stat_names array had no entries for them.
Add the four missing Statistic enum values (after the IAA entries,
before ZLIB, matching the QAT/IAA grouping pattern) and their
corresponding stat_names strings. Uncomment the INCREMENT_STAT calls
in the deflate and inflate IGZIP dispatch blocks. The error condition
mirrors IAA/QAT: unconditional IGZIP_COUNT on every dispatch, and
IGZIP_ERROR_COUNT when ret != 0 (fallback or hard error).
Enable ENABLE_STATISTICS=ON in cmake.txt so the counters compile in by
default.
Validation (OpenSearch opensearch-core-3.3.1.jar, 250-entry JAR repro,
use_igzip_uncompress=1, log_stats_samples=50):
Thread (main): inflate_count=750 igzip=749 igzip_errors=0 zlib=0
— 99.9% of inflate calls served by IGZIP with zero errors or
zlib fallbacks; one call is the initial probe before path selection.
When IAA inflate or deflate fails (ret != 0), and use_igzip_uncompress / use_igzip_compress is enabled, the new iaa_fallback_igzip config flag (default 0) routes the retry to IGZIP before allowing the call to fall through to software zlib. Motivation: IAA is stateless and leaves input unconsumed on failure, so IGZIP can retry from the same position at no extra cost. This keeps hardware-accelerated decompression in play for inputs that IAA rejects but IGZIP can handle (e.g. streams that fail IsIAADecompressible or exceed IAA's single-call constraint). Implementation: - New ConfigOption IAA_FALLBACK_IGZIP added between IGNORE_ZLIB_DICTIONARY and LOG_LEVEL; default 0 (opt-in), range [0,1]. - inflate(): after IAA dispatch block, if path_selected==IAA && ret!=0 && configs[IAA_FALLBACK_IGZIP] && igzip_available, reset input_len/output_len (IAA may have modified them on failure), clear read_in_correction_applied, and run IGZIPRunInflateAndSelectPathAction with the same path-action handling as the normal IGZIP dispatch. Falls through to zlib if IGZIP also fails. - deflate(): identical pattern; initialises isal_strm via InitCompressIGZIP if needed (matching normal IGZIP deflate path). - Both fallback blocks are wrapped in #ifdef USE_IGZIP and guarded by igzip_available so they compile away and are unreachable without IGZIP. - Existing INFLATE/DEFLATE_IGZIP_COUNT stats capture fallback calls. Test suite: 44756/44757 PASS (37961 pre-existing flaky only).
Signed-off-by: Olasoji <olasoji.denloye@intel.com>
- SupportedOptionsIGZIPCompress: always returns true (ISA-L natively supports NO_FLUSH/SYNC_FLUSH/FULL_FLUSH/FINISH) - SupportedOptionsIGZIPUncompress: always returns true (remove dead zero-input new-stream guard) - IGZIPShouldFallbackDeflate + IsIGZIPSyncFlush: removed entirely; both streaming_flush_with_input and empty_sync_flush_reentry cases were artificial constraints inherited from one-shot IAA/QAT model - CompressIGZIP: add ZSTATE_NEW_HDR guard for empty SYNC_FLUSH calls; ISA-L emits sync bytes unconditionally, guard returns 0 progress so caller sees Z_BUF_ERROR matching zlib semantics - deflateSetDictionary: reject mid-stream calls when an accelerator path is active; underlying zlib stream is unadvanced so orig_deflateSetDictionary would incorrectly return Z_OK - inflate active-stream no-input block: remove dead - tests: remove 6 stale path==ZLIB assertions from 3 regression tests that assumed IGZIP would fall back on streaming flushes
When IAA compress fails and IGZIP is used as a fallback, path_selected remained IAA. The return-code block below then used the IAA/QAT one-shot logic (avail_in==0 -> Z_STREAM_END) instead of IGZIP streaming logic (IsIGZIPDeflateFinished). ISA-L fills its internal level buffer and returns after consuming all input but producing only partial output; avail_in reaches 0 while the stream is not yet at ZSTATE_END. The result was Z_STREAM_END returned with ~48KB of compressed data still buffered internally, corrupting every document written during indexing. Fix: set path_selected = IGZIP immediately after the fallback completes so that the return-code logic correctly calls IsIGZIPDeflateFinished and returns Z_OK for partial output, allowing the caller to drain the stream.
When iaa_prepend_empty_block=0 (the default), the old code returned true for all raw deflate and gzip inputs unconditionally, causing QPL's overconsumption bug to surface for Java ZipInputStream callers. JAR loading feeds inflate() with 512-byte chunks where avail_in > actual compressed size; QPL reports total_in == available_in, causing the Java-level ZipException that kills OpenSearch at launch. Fix (ported from exp-8): replace the IAA_PREPEND_EMPTY_BLOCK=0 branch with a 512-byte threshold guard. ZipInputStream always feeds <=512-byte chunks; Lucene stored-field reads always provide the exact compressed size which is typically well above 512 bytes. The few Lucene entries below the threshold fall back to IGZIP, which is also correct. Also removes PrependedEmptyBlockPresent() which is no longer called from the decompression path. The compress-side marker write in CompressIAA is unaffected (still controlled by iaa_prepend_empty_block).
Add USE_IGZIP (ON/OFF) and ISAL_PATH to the cmake options list, and add a Requirements for IGZIP section with a link to ISA-L. Signed-off-by: Olasoji <olasoji.denloye@intel.com>
pre_avail_in is only referenced inside #ifdef USE_IGZIP blocks. Without this guard, builds with USE_IGZIP=OFF trigger an -Wunused-variable error under -Werror, breaking CI. Signed-off-by: Olasoji <olasoji.denloye@intel.com>
Signed-off-by: Olasoji <olasoji.denloye@intel.com>
cmake.txt contained a local build command with hardcoded absolute paths (/home/sdp/...). Remove it from the repo and add to .gitignore. deflateSetDictionary: deflate_stream_settings.Get(strm) can return nullptr if called without a prior deflateInit. Guard the mid-stream rejection check to avoid a null dereference in that case, returning Z_STREAM_ERROR consistent with zlib behaviour. Addresses review comments on PR intel#54 (matt-welch, Copilot). Signed-off-by: Olasoji <olasoji.denloye@intel.com>
SupportedOptionsIGZIPCompress, SupportedOptionsIGZIPUncompress, and IGZIPShouldFallbackDeflate unconditionally returned fixed values with all parameters voided. Remove them entirely and inline the fixed values at all 8 call sites in zlib_accel.cpp. Addresses review comment on PR intel#54 (matt-welch). Signed-off-by: Olasoji <olasoji.denloye@intel.com>
The flag is per-stream-session (set once when the read_in_length over-consumption correction fires; cleared only on inflateReset), not per-call as the comment incorrectly stated. Fix the struct comment to reflect this, and remove the erroneous reset to 0 on the IAA->IGZIP fallback path which was inconsistent with the primary IGZIP path and the inflateReset behavior. Addresses review comment on PR intel#54 (Copilot, matt-welch). Signed-off-by: Olasoji <olasoji.denloye@intel.com>
The empty stored-block marker approach was abandoned in favour of a 512-byte minimum input length threshold in IsIAADecompressible. The config option is retained for backward compatibility but has no effect on decompression and will be removed in a future release. Rationale for the threshold approach: QPL hardware always consumes all available_in bytes regardless of where the BFINAL=1 token falls (overconsumption bug). Java ZipInputStream feeds <=512-byte chunks when csize is unknown, triggering overconsumption errors. Lucene stored-field reads always supply the exact compressed size (>512 bytes), where consuming all input is correct. Document the deprecation in README and add a comment in iaa.cpp explaining the history and the replacement mechanism. Addresses review comment on PR intel#54 (matt-welch). Signed-off-by: $(git config user.name) <$(git config user.email)>
Add IAAFallbackIGZIPTest with three tests exercising the fallback
path for both deflate and inflate:
- DeflateUsesIGZIPWhenIAAFailsAndFallbackEnabled: configures
USE_IAA_COMPRESS+USE_IGZIP_COMPRESS+IAA_FALLBACK_IGZIP=1 and
asserts the stream lands on IGZIP (or IAA if hardware present).
- DeflateDoesNotUseIGZIPWhenFallbackDisabled: same IAA+IGZIP config
but IAA_FALLBACK_IGZIP=0; asserts IGZIP is never selected.
- InflateUsesIGZIPWhenIAAFailsAndFallbackEnabled: data compressed
with IGZIP, decompressed via USE_IAA_UNCOMPRESS+IAA_FALLBACK_IGZIP=1;
asserts round-trip correctness and IGZIP (or IAA) execution path.
On machines without IAA hardware the QPL init fails (non-zero return),
which naturally exercises the fallback branch without mocking. On
machines that do have IAA hardware and succeed, the assertions accept
either IAA or IGZIP so the tests remain valid in both environments.
Tests are guarded by #ifdef USE_IAA and #ifdef USE_IGZIP (both are
in scope here via the outer USE_IGZIP guard around the regression
block).
Addresses should-fix intel#6 from PR intel#54 review (matt-welch).
Signed-off-by: Olasoji <olasoji.denloye@intel.com>
Three IGZIPDeflateRegressionTest tests originally asserted ASSERT_EQ(path, ZLIB) because IGZIPShouldFallbackDeflate redirected SYNC_FLUSH calls to the zlib path. Commit f7ee0ec (lift Z_FINISH-only restriction) removed those assertions when IGZIP gained native SYNC_FLUSH support, but left no replacement. Now that IGZIPShouldFallbackDeflate is removed entirely, restore the path assertions with the new expected behavior: IGZIP must remain on the IGZIP path through Z_NO_FLUSH, Z_SYNC_FLUSH, and Z_FINISH calls. Changes: - ResetMustNotStallSyncFlushOnSameStream: assert IGZIP after Z_NO_FLUSH and Z_SYNC_FLUSH on each cycle. - RepeatedEmptySyncFlushMustEventuallyReportNoProgress: assert IGZIP after Z_NO_FLUSH and on every Z_SYNC_FLUSH iteration. - SyncFlushWithInputMustFallbackToZlibForStreamSafety: renamed to SyncFlushWithInputMustStayOnIGZIPPath (the old name described the removed fallback behavior); assert IGZIP after Z_SYNC_FLUSH and on every Z_FINISH iteration. Addresses review comment intel#7 on PR intel#54 (matt-welch). Signed-off-by: Olasoji <olasoji.denloye@intel.com>
…oupling intel#8 — Explain why read_in_correction_applied is cleared on inflateReset: isal_inflate_reset clears the internal read_in/read_in_length buffer, so any over-consumption correction from the previous stream session no longer applies. The flag must be reset so the new session can fire the correction fresh if needed. This is intentional; keeping it set across a reset would incorrectly suppress the correction on data that has not yet over-consumed. intel#9 — Document ISA-L version for ZSTATE_NEW_HDR coupling: The SYNC_FLUSH no-progress guard directly accesses isal_strm->internal_state.state == ZSTATE_NEW_HDR. Note that this was validated against ISA-L v2.32.0 (commit c196241) so future ISA-L updates can be audited for state machine changes. Addresses review comments intel#8 and intel#9 on PR intel#54 (matt-welch). Signed-off-by: Olasoji <olasoji.denloye@intel.com>
…ats tests, inflate fallback=0 test - README.md: document iaa_fallback_igzip config option - igzip.h: fix read_in_correction_applied comment to reflect per-stream-session semantics (not per-call) - tests/zlib_accel_test.cpp: add DEFLATE_IGZIP_COUNT/INFLATE_IGZIP_COUNT branches to CompressDecompress stats verification - tests/zlib_accel_test.cpp: add InflateDoesNotUseIGZIPWhenFallbackDisabled as companion to the existing inflate=1 test
…corruption) isal_deflate_reset preserves gzip_flag. After the first chunk, ISA-L internally changes gzip_flag from IGZIP_ZLIB (3) to IGZIP_ZLIB_NO_HDR (4) to suppress the header on continuation calls. Without a reset, the reused stream produced headerless chunks, causing Java's Inflater (nowrap=false) to report 'incorrect header check' / 'unknown compression method' on every subsequent SSTable chunk — resulting in CorruptSSTableException at read time. Fix: add ResetCompressIGZIP() which wraps isal_deflate_reset + calls ConfigureDeflateWindow to restore gzip_flag and hist_bits from window_bits. Replace the inline isal_deflate_reset call in deflateReset() with this. Regression test: ResetMustRestoreZlibHeaderForSubsequentChunks compresses 5 independent chunks via a reused IGZIP z_stream (deflateReset between each) and decompresses each with a fresh zlib inflater — would have failed before this fix on chunk 2 onward. Reproduced via: Cassandra 5 mix workload (80% reads, 20% updates) with IGZIP compress enabled, writing to a zlib-format SSTable.
- Rename IAA_FALLBACK_IGZIP -> IGZIP_FALLBACK (single generic key) - Extend fallback condition in deflate() and inflate() to cover QAT: (path_selected == IAA || path_selected == QAT) - Move failed_accelerator declaration before IGZIP retry call in both blocks - Log messages updated to identify the failing accelerator (IAA or QAT) - Flip default from 0 -> 1 (fallback is now on by default when IGZIP enabled) - Update README: key name, description, default - Add QATFallbackIGZIPTest suite (4 tests mirroring IAAFallbackIGZIPTest) Full make run: 51006 tests, 44766 passed, remainder expected skips.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR generalizes the “accelerator → IGZIP” fallback so it applies to both IAA and QAT, renames the configuration knob to igzip_fallback, and changes its default to enabled.
Changes:
- Rename
IAA_FALLBACK_IGZIP/iaa_fallback_igziptoIGZIP_FALLBACK/igzip_fallback, and flip the default to1. - Extend deflate/inflate fallback logic to trigger when either IAA or QAT fails, and adjust messages to identify which accelerator failed.
- Add a QAT fallback test suite mirroring the existing IAA coverage, and update README configuration docs.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
zlib_accel.cpp |
Extend fallback condition to cover QAT and update selection/fallback reasons/messages. |
config/config.h |
Rename config enum key to IGZIP_FALLBACK. |
config/config.cpp |
Update default value and config-file key name to igzip_fallback. |
tests/zlib_accel_test.cpp |
Update IAA fallback tests to new config key; add QAT fallback tests. |
README.md |
Document renamed key and new default. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+441
to
+443
| SetDeflatePath( | ||
| deflate_settings, strm, IGZIP, | ||
| std::string(failed_accelerator) + " failed, IGZIP fallback"); |
Comment on lines
+778
to
+780
| SetInflatePath( | ||
| inflate_settings, strm, ZLIB, | ||
| std::string(failed_accelerator) + "->IGZIP fallback: Z_NEED_DICT"); |
Comment on lines
+782
to
+784
| SetInflatePath( | ||
| inflate_settings, strm, ZLIB, | ||
| std::string(failed_accelerator) + "->IGZIP fallback: raw trailer"); |
Comment on lines
+786
to
+788
| SetInflatePath( | ||
| inflate_settings, strm, ZLIB, | ||
| std::string(failed_accelerator) + "->IGZIP fallback: raw boundary"); |
Comment on lines
+791
to
+793
| SetInflatePath(inflate_settings, strm, IGZIP, | ||
| "IAA failed, IGZIP fallback succeeded"); | ||
| std::string(failed_accelerator) + | ||
| " failed, IGZIP fallback succeeded"); |
SetDeflatePath/SetInflatePath take const char*, but commit 8120b29 passed std::string temporaries (string concatenation) which is a hard compile error. The reason parameter is (void)'d in both functions, so replace the broken concatenations with plain string literals and remove the now-unused failed_accelerator variables. The error was masked because the affected code is inside #ifdef USE_IGZIP blocks that are only reached when both an accelerator (IAA or QAT) and IGZIP are compiled in together. Builds with fewer accelerators enabled did not trigger it.
- Add ASSERT_EQ(ret, Z_STREAM_END) for deflate/inflate calls in QATFallbackIGZIPTest::DeflateDoesNotUseIGZIPWhenFallbackDisabled and InflateDoesNotUseIGZIPWhenFallbackDisabled so unchecked return codes are caught as test failures. - Guard QATFallbackIGZIPTest block with #if defined(USE_QAT) && defined(USE_IGZIP) since these tests exercise the QAT->IGZIP fallback path which requires both backends to be compiled in; USE_QAT alone is insufficient.
Comment on lines
+440
to
+441
| SetDeflatePath(deflate_settings, strm, IGZIP, | ||
| "IAA failed, IGZIP fallback"); | ||
| "accelerator failed, IGZIP fallback"); |
Comment on lines
770
to
+774
| if (path_action == IGZIP_INFLATE_PATH_FALLBACK_NEED_DICT) { | ||
| Log(LogLevel::LOG_ERROR, " strm=", static_cast<void*>(strm), | ||
| " source=igzip (iaa fallback)", " total_in=", strm->total_in, | ||
| " total_out=", strm->total_out, " adler=", strm->adler, "\n"); | ||
| " source=igzip (accelerator fallback)", | ||
| " total_in=", strm->total_in, " total_out=", strm->total_out, | ||
| " adler=", strm->adler, "\n"); |
Comment on lines
2459
to
+2462
| // With fallback disabled, IGZIP must not be selected for an IAA-configured | ||
| // stream; it should fall through to zlib. | ||
| const ExecutionPath path = GetDeflateExecutionPath(&stream); | ||
| EXPECT_NE(path, IGZIP) << "IGZIP must not be used when iaa_fallback_igzip=0"; | ||
| EXPECT_NE(path, IGZIP) << "IGZIP must not be used when igzip_fallback=0"; |
Comment on lines
2545
to
+2548
| // With fallback disabled, IGZIP must not be selected for an IAA-configured | ||
| // inflate stream; it should fall through to zlib. | ||
| const ExecutionPath path = GetInflateExecutionPath(&dstream); | ||
| EXPECT_NE(path, IGZIP) << "IGZIP must not be used when iaa_fallback_igzip=0"; | ||
| EXPECT_NE(path, IGZIP) << "IGZIP must not be used when igzip_fallback=0"; |
Comment on lines
2389
to
+2393
| #ifdef USE_IAA | ||
| // IAA->IGZIP fallback tests. | ||
| // On machines without IAA hardware, CompressIAA/UncompressIAA return non-zero, | ||
| // which naturally triggers the fallback path. These tests verify: | ||
| // - When iaa_fallback_igzip=1: the stream lands on IGZIP after IAA fails. | ||
| // - When iaa_fallback_igzip=0: the stream falls through to zlib, not IGZIP. | ||
| // - When igzip_fallback=1: the stream lands on IGZIP after IAA fails. |
| SetConfig(IGZIP_FALLBACK, 0); | ||
| DestroyBlock(input); | ||
| } | ||
| #endif // USE_IAA |
…reasons Address third-round Copilot review comments on PR intel#57: - SetDeflatePath: use path_selected ternary to say "IAA failed" vs "QAT failed" instead of the generic "accelerator failed" literal. - SetInflatePath (accelerator->IGZIP fallback block): declare failed_accelerator const char* from path_selected, then use ternaries in all four path-action branches so log and reason strings name the failing accelerator (IAA/QAT). - IAAFallbackIGZIPTest.DeflateDoesNotUseIGZIPWhenFallbackDisabled: capture and assert Z_STREAM_END from deflate() so compression failures are caught. All 44766 tests pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Extends the IGZIP fallback mechanism to cover QAT in addition to IAA,
and renames the config key from
iaa_fallback_igzipto the genericigzip_fallback.Changes:
IAA_FALLBACK_IGZIP→IGZIP_FALLBACK(single key covers both accelerators)deflate()andinflate()to trigger when either IAA or QAT failsQATFallbackIGZIPTestsuite (4 tests mirroring the existingIAAFallbackIGZIPTest)No behavior change for existing IAA-only deployments with
igzip_fallback=1in their config file. Deployments that relied on the old default of 0 and want to preserve strict hardware-only behavior should setigzip_fallback=0explicitly.