Skip to content

fuzz: cover deferred writing in chanmon_consistency#4465

Open
joostjager wants to merge 4 commits into
lightningdevkit:mainfrom
joostjager:chain-mon-internal-deferred-writes-with-fuzz
Open

fuzz: cover deferred writing in chanmon_consistency#4465
joostjager wants to merge 4 commits into
lightningdevkit:mainfrom
joostjager:chain-mon-internal-deferred-writes-with-fuzz

Conversation

@joostjager
Copy link
Copy Markdown
Contributor

@joostjager joostjager commented Mar 6, 2026

Adds fuzz coverage for #4351

@ldk-reviews-bot
Copy link
Copy Markdown

ldk-reviews-bot commented Mar 6, 2026

👋 I see @wpaulino was un-assigned.
If you'd like another reviewer assignment, please click here.

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.12%. Comparing base (75ac90a) to head (e0c5818).
⚠️ Report is 16 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4465      +/-   ##
==========================================
- Coverage   86.12%   86.12%   -0.01%     
==========================================
  Files         157      157              
  Lines      108824   108944     +120     
  Branches   108824   108944     +120     
==========================================
+ Hits        93727    93826      +99     
- Misses      12480    12499      +19     
- Partials     2617     2619       +2     
Flag Coverage Δ
tests 86.12% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@joostjager joostjager force-pushed the chain-mon-internal-deferred-writes-with-fuzz branch from 51afc25 to 0aebb10 Compare March 19, 2026 07:51
@joostjager
Copy link
Copy Markdown
Contributor Author

Rebased

@joostjager
Copy link
Copy Markdown
Contributor Author

Will hold off on this until pre-existing fuzz failures #4496 and #4472 are in

@joostjager joostjager force-pushed the chain-mon-internal-deferred-writes-with-fuzz branch from 0aebb10 to 6274ba0 Compare March 20, 2026 17:52
@joostjager
Copy link
Copy Markdown
Contributor Author

Prerequisites are in, rebased.

@TheBlueMatt
Copy link
Copy Markdown
Collaborator

This LGTM, why is it draft?

@joostjager
Copy link
Copy Markdown
Contributor Author

I first wanted to do a serious local run, but then it turned out there are so many pre-existing fuzz failures that it is hard to see what's new. I've bisected the failures to the various PRs that introduced them.

@joostjager
Copy link
Copy Markdown
Contributor Author

joostjager commented Mar 22, 2026

Although for this PR I could just see if anything pops up that doesnt repro on main. Will do that.

@joostjager
Copy link
Copy Markdown
Contributor Author

joostjager commented Mar 23, 2026

Several newly introduced fuzz failures to address:

7084a32219ffa61b1919192119197084a32219ff4400001d228012ffab
10fa0040801170704040198040111911191f1f1170bbff1170ffb3b2b6
10000150804211be707040198011191f1119401f401f0aa97210b6ff25
10000150804211707040198011191f11191f1a4040a91f7210b6ff25

@joostjager
Copy link
Copy Markdown
Contributor Author

Zoomed in on one of those sequences, and it seems it is reproducible with another string without deferred mode too.

0270801109191109191f1f10b6ff

@joostjager joostjager force-pushed the chain-mon-internal-deferred-writes-with-fuzz branch 2 times, most recently from c6da16e to d4bf3e0 Compare April 15, 2026 04:36
@joostjager
Copy link
Copy Markdown
Contributor Author

Dependency: #4520

@joostjager joostjager self-assigned this Apr 16, 2026
@joostjager joostjager force-pushed the chain-mon-internal-deferred-writes-with-fuzz branch from d4bf3e0 to a798d74 Compare May 6, 2026 11:42
@joostjager
Copy link
Copy Markdown
Contributor Author

joostjager commented May 7, 2026

Interestingly the non-deferred mode reproducer 0270801109191109191f1f10b6ff doesn't fail on main anymore, with #4520 not yet merged. Bisected the fixing PR to #4529 (@tankyleo).

It seems the increased headroom made the bug unobservable to the fuzzer?

I ran the fuzzer on this branch overnight, and no failures were found.

@joostjager
Copy link
Copy Markdown
Contributor Author

I'll try to add a new invariant to the fuzzer so it catches the problem in a more robust way.

@joostjager
Copy link
Copy Markdown
Contributor Author

Invariant addition: #4601

@joostjager joostjager marked this pull request as ready for review May 7, 2026 13:23
@ldk-reviews-bot ldk-reviews-bot requested a review from wpaulino May 7, 2026 13:23
@joostjager
Copy link
Copy Markdown
Contributor Author

Will rebase this after #4571

@joostjager joostjager removed the request for review from wpaulino May 7, 2026 13:23
Comment thread fuzz/src/chanmon_consistency.rs Outdated
Comment thread fuzz/src/chanmon_consistency.rs Outdated
@ldk-claude-review-bot
Copy link
Copy Markdown
Collaborator

ldk-claude-review-bot commented May 7, 2026

I've carefully re-reviewed the entire PR diff, cross-referencing with the ChainMonitor internals, the flush mechanism, the watch_channel_internal/update_channel_internal paths, and the fuzz loop convergence logic.

All four previously flagged issues remain unaddressed in the current code. No code changes have occurred since the prior review pass.

No new issues found beyond the prior review.

Summary

Previously flagged issues (still open):

  1. fuzz/src/chanmon_consistency.rs:367 — Docstring describes snapshot-then-flush pattern, but the implementation encodes the ChannelManager before flushing, so the fuzzer always tests post-flush manager state rather than the more interesting mid-flush crash scenario.

  2. fuzz/src/chanmon_consistency.rs:386track_monitor_update with Completed status calls mark_persisted which drains pending_monitors for the channel. This relies on the invariant that update_ret never changes during a node's lifetime. The invariant holds today but is not documented and is fragile.

  3. fuzz/src/chanmon_consistency.rs:421 — Previously confirmed as a false positive. pending_monitors and pending_monitor_completions are independent lists; pruning restart candidates does not affect completion obligations.

  4. fuzz/src/chanmon_consistency.rs:1179 — In reload, the new HarnessPersister is created with self.persistence_style at line 1103, meaning deferred+InProgress nodes get InProgress for startup watch_channel persists during the flush at line 1172. This creates unnecessary completion obligations from startup that don't correspond to a realistic post-restart state.

Cross-cutting observations:

  • The settle loop convergence changes (tracking made_progress from checkpoint_manager_persistences and complete_all_monitor_updates, plus the had_events = true fix in process_events) are correct and improve convergence detection for deferred mode.
  • The make_channel refactor to use split_at_mut is correct — the assert!(source_idx < dest_idx) guard ensures the split is valid, and the added checkpoint_manager_persistence calls before complete_all_pending_monitor_updates are necessary for deferred mode.
  • The separation of pending_monitors (restart candidates) and pending_monitor_completions (ChainMonitor ack obligations) is sound — the two lists intentionally diverge and are pruned independently.

@joostjager joostjager force-pushed the chain-mon-internal-deferred-writes-with-fuzz branch from a798d74 to 20275f6 Compare May 8, 2026 14:30
Comment thread fuzz/src/chanmon_consistency.rs Outdated
@joostjager joostjager force-pushed the chain-mon-internal-deferred-writes-with-fuzz branch 2 times, most recently from af7e8ee to 53cd7c9 Compare May 11, 2026 12:58
Comment thread fuzz/src/chanmon_consistency.rs Outdated
@joostjager joostjager force-pushed the chain-mon-internal-deferred-writes-with-fuzz branch from 53cd7c9 to e72ec58 Compare May 11, 2026 13:50
Replace the chanmon consistency harness' Watch wrapper with a Persist
implementation backed by HarnessPersister. Monitor writes now flow
through the real ChainMonitor persistence hooks.

Track restart candidates separately from monitor completion callbacks. A
monitor can stop being a valid reload candidate once a newer baseline is
durable, while its callback may still be needed to unblock the live
ChainMonitor.

On reload, choose the durable baseline, first pending snapshot, or last
pending snapshot. Startup monitor registration completes immediately
before the configured persistence style is restored.
@joostjager joostjager force-pushed the chain-mon-internal-deferred-writes-with-fuzz branch from e72ec58 to 5eb96ab Compare May 11, 2026 14:19
@joostjager
Copy link
Copy Markdown
Contributor Author

I reworked this PR and based it on a preparatory commit that makes the fuzzer's persistence model much more understandable, in my opinion. The prep work was taken from the force-close fuzzing PR #4381 that is still WIP, so it should be useful there as well.

@joostjager
Copy link
Copy Markdown
Contributor Author

Fuzzed this overnight, no failures other than what is pre-existing in main.

@joostjager joostjager force-pushed the chain-mon-internal-deferred-writes-with-fuzz branch from 5eb96ab to eb061da Compare May 13, 2026 08:28
Treat HTLC-forward processing and monitor completion as real progress
in the chanmon harness.

This keeps the settle loop running after passes that only unblock
follow-up work instead of stopping before the next event or message
batch.
Build the replacement persister with the configured monitor update
status during reload.

This keeps non-deferred restart behavior aligned with the active
persistence-style matrix.
Track deferred monitor writes in the harness and checkpoint the
ChannelManager state before flushing them to the persister.

This extends setup, reload, and settle paths to model deferred
ChainMonitor persistence ordering.
@joostjager joostjager force-pushed the chain-mon-internal-deferred-writes-with-fuzz branch from eb061da to e0c5818 Compare May 13, 2026 12:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

4 participants