Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
162 changes: 162 additions & 0 deletions fc-crashes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
# Force-close fuzzer LDK crashes

Minimized crash sequences found by the chanmon_consistency fuzzer with
force-close support. All crashes are `debug_assert` or `panic!` inside
LDK, not in the fuzzer harness. Byte 0 encodes monitor styles (bits
0-2) and channel type (bits 3-4: 0=Legacy, 1=KeyedAnchors).

## 1. channelmonitor.rs:2727 - HTLC input not found in transaction

```
debug_assert!(htlc_input_idx_opt.is_some());
```

When resolving an HTLC spend, the monitor searches for the HTLC
outpoint in the spending transaction's inputs but doesn't find it.
Falls back to index 0 in release mode, which would produce incorrect
tracking.

Minimized (17 bytes):
```
0x40 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xdc 0xde 0xff
```

Byte 0 = 0x40: Legacy channels, no async monitors. The sequence is
mostly 0xff (settlement) repeated, with height advances (0xdc, 0xde)
near the end. This suggests the crash happens during settlement when
processing on-chain HTLC spends after repeated settlement attempts.

## 2. onchaintx.rs:913 - Duplicate claim ID in pending requests

```
debug_assert!(self.pending_claim_requests.get(&claim_id).is_none());
```

The OnchainTxHandler registers a claim event with a claim_id that
already exists in the pending_claim_requests map.

Minimized (10 bytes):
```
0x08 0xd2 0x70 0x70 0x71 0x70 0x10 0x19 0xde 0xff
```

Byte 0 = 0x08: KeyedAnchors channels, no async monitors.
- 0xd2: B force-closes the A-B channel
- 0x70/0x71: disconnect/reconnect peers
- 0x10, 0x19: process messages on nodes A and B
- 0xde: advance chain 200 blocks
- 0xff: settle

B force-closes, peers disconnect and reconnect, messages are exchanged,
then height advances and settlement triggers the duplicate claim.

## 3. onchaintx.rs:1025 - Inconsistent internal maps

```
panic!("Inconsistencies between pending_claim_requests map and claimable_outpoints map");
```

The OnchainTxHandler detects that its `pending_claim_requests` and
`claimable_outpoints` maps are out of sync.

Minimized (14 bytes):
```
0x00 0x3c 0x11 0x19 0xd0 0xde 0xff 0xff 0x19 0x21 0x19 0xde 0x26 0xff
```

Byte 0 = 0x00: Legacy channels, all monitors completed.
- 0x3c: send hop payment A->B->C (1M msat)
- 0x11, 0x19: process messages to commit HTLC on A-B
- 0xd0: A force-closes A-B
- 0xde: advance 200 blocks
- 0xff: settle (first round)
- 0xff: settle again (second round, processes more messages)
- 0x19, 0x21, 0x19: continue processing B and C messages
- 0xde: advance 200 more blocks
- 0x26: process events on node C
- 0xff: settle (third round)

A hop payment partially committed, then A force-closes. Multiple
settlement rounds with continued message processing in between triggers
the internal map inconsistency.

## 4. test_channel_signer.rs:395 - Signing revoked commitment

```
panic!("can only sign the next two unrevoked commitment numbers, revoked={} vs requested={}")
```

The test channel signer is asked to sign an HTLC transaction for a
commitment number that has already been revoked.

Minimized (18 bytes):
```
0x22 0x71 0x71 0x71 0x71 0x71 0x71 0x71 0xff 0xff 0xff 0xff 0xff 0xff 0xde 0xde 0xb5 0xff
```

Byte 0 = 0x22: Legacy channels, async monitors on node B.
- 0x71: disconnect B-C peers (repeated, only first effective)
- 0xff: settle (repeated 6 times)
- 0xde 0xde: advance 400 blocks
- 0xb5: restart node B with alternate monitor state
- 0xff: settle

Async monitors on B with peer disconnection, repeated settlements,
height advances, and a node restart with a different monitor state.
The stale monitor combined with the restart puts B's signer in a state
where it's asked to sign for an already-revoked commitment.

## 5. channelmanager.rs:9836 - Payment blocker not found

```
debug_assert!(found_blocker);
```

During payment processing, the ChannelManager expects to find a
specific blocker entry for an in-flight payment but it's missing.

Minimized (13 bytes):
```
0x00 0x3c 0x11 0x19 0x11 0x1f 0x19 0x21 0x19 0x27 0x27 0xde 0xff
```

Byte 0 = 0x00: Legacy channels, all monitors completed.
- 0x3c: send hop A->B->C (1M msat)
- 0x11, 0x19, 0x11: commit HTLC on A-B
- 0x1f: B processes events (forwards HTLC to C)
- 0x19, 0x21, 0x19: commit HTLC on B-C
- 0x27, 0x27: C processes events (claims payment)
- 0xde: advance 200 blocks
- 0xff: settle

A straightforward A->B->C hop payment that completes normally (C
claims), followed by a height advance and settlement. No force-close
in this sequence, so the height advance before settlement may cause
HTLC timeout processing that conflicts with the claim path.

## 6. channelmanager.rs:19484 - Monitor update ID ordering violation

```
debug_assert!(update.update_id >= pending_update.update_id);
```

A ChannelMonitorUpdate has an update_id that is less than a pending
update's id, violating the expected monotonic ordering.

Minimized (10 bytes):
```
0x84 0x70 0x11 0x19 0x11 0x1f 0xd0 0x11 0x1f 0xba
```

Byte 0 = 0x84: Legacy channels, no async monitors, high bits set
(bits 3-4 = 0, bits 7 and 2 set).
- 0x70: disconnect A-B peers
- 0x11, 0x19, 0x11: process messages (likely reestablish after setup)
- 0x1f: process B events
- 0xd0: A force-closes A-B channel
- 0x11: process A messages
- 0x1f: process B events
- 0xba: restart node B with alternate monitor state

A force-close followed by continued message/event processing and a
node B restart triggers a monitor update with an out-of-order ID.
1 change: 1 addition & 0 deletions fuzz/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@ hfuzz_target
target
hfuzz_workspace
corpus
artifacts
107 changes: 107 additions & 0 deletions fuzz/FC-INFO.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# Force-Close Fuzzing Notes

This file records the current contract for `chanmon_consistency` force-close
coverage. It is intentionally short. Keep branch history and one-off debugging
notes elsewhere.

## Goal

Force-close fuzzing here should:

- exercise realistic off-chain to on-chain transitions
- keep force-close from changing the eventual outcome of claimed payments
- only allow claimed-payment sender failures when force-close dust touched a
used payment path
- allow unclaimed HTLCs to resolve by CLTV timeout
- drive the harness far enough that it observes real terminal outcomes
- avoid manufacturing timeout wins by starving message delivery or claim
propagation

## Hard-Mode Invariant

The current hard mode is:

- once the harness calls `claim_funds`, that HTLC must eventually produce
`PaymentClaimed` at the receiver
- after that claim, the sender must eventually produce a terminal outcome,
`PaymentSent` or `PaymentFailed`
- if the sender produces `PaymentFailed` for a claimed payment, some used
force-close path for that payment must have been dust-trimmed
- force-close dust on a used path is not, by itself, enough to require
`PaymentFailed`; the payment may still end in `PaymentSent`
- if no used force-close path for the claimed payment was dust-trimmed, the
sender must eventually produce `PaymentSent`
- going on-chain does not create any broader exception than that dust case
- unclaimed HTLCs may still fail by CLTV expiry
- CSV waits on force-close outputs are normal and expected; they are not
payment outcome changes
- a payment disappearing from `list_recent_payments()` is not enough, the
harness must observe or drive the terminal outcome directly

In this mode, the following are harness failures:

- `HTLCHandlingFailed::Receive` after we already chose to claim the HTLC
- a receiver-side claim without the receiver later getting `PaymentClaimed`
- a claimed HTLC without any sender-side terminal event
- a claimed HTLC getting `PaymentFailed` without any dust-trimmed used
force-close path
- a claimed HTLC that should fulfill resolving by CLTV timeout instead
- cleanup stopping while live balances or other pending work still show that
more progress is possible

## Timeouts

Do not conflate CSV and CLTV:

- CSV is normal force-close settlement latency
- CLTV expiry changes the HTLC outcome

The harness should keep driving through CSV waits. It should only protect
claimed HTLCs that should still fulfill from CLTV-expiry resolution.

## Harness Rules

The main rules for preserving the invariant are:

- advance large height jumps one block at a time, with bounded draining before
and after each block
- process queued messages and events before confirming newly broadcast
transactions, so preimages can propagate before timeout paths win
- keep sender-side payment bookkeeping independent of
`list_recent_payments()`
- track which channels each payment actually used, and when force-closing,
snapshot which used payment paths become dust-blocked on the closer's
commitment
- keep driving while `ClaimableOnChannelClose`, HTLC-related claimable balances,
queued messages, pending monitor updates, or pending broadcasts still show
unresolved work
- only stop before a CLTV boundary when crossing it would let a claimed HTLC
that has not yet reached a sender terminal event expire instead
- do not hide pending-payment state behind unrelated auto-driving before an
explicit force-close opcode; a bounded pre-close drain is acceptable when it
is only making already-queued work visible

## Review Checklist

When changing this harness, verify:

- claimed HTLCs still require `PaymentClaimed`
- claimed HTLCs still require a sender-side terminal event
- claimed HTLCs only allow `PaymentFailed` when some used force-close path was
dust-trimmed
- claimed HTLCs without dust-trimmed used force-close paths still require
`PaymentSent`
- unclaimed HTLCs may still time out on-chain
- force-close opcodes still act on the currently pending state
- large synthetic height jumps do not become blind timeout buttons again
- sender-side obligations are not reconciled away through local caches

## Verification

The standard check is:

```bash
~/repo/rl-tools/run_fuzz_runner.sh --timeout-secs 20
```

Re-run the full corpus after any meaningful force-close harness change.
Loading
Loading