htlcswitch: add link state machine fuzz harness#1
Closed
MPins wants to merge 11 commits into
Closed
Conversation
brunoerg
reviewed
Mar 12, 2026
brunoerg
reviewed
Mar 12, 2026
Coverage Report for CI Build 26690713350Coverage decreased (-0.08%) to 62.2%Details
Uncovered Changes
Coverage RegressionsNo coverage regressions found. Coverage Stats
💛 - Coveralls |
001781c to
1400d9a
Compare
1b219d1 to
ca4be0a
Compare
fcb4bb0 to
6e4610c
Compare
78af141 to
7a5aa99
Compare
Owner
Author
|
@Crypt-iQ when you have time, could you take a look? |
Sure I will take a look |
b28e3d8 to
5f570d5
Compare
Expose the `invoiceRegistry` field in `singleLinkTestHarness` so tests can register and look up invoices directly. Add `generateSingleHopHtlc`, a test helper that builds a single-hop `UpdateAddHTLC` with a random preimage, intended for use in unit and fuzz tests.
Add a no-op MailBox implementation and a no-op ticker for use in the channelLink FSM fuzz harness.
22b07c0 to
7b757e2
Compare
7e3c7cc to
de396f6
Compare
Replace createChannelLinkWithPeer (which required a Switch and spawned the htlcManager goroutine) with newFuzzLink, a minimal link factory that: - accepts dependencies directly (registry, preimage cache, circuit map, bestHeight) instead of a mockServer, so no Switch or background goroutines are created at all - sets link.upstream directly to a buffered channel controlled by the caller, bypassing the mailbox entirely - attaches a mockMailBox so mailBox.ResetPackets() in resumeLink succeeds
Add a failReason string field to channelLink that is populated by failf alongside the existing failed flag. This gives fuzz and unit tests direct access to the human-readable failure reason without requiring a dedicated OnChannelFailure callback or log scraping.
Introduce a one-shot nextOnionFailMode flag on mockIteratorDecoder
and matching payloadFail / extractFail fields on mockHopIterator so
that fuzz and unit tests can deterministically exercise the three
error branches of channelLink.processRemoteAdds:
- onionFailDecode → DecodeHopIterator returns a non-CodeNone
failcode (CodeTemporaryChannelFailure).
- onionFailPayload → HopPayload returns hop.ErrInvalidPayload.
- onionFailExtract → ExtractErrorEncrypter returns a non-CodeNone
failcode (CodeInvalidOnionVersion).
The flag is consumed and cleared on each DecodeHopIterator call so
it affects exactly one HTLC. Default behaviour is unchanged when no
mode is armed, so existing callers see no difference.
newMockHopIterator now returns *mockHopIterator (instead of
hop.Iterator) so the decoder can set the per-iterator failure flags
after construction; the concrete type still satisfies hop.Iterator
and the only external caller in test_utils.go is unaffected.
0f1c1b3 to
a5df5f9
Compare
Introduce `fuzz_link_test.go` with a model-based fuzzer that drives the Alice-Bob channel link through arbitrary sequences of protocol events and checks key invariants after each step.
Introduce fuzzSigner and fuzzSigVerifier in the fuzz harness, along with the SigVerifier hook in LightningChannel (WithSigVerifier, verifySig) and a matching SigPool extension (VerifyFunc field) so the harness can bypass secp256k1 verification end-to-end. Also refactors createTestChannel to accept functional options (testChannelOpt) so the signer and channel options can be injected from tests.
Introduce CommitKeyDeriverFunc and WithCommitKeyDeriver to allow LightningChannel to bypass the secp256k1-based DeriveCommitmentKeys on every commit round. All internal call sites are migrated to lc.deriveCommitmentKeys. The fuzz harness injects fuzzCommitKeyDeriver, a trivial identity deriver that avoids scalar-multiplication overhead.
createTestChannel started alicePool and bobPool but never stopped them. During fuzzing this caused goroutines to leak per. Register t.Cleanup handlers to call Stop() on both pools so all workers are torn down when the test ends.
newMockRegistry started an InvoiceRegistry but never stopped it. InvoiceRegistry internally starts two background goroutines — invoiceEventLoop and the InvoiceExpiryWatcher mainLoop — that run for the lifetime of the registry. Without a matching Stop() call both goroutines leaked for every test that called newMockRegistry, accumulating thousands of goroutines during fuzzing. Register a t.Cleanup to call registry.Stop() so both loops are torn down when the test ends.
f58c5a1 to
5204c28
Compare
Owner
Author
|
Superseded by upstream PR lightningnetwork#10865. Further review and development will continue in the upstream pull request. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds a coverage-guided fuzz harness that exercises the link state machine by randomly interleaving HTLC additions, commits, revocations, settlements, failures, link restarts, and transitions into and out of quiescence mode from both Alice and Bob.
Making each iteration cheap enough to fuzz
A coverage-guided fuzzer typically needs to execute hundreds of iterations per second to be able to find interesting states. The dominant per-iteration costs in the normal link path are secp256k1 arithmetic and disk I/O, neither of which are the focus of this harness. To keep the state machine the bottleneck rather than crypto and the filesystem, the harness introduces three substitutions:
*channeldb is backed by bbolt files created under t.TempDir(). To avoid the disk I/O bottleneck, the harness redirects TMPDIR to /dev/shm (tmpfs) so the bbolt files live in RAM, because tmpfs is Linux-specific, the harness fails fast on non-Linux hosts.
go test ./htlcswitch/ -run TestChannelLinkFSMScenarios -v
go test ./htlcswitch -run=^$ -fuzz=FuzzChannelLinkFSM -fuzztime=1m