Skip to content

htlcswitch: separate onchain and offchain intercpeted HTLCs cleanly#10895

Open
ziggie1984 wants to merge 9 commits into
lightningnetwork:masterfrom
ziggie1984:fix-onchain-interceptor-held-htlc
Open

htlcswitch: separate onchain and offchain intercpeted HTLCs cleanly#10895
ziggie1984 wants to merge 9 commits into
lightningnetwork:masterfrom
ziggie1984:fix-onchain-interceptor-held-htlc

Conversation

@ziggie1984

@ziggie1984 ziggie1984 commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Change Description

Fixes #10892

Look at the first commit which shows via itests the failure cases of the old Interceptor Implementation

The second commit introduces a new interface and distingishes between onchain and offchain HTLC for the interceptor. It does not change any public interface.

@ziggie1984 ziggie1984 self-assigned this Jun 11, 2026
@github-actions github-actions Bot added the severity-critical Requires expert review - security/consensus critical label Jun 11, 2026
@github-actions

Copy link
Copy Markdown

PR Severity: CRITICAL

Highest-severity file determines level | 4 non-test files | 467 lines changed

Critical (2 files):

  • htlcswitch/held_htlc_set.go - HTLC forwarding state machine
  • htlcswitch/interceptable_switch.go - HTLC forwarding/payment routing interceptable switch

Medium (1 file):

  • witness_beacon.go - Root-level Go file

Low (4 files - excluded from severity bump counts):

  • docs/release-notes/release-notes-0.21.1.md
  • htlcswitch/held_htlc_set_test.go
  • itest/list_on_test.go
  • itest/lnd_forward_interceptor_test.go

Analysis

This PR modifies core htlcswitch package files. Changes to held_htlc_set.go and interceptable_switch.go affect the HTLC forwarding and payment routing state machine, warranting expert review.

Bump check: 4 non-test files (threshold >20), 467 lines changed (threshold >500), 1 critical package. No bump applied.

To override, add a severity-override-{critical,high,medium,low} label.
<!-- pr-severity-bot -->

@ziggie1984 ziggie1984 added this to the v0.21.1 milestone Jun 11, 2026
@ziggie1984 ziggie1984 added this to v0.21 Jun 11, 2026
@gemini-code-assist

Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the HTLC interception mechanism to accurately manage HTLCs that transition from off-chain forwarding to on-chain resolution, particularly in scenarios involving channel force-closures. By introducing distinct handling for off-chain and on-chain held HTLCs, the changes prevent incorrect auto-failing or misdirection of settlement attempts, thereby enhancing the robustness and reliability of HTLC management within the system.

Highlights

  • Differentiated HTLC Holding: The heldHtlcSet now distinguishes between offChainHeld and onChainHeld entries, allowing for different behaviors based on whether an HTLC is in the off-chain forwarding flow or the on-chain contractcourt flow.
  • Correct On-Chain HTLC Resolution: On-chain held HTLCs are now settle-only and expire by pruning local interceptor state, preventing issues like premature auto-failing after a restart or incorrect routing of settlement messages to the link mailbox instead of the witness beacon.
  • Dynamic HTLC State Transition: If an HTLC is initially held off-chain and then re-offered on-chain (e.g., due to a channel force-close), the stored entry in the heldHtlcSet is replaced to ensure future settlement attempts correctly reach the witness beacon.
  • Robust Resolution Logic: HTLC resolution now removes an entry from the held set only after the selected action (settle, fail, resume) successfully completes, improving atomicity and reliability.
  • Integration Test Coverage: New integration tests have been added to cover critical scenarios, including settling an on-chain intercepted HTLC after a node restart and settling an HTLC that transitions from off-chain to on-chain without a restart.
New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the heldHtlcSet to distinguish between off-chain and on-chain held HTLCs using a new heldEntry interface, ensuring proper handling of resolutions and expirations for both flows. It also updates the InterceptableSwitch and witness_beacon to support this, and adds comprehensive unit and integration tests. A critical deadlock risk was identified in witness_beacon.go where SubscribeUpdates holds a lock while making a blocking call to the interceptor, which can conflict with the switch's event loop attempting to acquire the same lock during resolution; releasing the lock early is recommended.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread witness_beacon.go Outdated
@ziggie1984 ziggie1984 force-pushed the fix-onchain-interceptor-held-htlc branch 2 times, most recently from cf2e20c to 02347b0 Compare June 11, 2026 01:35
@ziggie1984

Copy link
Copy Markdown
Collaborator Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the heldHtlcSet to track held HTLCs as either off-chain or on-chain entries using a new heldEntry interface. This change ensures that on-chain re-offers can replace old off-chain holds, allowing settlements to reach the witness beacon after an incoming channel force-closes. Additionally, integration tests are added to verify on-chain settlement scenarios, and minor fixes are made to the witness beacon subscription flow. The reviewer's feedback correctly points out a style guide violation where the newHeldHtlcSet function lacks a documentation comment.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread htlcswitch/held_htlc_set.go
@ziggie1984 ziggie1984 force-pushed the fix-onchain-interceptor-held-htlc branch from 02347b0 to 6495243 Compare June 11, 2026 01:50
@saubyk saubyk removed this from v0.21 Jun 11, 2026
@saubyk saubyk added this to lnd v0.22 Jun 11, 2026
@saubyk saubyk removed this from the v0.21.1 milestone Jun 11, 2026
@github-project-automation github-project-automation Bot moved this to Backlog in lnd v0.22 Jun 11, 2026
@saubyk saubyk added this to the v0.22.0 milestone Jun 11, 2026
@saubyk saubyk moved this from Backlog to In progress in lnd v0.22 Jun 11, 2026
@ziggie1984 ziggie1984 modified the milestones: v0.22.0, v0.21.1 Jun 11, 2026
@saubyk saubyk added this to v0.21 Jun 11, 2026
@saubyk saubyk removed this from lnd v0.22 Jun 11, 2026
@saubyk saubyk moved this to In progress in v0.21 Jun 11, 2026
@ziggie1984 ziggie1984 force-pushed the fix-onchain-interceptor-held-htlc branch 4 times, most recently from 5e3c3c3 to 424d1a8 Compare June 11, 2026 20:35
@ziggie1984

Copy link
Copy Markdown
Collaborator Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the heldHtlcSet to support both off-chain and on-chain held HTLCs via a new heldEntry interface, resolving an issue where on-chain forward interceptor settlements failed to reach the witness beacon after a channel force-close. The changes also include updates to InterceptableSwitch and witness_beacon.go, comprehensive unit and integration tests, and updated documentation. Feedback is provided to add a documentation comment to newHeldHtlcSet to align with the repository's style guide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread htlcswitch/held_htlc_set.go
@ziggie1984 ziggie1984 requested a review from yyforyongyu June 11, 2026 20:51
@ziggie1984 ziggie1984 changed the title htlcswitch: track on-chain held HTLCs htlcswitch: separate onchain and offchain intercpeted HTLCs cleanly Jun 11, 2026
@ziggie1984 ziggie1984 force-pushed the fix-onchain-interceptor-held-htlc branch from 424d1a8 to ebe03ed Compare June 11, 2026 21:07
@github-actions github-actions Bot added severity-critical Requires expert review - security/consensus critical and removed severity-critical Requires expert review - security/consensus critical labels Jun 11, 2026
Comment thread htlcswitch/held_htlc_set.go Outdated
Comment thread htlcswitch/held_htlc_set.go Outdated
Comment thread htlcswitch/held_htlc_set.go Outdated
@ziggie1984 ziggie1984 force-pushed the fix-onchain-interceptor-held-htlc branch 3 times, most recently from d16ea33 to c16a193 Compare June 12, 2026 14:39
@ziggie1984 ziggie1984 requested a review from yyforyongyu June 12, 2026 14:50
@ziggie1984

Copy link
Copy Markdown
Collaborator Author

For clarity, here is the bug/failure-mode breakdown this PR fixes:

  1. Post-restart on-chain hold eviction
    The restart itest covers the case where Bob loses the in-memory held set, contractcourt re-offers the HTLC through the witness beacon, and the on-chain entry could previously be evicted on the next block before the interceptor settled it.

  2. Stale off-chain hold shadowing the on-chain re-offer
    The no-restart itest covers the case where the original off-chain held entry stayed around after the incoming channel force closed, so a later SETTLE could be routed to the stale off-chain path instead of reaching the on-chain resolver.

  3. Interceptor disconnect/reconnect handling for on-chain holds
    On-chain held HTLCs must survive interceptor stream disconnects. They cannot be released/resumed like off-chain holds, because there is no live link flow to resume. The PR now keeps on-chain holds until settlement or on-chain expiry/pruning.

  4. Invalid non-settle actions for on-chain holds
    Once a held HTLC has moved on-chain, the meaningful interceptor action is SETTLE with the preimage. Non-settle actions such as FAIL/RESUME should not tear down the on-chain held entry or crash the handling path.

  5. Resolve-then-lose-on-error ordering
    The old resolve flow removed the held entry before executing the action. If the action returned an error, the entry was already gone and the HTLC could be stranded. The new flow only removes the entry after the action succeeds.

  6. Witness beacon error-path deadlock/subscriber leak
    The witness beacon interceptor error path could hold the lock while cancellation reacquired it, and could also leave the subscriber registered. The PR avoids that deadlock path and cleans up the subscription correctly.

So the short version is: the two itests cover the two main incident paths, and the follow-up commits harden the surrounding on-chain held HTLC lifecycle so disconnects, invalid actions, action errors, and witness-beacon error handling do not regress the same settlement path.

@ziggie1984 ziggie1984 force-pushed the fix-onchain-interceptor-held-htlc branch from c16a193 to 8b7ee4a Compare June 12, 2026 15:28
@ziggie1984 ziggie1984 added the backport-v0.21.x-branch This label triggers a backport to branch `v0.21.x-branch ` label Jun 12, 2026
@ziggie1984 ziggie1984 force-pushed the fix-onchain-interceptor-held-htlc branch from 8b7ee4a to 7546618 Compare June 13, 2026 02:59
Add coverage for held forwards that move on chain after the
incoming channel force closes.

The restart case exercises the path where Bob loses the in-memory
held set and contractcourt re-offers the HTLC through the witness
beacon. The no-restart case keeps the original off-chain hold and
proves that settlement must still reach the on-chain resolver.
Store held forwards as off-chain or on-chain entries instead of a raw
InterceptedForward map. Off-chain entries keep the existing resume, fail,
settle and auto-fail behavior. On-chain entries are settle-only and
expire by pruning local interceptor state.

When contractcourt re-offers a circuit that is already held off-chain,
replace the stored entry with the on-chain forward so a later SETTLE
reaches the witness beacon instead of the old link mailbox path.

Also set the on-chain interceptor deadline to the HTLC refund timeout.
This keeps the public interceptor deadline populated while ensuring only
off-chain held entries use that value to fail back.
Release the preimage beacon lock before invoking the on-chain
interceptor. The interceptor path can block on the htlcswitch event
loop, while resolution of another held on-chain HTLC can call back
into the beacon to add a preimage.

If interceptor delivery fails after the subscriber was registered,
cancel the subscription before returning the error.
Only off-chain held HTLCs can be released when an optional interceptor
disconnects, because they can resume into the link forwarding flow.

On-chain held HTLCs have no link flow to resume. Keep them in the held
set so a reconnecting interceptor can replay and settle them while
contractcourt waits for the preimage or on-chain expiry.
When an off-chain held HTLC is promoted to on-chain, the interceptor is
not notified again because it already has the circuit key. A client may
therefore still answer with an off-chain decision such as resume or fail.

Those actions cannot be applied to the on-chain resolver, but returning an
error tears down the whole interceptor stream. Log and ignore non-settle
actions instead, keeping the entry available for a later settle or expiry.
On-chain held entries are replay handles for the interceptor while
contractcourt waits for a preimage or on-chain expiry. Once the resolver
tears down, keeping the handle until the refund timeout can replay a stale
HTLC to a reconnecting interceptor.

Thread a dedicated cleanup signal from the witness subscription cancel path
back through the interceptable switch event loop. The held set only removes
on-chain entries for that signal, leaving off-chain entries under the link
flow lifecycle.
@ziggie1984 ziggie1984 force-pushed the fix-onchain-interceptor-held-htlc branch from 7546618 to 6e385f6 Compare June 13, 2026 12:42
@ziggie1984 ziggie1984 moved this from In progress to In review in v0.21 Jun 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-v0.21.x-branch This label triggers a backport to branch `v0.21.x-branch ` bug fix severity-critical Requires expert review - security/consensus critical

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

[bug]: on-chain intercepts evicted after one block due to unset AutoFailHeight

3 participants