fix: make MergeRun wg.Wait() and CnServerMessageHandler connection wait cancelable#25035
Draft
ck89119 wants to merge 2 commits into
Draft
fix: make MergeRun wg.Wait() and CnServerMessageHandler connection wait cancelable#25035ck89119 wants to merge 2 commits into
ck89119 wants to merge 2 commits into
Conversation
…it cancelable (matrixorigin#25025) Three fixes to prevent distributed query deadlock when cross-CN data stream stalls: 1. MergeRun defer wg.Wait(): use goroutine+select with ctx.Done() to avoid blocking forever when sub-routines fail to call wg.Done(). 2. sendNotifyMessage closeWithError: use select for Ch2 send to avoid blocking when pipeline consumer has stopped. 3. CnServerMessageHandler connection wait: observe messageCtx.Done() in addition to connectionCtx.Done() so killed queries don't leave handlers blocked waiting for TCP close. Previously KILL/cancel had no effect on these blocking points, making the query unkillable and table locks unreleasable without CN restart. Co-Authored-By: Claude <noreply@anthropic.com>
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fix distributed INSERT...SELECT deadlock where cross-CN data stream stall causes Scope.MergeRun to block forever in wg.Wait(), making the query unkillable and table locks unreleasable without CN restart.
Root Cause
Three blocking points form a cascade that ignores KILL/context cancellation:
reg.Ch2 <- signalwhen pipeline consumer stoppedwg.Wait()never returns because sub-routines can't callwg.Done()<-receiver.connectionCtx.Done()(TCP close only, not cancellable)Changes
reg.Ch2 <-changed toselectwith ctx.Done()wg.Wait()wrapped in goroutine + select with ctx.Done()Tests
3 new unit tests + 3 existing tests all pass:
Issue
Fixes #25025
🤖 Generated with Claude Code