Skip to content

feat(#442484): suppress TaskCanceledException APM noise during pod graceful shutdown#113

Merged
benspeth merged 1 commit into
masterfrom
teams/core/209/us/442484
Jun 5, 2026
Merged

feat(#442484): suppress TaskCanceledException APM noise during pod graceful shutdown#113
benspeth merged 1 commit into
masterfrom
teams/core/209/us/442484

Conversation

@ecofrankie
Copy link
Copy Markdown
Collaborator

Summary

Adds a guard clause to ReceiverWrapper.OnExceptionOccured that detects OperationCanceledException / TaskCanceledException raised during pod graceful shutdown and logs them at Debug instead of Error.

Root cause: When a pod shuts down, CloseAsync cancels all concurrent receive loops simultaneously. Each loop raises an OperationCanceledException with IsCancellationRequested == true and calls ProcessErrorAsync, which previously logged every one at Error level — producing ~250 spurious APM error entries per day in ExternalCommEmailing.

Fix: In OnExceptionOccured, check exceptionEvent.Exception is OperationCanceledException oce && oce.CancellationToken.IsCancellationRequested. If true → log at Debug and return early. Real transport errors (non-cancelled token) continue to log at Error.

Changes

  • src/Ev.ServiceBus/Management/Wrappers/ReceiverWrapper.cs — guard clause + initialise _onExceptionReceivedHandler to non-nullable no-op default
  • tests/Ev.ServiceBus.UnitTests/ReceiverWrapperTests.cs — 2 new unit tests covering both paths
  • docs/CHANGELOG.md — 5.7.2 entry

Test plan

  • OnExceptionOccured_WithCancelledToken_DoesNotLogError — verifies no LogError call when IsCancellationRequested == true
  • OnExceptionOccured_WithNonCancelledToken_LogsError — verifies LogError is still called for genuine transport errors
  • All existing tests pass

…aceful shutdown

Add guard clause to ReceiverWrapper.OnExceptionOccured: when the exception is
an OperationCanceledException with IsCancellationRequested=true, log at Debug
and return early instead of logging at Error. This eliminates ~250 spurious APM
error entries per day caused by all concurrent receivers being cancelled
simultaneously during pod graceful shutdown via CloseAsync.

Real transport errors (non-cancelled token) continue to be logged at Error,
preserving the original behaviour for genuine failures.

Also initialise _onExceptionReceivedHandler to a no-op default (non-nullable)
so OnExceptionOccured is safe to call without RegisterMessageHandler, which
enables direct unit testing via a TestableReceiverWrapper subclass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ecofrankie ecofrankie force-pushed the teams/core/209/us/442484 branch from e4a4b5c to c0a43b9 Compare June 5, 2026 08:32
@benspeth benspeth self-assigned this Jun 5, 2026
@benspeth benspeth added the bug Something isn't working label Jun 5, 2026
@benspeth benspeth merged commit de4367d into master Jun 5, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants