Surface still-running extensions during MTP shutdown (refs #5345)#8582
Surface still-running extensions during MTP shutdown (refs #5345)#8582Evangelink wants to merge 4 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a shutdown-progress tracking/reporting service to Microsoft.Testing.Platform (MTP) so that after Ctrl+C cancellation, the host can periodically warn which extensions/consumers are still blocking shutdown.
Changes:
- Introduces
IShutdownProgressReporter+ defaultShutdownProgressReporterwatchdog that emits “Still waiting for …” warnings after a quiet window once cancellation is requested. - Wraps key shutdown await sites with tracking scopes (session finishing handlers + async consumer drain).
- Adds unit tests and localized resource strings for the warning prefix.
Show a summary per file
| File | Description |
|---|---|
| test/UnitTests/Microsoft.Testing.Platform.UnitTests/Services/ShutdownProgressReporterTests.cs | Adds unit tests covering tracking/snapshot and watchdog emission behavior. |
| src/Platform/Microsoft.Testing.Platform/Services/IShutdownProgressReporter.cs | Introduces internal tracking interface for shutdown in-flight work. |
| src/Platform/Microsoft.Testing.Platform/Services/ShutdownProgressReporter.cs | Implements watchdog that reports still-running shutdown work via output device + logging. |
| src/Platform/Microsoft.Testing.Platform/Hosts/TestHostBuilder.CommonServices.cs | Registers ShutdownProgressReporter in the common service provider (after output device). |
| src/Platform/Microsoft.Testing.Platform/Hosts/CommonTestHost.cs | Tracks OnTestSessionFinishingAsync awaits (non-consumer + consumer passes). |
| src/Platform/Microsoft.Testing.Platform/Messages/AsynchronousMessageBus.cs | Tracks per-consumer DrainDataAsync awaits; adds ctor overload to accept reporter. |
| src/Platform/Microsoft.Testing.Platform/Hosts/TestHostBuilder.Framework.cs | Passes IShutdownProgressReporter into AsynchronousMessageBus construction. |
| src/Platform/Microsoft.Testing.Platform/Hosts/TestHostControllersTestHost.cs | Passes IShutdownProgressReporter into AsynchronousMessageBus construction. |
| src/Platform/Microsoft.Testing.Platform/Resources/PlatformResources.resx | Adds localized resource key for “Still waiting for: ” prefix. |
| src/Platform/Microsoft.Testing.Platform/Resources/xlf/PlatformResources.cs.xlf | Adds new trans-unit for the shutdown progress prefix. |
| src/Platform/Microsoft.Testing.Platform/Resources/xlf/PlatformResources.de.xlf | Adds new trans-unit for the shutdown progress prefix. |
| src/Platform/Microsoft.Testing.Platform/Resources/xlf/PlatformResources.es.xlf | Adds new trans-unit for the shutdown progress prefix. |
| src/Platform/Microsoft.Testing.Platform/Resources/xlf/PlatformResources.fr.xlf | Adds new trans-unit for the shutdown progress prefix. |
| src/Platform/Microsoft.Testing.Platform/Resources/xlf/PlatformResources.it.xlf | Adds new trans-unit for the shutdown progress prefix. |
| src/Platform/Microsoft.Testing.Platform/Resources/xlf/PlatformResources.ja.xlf | Adds new trans-unit for the shutdown progress prefix. |
| src/Platform/Microsoft.Testing.Platform/Resources/xlf/PlatformResources.ko.xlf | Adds new trans-unit for the shutdown progress prefix. |
| src/Platform/Microsoft.Testing.Platform/Resources/xlf/PlatformResources.pl.xlf | Adds new trans-unit for the shutdown progress prefix. |
| src/Platform/Microsoft.Testing.Platform/Resources/xlf/PlatformResources.pt-BR.xlf | Adds new trans-unit for the shutdown progress prefix. |
| src/Platform/Microsoft.Testing.Platform/Resources/xlf/PlatformResources.ru.xlf | Adds new trans-unit for the shutdown progress prefix. |
| src/Platform/Microsoft.Testing.Platform/Resources/xlf/PlatformResources.tr.xlf | Adds new trans-unit for the shutdown progress prefix. |
| src/Platform/Microsoft.Testing.Platform/Resources/xlf/PlatformResources.zh-Hans.xlf | Adds new trans-unit for the shutdown progress prefix. |
| src/Platform/Microsoft.Testing.Platform/Resources/xlf/PlatformResources.zh-Hant.xlf | Adds new trans-unit for the shutdown progress prefix. |
Copilot's findings
- Files reviewed: 22/22 changed files
- Comments generated: 0
Self-review summaryI went through the PR end-to-end to weigh the added complexity against the value. My take: it's worth landing as an incremental improvement, with the minor catch-clause tightening I just pushed in 5cf2741. Value vs. complexity
Caveats already disclosed in the PR body
CodeQL bot comments
Verification
|
When the test-application cancellation token is signalled and shutdown takes longer than expected, MTP now periodically prints a 'Still waiting for: ...' warning listing the extensions and consumers that have not yet returned. This makes a hanging Ctrl+C observable without users having to inspect the process state. Adds a new internal IShutdownProgressReporter service plus a default ShutdownProgressReporter implementation. The reporter wraps the three known blocking await sites: * ITestSessionLifetimeHandler.OnTestSessionFinishingAsync (both non-consumer and consumer passes in CommonTestHost) * IAsyncConsumerDataProcessor.DrainDataAsync per consumer in AsynchronousMessageBus The watchdog only starts after the cancellation token fires, waits a quiet window (3s) to avoid noise on clean shutdowns, then polls every second. Output goes through IOutputDevice as a WarningMessageOutputDeviceData. Refs #5345. Complementary to #8580 - the same trackers will feed the eventual `force-killed because X didn't drain` message once the phased-shutdown RFC lands. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addresses three CodeQL findings about generic catch clauses in the watchdog and ReportAsync of ShutdownProgressReporter. Behavior is intentionally preserved -- these are best-effort during shutdown -- so the only change is making the swallow explicit. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
5cf2741 to
0df243d
Compare
Capture the watchdog stop token synchronously in OnCancellationRequested and thread it through RunWatchdogAsync into ReportAsync, so a concurrent Dispose() (which both cancels and disposes the CTS) cannot turn the DisplayAsync call site into an ObjectDisposedException path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| IReadOnlyList<TrackedWork> snapshot = Snapshot(); | ||
| if (snapshot.Count == 0) | ||
| { | ||
| return; | ||
| } |
There was a problem hiding this comment.
Good catch — addressed in 5dbd63e by keeping the watchdog loop alive when the snapshot is empty and only emitting when there is something to report. Late-tracked work after the quiet window is now reported on the next poll instead of being silently missed.
| [TestMethod] | ||
| public async Task Watchdog_StopsEmitting_OnceAllTrackersDisposed() | ||
| { | ||
| using ShutdownProgressReporter reporter = CreateReporter(quietWindow: TimeSpan.FromMilliseconds(20), pollInterval: TimeSpan.FromMilliseconds(20)); | ||
| IDisposable handle = reporter.Track("uid-1", "Display 1", "Phase 1"); | ||
|
|
||
| CancelTokenSource(); | ||
| await WaitForMessageAsync(TimeSpan.FromSeconds(5)); | ||
|
|
||
| handle.Dispose(); | ||
| int countAfterDispose = _outputDevice.Messages.Count; | ||
|
|
||
| await Task.Delay(200, TestContext.CancellationToken); | ||
|
|
||
| // Allow at most one extra in-flight emission that was already running when we disposed. | ||
| Assert.IsLessThanOrEqualTo(countAfterDispose + 1, _outputDevice.Messages.Count); | ||
| } |
There was a problem hiding this comment.
Added in 5dbd63e as Watchdog_EmitsForLateTracking_WhenQuietWindowElapsedWithNoWork: cancels first, waits past the quiet window with zero tracked work, then begins tracking and asserts the late work is reported. Verified passing on net8.0, net9.0, and net462.
If cancellation happens before any shutdown work is tracked (e.g. extensions only start their shutdown handlers after the quiet window has elapsed), the watchdog must keep polling instead of exiting silently. Otherwise late tracking is never reported. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Closes part of #5345 (incremental improvement, complementary to the phased-shutdown RFC in #8580).
Problem
Today when a user hits
Ctrl+Con a hanging MTP test run, the cancellation token gets signalled but the user has no visibility into which extension or consumer is blocking the shutdown. The process just appears stuck.Approach
Add a small
IShutdownProgressReporterservice that:using (reporter.Track(uid, displayName, phase)) { await ... }.Still waiting for: <Name1> (<Phase1>, <Ns>); <Name2> ...warning throughIOutputDevice.The three known blocking await sites are now wrapped:
ITestSessionLifetimeHandler.OnTestSessionFinishingAsync(non-consumer pass) —CommonTestHostITestSessionLifetimeHandler.OnTestSessionFinishingAsync(consumer pass) —CommonTestHostIAsyncConsumerDataProcessor.DrainDataAsyncper consumer —AsynchronousMessageBusThis is intentionally complementary to #8580. The RFC there proposes a full phased-shutdown protocol with a graceful timer and forced abort. The same trackers introduced here will feed the eventual
Force-killed because X didn't drainmessage in that bigger work — but in the meantime this PR ships value independently with zero behavior change on the happy path.Limitations
Uid/DisplayName).WarningMessageOutputDeviceData(yellow text); coordination with the live progress line is best-effort.PlatformResources.resx/ regenerated.xlffiles.Tests
7 new unit tests in
ShutdownProgressReporterTestscovering snapshot semantics, idempotent dispose, quiet-window behavior, post-cancellation emission, no-emit-if-all-disposed, and post-disposalTrack. All pass on net9.0 and net462.Out of scope