Skip to content

fix(e2e): chromedriver proxy host/origin, docker exec output, host-access skip, devtoolsproxy timeout#282

Merged
rgarcia merged 4 commits into
mainfrom
fix/chromedriver-proxy-host-header
Jun 11, 2026
Merged

fix(e2e): chromedriver proxy host/origin, docker exec output, host-access skip, devtoolsproxy timeout#282
rgarcia merged 4 commits into
mainfrom
fix/chromedriver-proxy-host-header

Conversation

@rgarcia

@rgarcia rgarcia commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Several related e2e/server fixes (the shared/upstream-appropriate pieces of the hypeman-e2e work, plus a pre-existing CI flake).

1. ChromeDriver proxy forwards a non-loopback Host/Origin (HTTP 500)

ChromeDriver (Chrome 111+) rejects requests whose Host/Origin isn't localhost/an IP ("Host header or origin header ... not whitelisted or localhost", HTTP 500). The :9224 proxy did r.Out.Host = r.In.Host, forwarding the inbound host; over a direct 127.0.0.1 hit (docker) that's loopback so it works, but behind an ingress it 500s every request (looks like a slow/never-ready chromedriver). Fix: keep SetURL's loopback host and strip Origin, in both the reverse-proxy path and handleCreateSession. Adds a regression test.

2. Docker exec output includes stream-multiplexing headers

dockerBackend.Exec returned the raw multiplexed Docker stream. Use testcontainers exec.Multiplexed() for clean combined stdout+stderr.

3. Backend.SupportsHostAccess + skip host-access tests on backends without it

Adds Backend.SupportsHostAccess() (docker=true, hypeman=false); TestContainer.Start t.Skips when a test requests HostAccess on a backend that can't bridge to the runner's loopback. Keyed on the request, so a test that doesn't ask for HostAccess is never skipped.

4. devtoolsproxy: raise UpstreamManager detect timeout 20s -> 60s

TestUpstreamManagerDetectsChromiumAndRestart is a pre-existing flake on main: it launches a real browser and waits for the DevTools listening on ws://... line, but chromium cold-start has a long tail on shared CI runners (recent runs: ~6s warm, 15-17s contended, occasionally >20s — failing at exactly ~20.15s, a timeout not a missing line). Raise the wait to 60s. (Supersedes #283, now closed; the browser-binary pinning explored there was unnecessary — the timeout is the real fix.)

🤖 Generated with Claude Code

…omeDriver accepts proxied requests

The ChromeDriver proxy fronts the internal `chromedriver --port=9225` on
:9224. ChromeDriver (Chrome 111+) rejects any request whose Host/Origin
header is not localhost or an IP, returning HTTP 500 with body "Host
header or origin header is specified and is not whitelisted or localhost"
(DNS-rebinding protection).

The reverse-proxy Rewrite did `r.Out.Host = r.In.Host`, which overrode
the loopback host that SetURL had set, forwarding the inbound Host to
ChromeDriver. Over a direct 127.0.0.1 connection (docker e2e) the inbound
Host is already loopback so it works, but behind an ingress (e.g.
{instance}.dev-yul-hypeman-1.kernel.sh:9224, the hypeman e2e path) the
real hostname gets forwarded and ChromeDriver 500s every request. This
looked like a slow/never-ready chromedriver because WaitChromeDriver
polls /status for a 200 that never comes.

handleCreateSession had the same class of bug: it copied all client
headers (including Origin) to the upstream.

Fix: drop the `r.Out.Host = r.In.Host` line so the upstream sees the
loopback Host, strip the Origin header in the rewrite, and skip the
Origin header in handleCreateSession's header-copy loop. Adds a
regression test (TestHandler_RewritesHostAndStripsOrigin) exercising both
the reverse-proxy path and POST /session with an ingress Host/Origin.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@firetiger-agent

Copy link
Copy Markdown

Created a monitoring plan for this PR.

What this PR does: Fixes Selenium/WebDriver session creation inside browser VMs when the VM is accessed via an ingress hostname rather than a direct loopback connection — unblocking the hypeman e2e test path and any production customer using the WebDriver protocol behind an ingress.

Intended effect:

  • Synthetic browser test pass rate: baseline ~96–120 tests/hr, 0–9 failures/hr; confirmed if this rate holds steady after new VM images are provisioned.
  • WaitChromeDriver timeout: baseline — this call hung indefinitely behind an ingress (the bug). Confirmed if e2e tests using the hypeman backend complete without timeout.
  • CI regression test TestHandler_RewritesHostAndStripsOrigin: baseline — test did not exist pre-PR. Confirmed if CI passes green on this PR.

Risks:

  • Origin strip too broadOrigin header is now removed unconditionally for all proxy paths; if any upstream WebDriver flow legitimately required Origin forwarding, sessions could fail differently. Signal: synthetic test failures > 10/hr for 2+ consecutive hours; alert if sustained.
  • WebSocket BiDi path not coveredproxyWebSocket has separate origin handling (OriginPatterns: ["*"]); if BiDi WebSocket sessions were also broken behind an ingress, they remain untested by this fix. Signal: any BiDi connection errors in e2e test output.
  • Host header dependency — removal of r.Out.Host = r.In.Host means the upstream Host depends entirely on SetURL; if URL resolution breaks, ChromeDriver could receive a wrong Host. Signal: any WaitChromeDriver timeout or ChromeDriver 500 in test logs.

Status updates will be posted automatically on this PR as monitoring progresses.

View monitor

@rgarcia rgarcia requested a review from tnsardesai June 10, 2026 14:57
dockerBackend.Exec returned the raw multiplexed Docker stream, so callers
that parse Exec output saw Docker's 8-byte frame headers interleaved with
the data. Use testcontainers exec.Multiplexed() to demultiplex into clean
combined stdout+stderr, matching what tests expect.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@rgarcia rgarcia changed the title fix(chromedriverproxy): forward loopback Host and strip Origin so ChromeDriver accepts proxied requests fix(e2e): chromedriver proxy host/origin + clean docker exec output Jun 10, 2026
…ds without it

Mirrors the host-access handling from the private hypeman-e2e work so the two
repos' e2e Backend surface stays identical (no sync drift).

Add Backend.SupportsHostAccess() (docker=true, hypeman=false) and have
TestContainer.Start t.Skip when a test requests ContainerConfig.HostAccess on a
backend that can't bridge the instance to a service on the test host. Keyed on
the request, so a test that doesn't ask for HostAccess is never skipped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
TestUpstreamManagerDetectsChromiumAndRestart launches a real browser and waits
for UpstreamManager to scrape the "DevTools listening on ws://..." line. On
shared CI runners chromium cold-start has a long tail: recent runs printed the
line in ~6s warm but 15-17s when contended, occasionally exceeding the 20s
budget and failing at exactly ~20.15s — a timeout, not a missing line. Raise
the wait to 60s (still fails fast if the browser truly never starts).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@rgarcia rgarcia changed the title fix(e2e): chromedriver proxy host/origin + clean docker exec output fix(e2e): chromedriver proxy host/origin, docker exec output, host-access skip, devtoolsproxy timeout Jun 11, 2026
@rgarcia rgarcia merged commit 52cd8e3 into main Jun 11, 2026
10 checks passed
@rgarcia rgarcia deleted the fix/chromedriver-proxy-host-header branch June 11, 2026 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants