Skip to content

feat: CORS iframes + closed shadow DOM parity#226

Draft
aryanku-dev wants to merge 15 commits into
masterfrom
feat/cors-iframes-and-shadow-dom
Draft

feat: CORS iframes + closed shadow DOM parity#226
aryanku-dev wants to merge 15 commits into
masterfrom
feat/cors-iframes-and-shadow-dom

Conversation

@aryanku-dev

Copy link
Copy Markdown

Summary

Brings percy-selenium-python to parity with the canonical Percy CORS iframe + closed shadow DOM feature set.

Implemented

  • Inlined Python helpers (DEFAULT_MAX_FRAME_DEPTH, clamp_frame_depth, normalize_ignore_selectors, is_unsupported_iframe_src, resolve_max_frame_depth, resolve_ignore_selectors)
  • Nested cross-origin iframe capture (depth-capped, cycle-guarded)
  • data-percy-ignore attribute opt-out
  • ignoreIframeSelectors option
  • Post-switch URL re-check via is_unsupported_iframe_src
  • PercyContextLost recovery merges partial_capture
  • Closed shadow DOM capture via CDP (expose_closed_shadow_roots)

Skipped

  • ElementInternals preflight (Feature 8): N/A — selenium-python has no before-page-load hook.
  • @percy/sdk-utils version bump (Feature 9): not applicable to Python; helpers inlined.

Reference

Mirrored from percy/percy-nightwatch#869 (PER-7292-add-cors-iframe-support); CDP from percy/percy-playwright#609.

Test plan

  • Full repo test suite passed locally
  • Manual smoke: cross-origin iframes
  • Manual smoke: closed shadow roots in Chromium

🤖 Generated with Claude Code via /percy-sdk-sync

aryanku-dev and others added 15 commits May 11, 2026 12:23
…arity

Brings the same helper surface used by percy-nightwatch / percy-webdriverio
into percy-selenium-python directly: DEFAULT_MAX_FRAME_DEPTH, clamp_frame_depth,
normalize_ignore_selectors, is_unsupported_iframe_src, get_origin,
resolve_max_frame_depth, resolve_ignore_selectors, and the PercyContextLost
exception. No SDK version bump — Python doesn't share a utils package with
the JS SDKs, so the helpers ship inline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implements feature parity with percy-nightwatch / percy-webdriverio for
cross-origin iframe serialization:

- Replaces the flat single-level iframe scan with a recursive
  ``process_frame_tree`` walk bounded by ``DEFAULT_MAX_FRAME_DEPTH``
  (5, overridable via ``maxIframeDepth`` option or
  ``percy.config.snapshot.maxIframeDepth``).
- Adds an ancestor-URL cycle guard so frames that link back to a
  previously-visited URL stop descending instead of recursing forever.
- Adds an ``enumerate_iframes_script`` JS helper that runs inside the
  current frame context and returns metadata for every iframe
  (src, srcdoc, percyElementId, dataPercyIgnore, matchesIgnoreSelector,
  index). Nested-frame discovery now uses this script in the child
  context so nested-frame origin comparisons are against the *immediate*
  parent origin, not the page origin.
- ``data-percy-ignore`` attribute opt-out: any iframe with this attribute
  is dropped before any switch.
- ``ignoreIframeSelectors`` option (and ``ignore_iframe_selectors`` /
  ``percy.config.snapshot.ignoreIframeSelectors``): selectors are baked
  into the in-browser enumeration script so matching iframes are dropped
  before being processed.
- Post-switch URL re-check via ``is_unsupported_iframe_src``: after
  switching into a frame we read ``document.URL`` and bail if the
  loaded document is about:blank, about:srcdoc, a net-error page, or
  another unsupported scheme.
- ``PercyContextLost`` recovery: if ``switch_to.parent_frame()`` fails
  at depth > 1 we raise ``PercyContextLost`` carrying the
  ``partial_capture`` collected so far. The top-level walk merges that
  partial capture into the final ``corsIframes`` payload before
  aborting sibling iteration (whose enumeration was performed in a
  now-lost context).

All per-frame serialize calls force ``enableJavaScript=True`` to bypass
the standard iframe inlining path inside PercyDOM.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds ``expose_closed_shadow_roots`` (mirrored from percy-playwright's
``exposeClosedShadowRoots``) so that PercyDOM.serialize can capture
closed-mode shadow DOM that ordinary DOM traversal cannot reach.

Flow:
1. ``DOM.enable`` — gates non-Chromium drivers silently (Firefox/WebKit
   will fail this call and we no-op without touching the page).
2. ``DOM.getDocument`` with ``depth=-1, pierce=True`` — walks the full
   DOM tree including every shadow root.
3. Recurse the tree, collecting (host, shadowRoot) backend-node pairs
   for each ``shadowRootType=='closed'`` entry. Subtrees inside an
   iframe's ``contentDocument`` are skipped — their JS execution
   contexts can't see the page's WeakMap.
4. Create ``window.__percyClosedShadowRoots`` (same key PercyDOM uses).
5. For each pair, ``DOM.resolveNode`` both ends, then
   ``Runtime.callFunctionOn`` to stash the shadow root in the WeakMap
   keyed by its host element.

Wired into ``percy_snapshot`` immediately after PercyDOM injection and
re-primed after ``driver.refresh()`` inside ``capture_responsive_dom``
(the WeakMap is destroyed on navigation).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
capture_responsive_dom was unconditionally writing output_file.json to the
user's CWD on every snapshot, polluting CI workspaces. Only dump when
PERCY_DEBUG is set, and swallow IO errors so a read-only CWD doesn't break
capture.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The exact 3.10.3 patch isn't on the actions/setup-python@v6 cache that the
GitHub Automatic Dependency Submission workflow uses, so submit-pypi fails
on resolution. Pinning to the 3.10 line resolves to the latest cached
3.10.x without affecting our test matrix (test.yml passes python-version
explicitly). Mirrors the fix applied to percy-playwright-python.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A pure-srcdoc iframe has no src attribute. The previous ordering ran the
unsupported-src branch first, so pure-srcdoc iframes were silently logged
as "unsupported src" instead of being routed through the srcdoc-specific
branch (which is the path that downstream consumers / parity tests rely on).

Reorder so srcdoc is checked first; an iframe carrying both srcdoc and a
real src still takes the srcdoc path because srcdoc wins by spec.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The cycle guard at the top of process_frame_tree only compared
iframe_meta['src'] (the static, pre-switch attribute) against ancestor_urls
which was seeded with page_url (the post-resolution URL). For a redirect
chain — src=A loads B via 30x; B carries src=A — the inner B never matched
"A" in ancestors so the cycle wasn't caught until the second hop wasted a
full serialize+enumerate round-trip.

Add the resolved post-switch document.URL to the comparison so the cycle
trips on whichever form happens to match the ancestor chain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The broad-except at the bottom of process_frame_tree was logging every
failure at debug, including top-level (depth==1) failures. That meant a
user-visible iframe silently going missing produced zero visible output
unless PERCY_LOGLEVEL=debug — exactly the case where users would never
realize a capture was incomplete.

Surface depth==1 failures at info so missing top-level iframes are visible
in normal runs. Deeper nested failures stay at debug — chatty pages with
many nested iframes (ad/tracker grids) can produce a lot of those.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous walker was a recursive Python function. CPython's default
recursion limit is ~1000; very deep DOM trees would raise RecursionError
which the outer broad-except silently swallowed — so deep pages lost
closed-shadow exposure with no diagnostic.

Switch to an explicit stack-based walker. Memory now scales with tree
breadth rather than tree depth, and deeply linear chains (3000+ nodes)
complete cleanly. Add pylint disable for too-many-nested-blocks (the
shape of the iterative version trips the default cap of 5).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The 1.30.9 pin printed an "out of date by 10+ releases" warning on CI and
appears to interact badly with the test harness on python 3.9/3.10 runs
(the test process never gets past the warning banner). 1.31.14 is the
current release line used across the other percy SDKs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Calling the underscore-prefixed helper directly from a regression test is
intentional — the function is the boundary the fix lives at. Disable the
pylint warning class-wide so the lint gate stays green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The earlier run (26355895088) had both Python 3.9 and 3.10 jobs stuck
in_progress while Test (3.8) on the same run completed cleanly in 1m40s.
Empty commit to kick a fresh CI run and supersede the hung jobs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nd-shadow-dom

# Conflicts:
#	package.json
#	percy/snapshot.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant