Test/trig type 1#1446
Closed
ruck314 wants to merge 15 commits into
Closed
Conversation
Run only the cocotb tests whose RTL dependencies actually changed on feature-branch pushes, while tags, main, and PRs into main still run the complete suite. The dependency graph is derived from the actual build (build/*.cf + `ghdl --gen-depends`) at CI time, so it cannot drift from the RTL, and every indeterminate case fails safe toward running more tests. - tests/common/dep_map.py (importable core) + `python -m tests.common` CLI: map each cocotb test (by its toplevel) to its transitive RTL sources via build/*.cf and `ghdl --gen-depends`, diff against the merge-base with origin/main, and print the affected pytest targets (or a FORCE_FULL sentinel). Any indeterminate case falls back to always-run or a full run. - Wrapper / IP-integrator resolution: most cocotb tests wrap their DUT in a **/wrappers/ or **/ip_integrator/ entity (cocotb cannot drive VHDL records), and `make import` never compiles those sources -- so their toplevels are absent from the surf library and `ghdl --gen-depends` cannot resolve them. Import those sources into the resolver's library first (with absolute paths, so the .cf entries stay REPO_ROOT-resolvable) so each such test resolves to its real transitive dependency set instead of falling back to always-run. Subsystems whose core RTL is family-gated out of the generic GHDL build (pgp2b, pgp2fc, coaxpress) remain always-run by fail-safe. - Changed-wrapper attribution: a changed **/wrappers/ file selects exactly the test(s) whose toplevel instantiates it (or FORCE_FULL if unattributable) and never fans out as another test's production dependency. - Shared base-package changes (e.g. StdRtlPkg, AxiLitePkg, AxiStreamPkg, or any high-fan-out hub) force a full run, detected by a data-driven fan-out threshold over the built dependency map. - Strict-subset guarantee: a runtime guard intersects the selection with the discovered test universe before emitting it, and a randomized property test asserts the selected set is always a subset of the full suite. - .github/workflows/surf_ci.yml: trigger on push and pull_request into main (preserving concurrency cancel-in-progress); selective on feature-branch pushes, full on tag/main/PR->main; fail open to the full suite on resolver error. The regression job installs GHDL's official LLVM build via ghdl/setup-ghdl (v6.0.0) for --gen-depends only, and runs the cocotb simulations on the apt mcode ghdl. The lint/syntax job is unchanged. - tests/common/test_dep_map.py: unit and property tests for the parser, the wrapper index, wrapper/ip_integrator source discovery and import, selection routing, base-package force-full, and the strict-subset invariant.
18c0927 to
40058ca
Compare
- discover_toplevels() now derives its scan root from the repo_root argument (repo_root / "tests") instead of the module-global TESTS_ROOT, including the not-found error message, so the parameter matches the function's behavior and the resolver can run against alternate checkouts/fixtures. Drop the now-unused TESTS_ROOT import. - surf_ci.yml: gate the official LLVM GHDL install (ghdl/setup-ghdl) to selective runs only. Full runs (tag/main/PR->main) skip the resolver entirely and build/simulate on the apt mcode ghdl, so the llvm install was pure overhead there. Determine run mode is moved ahead of the setup step to drive the gate, and the full-run path now uses a plain mcode `make import` + pytest. - surf_ci.yml: drop `--cov` from the selective-mode pytest invocations. The Codecov step is full-run-only, so coverage collected on feature-branch pushes was never uploaded -- wasted work and artifact churn.
The dependency resolver overrode its ghdl binary via a GHDL env var, but the repo-wide convention is GHDL_CMD (Makefile exports it, and regression_utils.py reads it). Setting only GHDL_CMD -- as `make import` does -- left the resolver on a different ghdl, so it could run --gen-depends on the apt mcode backend (which lacks it) and produce an incompatible .cf/gen-depends analysis. Read GHDL_CMD instead, parsed with shlex.split to match regression_utils.py (tolerates a command with args), and update the CI resolver invocation to export GHDL_CMD rather than GHDL.
tests/common/test_dep_map.py exercises the resolver, but neither the full-suite pytest invocation nor the FORCE_FULL fail-open path included tests/common, so resolver changes were never run in CI. Add tests/common to both pytest target lists. The suite is pure-Python (all GHDL paths are monkeypatched) so it runs on the mcode runner in a few seconds and needs no --gen-depends. Update the DEFAULT_SCAN_DIRS comment to note tests/common is CI-run but not a cocotb scan dir.
discover_test_local_sources() rglob'd the whole repo for wrappers/*.vhd and ip_integrator/*.vhd, so a selective run imported ~40 ethernet/**/wrappers sources despite ethernet being outside DEFAULT_SCAN_DIRS -- pure resolver overhead, since no scanned test resolves to those toplevels. Thread scan_dirs through discover_/import_test_local_sources() (defaulting to DEFAULT_SCAN_DIRS) and scan only repo_root/<scan_dir> subtrees; the CLI passes its resolved scan_dirs. Add a scoping regression test. Also correct two map_python_changes() docstring bullets to match the implementation: a test file added on the branch is discovered on the current checkout and selects itself (ignored only when outside the discovered universe, e.g. tests/ethernet), and any non-test_*.py helper at any depth under tests/<sub>/ -- not just tests/<sub>/*_utils.py -- selects the whole <sub> subtree, keyed on <sub> alone.
parse_wrapper_entity_units() rglob-ed the whole repo for wrappers/*.vhd, pulling in the ~40 ethernet wrapper sources even though DEFAULT_SCAN_DIRS excludes ethernet. Those sources union into cf_units but can never join to a discovered module owner in build_wrapper_index (discover_toplevels is itself scoped to scan_dirs), so the extra reads are pure IO overhead on selective runs. Thread scan_dirs (default DEFAULT_SCAN_DIRS) through and restrict the rglob to repo_root/<scan_dir>/**/wrappers/*.vhd, mirroring the already scoped discover_test_local_sources(). The CLI passes its resolved scan_dirs. Adds test_parse_wrapper_entity_units_scoped_to_scan_dirs.
…strings - tests/common/__main__.py: wrap discover_toplevels() in the same fail-open block as the merge-base and .cf paths. A bad --scan-dir raises FileNotFoundError by design; main() now prints the FORCE_FULL sentinel on stdout and returns 1 instead of letting the exception escape, preserving the single-stdout-line contract that indeterminate resolution fails open. - tests/common/dep_map.py: correct the parse_wrapper_entity_units() and build_wrapper_index() docstrings. They claimed .cf can "never" contain wrapper files, but the resolver itself imports wrapper/ip_integrator sources via import_test_local_sources() (ghdl -i, --work=surf) before parse_cf_units() reads that same surf-obj08.cf, so wrappers can appear in .cf. Reword to note only a make-import-only .cf is wrapper-free; the direct scan remains the reliable source and the join is source-agnostic. - tests/common/test_dep_map.py: add test_cli_bad_scan_dir_fails_open_to_force_full pinning the CLI FORCE_FULL fail-open behavior.
parse_wrapper_entity_units() scanned every line of each wrapper file even after finding its `entity <Name> is` declaration. A wrapper file declares exactly one entity (the DUT it wraps for cocotb), so break at the first match instead of reading the remaining lines. Behavior is unchanged -- each wrapper still maps to its single declared unit name -- and the scan does less IO/CPU per file on selective CI runs. All 45 test_dep_map.py tests still pass.
Both the push and the pull_request(->main) triggers fire the regression job, and their concurrency groups differ (github.ref is refs/heads/<branch> vs refs/pull/N/merge), so a branch with an open PR into main runs the job twice per push -- once selective, once full -- with neither cancelling the other. Document this at the trigger block: it is intentional (per-commit feedback plus the full pre-merge gate), the selective set is a strict subset of the full suite so the overlap is redundant-but-safe, and in this repo feature branches target pre-release, so the double run is confined to release PRs into main rather than routine development.
…rkdir Two fixes for indeterminate/edge cases in the change-aware CI path: - surf_ci.yml: on selective runs, always include tests/common in the pytest targets. A feature-branch push that touches only the resolver (tests/common/dep_map.py) resolves to zero affected cocotb tests, so the old `elif [[ -z resolver_output ]]` "nothing to run" branch executed no tests at all -- leaving resolver changes untested until a full run. Prepend tests/common to the target list (fast: pure-Python, GHDL monkeypatched) and append the selected cocotb tests when the resolver names any. - tests/common/__main__.py: fail open when --workdir does not exist. Every GHDL call runs with cwd=workdir (import_test_local_sources' `ghdl -i` and build_dependency_map's `ghdl --gen-depends`); a missing workdir made subprocess.run raise FileNotFoundError before the parse_cf_units guard could catch it, escaping unhandled and breaking the single-stdout-line FORCE_FULL contract. Guard the workdir up front and emit FORCE_FULL, matching the merge-base, scan-dir, and .cf fail-open paths. - tests/common/test_dep_map.py: add test_cli_bad_workdir_fails_open_to_force_full pinning the new CLI behavior. 46 tests pass.
…ral always_run owners The select_tests docstring claimed a wrapper edit for any always_run/unresolved owner never trips a full run. That holds only for gen-depends-unresolved owners: their literal toplevel names survive in the resolved dict that build_wrapper_index joins against, so their wrappers are attributed precisely. A wrapper whose sole owner is a non-literal-toplevel always_run test is absent from resolved, has no wrapper_index entry, and correctly falls to the unattributable-wrapper FORCE_FULL branch. Reword to distinguish the two cases so readers do not assume all always_run wrappers are handled precisely.
…tatus
- parse_wrapper_entity_units() used read_text().splitlines(), which loads
the whole file up front, so the early `break` at the first `entity ... is`
declaration skipped only regex work, not IO. Iterate the file handle
line-by-line so the break actually stops reading, keeping the early-exit
effective when scanning many wrappers.
- changed_files() documents its status values as {A, M, D} but forwarded
status[0] verbatim for non-rename/non-delete entries, so git codes like
'T' (typechange) could leak through and break the contract. Normalize any
code other than A/M/D-handled to 'M'. select_tests only special-cases 'D'
and treats everything else as a modification, so this tightens the return
type with no behavior change.
- test_dep_map.py: add test_changed_files_normalizes_unknown_status_to_modified.
47 tests pass.
…comment typo - __main__.py: the FORCE_FULL rationale only cited unresolvable RTL, but select_tests() also forces a full run for unattributable wrapper edits and high-fanout hub files (>= FAN_OUT_THRESHOLD). Broaden the message so CI logs explain the actual fail-open cause. - surf_ci.yml: guard the MCODE_PATH grep pipeline with '|| true'. Under the step's default 'bash -eo pipefail', 'grep -vxF' exits 1 when it filters out every PATH entry (e.g. PATH were only the llvm bin dir), which would fail the whole step. - surf_ci.yml: fix comment typo 'broken for on' -> 'broken on'.
a5a9de4 to
d4795fb
Compare
Bring tests/common/{dep_map,__main__,test_dep_map}.py in line with the
final reviewed resolver. surf_ci.yml is unchanged -- the push/PR trigger
routing and selective-vs-full run mode are identical to before.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Details
JIRA
Related