Skip to content

feat: detector precision, monorepo discovery, CI gate + honest docs restructure#22

Merged
decksoftware merged 6 commits into
mainfrom
feat/precision-monorepo-gate-and-honest-docs
Jun 10, 2026
Merged

feat: detector precision, monorepo discovery, CI gate + honest docs restructure#22
decksoftware merged 6 commits into
mainfrom
feat/precision-monorepo-gate-and-honest-docs

Conversation

@decksoftware

@decksoftware decksoftware commented Jun 10, 2026

Copy link
Copy Markdown
Owner

Summary

Engine/CLI hardening and a documentation restructure so the project presents itself as what it is: an orchestration, corroboration, and evidence layer for local security review — complementing Semgrep and the other scanners, never replacing them.

Engine & CLI

  • XML_XXE precision fix: the old .*?(?!...) lookahead was always satisfiable, so ANY mention of an XML parser (including the entity-free browser DOMParser) produced a CRITICAL false positive — empirically proven, then rewritten to fire only on explicitly insecure entity configuration (noent: true, resolve_entities=True, LIBXML_NOENT, DtdProcessing.Parse, ...). Audited all 9 lookahead patterns; the other 8 are positionally correct. Precision corpus extended with safe/insecure XXE pairs.
  • Monorepo discovery: config/dependency/BaaS patterns are now recursive (**/), so apps/*/.env, services/*/Dockerfile, nested package.json/firestore.rules are scanned; one parallel glob walk per category (was ~87 sequential walks); /-normalized paths; frameworks detected from ALL workspace manifests.
  • Strict CLI: node:util parseArgs — a mistyped flag (--basline) now aborts instead of silently running without the baseline. New --version and --fail-on <severity> (exit 1 when findings at/above the level remain → CI gate to pair with the SARIF output).
  • Tool control: --tool-timeout <s> (timeouts reported as timeouts, with the knob named, instead of generic "unavailable") and --semgrep-config <ref> for local/air-gapped rules (adds --metrics=off).
  • Consolidation: single sanitizeAgentName module (was 3 drifting copies), shared extension/language maps (fixes *.pyw never getting Python rules), single content split per file in the detector.

CI

  • New dogfood self-scan job: every PR runs CSReview against this repository with Semgrep installed and fails on remaining HIGH/CRITICAL findings — exercising the new gate end to end.

Documentation honesty

  • SKILL.md: 1948 → ~250 lines, engine-first ("run the CLI; never handcraft reports"), consolidated core rules, and an explicit no-fabricated-metrics rule: the old report templates instructed agents to fill in ASVS Coverage %, SLSA Level, and per-article compliance PASS/FAIL that the engine does not compute. DAST-CONFIRMED is documented as a reserved label the built-in probe never emits; dotnet build is no longer listed as a security scanner.
  • Checklists/compliance/tooling/subagent/DAST/report reference moved to csreview/reference/*.md (shipped in the package, loaded on demand — agents stop paying ~2k lines of context per invocation).
  • README rebuilt as an honest landing (pipeline, explicit "Honest limits" section, updated CLI) keeping the verbatim SKILL.md mirror; the doc-honesty contract tests in analysis.test.js now assert across README + SKILL.md + reference/*.

Test plan

  • 201/201 tests (18 new: XXE precision, monorepo scanner, CLI args, tool args/timeouts, agent-name, languages) — all 15+ doc-honesty contracts green
  • npm run lint + npm run typecheck clean
  • Dogfood self-scan: score 95/100, --fail-on high → exit 0; --fail-on low → exit 1 (gate verified in both directions)
  • Self-scan job runs in this PR's own CI

🤖 Generated with Claude Code

dev-ecd-dm and others added 6 commits June 10, 2026 15:04
… CI gate

- XML_XXE now fires only on explicitly insecure entity configuration
  (the old lazy-dot lookahead matched every XML parser mention,
  including the entity-free browser DOMParser); precision corpus
  extended with safe/insecure XXE pairs
- scanner discovers config/dep/BaaS files recursively (monorepo-aware),
  runs one parallel glob walk per category, /-normalizes paths, and
  detects frameworks from ALL workspace manifests
- CLI: strict parseArgs (mistyped flags abort instead of silently
  changing the audit), --version, and a --fail-on <severity> CI gate
- tools: --tool-timeout and --semgrep-config (local rules, metrics
  off); timeouts are reported as timeouts, not generic unavailability
- consolidation: single sanitizeAgentName module (was 3 copies), shared
  extension/language maps (fixes *.pyw missing Python rules), single
  content split per file in the detector

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
*_security.sarif, *_local-dast-report*.html, *_local-dast-findings*.md,
and *_db-dump-guide.html now match the existing stray-copy patterns.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
SKILL.md drops from 1948 to ~250 lines: engine-first execution workflow
(run the CLI; never handcraft reports), consolidated core rules, and an
explicit no-fabricated-metrics rule (the engine computes no ASVS
percentages, SLSA levels, or per-article compliance verdicts).
DAST-CONFIRMED is documented as a reserved label the built-in probe
never emits, and dotnet build is no longer listed as a scanner.
Checklists, compliance tables, tooling commands, the subagent protocol,
and report anatomy move to csreview/reference/*.md (shipped in the npm
package, loaded on demand).

README is rebuilt as an honest landing page (orchestration +
corroboration + evidence layer, with an explicit limits section) and
keeps the verbatim SKILL.md mirror. The doc-honesty contracts in
analysis.test.js now assert across README + SKILL.md + reference/*.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The exact-version pin broke CI on every release bump (this PR's bump to
0.1.4 included) without protecting anything; the semver-shape assertion
keeps the metadata contract.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Runs CSReview against its own repository on every PR (Semgrep installed,
update check skipped) and fails the build when HIGH/CRITICAL findings
remain — exercising the new CI gate end to end.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@decksoftware decksoftware merged commit 6d1dfbd into main Jun 10, 2026
13 checks passed
@decksoftware decksoftware deleted the feat/precision-monorepo-gate-and-honest-docs branch June 10, 2026 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant