Skip to content

feat(codegraph): helper extraction v2 — parameterize proposals + external OSS case studies#95

Merged
QodeXcli merged 1 commit into
mainfrom
feat/helper-extract-v2-case-studies
Jul 2, 2026
Merged

feat(codegraph): helper extraction v2 — parameterize proposals + external OSS case studies#95
QodeXcli merged 1 commit into
mainfrom
feat/helper-extract-v2-case-studies

Conversation

@QodeXcli

@QodeXcli QodeXcli commented Jul 2, 2026

Copy link
Copy Markdown
Owner

Two of the roadmap items in one coherent change: Helper Extraction v2 and external OSS case studies (the v2 output is the highlight of the case studies).

v2 — from "these look similar" to "here is the exact consolidation"

proposeParameterizedHelper (PURE) aligns a cluster's bodies token-by-token (literals kept, each function's own name neutralized) and:

  • turns each varying position into a parameter — positions that always vary together collapse into one
  • names params from context: kind: "min" → param kind (fallback p1/p2)
  • emits the reference body with substitutions + the exact call each original becomes
  • declines honestly (with the reason) when bodies don't align or >4 parts vary; mixed structural variants keep the largest aligned subset and report the rest as dropped

Also fixes a live false positive: an inner function no longer clusters with its own parent (caught on axios: buildPath nested inside formDataToJSON).

Case studies — real repos, real output, commits pinned

Repo Result
zod 912f0f5 1,103 fns → 38 clusters / ~671 lines; the 7-copy positive/negative family comes back as a concrete proposal: helper(kind, inclusive) with per-member call mapping
hono b20d422 importPublicKey/importPrivateKey ~95% (~26 lines) + duplicated getQueryString (~92%)
axios e435384 setFormDataHeaders byte-near-identical across resolveConfig.js + adapters/http.js — verified by eye (only a || {} differs)

Documented in docs/ADOPTION.md with reproduction commands and honest caveats (detection-only; duplication in mature libs can be deliberate — mechanical evidence, human judgment).

+6 tests (param naming, co-varying collapse, honest declines, best-subset with dropped, nested-member regression). Full suite 1514 green, tsc clean.

…rnal OSS case studies

v2 turns near-dupe DETECTION into a concrete CONSOLIDATION proposal, and proves the whole
pipeline on three real open-source repos.

- proposeParameterizedHelper (PURE): aligns cluster bodies token-by-token (literals kept, own
  name neutralized), turns each varying position into a parameter (co-varying positions collapse
  into one), names params from context (`kind: "min"` → param `kind`), and emits the reference
  body with substitutions + the exact call each original becomes. Conservative: declines with the
  reason when bodies don't align or >4 parts vary; mixed structural variants keep the LARGEST
  aligned subset and report the rest as dropped.
- find_similar_helpers appends proposals for the top clusters (still detection-only — no edits).
- Nested-member fix: an inner function no longer clusters with its own parent (caught live on
  axios: buildPath inside formDataToJSON).
- docs/ADOPTION.md external case studies (real tool output, commits pinned, 2026-07-02):
  zod 912f0f5 → 38 clusters / ~671 lines (7-copy positive/negative family + the real
  parameterize proposal); hono b20d422 → importPublicKey/importPrivateKey ~95% + dup
  getQueryString; axios e435384 → setFormDataHeaders byte-near-identical across two files
  (manually verified). With the honest caveats (detection-only; duplication can be deliberate).
@QodeXcli QodeXcli merged commit 8e22533 into main Jul 2, 2026
2 checks passed
@QodeXcli QodeXcli deleted the feat/helper-extract-v2-case-studies branch July 2, 2026 00:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant