Skip to content

feat(recall): receipt-aware ranking + timeline — ground-truth-weighted recall#102

Merged
QodeXcli merged 1 commit into
mainfrom
feat/advanced-recall
Jul 2, 2026
Merged

feat(recall): receipt-aware ranking + timeline — ground-truth-weighted recall#102
QodeXcli merged 1 commit into
mainfrom
feat/advanced-recall

Conversation

@QodeXcli

@QodeXcli QodeXcli commented Jul 2, 2026

Copy link
Copy Markdown
Owner

Advanced recall, exactly per the plan: search → receipt-aware ranking → visualization (diff + timeline) → dashboard/CLI.

Receipt-aware ranking (the piece recall never had)

  • Episode.verified — stamped true at record time: episodes are only written on the objective-success path (sandbox compiled + merged, verify/completion gates passed). Old episodes stay unknown-neutral.
  • Maintain receipts join the corpus as a new source kind receipt — verified autonomous work is history worth recalling. opened → verified ✓, blocked/failed → ⛔.
  • verifiedBoost in rankApproaches: ✓ +0.08, ⛔ −0.04, unknown 0 — applied to the effective rank only; the displayed score stays honest raw relevance. Composes with recency tilt + MMR diversity + stemming.

Timeline + outcome marks

Every label now carries its ground-truth outcome (🎯 task ✓, 🧾 receipt ⛔), and the output ends with the evolution story:

Timeline (oldest → newest):
  ○ 2026-03-01     add basic auth login with sessions
  ○ 2026-05-12 ✓   add jwt auth login middleware to the api
  ○ 2026-05-20 ⛔   tried oauth login integration, rolled back
  ● 2026-06-10 ✓   maintain unused-imports: cleaned auth module

Last 5 dated approaches (per spec); omitted when <2 carry dates. It's appended inside renderApproachDiffs, so the dashboard Recall panel and the CLI tool inherit it with zero extra wiring.

Live demo output shows the point: a verified jwt approach at 41% raw relevance correctly outranks an unverified 45% one — and a failed oauth attempt surfaces with ⛔ as a warning, which recall could never say before.

+5 tests (✓>unknown>⛔ ordering with honest displayed score; timeline order/cap/marks/undated-skip/omission; receipt tag). Full suite 1535 green, tsc clean.

…d recall

recall_approach now knows which past approaches PROVABLY worked, ranks with that, and shows the
evolution on a timeline. Same pipeline everywhere (CLI tool + dashboard panel via the shared
renderer).

- Episode.verified: stamped true at record time (episodes are written on the objective-success
  path — sandbox compiled + merged, verify/completion gates passed). Old episodes → unknown.
- New recall source kind 'receipt': maintain runs with receipts join the corpus — verified
  autonomous work IS history worth recalling. opened → verified:true, blocked/failed → false.
- rankApproaches verifiedBoost: ✓ proven +0.08, ⛔ blocked/failed −0.04, unknown neutral —
  applied to the effective rank only; the displayed score stays honest raw relevance. Composes
  with the existing recency tilt + MMR diversity + stemming.
- Visualization: outcome marks on every label (🎯 task ✓ / 🧾 receipt ⛔) + renderTimeline —
  the last 5 dated approaches, oldest → newest with ○/● glyphs and per-entry outcome marks;
  omitted when <2 matches carry a date. Appended by renderApproachDiffs, so the dashboard
  Recall panel and the CLI tool get it with zero extra wiring.
- Both gathers (recall_approach tool + dashboard recall.query) feed episodes' verified flag and
  the maintain-receipt source.
@QodeXcli QodeXcli merged commit e1d62b0 into main Jul 2, 2026
2 checks passed
@QodeXcli QodeXcli deleted the feat/advanced-recall branch July 2, 2026 02:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant