Add skill-evolver: closed-loop self-improvement system for skills#439
Draft
wzhipan wants to merge 10 commits into
Draft
Add skill-evolver: closed-loop self-improvement system for skills#439wzhipan wants to merge 10 commits into
wzhipan wants to merge 10 commits into
Conversation
Introduces a closed feedback loop (capture → analyze → propose → review → apply → validate → measure) for evolving skills and AI tools. - .github/hooks/journal-utils.js: single-writer JSONL friction journal store CLI + module (record, set-active, stats, list) under ~/.skill-evolution/ - .github/hooks/friction-capture.js: PostToolUse/Stop hook that auto-logs tool failures, attributes them to the active skill, clears attribution on Stop - .github/hooks/orchestrator.json: register PostToolUse + Stop capture hooks - .github/skills/skill-evolver/: SKILL.md + friction-schema, classification-rubric, edit-safety-rules references - .github/skill-evolution/: evolution-log changelog + .gitignore for local journal - .github/copilot-instructions.md: register skill-evolver in the skills table Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Expand the description frontmatter (the skill's activation mechanism) with more natural-language trigger phrases and a clearer proactive cue, so the skill self-activates without the user naming it explicitly. Also broadens scope wording to skills, prompts, and AI tools, and syncs the skills-table row in copilot-instructions.md. Validated at 1019/1024 chars. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Address the concern that always-on capture could feel intrusive:
- SKILL_EVOLUTION_DISABLE env var silences all capture (hook + journal
recordEvent become no-ops); read paths (stats/list) still work so past
data stays reviewable.
- friction-capture.js exits early when disabled, still returning
{continue:true} so the tool flow is never blocked.
- journal-utils.js recordEvent no-ops when disabled; CLI `record` reports
it cleanly instead of printing null.
- SKILL.md: add a "Non-intrusiveness & controls" section documenting the
silent/non-blocking capture, the no-mid-task-edits guarantee, the off
switch, and an explicit rule that proactive logging must be one-line and
must never interrupt or question the user mid-task.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Retrospective over the friction journal (3 active-captured events from the build session). Each fix individually approved by the developer. 1. skill-creator: document the PyYAML prerequisite for the validation/ packaging scripts (fixes ModuleNotFoundError: No module named 'yaml'). 2. skill-evolver: clarify that automatic hook capture is best-effort and active capture is the PRIMARY path (this runtime didn't fire PostToolUse). 3. skill-evolver: state the 1024-char description limit explicitly and add a length-check command in edit-safety-rules (cost 2 retries this session). Logged all three in evolution-log.md. Both edited skills pass quick_validate. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… hook The GitHub Copilot CLI runtime has no hooks system, so the Claude Code-style PostToolUse/Stop registrations in orchestrator.json never fired. Per the developer's choice (Option A: Copilot CLI only), stop pretending capture is automatic and make active (agent-driven) capture the primary mechanism. - orchestrator.json: remove the inert friction-capture.js registrations (PostToolUse, Stop, and the duplicate SubagentStop entry); keep the orchestrator's own subagent hooks. - friction-capture.js: mark DORMANT with a header banner explaining it is Claude Code-only and how to enable it via .claude/settings.json. - skill-evolver/SKILL.md: reframe Architecture + Capture so active capture is the primary/only reliable path on this runtime; fix non-intrusiveness and off-switch wording that implied a background hook runs here. - evolution-log.md: record the change with rollback ref. Validated: quick_validate passes; orchestrator.json no longer references friction-capture; CLI record/stats still work (active capture intact). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
edit-safety-rules.md Workflow step 1 was ambiguous about which tool to use for branch creation. I used gitkraken-git_checkout which doesn't support -b, requiring an unnecessary two-step workaround. Verified: git checkout -b works correctly via the powershell tool with native git 2.52.0. Clarified in one line: 'via the powershell tool (not gitkraken-git_checkout, which does not support -b)'. Retro #2 summary: 7 journal events (4 carried/confirmed-fixed, 3 new). 1 skill defect fixed; 2 environmental (no action). All 4 prior fixes hold. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
User feedback: retrospective proposals showed diffs but didn't make clear which skill each fix targeted. Since skill-evolver evolves many skills, that ambiguity makes per-skill review decisions hard. SKILL.md section 3 now mandates: - a per-proposal header: 'Target: <skill> -> <file> . <root-cause> . <severity>' - a summary table (# . Target skill . File . Root cause . Severity) when proposing multiple fixes - naming the target skill in per-fix approval questions quick_validate passes; logged in evolution-log.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ate + references) Counters the loop's addition bias so skills don't grow into caveat-soup: - #2 tripwire: new 'journal-utils.js skill-sizes' command scans every SKILL.md and flags body >400/500 lines and description >900/1024 chars. - #1 prune: SKILL.md section 4 is now 'Measure & prune' - run skill-sizes each retro; every ~5th retro (or when flagged) propose removals, not just additions. - #3 + #4: new edit-safety rule 6 (consolidate over append; references over body; don't add to an over-budget skill without pruning). - New references/bloat-control.md holds budgets + prune procedure, kept out of the always-loaded body (practicing #4). Validated: quick_validate passes; skill-sizes runs and already flags skill-evolver's own description (1019/1024, DESC_WARN). Body grew only 104->111 lines because detail went into the reference. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The skill-sizes tripwire flagged skill-evolver's own description at 1019/1024 chars. Removed redundant trigger phrasings (overlapping 'didn't work' wording, a duplicate example) and tightened the global-lessons clause; strongest triggers preserved. Now 887 chars (under the 900 warn). quick_validate passes and skill-sizes reports all skills within budget. Demonstrates the anti-bloat loop end to end: tripwire flagged, prune cleared. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… outcome Integrate the build-time (skill-creator) and run-time (skill-evolver) halves of the skill lifecycle via lightweight cross-references (not a merge), and close a real gap: the evolver had no path to recommend creating a NEW skill. - creator -> evolver: Step 6 'Iterate' now points to skill-evolver for continuous, evidence-based iteration after a skill is in use (Step 6 still covers immediate in-authoring tweaks). - evolver -> creator: new 'Needs a new skill' classification outcome for a substantial out-of-scope task, or splitting an over-budget skill that's doing two jobs -> hand off to skill-creator. Added to the rubric table, SKILL.md target-decision list, and bloat-control prune procedure. Kept separate by design (distinct triggers, freedom levels, 1024-char description ceiling). Both skills pass quick_validate; skill-sizes reports all within budget. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
❌ Work item link check failed. Description does not contain AB#{ID}. Click here to Learn more. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds skill-evolver — a closed-loop self-improvement system for skills and AI tools — and integrates it with the existing skill lifecycle. Built and refined entirely via its own retrospective loop this session.
The loop: capture friction -> analyze -> propose reviewed edits -> validate -> log -> measure.
What's included
New system
.github/hooks/journal-utils.js— single-writer JSONL friction journal store + CLI (record,stats,skill-sizes,list, attribution markers). Store lives in~/.skill-evolution/(gitignored)..github/skills/skill-evolver/—SKILL.md+ references (friction-schema, classification-rubric, edit-safety-rules, bloat-control)..github/skill-evolution/evolution-log.md— auditable changelog of every applied change with rollback refs.Runtime-honest capture
friction-capture.jsis marked dormant (Claude Code-style hook; the GitHub Copilot CLI runtime has no hooks, so it never fires here). Active, agent-driven capture is the primary mechanism on this runtime; the hook registrations were removed fromorchestrator.json.Anti-bloat guardrails
skill-sizestripwire flags any SKILL.md over body/description budget; prune phase + consolidate-over-append + references-over-body rules counter the loop's natural addition bias.Lifecycle integration with skill-creator
Registration
copilot-instructions.mdskills table updated.Validation
quick_validate.py.skill-sizesreports all skills within budget.Notes