fix(skill): dedupe skills with identical SKILL.md content#890
Conversation
- hash SKILL.md content with sha256 during discovery - skip re-registering a skill whose content was already seen - prefer the sub-skill form when a duplicate collides with a parent copy
🦋 Changeset detectedLatest commit: 30d87e2 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
commit: |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 30d87e23e9
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const existing = input.byContent.get(contentHash); | ||
| if (existing !== undefined) { | ||
| const preferCurrent = | ||
| discovered.metadata.isSubSkill === true && existing.metadata.isSubSkill !== true; | ||
| if (!preferCurrent) return undefined; |
There was a problem hiding this comment.
Do not dedupe every identical content hash
When two distinct skills happen to have byte-identical markdown, this returns before onDiscoveredSkill and before adding the second name. That drops legitimate cases such as foo.md and bar.md with the same body but different filename-derived names, and it also prevents SessionSkillRegistry.loadRoots from indexing the second plugin's copy for getPluginSkill when two enabled plugins vendor the same startup SKILL.md; those are not flattened sub-skill duplicates and should remain addressable.
Useful? React with 👍 / 👎.
Related Issue
Problem
When a parent skill bundle has
has-sub-skill: true, the scanner walks its nested sub-skill directories (e.g.outer/child/SKILL.md) and registers qualified names such asouter.child. Some setups also ship the same sub-skill as a flattened copy (e.g.outer__child/SKILL.md) with byte-identicalSKILL.mdcontent. Previously both copies were registered, so the same skill showed up twice — once under the qualified sub-skill name and once under the flattened directory name.What changed
SKILL.mdwith SHA-256 and track seen content during discovery.Checklist
gen-changesetsskill, or this PR needs no changeset.gen-docsskill, or this PR needs no doc update.