Add polygraph skill: behavioral trust grades for MCP servers#477
Open
RubenSousaDinis wants to merge 8 commits into
Open
Add polygraph skill: behavioral trust grades for MCP servers#477RubenSousaDinis wants to merge 8 commits into
RubenSousaDinis wants to merge 8 commits into
Conversation
Polygraph grades MCP servers A–F by connecting like an agent, fingerprinting the exact tool surface, and running three behavioral probes (C-01 tool-output injection, C-02 permission/egress overreach, C-03 sensitive-data leak), then publishing a reproducible grade as an onchain EAS attestation on Base. The skill covers: checking a grade (`npx polygraphso check <server>`), running the open litmus harness locally to grade your own server, why a server got a given grade, and the verify-before-trust pattern for Bankr agents (recompute the live tool-surface fingerprint and require it to match the attestation before executing). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Behavioral grades are now live via `polygraphso check` / `list` (A–F across graded servers), so replace the "rolling out / not yet available" framing and the stale example outputs with the real current CLI output, including the shipped grades. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
"Triggers on:" inside the plain-scalar description made YAML read it as a nested
mapping ("mapping values are not allowed in this context"). Reword to "Triggers
on mentions of" (no colon), matching the zerion skill convention.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Address review feedback: - Scope to MCP servers (drop "AI tools" — the whole harness is MCP-specific). - Make the remote/Docker-less B-cap explicit and frame it as a property of the measurement, not a knock (a remote B is not "worse than" a local A). - Stop hardcoding named third-party grades; keep one live first-party A as proof and treat the live set / attestation as the point-in-time source of truth. - Present the live scale as A/B/D/F; note C/E are not assigned (C reserved). - Elevate runtime verify-before-trust above the get-graded CTA and surface the evasion caveat at the trust decision. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Verified the skill against the now-published package. Remove the `challenge` command and the `check <ref[@Version]>` form — neither exists in the published CLI (commands: litmus/check/list; flags --json/--bearer/--header/--allow-state-changing and env POLYGRAPH_API_URL/LITMUS_BEARER/LITMUS_STDIO_ISOLATION all confirmed present). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add references/ci-gate.md (the polygraphso/litmus@v1 Action that fails a build when an MCP server or an Agent Skill grades D/F) and a 'Gate your CI on grades' section + reference link in SKILL.md. Co-Authored-By: Claude <noreply@anthropic.com>
…ory, the ci command Bring the skill up to date with the published @polygraphso/litmus: four probe categories (adds C-04 adversarial-input handling), methodology version litmus-v9 in the illustrative outputs, and the ci command in the CLI reference. Co-Authored-By: Claude <noreply@anthropic.com>
Adds the required catalog.json (slug=polygraph, install type bankr) so the skill appears in the Bankr Discover catalog, a square logo.svg, and a README table row. Rebased onto current main.
6738fcb to
f353528
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
polygraph — behavioral trust grades (A–F) for MCP servers and Agent Skills
Adds a
polygraph/skill (polygraph.so). Polygraph connects to an MCPserver the way an agent would, fingerprints its exact tool surface, and runs four behavioral
probes — C-01 tool-output injection, C-02 permission/egress overreach, C-03
sensitive-data leak, C-04 adversarial-input handling — then grades it A–F and can publish a
reproducible grade as an onchain EAS attestation on Base. The harness is open source, so anyone
can re-run it and disprove a bad grade.
CTA for builders
What the skill covers
npx polygraphso check npm/@modelcontextprotocol/server-filesystemthe attestation before letting Bankr execute (the runtime gate)
polygraphso/litmus@v1(ornpx @polygraphso/litmus ci)fails a build when an MCP server or a skill it ships grades D/F — see
polygraph/references/ci-gate.mdConforms to the contribution guide
polygraph/catalog.json—slugequals the folder,install.type: bankr, so it appears in theBankr Discover catalog
polygraph/logo.svg(square mark),polygraph/SKILL.mdwithname+descriptionfrontmatter,supporting docs under
polygraph/references/mainVerified against the published packages
polygraphso— the lookup CLI (check/list)@polygraphso/litmus— the open harness, thecigate, and thepolygraphso/litmus@v1GitHubAction
Happy to adjust naming/scope to match your conventions.