Skip to content

feat(compile): persist derived data as versioned artifacts#256

Draft
jamesdabbs wants to merge 3 commits into
mainfrom
derived-artifacts
Draft

feat(compile): persist derived data as versioned artifacts#256
jamesdabbs wants to merge 3 commits into
mainfrom
derived-artifacts

Conversation

@jamesdabbs

Copy link
Copy Markdown
Member

Compile already deduces every space to validate for contradictions, then throws the result away — so every client and SSR request re-derives the same deterministic data. This PR keeps that work and publishes it: alongside the unchanged bundle.json, compile now emits a versioned artifact set (manifest.json, slim core.json, per-space derived-trait shards, text.json) that the viewer will consume in a follow-up, retiring runtime deduction for canonical data.

Format contract, versioning, and determinism guarantees are documented in doc/artifacts.md.

Review notes

Easiest commit-by-commit, "make the change easy" → "make the change":

  1. feat(core): add derived-data artifact schemas and serialization — schemas + canonical serialization + determinism test. The load-bearing property is byte-identical output for the same logical input: prover input order is pinned (artifacts.implications/deduceSpace) and every collection is sorted.
  2. refactor(compile): keep per-space derivations from bundle validation — replaces the FIXME'd check helper; behavior-preserving except deductions are now returned (and the implication index is built once, not per space).
  3. feat(compile): emit the derived-data artifact set alongside bundle.json — the emit itself, plus a viewer parity test pinning artifact deduction ≡ client prover on the same fixture (guards the "users can re-derive published results" property).

Verified against the live data repo (222 spaces × 244 properties): 49,578 derived traits across 222 shards (max 44 KB), byte-identical across repeated runs, sizes matching the plan's measurements.

Note for the release that follows: compile at HEAD rejects the data repo's current main — 7 trait files under S000171/S000172 are missing description bodies, which validations.trait requires but the currently-deployed compile release evidently tolerates. Those files need fixing (or the validation relaxing) before this ships to the data pipeline.

Define the versioned artifact set (manifest, slim core, per-space derived
shards, text) that compile will publish so consumers can load precomputed
deductions instead of re-running the prover. Serialization is canonical -
prover inputs and all output collections are sorted - so the same logical
input yields byte-identical artifacts, which diffs and sha-keyed caching
will rely on. See doc/artifacts.md for the format contract.
Bundle validation already deduces every space to check for contradictions,
then discarded the result. Return the Derivations keyed by space uid instead
(threading them through load), so a follow-up can emit them as derived-data
artifacts. Deduction now runs through core's artifact entry points, which pin
prover input order for reproducible proofs, and builds the implication index
once instead of per space.
Write manifest.json, core.json, text.json and per-space derived-trait shards
(to an 'artifacts' directory by default) next to the unchanged bundle.json.
Adds a viewer parity test - artifact deduction equals what the client prover
derives on the same fixture - guarding the property that users can re-derive
published results locally.

Verified against the live data repo: 49,578 derived traits across 222 shards
(max 44 KB), byte-identical across repeated runs.
@cloudflare-workers-and-pages

Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
pi-base-topology 5738411 Commit Preview URL

Branch Preview URL
Jul 03 2026, 12:37 AM

@cloudflare-workers-and-pages

Copy link
Copy Markdown

Deploying topology with  Cloudflare Pages  Cloudflare Pages

Latest commit: 5738411
Status: ✅  Deploy successful!
Preview URL: https://a6d33f94.topology.pages.dev
Branch Preview URL: https://derived-artifacts.topology.pages.dev

View logs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant