docs(deploy): spec + explanation for set_replica_identity (#1447)#180
Conversation
Two new pages for the dj.deploy.set_replica_identity helper landing in DataJoint 2.3 (datajoint-python #1466): - src/reference/specs/deploy-operations.md — normative spec for the datajoint.deploy module, with set_replica_identity as the first inhabitant. Includes a Design rationale section explaining the three structural choices: migration-only (no auto-emit at declare time), a new module rather than dj.migrate, idempotency-by-default. - src/explanation/postgresql-cdc-replication.md — explainer covering what REPLICA IDENTITY is at the PostgreSQL level, why CDC consumers care (Databricks Lakehouse Sync mandates FULL and silently skips tables that lack it), cost and compliance considerations, and the representative workflow. Both pages cross-link to each other. The spec carries the formal API contract; the explainer carries the reasoning and the WAL/compliance tradeoffs that motivate it. Nav: new "Operations" group under Concepts; new "Deployment" group under Reference > Specifications. Slated for 2.3 alongside the implementation PR.
MilagrosMarin
left a comment
There was a problem hiding this comment.
Verified carefully against datajoint-python#1466:
✅ Signature, return shape, error messages match impl exactly
✅ Schema.list_tables() already excludes ~/~~ tables (schemas.py:525-545) — the CDC-relevant filter
✅ Cross-links resolve; nav placement under Concepts → Operations and Reference → Specifications → Deployment is sensible
✅ The "Design rationale" section is unusually well-argued — the migration-only argument (mixed-state failure mode from a config-flag + utility combo) is the right framing
✅ Databricks Lakehouse Sync "silently skipped" framing accurately motivates the feature
Sequencing note: impl PR #1466 is still open (I requested changes on the assert usage). This docs PR is independently reviewable but ideally lands alongside or after the impl.
Approving — the spec and explainer are a clean pair.
The companion spec page (deploy-operations.md) already carries a version-added admonition. Mirror it on the explainer so a reader landing here from search or a link sees the version context up front rather than only from the spec cross-link.
Summary
Adds two new pages documenting the `dj.deploy.set_replica_identity` helper that lands in DataJoint 2.3 (datajoint-python #1466). Closes the docs side of datajoint-python #1447.
The spec carries the formal API contract; the explainer carries the reasoning and the WAL/compliance tradeoffs that motivate the feature. Both pages cross-link to each other.
Why a new module page rather than folding into existing specs
`datajoint.deploy` is the first of an emerging category of operational helpers (publication membership, vacuum/reindex, role grants are plausible siblings). Giving it a dedicated spec page now — with one inhabitant — establishes the boundary against `datajoint.migrate` and provides a home for future helpers without retroactive reorganization. The rationale section in the spec walks through the alternatives that were rejected.
Sequencing
This PR is independently reviewable but should land alongside or after datajoint-python #1466 — the code that implements the function. If they merge in either order, no broken links result; both stand alone.
Test plan