docs(#1454): document ~lineage auto-refresh on decoration#181
Conversation
Docs companion to datajoint-python #1467, which adds an in-memory check at @Schema decoration time: when an already-declared table's heading shows any PK attribute with lineage=None, the table's ~lineage rows are auto-refreshed. Healthy schemas pay zero extra DB queries; the refresh only fires when the symptom is present. - src/explanation/semantic-matching.md: new "The ~lineage table" section explaining how the table is maintained, the in-memory check on decoration (new in 2.3), and explicitly calling out the limitation (stale-but-non-None entries require manual rebuild). Adds the dj.migrate.rebuild_lineage(schema, dry_run) alternative for users who want a preview before applying. - src/reference/specs/semantic-matching.md: version-added admonition on "Rebuilding Lineage" updated to describe the in-memory check rather than an unconditional refresh. The "When you still need to call this explicitly" list now leads with the stale-but-non-None case (the primary scenario the auto-heal can't reach), production-mode suppression, and cross-schema upstream changes. Slated for DataJoint 2.3 alongside datajoint-python #1467.
a734046 to
de191c7
Compare
MilagrosMarin
left a comment
There was a problem hiding this comment.
Clean companion to #1467. Verified:
✅ "New in 2.3" admonition accurately describes the in-memory check (not unconditional refresh).
✅ Stale-but-non-None limitation explicitly called out — matches what #1467 doesn't auto-heal.
✅ Production-mode (create_tables=False) suppression claim verifies against impl: schemas.py:303 guards the refresh on create_tables=True. ✓
✅ Cross-schema upstream limitation correctly identified.
✅ Error message quoted in docs matches condition.py:assert_join_compatibility (modulo ... truncation).
The "When you still need to call this explicitly" list (stale-but-non-None / production mode / cross-schema upstream) is precise and complete.
One small wording observation, optional: the spec page calls the in-memory check's miss "Stale-but-non-None rows" and the impl error message says "stale ~lineage entry". The case that actually triggers the error (lineage missing on one side) is more accurately "missing or stale", since both bring the same fix. Not blocking — your wording reads cleanly.
Approving.
Summary
Docs companion to datajoint-python #1467. The python PR adds an in-memory check at `@schema` decoration time — when an already-declared table's heading shows any PK attribute with `lineage=None`, the table's `~lineage` rows are auto-refreshed. Healthy schemas pay zero extra DB queries; the refresh only fires when the symptom is in memory.
This PR documents the new behavior and the limitation (stale-but-non-None entries still require manual `rebuild_lineage()`). Closes the docs side of datajoint-python #1454.
Why the lighter approach
The earlier draft of #1467 refreshed lineage on every decoration unconditionally, which would have added per-table DB queries on every schema activation (~50 queries for a 10-table schema). The current implementation guards the refresh on an in-memory check against the heading's already-loaded lineage values — zero extra cost on healthy schemas, auto-heal only when the bug's symptom (None lineage) is present.
The stale-but-non-None case (DJ version skew that wrote lineage in a different string format) is not auto-detected here. Users hit the improved error message in #1467 — "...Run `schema.rebuild_lineage()`..." — and run the explicit rebuild.
Sequencing
Reviewable now. Should land alongside or after datajoint-python #1467 so the docs don't claim auto-heal before the code ships.
Test plan