Skip to content

docs(#1454): document ~lineage auto-refresh on decoration#181

Merged
dimitri-yatsenko merged 2 commits into
mainfrom
fix/1454-lineage-docs
Jun 10, 2026
Merged

docs(#1454): document ~lineage auto-refresh on decoration#181
dimitri-yatsenko merged 2 commits into
mainfrom
fix/1454-lineage-docs

Conversation

@dimitri-yatsenko

@dimitri-yatsenko dimitri-yatsenko commented Jun 10, 2026

Copy link
Copy Markdown
Member

Summary

Docs companion to datajoint-python #1467. The python PR adds an in-memory check at `@schema` decoration time — when an already-declared table's heading shows any PK attribute with `lineage=None`, the table's `~lineage` rows are auto-refreshed. Healthy schemas pay zero extra DB queries; the refresh only fires when the symptom is in memory.

This PR documents the new behavior and the limitation (stale-but-non-None entries still require manual `rebuild_lineage()`). Closes the docs side of datajoint-python #1454.

File Change
`src/explanation/semantic-matching.md` New "The `~lineage` table" section: how the table is maintained, the in-memory check on decoration (new in 2.3), the auto-heal scope, and explicit call-out of the stale-but-non-None limitation. Adds the `dj.migrate.rebuild_lineage(schema, dry_run)` helper as the alternative path with preview support.
`src/reference/specs/semantic-matching.md` `version-added` admonition on "Rebuilding Lineage" updated to describe the in-memory check (not an unconditional refresh). "When you still need to call this explicitly" list reordered to lead with the stale-but-non-None case.

Why the lighter approach

The earlier draft of #1467 refreshed lineage on every decoration unconditionally, which would have added per-table DB queries on every schema activation (~50 queries for a 10-table schema). The current implementation guards the refresh on an in-memory check against the heading's already-loaded lineage values — zero extra cost on healthy schemas, auto-heal only when the bug's symptom (None lineage) is present.

The stale-but-non-None case (DJ version skew that wrote lineage in a different string format) is not auto-detected here. Users hit the improved error message in #1467 — "...Run `schema.rebuild_lineage()`..." — and run the explicit rebuild.

Sequencing

Reviewable now. Should land alongside or after datajoint-python #1467 so the docs don't claim auto-heal before the code ships.

Test plan

  • `mkdocs serve` renders the new section under Concepts → Queries → Semantic Matching
  • Wording stays consistent with the error message added in datajoint-python #1467
  • Limitation call-outs are precise (stale-but-non-None case, production mode, cross-schema upstream)

Docs companion to datajoint-python #1467, which adds an in-memory check
at @Schema decoration time: when an already-declared table's heading
shows any PK attribute with lineage=None, the table's ~lineage rows are
auto-refreshed. Healthy schemas pay zero extra DB queries; the refresh
only fires when the symptom is present.

- src/explanation/semantic-matching.md: new "The ~lineage table" section
  explaining how the table is maintained, the in-memory check on
  decoration (new in 2.3), and explicitly calling out the limitation
  (stale-but-non-None entries require manual rebuild). Adds the
  dj.migrate.rebuild_lineage(schema, dry_run) alternative for users who
  want a preview before applying.

- src/reference/specs/semantic-matching.md: version-added admonition on
  "Rebuilding Lineage" updated to describe the in-memory check rather
  than an unconditional refresh. The "When you still need to call this
  explicitly" list now leads with the stale-but-non-None case (the
  primary scenario the auto-heal can't reach), production-mode
  suppression, and cross-schema upstream changes.

Slated for DataJoint 2.3 alongside datajoint-python #1467.

@MilagrosMarin MilagrosMarin left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean companion to #1467. Verified:

✅ "New in 2.3" admonition accurately describes the in-memory check (not unconditional refresh).
✅ Stale-but-non-None limitation explicitly called out — matches what #1467 doesn't auto-heal.
✅ Production-mode (create_tables=False) suppression claim verifies against impl: schemas.py:303 guards the refresh on create_tables=True. ✓
✅ Cross-schema upstream limitation correctly identified.
✅ Error message quoted in docs matches condition.py:assert_join_compatibility (modulo ... truncation).

The "When you still need to call this explicitly" list (stale-but-non-None / production mode / cross-schema upstream) is precise and complete.

One small wording observation, optional: the spec page calls the in-memory check's miss "Stale-but-non-None rows" and the impl error message says "stale ~lineage entry". The case that actually triggers the error (lineage missing on one side) is more accurately "missing or stale", since both bring the same fix. Not blocking — your wording reads cleanly.

Approving.

@dimitri-yatsenko dimitri-yatsenko merged commit 6d4764a into main Jun 10, 2026
2 checks passed
@dimitri-yatsenko dimitri-yatsenko deleted the fix/1454-lineage-docs branch June 10, 2026 23:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants