(opt): keep syntax node ids stable across reparses#10105
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
b65f1d0 to
025c2b9
Compare
PR SummaryHigh Risk Overview Canonical roots are created only in the Detached The defs cache no longer serializes green trees for file roots: on load, roots are restored through Reviewed by Cursor Bugbot for commit b44c9c9. Bugbot is set up for automated code reviews on this repo. Configure here. |
orizi
left a comment
There was a problem hiding this comment.
@orizi reviewed 5 files and all commit messages, and made 1 comment.
Reviewable status: 5 of 8 files reviewed, 2 unresolved discussions (waiting on eytan-starkware and TomerStarkware).
crates/cairo-lang-parser/src/parser.rs line 206 at r2 (raw file):
/// Parses a file expr, returning the green root. See [Self::parse_file_green]. pub fn parse_file_expr_green(
should some of the non _green fns be removed?
025c2b9 to
e2bb34f
Compare
eytan-starkware
left a comment
There was a problem hiding this comment.
@eytan-starkware made 1 comment.
Reviewable status: 5 of 9 files reviewed, 2 unresolved discussions (waiting on orizi and TomerStarkware).
crates/cairo-lang-parser/src/parser.rs line 206 at r2 (raw file):
Previously, orizi wrote…
should some of the non
_greenfns be removed?
Yes. Done. Also changed a util to rely on the query so it wont use a detached root.
orizi
left a comment
There was a problem hiding this comment.
@orizi reviewed 3 files and all commit messages, made 1 comment, and resolved 1 discussion.
Reviewable status: 8 of 9 files reviewed, 2 unresolved discussions (waiting on eytan-starkware and TomerStarkware).
crates/cairo-lang-defs/src/cache/mod.rs line 1645 at r3 (raw file):
/// A file whose content is not re-derivable at load time (virtual / plugin-generated): its /// green is stored and the root is rebuilt detached, standing on its own. // TODO: give plugin-generated code a canonical source too, so this variant can be removed and
should be solved or have an assignee.
e2bb34f to
061e908
Compare
eytan-starkware
left a comment
There was a problem hiding this comment.
@eytan-starkware made 1 comment.
Reviewable status: 8 of 9 files reviewed, 2 unresolved discussions (waiting on orizi and TomerStarkware).
crates/cairo-lang-defs/src/cache/mod.rs line 1645 at r3 (raw file):
Previously, orizi wrote…
should be solved or have an assignee.
Done.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 061e908. Configure here.
orizi
left a comment
There was a problem hiding this comment.
@orizi reviewed all commit messages, made 1 comment, and resolved 1 discussion.
Reviewable status: 8 of 9 files reviewed, 2 unresolved discussions (waiting on eytan-starkware and TomerStarkware).
crates/cairo-lang-defs/src/cache/mod.rs at r4 (raw file):
awaiting updates discussed f2f.
061e908 to
b2a1d57
Compare
orizi
left a comment
There was a problem hiding this comment.
@orizi reviewed 6 files and all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on eytan-starkware and TomerStarkware).
1334cde to
9cb7a3d
Compare
b2a1d57 to
783ecae
Compare
783ecae to
34274a7
Compare
9cb7a3d to
329a5a8
Compare
34274a7 to
d40ef8c
Compare
329a5a8 to
3f5bf0b
Compare
eytan-starkware
left a comment
There was a problem hiding this comment.
@eytan-starkware made 1 comment.
Reviewable status: 9 of 12 files reviewed, 1 unresolved discussion (waiting on orizi and TomerStarkware).
crates/cairo-lang-defs/src/cache/mod.rs at r4 (raw file):
Previously, orizi wrote…
awaiting updates discussed f2f.
Done.
3f5bf0b to
c9b0603
Compare
19ba634 to
b16c4fc
Compare
c9b0603 to
3ce0717
Compare
Seed the canonical per-file root from the file-keyed parse query (file_syntax) instead of a green-keyed detached root, so reparsing a changed file reuses the previous node ids — early cutoff downstream relies on stable stable-ptrs. Split root construction into a canonical constructor (file_syntax only) and a detached one (for standalone/transient trees). The defs cache now records each root by file id alone and re-derives the tree via file_syntax on load, unifying cached stable ptrs with the live tree. External (plugin-generated) files re-derive the same way; their content is served from the cache blob (see the external-file content cache below this in the stack), so re-deriving never cycles back into module data.
b16c4fc to
b44c9c9
Compare
3ce0717 to
0fb9292
Compare
orizi
left a comment
There was a problem hiding this comment.
@orizi reviewed 7 files and all commit messages, made 1 comment, and resolved 1 discussion.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on TomerStarkware).


Summary
Introduces a distinction between canonical and detached syntax tree roots to make stable pointer node IDs stable across file edits.
Previously, all root nodes were created via
SyntaxNode::new_root/SyntaxNode::new_root_with_offset, both backed by a green-keyed tracked function (new_root_node). Because the key included the green tree, any content change minted fresh node IDs, preventing salsa early cutoff from firing downstream.Now there are two explicit constructors:
SyntaxNode::new_canonical_root— called exclusively from thefile_syntax_dataparse query. The root's tracked struct is seeded by that file-keyed query, so reparsing changed content reuses the same node IDs. This is the only place a canonical root is created; all consumers must reach it viadb.file_syntax(file_id).SyntaxNode::new_detached_root/new_detached_root_with_offset— backed by the green-keyednew_detached_root_nodequery, for transient trees (proc-macro token streams, snippet formatting, tests) that have no per-file parse query to seed from.The defs cache is updated to match: on-disk file roots are reconstructed at load time by re-running
db.file_syntax(so cached stable ptrs land on canonical nodes), while virtual/plugin-generated file roots store their green tree and are rebuilt as detached roots.Type of change
Please check one:
Why is this change needed?
Stable pointers and IDs derived from them are used throughout the compiler for incremental computation. When a file was edited, the green-keyed root constructor minted new node IDs even for unchanged subtrees, defeating salsa early cutoff and causing unnecessary recomputation. By seeding the canonical root from a file-keyed query, node IDs are reused across edits, restoring early cutoff behavior.
What was the behavior or documentation before?
All root nodes were created through a single green-keyed tracked function. Every reparse of a changed file produced new node IDs for the root and all its descendants, even if the content was structurally identical.
What is the behavior or documentation after?
The canonical root for each file is created once inside
file_syntax_dataviaSyntaxNode::new_canonical_root. Reparsing a changed file reuses the same root node ID (salsa reruns the query but the tracked struct identity is preserved), allowing downstream queries that depend on stable ptrs to benefit from early cutoff. Detached roots (transient trees, macro helpers, tests) continue to use the green-keyed path and carry no stability guarantee.Related issue or discussion (if any)
Additional context
The
db_testcomparison for the parser test was updated to compare green trees rather than node identity, since the parsed root (canonical) and the hand-built root (detached) are now distinct nodes even when their content is identical.