refactor(syntax): Group SyntaxKind and unify token/terminal kinds under LexemeKind.#10154
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
TomerStarkware
left a comment
There was a problem hiding this comment.
@TomerStarkware reviewed 23 files and all commit messages, and made 2 comments.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on eytan-starkware and orizi).
crates/cairo-lang-syntax/src/node/kind.rs line 383 at r1 (raw file):
} pub fn is_keyword_token(&self) -> bool { matches!(
if let SyntaxKind::Token(lexemeKind) && lexemeKind.is_keyword_token() {
true
} else {
false
}
crates/cairo-lang-syntax/src/node/kind.rs line 420 at r1 (raw file):
} pub fn is_keyword_terminal(&self) -> bool { matches!(
f let SyntaxKind::Terminal(lexemeKind) && lexemeKind.is_keyword_token() {
true
} else {
false
}
f20c3f9 to
2c76ca2
Compare
orizi
left a comment
There was a problem hiding this comment.
@orizi made 2 comments.
Reviewable status: 21 of 23 files reviewed, 2 unresolved discussions (waiting on eytan-starkware and TomerStarkware).
crates/cairo-lang-syntax/src/node/kind.rs line 383 at r1 (raw file):
Previously, TomerStarkware wrote…
if let SyntaxKind::Token(lexemeKind) && lexemeKind.is_keyword_token() {
true
} else {
false
}
Done.
crates/cairo-lang-syntax/src/node/kind.rs line 420 at r1 (raw file):
Previously, TomerStarkware wrote…
f let SyntaxKind::Terminal(lexemeKind) && lexemeKind.is_keyword_token() {
true
} else {
false
}
Done.
…er LexemeKind.
SyntaxKind's flat token/terminal/missing variants are grouped into nested enums. A terminal node and the token that backs it share one lexical identity, so both reuse a single `LexemeKind` instead of duplicated `TokenKind`/`TerminalKind` lists; trivia tokens (whitespace, newlines, comments, skipped) move to `TriviaKind`, and enum missing-variants to `MissingKind`:
enum SyntaxKind {
Token(LexemeKind),
TriviaToken(TriviaKind),
Terminal(LexemeKind),
Missing(MissingKind),
// ...other node kinds stay flat
}
The parser/lexer layer now operates on `LexemeKind` directly (LexerTerminal, peek().kind, Terminal::KIND, MissingToken, operator-precedence and skip predicates), reserving SyntaxKind for green-tree node kinds. Debug/Display preserve the historical flat names (TerminalIdentifier, TokenWhitespace, ExprMissing) so diagnostics and golden tests are unaffected by the grouping; the lone visible change is the friendlier "Missing token Identifier." (was "TerminalIdentifier"), since kind_to_string now renders a LexemeKind.
colored_printer::set_color matches exhaustively over LexemeKind/TriviaKind, removing its panicking catch-all. All kind enums are generated from the spec via a single classifier, so the duplication can't drift.
2c76ca2 to
4df3abb
Compare

Summary
Restructures the generated
SyntaxKindenum so that the token and terminal kinds are no longer two parallel, duplicated variant lists.A terminal node and the token that backs it share one lexical identity, so both now reuse a single
LexemeKindenum. Trivia tokens (whitespace, newlines, comments, skipped) move toTriviaKind, and enum missing-variants toMissingKind:The lexer/parser layer now operates on LexemeKind directly (LexerTerminal, peek().kind, Terminal::KIND, MissingToken, operator-precedence and skip-until predicates); SyntaxKind is reserved for green-tree node kinds. All kind enums are generated from the spec through a single classifier, so the lists can't drift apart again.
Debug/Display preserve the historical flat names (TerminalIdentifier, TokenWhitespace, ExprMissing), so diagnostics and golden tests are unaffected by the grouping. colored_printer::set_color now matches exhaustively over LexemeKind/ panic! catch-all (the long-standing "Can this be made exhaustive?"TODO).
Type of change
▎ None of the listed categories is "refactor"; this is an internal restructuring with no intended functional change (see below for the one user-visible diagnostic wording change).
Why is this change needed?
The flat SyntaxKind duplicated every lexical kind twice — once as TokenX and once as TerminalX (81 paired names) — with nothing tying the two lists together; they could (and did) drift. A terminal and its backing token are the sameould share one source of truth.
Separately, colored_printer::set_color was a non-exhaustive match ovpanic!, which crashed on perfectly valid token kinds it didn'tenumerate. Grouping the kinds lets that match be exhaustive over LexemeKind/TriviaKind, so the compiler now forces a coloring decision for any new token instead of allowing a latent panic.
What was the behavior or documentation before?
What is the behavior or documentation after?
Everything else (Debug/Display strings, all other diagnostics, golden test output) is unchanged.
Related issue or discussion (if any)
None.
Additional context