Implement repeat directive + %rowIndex via a hybrid (staged) SQL emitter#34
Implement repeat directive + %rowIndex via a hybrid (staged) SQL emitter#34piotrszul wants to merge 13 commits into
Conversation
* Replace struct emitter with hybrid (staged) emitter (#24) Make the staged/hybrid SQL emitter (SPEC_hybrid) the sole emitter for all non-repeat, non-%rowIndex features, removing the legacy struct emitter. Stage 1 of the migration tracked in #1. What changed: - src/staged-sql-builder.js (new): the SPEC_hybrid walker — collectScope, subtreeHasFork, emitScope (CHAIN vs FORK), emitUnion, unnestFrom, lazy fork keys (rid + WITH ORDINALITY ordinals), root-key modes (natural|uuid). repeat is rejected (later stage); no %rowIndex path. - src/query-builder.js: staged-only; the `backend` flag is gone. The read schema is derived from a navigation-only path expression (view-parser `viewPaths`) instead of the struct flattening transform. - src/view-parser.js: dropped struct flattening (`parseVd` path/tables); kept `validateVd` + `extractPathsFromAst`; added `viewPaths` for read-schema path coverage. - src/ddb-sql-builder.js: kept all navigation/function/literal/comparison leaf cases, `_col`/`_col_collection`/`_invoke`, and `pathsToSchema` (shared by the staged engine); threaded the `rootVar` arg; removed struct-only directive cases (`_forEach`/`_forEachOrNull`/`_unionAll`) and `tablesToSql`. - src/cli.js: removed `--backend`; added `--root-key` (default `natural`); still executes via the duckdb 1.4.1 binding. - templates/{csv,parquet,ndjson,explore,dbt_model}.sql: promoted to the staged source/tail form; duck-macros.js gains `fhir_list`. - tests: spec.staged.test.js is the spec runner (runs every reference suite in both root-key modes); staged-rootkey.test.js added; e2e/fp-custom migrated to the staged engine; removed struct-only spec.test.js + view.test.js and the struct-directive fp-custom cases. Added spec-tests/staged.json fork coverage. - specs/SPEC_hybrid.md added; README documents `--root-key`. Verification: `bun test` 279 pass / 0 fail. End-to-end `bun run src/cli.js` on forEach and nested-fork ViewDefinitions produces correct output on duckdb 1.4.1; generated SQL uses only 1.4-valid syntax (UNNEST column, WITH ORDINALITY). Stays on `duckdb` 1.4.1 (no @duckdb/node-api). repeat and %rowIndex remain later migration stages. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Simplify viewPaths to schema-coverage essentials (#24) `viewPaths` was trimmed down from the removed struct `parseVd` and carried vestigial structure that does not affect the extracted read-schema paths: the `isRoot`/`inUnion` parameters and their conditional `_nav(...)` wrapping, the `'e'` literal, and the `_col('name', ...)` wrapper. `extractPathsFromAst` only collects type-resolved `nav` segments and recurses through any function's args regardless of name, so these are inert. Collapse to a uniform inert `_nav(...)` grouping wrapper that solely carries the forEach type context. Produces byte-identical extracted paths; `bun test` 279 pass / 0 fail. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Document natural-key getResourceKey() source + cross-type collision caveat (#26) The natural root-fork key is sourced from getResourceKey() (as-is) so a future fix to that function propagates to the key. getResourceKey() today resolves to the bare id, so the natural key only disambiguates within a single resource type; the mixed-type collision risk is tracked in #26. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Address code-review findings on the staged emitter (#24) Correctness: - Reject duplicate column names in validateVd: a name reused across sibling scopes previously emitted two same-named columns into a CTE, where DuckDB silently keeps the first and drops the other column's values. Now fails up front with a clear message. unionAll branches (which must share names) are still allowed. - Reject `%rowIndex` with a clear "not supported by the staged backend yet" error instead of the misleading generic "define it with --var" hint (a constant --var cannot supply a per-row ordinal). Cleanup (all behaviour-preserving — generated SQL byte-identical across every reference view in both root-key modes): - Memoise subtreeHasFork (WeakMap) so the recursive walk is not re-run on overlapping subtrees at every level. - Compile each fan-out path once via prepareFanout, replacing childElemOf + arrayize which each re-parsed the same path. - Unify UNNEST FROM-clause construction in a single unnestClause helper used by both unnestFrom and the unionAll emitter. - Drop the unused fhir_list macro (forward-looking; belongs with the later repeat stage that will use it). bun test: 283 pass / 0 fail. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Add %rowIndex for forEach / forEachOrNull (staged emitter)
Implements the `%rowIndex` FHIRPath environment variable - the 0-based
position of an element within the collection its operator iterates,
scoped to the parent instance (per SPEC_hybrid §11), not a global
ROW_NUMBER over output.
- fhirpath-parser: `%rowIndex` parses to a `{segmentType:"rowIndex"}`
segment (replacing the prior "not supported" rejection).
- ddb-sql-builder: `astToSql` threads a `rowIndexSql` binding; a
`rowIndex` leaf emits it. Defaults to "0" (resource root / no
enclosing iteration).
- staged-sql-builder: every forEach / forEachOrNull / unionAll fan-out
binds a `WITH ORDINALITY` ordinal and sets the child scope's
`rowIndexSql` to `CAST(ord - 1 AS INTEGER)` (the ordinal is BIGINT;
%rowIndex is a FHIR integer). forEachOrNull over an empty collection
yields index 0 (COALESCE the missing ordinal to 1) per the official
fixture. The ordinal resets per driving row automatically.
- tests/spec-tests/row_index.json: official fixture (root, forEach,
forEachOrNull incl. empty, nested forEach, unionAll variants,
surrogate key). The `%rowIndex with repeat` sub-test is omitted -
repeat (and %rowIndex over repeat) are later migration stages.
- spec runner: drop `row_index` from the exclude set; drop the now
obsolete "%rowIndex is rejected" assertion (repeat rejection kept).
Scope: forEach / forEachOrNull / unionAll only. `repeat` stays
rejected (Stage 3) and %rowIndex-over-repeat is Stage 4.
Tests: 298 pass / 0 fail (baseline 283).
Refs #27, #1
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Simplify %rowIndex ordinal plumbing (review cleanup)
- Fold bindRowIndex into unnestFrom (single source for chain/fork) and
route the iterating unionAll branch through unnestFrom too, removing
three duplicated bind sites and the cosmetic _b/_u alias split.
- Use the existing name("rn") helper instead of inline `rn${++counter}`,
matching the file's other generated identifiers.
- Drop the redundant `|| "0"` in compilePath/compileColumn; astToSql
already defaults rowIndexSql to "0".
No behaviour change; 298 pass / 0 fail.
Refs #27, #1
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Address review: thread rowIndexSql into _invoke; reconcile spec §11
- ddb-sql-builder.js: thread rootVar/rowIndexSql into the _invoke macro-param
astToSql call so %rowIndex resolves correctly if _invoke ever accepts path
arguments (currently guarded by the literal-only validator, so latent).
- ddb-sql-builder.js: drop stray double blank line after the rowIndex case.
- SPEC_hybrid.md §11: rename %rowNumber -> %rowIndex, 1-based -> 0-based, and
document the forEachOrNull empty-row result as 0 (COALESCE(ord,1)-1), matching
the SoF spec and the implementation; repeat windows now subtract 1.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Stage 3: implement repeat operation in the staged emitter (#29, #1) Implement the `repeat` directive on the hybrid (staged) backend per SPEC_hybrid §5/§10, in both CHAIN and FORK modes. - repeat-lowering.js: staged-relevant helpers (assertSimplePath, jsonFold descent, typedSeed, repeatStructure via a self-contained body-paths walk, childElemOf). Dropped the struct-only reduceSource accumulator. - staged-sql-builder.js: remove the repeat rejection; add emitRepeat (WITH RECURSIVE descent over the listed paths only, excluding the seed + a typed from_json bridge so leaf columns / forEach inside the body compile via the existing typed engine), fq_cast_<N> cast-macro pooling, and FORK keying (carry the fork key + recombine via JOIN USING(rid)). A lone repeat stays a keyless CHAIN. Bridge element bound as `el_r` (not `el`) so an array-navigating body column's `list_transform(el -> ...)` lambda does not shadow the bridge struct. - query-builder.js / view-parser.js: forced-JSON repeat seeds in the read schema (applyForcedJson + repeat-aware viewPaths), and the fq_staged_with / fq_staged_macros template vars. - ddb-sql-builder.js: pathsToJsonStruct + forceJson handling (shared leafSqlType). - duck-macros.js: the fhir_list macro (single-value-vs-array bridge). - templates + test-util: WITH RECURSIVE keyword + cast-macro preamble placeholders. - Fixtures: official repeat.json + staged-repeat.test.js (from_json bridge / choice-type / lambda-collision edge cases), wired into the staged spec runner. %rowIndex over a repeat's own descent scope is Stage 4 (omitted here); %rowIndex for forEach/forEachOrNull keeps working, including inside a repeat body. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Simplify: reuse viewPaths, single-compile seed, hoist bridge-var const Apply /simplify findings: - repeat-lowering.js: drop repeatBodyPaths (a verbatim duplicate of viewPaths' parseNode walker) and call viewPaths on the stripped body; typedSeed now arrayizes from its single compilePath result instead of recompiling each path via B.arrayize. - ddb-sql-builder.js: factor the duplicated JSON object-field rendering in pathsToJsonStruct into a local objOf helper. - staged-sql-builder.js: hoist the `el_r` bridge variable to a BRIDGE_VAR const used by bridgeElem and both bridge CTE emitters, so the alias and the leaf root-var can't silently drift. No behaviour change: bun test 331 pass / 0 fail; chain + fork repeat e2e unchanged (5 / 10 rows). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Review fixes: %rowIndex in shared-field forEach + applyForcedJson guard Apply /review findings: - HIGH: a `forEach` that names the same field as a sibling `repeat` is routed through emitJsonEach (from_json bridge). It bound %rowIndex on the discarded f.childElem instead of the bridge element used for the body, and passed ordName=null so no ORDINALITY was even emitted — %rowIndex was always 0. Now emitJsonEach always mints an ordinal, projects the 0-based %rowIndex as a named bridge column, and binds the body element's rowIndexSql to it. Verified: forEach beside repeat on the same field now reports 0,1,2. - MEDIUM: applyForcedJson force-JSON'd a shallower ancestor on a partial path match (truncating a subtree a typed sibling may need). Guard with a `matched` flag so it only applies when the full path resolves. bun test 331 pass / 0 fail; chain (5 rows) + fork (10 rows) repeat e2e unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Fix fork inside a repeat body; reject %rowIndex over a repeat scope A repeat body that forks (>=2 fan-out children) recombined its branches via JOIN USING(keyCols), where keyCols was constant per resource (rid for a CHAIN repeat, the parent fork key for a FORK branch). With no per-node identity the branches cross-produced across ALL descended nodes of the resource, emitting wrong rows (e.g. item "a" paired with item "b"'s answers). The official repeat.json suite missed this: its repeat+internal-fork cases all expect a single row, which cannot expose the cross-product. Mint a per-node descent key when the body forks: carry the integer descent `path` (per-level WITH ORDINALITY ordinals from seed to node) through both recursive legs, then assign a deterministic pre-order index `(row_number() OVER (PARTITION BY keyCols ORDER BY path) - 1)` at the bridge and thread it as an extra recombination key. ORDER BY path is a total order (paths are unique per node), so the index survives every reference to the bridge without MATERIALIZED. This is the same descent key Stage 4 binds for %rowIndex over a repeat's own scope; here it serves only as a fork key. Also reject %rowIndex resolving to a repeat's own descent scope (Stage 4, unsupported) instead of silently binding it to the constant 0: the repeat bridge element binds rowIndexSql=null and the leaf engine throws on a rowIndex segment with no valid binding. %rowIndex for a forEach/forEachOrNull inside a repeat body is unaffected. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Stage 4: %rowIndex over repeat (pre-order descent index)
Implements `%rowIndex` indexing a `repeat`'s OWN descent scope: a depth-first
pre-order index over the recursive descent, completing the hybrid-only migration.
The index is assigned at the repeat's own scope (one row per visited node) by
`row_number() OVER (PARTITION BY {enclosing-scope key} ORDER BY path)` over the
int-list descent path the rCTE already carries, BEFORE any sibling recombination,
then materialised as a scalar that survives fork joins:
- root repeat partitions by `rid` (forced even with no fork, via
`subtreeHasRepeatUsingRowIndex`);
- a repeat under a forEach carries the enclosing iteration ordinal as a key so
its window restarts per enclosing element;
- a repeat-in-repeat partitions the inner window by the outer repeat's own index
(chaining via `(rid, outer_rn)`), never PARTITION BY a list.
The pre-order window unifies with the existing fork-recombination node key (same
value, same `nord{N}` column): computed when any of fork / `%rowIndex` /
nested-`%rowIndex`-repeat needs it.
Adds tests/spec-tests/row_index_repeat.json (staged-only extended suite, skip:true)
+ tests/row-index-repeat.test.js (dedicated runner, natural + uuid key modes).
Updates the now-obsolete Stage 3 "rejected" guard into a positive assertion.
Tests: 334 -> 344 pass, 0 fail (+10 from the new repeat-rowIndex suite).
Refs #32, #1
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Simplify: share %rowIndex scope traversal + cache subtree detection
- Collapse the parallel raw-string / AST `%rowIndex` scope detectors onto one
shared `scopeMentionsRowIndex` traversal parameterised by a per-column
predicate, so the "a nested fan-out opens its own scope" rule lives in one
place instead of being mirrored in four functions.
- Memoise `subtreeHasRepeatUsingRowIndex` with a WeakMap, matching the existing
`forkCache` pattern (emitScope queries it per fan-out over overlapping subtrees).
No behaviour change; 344 pass / 0 fail.
Refs #32, #1
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lisions
A ViewDefinition column name is a validated SQL identifier starting with a
letter (`^[A-Za-z][A-Za-z0-9_]*$`). The staged emitter carried internal
bookkeeping columns (resource key `rid`, fork ordinals, a repeat's descent
`node`/`ord`/`path` and pre-order index `nord`) in the same CTEs as the user's
columns, so a view that legally named a column `rid`, `node`, `path`, etc.
collided: a hard `Binder Error` (duplicate column) inside a repeat's recursive
CTE, or silent value loss in `uuid` root-key mode.
Prefix every internal identifier with `_` — the smallest prefix a valid column
name can never start with — making the collision impossible rather than merely
unlikely. Covers the `name()`-generated relation/column names plus the fixed
names `_rid`, `_ord{N}`, `_nord{N}`, `_node`, `_ord`, `_path`, and BRIDGE_VAR.
Also move the unknown-type diagnostic in pathsToSchema from console.log
(stdout, which corrupts the compiled SQL the CLI prints there) to console.warn
(stderr), and document the enforcement mechanism in SPEC_hybrid §projection.
New tests/staged-reserved-names.test.js pins that a view may name a column
`rid`, `node`, `path`, or `ord` and still compile and return correct rows.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…d test runners tests/spec-tests/ now holds only the official SQL-on-FHIR conformance fixtures (mirror of FHIR/sql-on-fhir.js /tests). Flatquack-specific fixtures move to a new tests/custom-tests/ dir in the same JSON format: staged.json (emitter fork handling) and row_index_repeat.json (extended %rowIndex-over-repeat). repeat.json and row_index.json stay in spec-tests/ — they are official spec features. De-duplicate the staged test harness: - Extract a shared buildStaged helper and a runFixtureSuite runner into test-util.js (replacing the copy-pasted helper in 5 runner files). The runner gains a custom-only per-test `resources` override so one fixture can hold cases with differing input data. - spec.staged.test.js + new custom.staged.test.js are now thin wrappers. - Convert inline-view JS tests to JSON fixtures: reserved_names.json, repeat_extra.json, and a no_id_fork case in staged.json. The only non-fixture bits survive as staged-sql-shape.test.js (asserts emitted SQL) and view-parser.test.js (validateVd units, moved out of the spec runner). Delete the superseded row-index-repeat / staged-repeat / staged-reserved-names / staged-rootkey runners. Route generated *.temp.json resource files to a git-ignored tests/.tmp/ scratch dir instead of polluting the fixture directories. Add tests/README.md documenting the split. Fix scripts/build-test-output.js (it imported a non-existent testQueryTemplate and was broken): use the shared buildStaged, a single shared db, and the scratch dir; regenerates an identical conformance report. Coverage note: drops the 4 redundant natural-EXCEPT-ALL-uuid cross-mode checks (both modes are still independently checked against ground truth). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Re-vendor tests/spec-tests/ from FHIR/sql-on-fhir.js tests/ at commit
70c05667b5fc8f82bfce16a3a2a0e5b142e5c6d2 (branch main, fetched 2026-06-16).
This pulls newer official cases, notably repeat.json (+4) and row_index.json
(+1, "%rowIndex with repeat"). The vendored revision is recorded in
tests/README.md. skip/ fixtures (constant, constant_types, fn_boundary) are
refreshed but stay skipped (unimplemented features).
Test harness: feed each case only the resources matching its view's declared
resource type. Upstream fhirpath.json now mixes an Organization (scalar name)
with Patients (array name) in one resource list, which broke flatquack's
type-coerced read_json_auto; filtering to the view's resourceType mirrors the
per-resource-type engine model and leaves results unchanged (recovers 16 tests).
Engine fix: join() over an empty collection now yields NULL, not "" — drop the
forced .ifnull2('') in ddb-sql-builder. Matches the new official fn_join cases.
Known failure (to be addressed separately): repeat.json "unionAll with repeat
and non-repeat branches". A unionAll branch that navigates the repeat-forced
JSON field (item.linkId.first()) evaluates to JSON, while the repeat branch
yields VARCHAR via its from_json bridge; UNION ALL then fails casting a plain
string to JSON. Root cause is forceJson typing not coercing to the column's
declared type; fix deferred pending design discussion.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ry (#36) Re-type a repeat-forced JSON field back to a typed STRUCT inline at every site that navigates across the boundary: columns (root / chain-forEach / unionAll branches), where paths and where() lambdas, and components/comparison operands. Multi-segment (dotted) seeds re-type at the divergence depth. forcedJsonPaths is the single source of truth; from_json cast macros are pooled. No-op for non-repeat views. Adds a systematic on-demand-typing test matrix + SQL-shape assertions and a design spec (specs/SPEC_ondemand_typing.md). Full suite: 445 tests, 0 fail.
…ocs/ Make SPEC_hybrid.md a self-contained specification of the staged-CTE lowering and drop all references to non-existent resources (Spec A/B/C, benchmark/work SQL, measured timings). Replace the inline JSON leaf machinery with the assumption of an available FHIRPath->SQL translation layer, and focus on the view-to-relational mapping (chain vs fork, lazy fork keys, repeat descent, %rowIndex, projection contract). Rename specs/ -> docs/ and SPEC_hybrid.md -> SPEC_view_lowering.md, and update all references (README, companion spec, source comments, tests). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
master-fix vs cte-fix — full comparisonBenchmark of two flatquack lines compiling the same SQL-on-FHIR ViewDefinitions to DuckDB SQL:
Same DuckDB engine in each cell, so every difference is the generated SQL. Speedup = master ÷ cte (>1 ⇒ cte faster). DatasetsSynthea-generated FHIR R4 bulk-data (per-resource NDJSON, US Core profiles). Three patient cohorts — s1000 (~1k patients), s10000 (~10k), s100000 (~119k, patients-only export: MeasurementPer cell: flatquack compiles the ViewDefinition → DuckDB SQL, run through the DuckDB CLI in materialize mode ( DuckDB 1.4.1 (mean ms)
DuckDB 1.5.3 (mean ms)
s100000 — both engines (~119k patients)
Takeaways
Net: cte-fix is equal-or-faster on every case, at every scale, on both engines — and engine-stable across the 1.4→1.5 jump that severely regresses master. |
johngrimes
left a comment
There was a problem hiding this comment.
Thanks @piotrszul, it's looking good!
I managed to run it against my AU PIP QI project, but ran into the issue with the breaking change to the template variables. Is this breaking change necessary?
I've also added a few other comments.
Two claims in the spec did not match the staged emitter; soften the wording to describe actual behavior: - preserve_insertion_order = false: the lowering never emits it. Document it as a DuckDB 1.5+ session setting that recovers parallel UNNEST materialisation (a significant regression without it on 1.5+), and add a matching performance note to the README. - shallow shared-array siblings: the emitter always FORKs at >=2 fan-outs; there is no CHAIN exception. Document always-FORK and record the shallow-CHAIN optimization as benchmark-gated in #39. The third claim (WITH ORDINALITY 'zero extra cost') is addressed by a follow-up code change in a separate commit. Addresses review comment on PR #34. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Previously every forEach unnest emitted WITH ORDINALITY because an
ordinal was always minted to make %rowIndex available, even when no
column read it and no fork needed it as a key (the 'zero extra cost'
the spec claimed was actually an always-on cost).
Gate the ordinal on (needsOrd || usesRi): emit it only when a fork or a
nested %rowIndex repeat below needs it as a key, or when the scope itself
binds %rowIndex. When neither holds the unnest is a plain UNNEST(...) and
%rowIndex is left unbound (a stray reference is rejected, matching a
repeat's no-ordinal default). Reuses the existing scopeUsesRowIndex
predicate and the already-conditional unnestClause; applies uniformly to
the chain, fork, json-each, and unionAll-branch paths.
The unionAll-branch path computes needsOrd/usesRi in prepUnion (the same
predicates emitScope uses for chain/fork fan-outs); without them the
emitJsonEach/unnestFrom gate read undefined and minted no ordinal,
leaving a %rowIndex-reading forEach branch over a repeat-forced field
(jsonMode) unbound. The branch ordinal stays private (branches recombine
on the parent keyCols, never their own ordinal).
Verified: fork keys (_ord{N}) and %rowIndex views still emit ORDINALITY;
nested forks still join on (_rid, _ord1, ...). Full suite: 447 pass
(adds row_index_union_jsonmode.json covering the unionAll branch).
Reconciles the third SPEC_view_lowering claim (lines 139, 278) by making
the emitter match the doc rather than softening it.
Addresses review comment on PR #34.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ntract Replace the fq_staged_*/fq_where_filter template variables with one fq_sql_* contract built around a named input CTE and a column-clean output CTE. The template binds the data source as `_fq_input` (the resource's elements as its columns) and projects the emitter's `_fq_output`; the emitter's flattening pipeline is sealed between them. This drops the source-splicing hole and the with/src/materialized/tail fragments in favour of: fq_sql_with, fq_sql_input, fq_sql_input_schema, fq_sql_where, fq_sql_pipeline, fq_sql_output, fq_sql_output_columns, fq_sql_macros, fq_sql_view_macros Emitter (staged-sql-builder, query-builder): emit `_fq_src AS (… FROM _fq_input)` reading the template-provided input, seal the MATERIALIZED hint onto `_fq_src`, and add a column-clean `_fq_output` CTE; expose pipeline / input / output names and the output column list (no edge commas, no terminal SELECT). The root pipeline CTE is `_fq_`-prefixed (via the SRC_CTE constant) so it can never collide with a user column, matching the seam CTEs and the SPEC_sql_template_contract §4.2 invariant. Templates: rewrite csv/ndjson/parquet/explore/dbt_model/dbt_prehook. explore materialises its limited input so a forked view stays deterministic. dbt routes both macro blocks to the prehook so the model is a single, materializable SELECT (fixing the prior inline-CREATE-MACRO model). Docs: add docs/SPEC_sql_template_contract.md (the contract), rewrite the README variable table + migration note, and cross-link from SPEC_view_lowering. Tests: migrate the shared harness template to the new contract; add a determinism guard that renders the shipped templates. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Marks the fq_sql_ template-variable contract as a new minor release: the fq_staged_*/fq_where_filter/fq_sql_flattening_* variables from v0.2.x are replaced by the input/output CTE contract, a breaking change for custom templates. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replaces the legacy struct SQL emitter with the hybrid (staged-CTE) emitter, and adds the
repeatdirective and%rowIndexalong the way. The hybrid emitter is now the sole emitter (code, templates, docs, tests). Closes #1.Core migration
Landed as four reviewed sub-issue PRs into this branch, in sequence:
%rowIndexforforEach/forEachOrNull(0-based,WITH ORDINALITYordinal)repeatoperation — recursive-CTE descent + typedfrom_jsonbridge (CHAIN + FORK)%rowIndexoverrepeat— pre-order descent index, nested-repeat chainingPost-stage hardening
_rid,_ord{N},_nord{N},_node,_path, bridge var) with_, the one prefix a valid ViewDefinition column name can never use. A view may now legally name a columnrid,node,path, orord.tests/spec-tests/) from flatquack-specific ones (tests/custom-tests/); de-duplicate the staged harness behind a sharedbuildStaged+runFixtureSuite(test-util.js); route scratch output to a git-ignoredtests/.tmp/.70c0566(2026-06-16) — pulls newer official cases (repeat.json+4,row_index.json+1). Engine fix:join()over an empty collection yieldsNULL, not"".wherepaths/lambdas, components, comparison operands, unionAll branches). This resolves the previously known-failing official case "unionAll with repeat and non-repeat branches", which is now unskipped and passing. Design recorded indocs/SPEC_ondemand_typing.md.specs/→docs/(SPEC_view_lowering.md,SPEC_ondemand_typing.md).Verification
repeat.jsonandrow_index.jsonsuites (both root-key modes), with no skipped repeat/rowIndex cases.%rowIndex, chainrepeat, forkrepeat, and repeat-in-repeat indexing.tablesToSql/setRepeatContext/--backend/_forEachdirectives) remain insrc/tests/templates.--root-key <natural|uuid>(defaultnatural) replaces--backend.Follow-up (not in scope)
id— cross-resource-type key collision risk #26 —getResourceKey()/getReferenceKey()resolve to bareid(cross-resource-type key collision risk)._fq_srcCTE in natural-key mode as a performance optimization #40 — investigate materializing the root_fq_srcCTE in natural-key mode as a performance optimization.🤖 Generated with Claude Code