Implement the $sqlquery-export operation and align SQL on FHIR operations to the spec#2634
Implement the $sqlquery-export operation and align SQL on FHIR operations to the spec#2634johngrimes wants to merge 15 commits into
Conversation
Add viewReference support, Reference-typed patient/group with spec cardinality, Bundle unwrapping in the resource parameter, and 422 for semantically invalid ViewDefinitions on the run operation. Across the run, export, and sqlquery-run operations, reject an explicitly unsupported _format and the unsupported source parameter rather than silently ignoring them. Add a creation-time start timestamp to the async job for later export-manifest timing.
Support view.viewReference resolution to a stored ViewDefinition with per-part exclusivity and presence checks. Rewrite the completion manifest to the SQL on FHIR shape (exportId, status, echoed clientTrackingId and _format, export timing fields, and one output per view with name and location parts), dropping the Bulk Data-style fields. Return a Parameters kick-off acknowledgement for SQL on FHIR async operations while leaving the Bulk Data kick-off body unchanged.
…ery 422 Declare the SQL on FHIR spec canonical OperationDefinition URLs for the viewdefinition-run, viewdefinition-export, and sqlquery-run operations in the CapabilityStatement, and stop serving the Pathling-authored OperationDefinitions for them. Add a regression test pinning the existing 422 response for an unmappable FHIR column type under _format=fhir on sqlquery-run, and integration coverage for the source and _format rejections across all levels and transports.
Update the admin UI manifest parser to read the spec-shaped output location parts (one or more per output) instead of a single url part, emitting one download entry per file so a view that produced several files lists every file. Add an end-to-end check that an invalid custom ViewDefinition surfaces the 422 OperationOutcome message.
Update the run, export, and sqlquery-run operation docs for the new viewReference support, Reference-typed patient/group, Bundle unwrapping, the spec-shaped export manifest and kick-off acknowledgement, the source and unsupported-format rejections, the 400/404/422 status semantics, and the spec conformance canonicals. Credit the additional author on the significantly changed sqlquery providers.
Align the admin UI export end-to-end mocks with the new completion manifest: output download files are carried in location parts, and the non-spec transactionTime and requiresAccessToken fields are removed, so the card-listing tests exercise the location-reading parser.
Strip media-type parameters (e.g. text/csv;charset=utf-8) before matching an explicit _format value across the run, export, and sqlquery-run strict parsers, so a supported media type with parameters is treated as that format rather than rejected with 400, matching the spec's content-negotiation edge case.
The strict SQL validator's expression allow-list omitted the named and positional parameter placeholder expressions, so any parameterised query (using ":name") failed static validation. This blocked runtime parameter binding for $sqlquery-run even though the binding machinery was already present. Add NamedParameter and PosParameter to the allow-list. They are leaf, unevaluable value placeholders bound at execution time, so they carry no code-execution risk.
Implement the SQL on FHIR $sqlquery-export operation, the asynchronous counterpart to $sqlquery-run, at the system, type, and instance levels. Clients can run one or more SQL queries against materialised ViewDefinition tables in the background and download each result as files, following the FHIR Asynchronous Request Pattern. Each query produces one output. Table sources can be supplied at request time, inline or by reference, in addition to being read from server storage. Supported formats are NDJSON (default), CSV, and Parquet, with patient, group, and _since filtering. A failed query fails the whole export. To avoid duplication, extract the shared asynchronous-export machinery into a new operations/export package (a generic completion-manifest builder, a file writer, and a filtered data-source builder) reused by both export operations, and refactor $sqlquery-run onto a shared SqlQueryPipeline that the export also uses. The Admin UI gains an export affordance on the SQL query result card, backed by a reusable export-job card shared with the ViewDefinition export. The operation is declared in the CapabilityStatement against the spec canonical and gated by a new sqlQueryExportEnabled configuration flag. Resolves #2633.
Replace the boolean redirectOnComplete marker on the server's asynchronous machinery with an AsyncPattern enum (STANDARD_ASYNC_PATTERN, BULK_DATA), so the @AsyncSupported annotation reads as a deliberate choice between two named patterns rather than the presence or absence of one behaviour. Correct the Javadoc that misattributed the redirect-based completion to a "SQL on FHIR unify-async" specification, citing the HL7 Asynchronous Interaction Request Pattern instead. Behaviour is unchanged; the existing async tests pass with only their identifiers updated.
isAnyExportEnabled now also covers the SQL on FHIR asynchronous export operations, so the all-exports-disabled case must also disable the viewdefinition-export and sqlquery-export flags, both of which default to enabled.
The SQL on FHIR export integration tests added to the server module push the full build-and-test suite past the previous 30-minute job limit, causing the run to be cancelled before completion. Raise the timeout on the test, pre-release, and release server workflows so the suite has room to finish.
The disabled-operation assertion called doesNotContain on its own, which passes vacuously if the operation list is empty. Assert the still-enabled sqlquery-run operation is present first, so the test proves the list is populated and only sqlquery-export was dropped.
piotrszul
left a comment
There was a problem hiding this comment.
Solid PR overall — the async refactor, the spec-aligned manifest/conformance work, the shared operations/export abstractions, and the bulk of the test suite are all strong. Two correctness issues block merge, plus a confirmed flaky-CI failure that should be fixed before this lands. Findings grouped by severity. All are static reads (plus one CI log); I haven't run the full build.
🔴 Critical (must fix)
1. $sqlquery-export inherits the synchronous run guards — a silent 1M-row truncation and a 60s abort.
Both operations funnel through SqlQueryPipeline.execute → SqlQueryExecutor.execute. That executor applies a row cap and a wall-clock timeout meant to keep the synchronous $sqlquery-run snappy:
- Row cap:
SqlQueryExecutor.java:124result = result.limit(effectiveLimit(request.getLimit(), requestId)).effectiveLimit(:148-162) returns the server cap whencallerLimit == null— and the export parser always passesnull(SqlQueryExportRequestParser.java:195). DefaultmaxRows = 1_000_000(SqlQueryConfiguration.java:43,application.yml:96). So every export is silently truncated at 1M rows, with no_limitto raise it and no manifest flag indicating truncation. - Timeout: watchdog at
SqlQueryExecutor.java:112,127-133, defaulttimeoutSeconds = 60(SqlQueryConfiguration.java:51). Since the export is all-or-nothing, a single query past 60s fails the whole export.
This defeats the purpose of an async export (large/slow result sets). The export path should bypass the synchronous cap/timeout (or have its own, much larger/unlimited config). Please add a test that exports a result set exceeding maxRows and asserts all rows are written.
2. $viewdefinition-export patient scoping is broken under the spec-aligned Reference contract.
The realignment moved sibling operations to List<Reference> with Patient/ prefix stripping, but $viewdefinition-export was left half-migrated:
ViewDefinitionExportProvider.java:172-173still declares@OperationParam(name="patient") final List<String> patientIdsand inserts them verbatim into the compartment filter vianew HashSet<>(patientIds)(:426) with no prefix stripping.- A spec-aligned
patient=Patient/123therefore lands as the literal"Patient/123"and never matches stored ids (123) → silently empty/incorrect export. (groupis unaffected —IdType.getIdPart()stripsGroup/5→5.) - No test catches it:
ViewDefinitionExportExecutorTestfeeds a plain id set straight to the executor, and no IT sends aPatient/-prefixedpatient. Docs corroborate the gap —view-run.mdshowsReference,view-export.mdstill showsid.
Fix: migrate view-export patient/group to List<Reference>, reusing the shared ReferenceParameters/toPatientIds/toGroupIds path (and update view-export.md). Add an IT that sends patient=Patient/<id> and asserts the compartment actually filters.
🟡 Important (should fix)
3. Export rejects the spec-recommended json format while supporting the optional parquet.
Issue #2633 lists _format as ndjson/csv/json/parquet and notes json/ndjson/csv are recommended, parquet is MAY. The export uses ViewExportFormat (ViewExportFormat.java:32-41 — NDJSON/CSV/PARQUET only); _format=json returns 400, whereas $sqlquery-run supports json (SqlQueryOutputFormat.java:45). So a client can get json from the sync op but not its async counterpart, and the format set is inverted relative to the spec's recommendation. The gap is documented in sql-export.md, so it's at least deliberate — but please either implement json for export or declare the narrower set in the OperationDefinition/CapabilityStatement so conformance tooling sees the constraint.
4. Job-delete race causes a confirmed flaky CI failure (test stability); benign in production.
On DELETE, JobProvider.handleJobDeleteRequest cancels the future (:181) and immediately fs.delete(jobDir, true) (:213), but Spark cancellation is asynchronous (signalled via SparkJobListener at stage boundaries), so an in-flight commit can still write files into a directory being deleted. This produces an intermittent ExportOperationIT teardown failure: cleanup (ExportOperationIT.java:134) → FileUtils.cleanDirectory → NoSuchFileException on a partitioned NDJSON part-file as a cancelled write settles → BUILD FAILURE (reproduced; passes on rerun). The production effect is minor (orphaned partial output / storage leak; narrow-window 500; no data-integrity risk), so this is Important only for test stability. Minimum: make teardown tolerant of concurrent modification (catch/retry NoSuchFileException, or await() job termination in cancel/delete tests). Proper at-source fix (block on job termination before fs.delete) can be a follow-up issue.
🟢 Minor (nice to have)
5. ExportManifest.buildResultUrl parses the job id/file out of a URL string under a mismatched catch.
ExportManifest.java:163 does localUrl.split("/jobs/")[1].split("/") then indexes [0]/[1], but the surrounding try only catches URISyntaxException (:172) — a localUrl not shaped …/jobs/<uuid>/<file> throws an uncaught ArrayIndexOutOfBoundsException → generic 500. Not a regression (carried over), but it now lives in shared code exercised by both export ops. Cheapest fix: widen the catch / parse defensively. Better: carry (jobId, fileName) structurally from ExportFileWriter (which already knows both) instead of reconstructing them from the URL.
Replaced a Javadoc-style {@code} tag, which renders literally in JSDoc,
with a backtick code span.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|



This adds the SQL on FHIR
$sqlquery-exportoperation, the asynchronous counterpart to$sqlquery-run, and brings the existing SQL on FHIR operations into line with the specification along the way. Clients can run one or more SQL queries against materialised ViewDefinition tables in the background and download each result set as files, following the FHIR Asynchronous Request Pattern.The
$sqlquery-exportoperationThe operation is available at the system, type, and instance levels. Each
queryproduces one output. Table sources can be supplied at request time, inline or by reference, in addition to being read from server storage. Supported formats are NDJSON (the default), CSV, and Parquet, withpatient,group, and_sincefiltering. A failed query fails the whole export. It is gated by a newsqlQueryExportEnabledconfiguration flag (defaulttrue) and declared in the CapabilityStatement against the spec canonicalhttp://sql-on-fhir.org/OperationDefinition/$sqlquery-export.Specification alignment
While implementing the new operation, the existing
$viewdefinition-run,$viewdefinition-export, and$sqlquery-runoperations were aligned to the spec:viewReferenceresolution to a stored ViewDefinition, with per-part exclusivity and presence checks.patientandgroupwith the spec's cardinality, and Bundle unwrapping in the resource parameter.exportId,status, echoedclientTrackingIdand_format, export timing fields, and one output per view carryingnameandlocationparts), dropping the Bulk Data-style fields. The Bulk Data export kick-off body is unchanged, with a regression test confirming this._formator the unsupportedsourceparameter rather than silently ignoring them._formatmedia type carrying parameters (e.g.text/csv;charset=utf-8) is now accepted rather than rejected.$sqlquery-runqueries pass static validation.Breaking changes
The alignment work changes the wire contract of the existing
$viewdefinition-run,$viewdefinition-export, and$sqlquery-runoperations. Each change moves Pathling onto the published SQL on FHIR contract, but clients written against the previous behaviour will need updating.$viewdefinition-exportcompletion manifest no longer carries the Bulk Data-styletransactionTime,request,requiresAccessToken, andoutput.urlparts. It now follows the SQL on FHIR shape:exportId,status, the echoedclientTrackingIdand_format, the export timing fields, and oneoutputper view withnameand one or morelocationparts. Clients parsing the old manifest will not find the download URLs in the same place.patientandgroupparameter types changed. On the run and export operations,patientmoves from bare string IDs toReference, andgroupmoves fromidtoReference. Requests sending the old types will be rejected.Parametersacknowledgement (status=accepted,exportId) with the 202, in place of the previous OperationOutcome. The FHIR Bulk Data$exportkick-off body is deliberately unchanged._formatis now rejected. A non-blank_formatthat matches no supported code or media type now returns 400 rather than silently falling back to the default. A client relying on a typo or an unsupported value quietly defaulting will now see an error.sourceparameter is now rejected. Previously accepted and silently ignored, supplyingsourcenow returns an error. Low impact in practice, since it never had any effect.http://sql-on-fhir.org/OperationDefinition/. Clients that fetched the Pathling OperationDefinitions or matched on the old canonical URLs will need to point at the spec canonicals.The internal refactors (the
AsyncPatternenum replacingredirectOnComplete, the sharedSqlQueryPipeline, and the extractedoperations/exportpackage) are confined to the server module and do not affect the public library API, so they are not breaking. The newsqlQueryExportEnabledflag is additive and defaults totrue.Shared machinery
To avoid duplication, the shared asynchronous-export machinery is extracted into a new
operations/exportpackage - a generic completion-manifest builder, a file writer, and a filtered data-source builder - reused by both export operations.$sqlquery-runand$sqlquery-exportnow share a commonSqlQueryPipeline. The booleanredirectOnCompletemarker on the async machinery is replaced with anAsyncPatternenum (STANDARD_ASYNC_PATTERN,BULK_DATA) so@AsyncSupportedreads as a deliberate choice between two named patterns, and the Javadoc that misattributed redirect-based completion to a "SQL on FHIR unify-async" spec now cites the HL7 Asynchronous Interaction Request Pattern.Admin UI
The SQL query result card gains an export affordance, backed by a reusable export-job card shared with the ViewDefinition export flow. The manifest parser reads the spec-shaped output
locationparts (one or more per output), emitting one download entry per file.Testing
The branch adds extensive unit and integration coverage for the new operation and the alignment changes, including provider integration tests across all levels and transports, format and disabled-flag integration tests, request-parser and executor unit tests, conformance assertions, and Playwright end-to-end coverage in the admin UI. Documentation for the run, export, and
$sqlquery-runoperations is updated, and a newsql-export.mdpage is added.Resolves #2633.