Skip to content

Implement the $sqlquery-export operation and align SQL on FHIR operations to the spec#2634

Open
johngrimes wants to merge 15 commits into
mainfrom
issue/2633
Open

Implement the $sqlquery-export operation and align SQL on FHIR operations to the spec#2634
johngrimes wants to merge 15 commits into
mainfrom
issue/2633

Conversation

@johngrimes

Copy link
Copy Markdown
Member

This adds the SQL on FHIR $sqlquery-export operation, the asynchronous counterpart to $sqlquery-run, and brings the existing SQL on FHIR operations into line with the specification along the way. Clients can run one or more SQL queries against materialised ViewDefinition tables in the background and download each result set as files, following the FHIR Asynchronous Request Pattern.

The $sqlquery-export operation

The operation is available at the system, type, and instance levels. Each query produces one output. Table sources can be supplied at request time, inline or by reference, in addition to being read from server storage. Supported formats are NDJSON (the default), CSV, and Parquet, with patient, group, and _since filtering. A failed query fails the whole export. It is gated by a new sqlQueryExportEnabled configuration flag (default true) and declared in the CapabilityStatement against the spec canonical http://sql-on-fhir.org/OperationDefinition/$sqlquery-export.

Specification alignment

While implementing the new operation, the existing $viewdefinition-run, $viewdefinition-export, and $sqlquery-run operations were aligned to the spec:

  • viewReference resolution to a stored ViewDefinition, with per-part exclusivity and presence checks.
  • Reference-typed patient and group with the spec's cardinality, and Bundle unwrapping in the resource parameter.
  • The export completion manifest rewritten to the spec shape (exportId, status, echoed clientTrackingId and _format, export timing fields, and one output per view carrying name and location parts), dropping the Bulk Data-style fields. The Bulk Data export kick-off body is unchanged, with a regression test confirming this.
  • A Parameters kick-off acknowledgement for the SQL on FHIR async operations.
  • 422 for semantically invalid ViewDefinitions, and rejection of an explicitly unsupported _format or the unsupported source parameter rather than silently ignoring them.
  • A supported _format media type carrying parameters (e.g. text/csv;charset=utf-8) is now accepted rather than rejected.
  • Named and positional parameter placeholders are allowed through the strict SQL validator, so parameterised $sqlquery-run queries pass static validation.
  • The CapabilityStatement now points at the spec canonical OperationDefinition URLs for these operations, and the Pathling-authored OperationDefinitions for them are no longer served.

Breaking changes

The alignment work changes the wire contract of the existing $viewdefinition-run, $viewdefinition-export, and $sqlquery-run operations. Each change moves Pathling onto the published SQL on FHIR contract, but clients written against the previous behaviour will need updating.

  • Export completion manifest reshaped. The $viewdefinition-export completion manifest no longer carries the Bulk Data-style transactionTime, request, requiresAccessToken, and output.url parts. It now follows the SQL on FHIR shape: exportId, status, the echoed clientTrackingId and _format, the export timing fields, and one output per view with name and one or more location parts. Clients parsing the old manifest will not find the download URLs in the same place.
  • patient and group parameter types changed. On the run and export operations, patient moves from bare string IDs to Reference, and group moves from id to Reference. Requests sending the old types will be rejected.
  • Async kick-off acknowledgement body changed. The SQL on FHIR async operations now return a Parameters acknowledgement (status=accepted, exportId) with the 202, in place of the previous OperationOutcome. The FHIR Bulk Data $export kick-off body is deliberately unchanged.
  • Unsupported _format is now rejected. A non-blank _format that matches no supported code or media type now returns 400 rather than silently falling back to the default. A client relying on a typo or an unsupported value quietly defaulting will now see an error.
  • The source parameter is now rejected. Previously accepted and silently ignored, supplying source now returns an error. Low impact in practice, since it never had any effect.
  • Pathling-authored OperationDefinitions removed. The server no longer serves its own OperationDefinition resources for these operations, and the CapabilityStatement now references the spec canonical URLs under http://sql-on-fhir.org/OperationDefinition/. Clients that fetched the Pathling OperationDefinitions or matched on the old canonical URLs will need to point at the spec canonicals.

The internal refactors (the AsyncPattern enum replacing redirectOnComplete, the shared SqlQueryPipeline, and the extracted operations/export package) are confined to the server module and do not affect the public library API, so they are not breaking. The new sqlQueryExportEnabled flag is additive and defaults to true.

Shared machinery

To avoid duplication, the shared asynchronous-export machinery is extracted into a new operations/export package - a generic completion-manifest builder, a file writer, and a filtered data-source builder - reused by both export operations. $sqlquery-run and $sqlquery-export now share a common SqlQueryPipeline. The boolean redirectOnComplete marker on the async machinery is replaced with an AsyncPattern enum (STANDARD_ASYNC_PATTERN, BULK_DATA) so @AsyncSupported reads as a deliberate choice between two named patterns, and the Javadoc that misattributed redirect-based completion to a "SQL on FHIR unify-async" spec now cites the HL7 Asynchronous Interaction Request Pattern.

Admin UI

The SQL query result card gains an export affordance, backed by a reusable export-job card shared with the ViewDefinition export flow. The manifest parser reads the spec-shaped output location parts (one or more per output), emitting one download entry per file.

Testing

The branch adds extensive unit and integration coverage for the new operation and the alignment changes, including provider integration tests across all levels and transports, format and disabled-flag integration tests, request-parser and executor unit tests, conformance assertions, and Playwright end-to-end coverage in the admin UI. Documentation for the run, export, and $sqlquery-run operations is updated, and a new sql-export.md page is added.

Resolves #2633.

Add viewReference support, Reference-typed patient/group with spec
cardinality, Bundle unwrapping in the resource parameter, and 422 for
semantically invalid ViewDefinitions on the run operation. Across the
run, export, and sqlquery-run operations, reject an explicitly
unsupported _format and the unsupported source parameter rather than
silently ignoring them. Add a creation-time start timestamp to the
async job for later export-manifest timing.
Support view.viewReference resolution to a stored ViewDefinition with
per-part exclusivity and presence checks. Rewrite the completion
manifest to the SQL on FHIR shape (exportId, status, echoed
clientTrackingId and _format, export timing fields, and one output per
view with name and location parts), dropping the Bulk Data-style
fields. Return a Parameters kick-off acknowledgement for SQL on FHIR
async operations while leaving the Bulk Data kick-off body unchanged.
…ery 422

Declare the SQL on FHIR spec canonical OperationDefinition URLs for the
viewdefinition-run, viewdefinition-export, and sqlquery-run operations in
the CapabilityStatement, and stop serving the Pathling-authored
OperationDefinitions for them. Add a regression test pinning the existing
422 response for an unmappable FHIR column type under _format=fhir on
sqlquery-run, and integration coverage for the source and _format
rejections across all levels and transports.
Update the admin UI manifest parser to read the spec-shaped output
location parts (one or more per output) instead of a single url part,
emitting one download entry per file so a view that produced several
files lists every file. Add an end-to-end check that an invalid custom
ViewDefinition surfaces the 422 OperationOutcome message.
Update the run, export, and sqlquery-run operation docs for the new
viewReference support, Reference-typed patient/group, Bundle unwrapping,
the spec-shaped export manifest and kick-off acknowledgement, the source
and unsupported-format rejections, the 400/404/422 status semantics, and
the spec conformance canonicals. Credit the additional author on the
significantly changed sqlquery providers.
Align the admin UI export end-to-end mocks with the new completion
manifest: output download files are carried in location parts, and the
non-spec transactionTime and requiresAccessToken fields are removed, so
the card-listing tests exercise the location-reading parser.
Strip media-type parameters (e.g. text/csv;charset=utf-8) before
matching an explicit _format value across the run, export, and
sqlquery-run strict parsers, so a supported media type with parameters
is treated as that format rather than rejected with 400, matching the
spec's content-negotiation edge case.
The strict SQL validator's expression allow-list omitted the named and
positional parameter placeholder expressions, so any parameterised query
(using ":name") failed static validation. This blocked runtime parameter
binding for $sqlquery-run even though the binding machinery was already
present.

Add NamedParameter and PosParameter to the allow-list. They are leaf,
unevaluable value placeholders bound at execution time, so they carry no
code-execution risk.
Implement the SQL on FHIR $sqlquery-export operation, the asynchronous
counterpart to $sqlquery-run, at the system, type, and instance levels.
Clients can run one or more SQL queries against materialised ViewDefinition
tables in the background and download each result as files, following the
FHIR Asynchronous Request Pattern.

Each query produces one output. Table sources can be supplied at request
time, inline or by reference, in addition to being read from server storage.
Supported formats are NDJSON (default), CSV, and Parquet, with patient,
group, and _since filtering. A failed query fails the whole export.

To avoid duplication, extract the shared asynchronous-export machinery into a
new operations/export package (a generic completion-manifest builder, a file
writer, and a filtered data-source builder) reused by both export operations,
and refactor $sqlquery-run onto a shared SqlQueryPipeline that the export also
uses. The Admin UI gains an export affordance on the SQL query result card,
backed by a reusable export-job card shared with the ViewDefinition export.

The operation is declared in the CapabilityStatement against the spec
canonical and gated by a new sqlQueryExportEnabled configuration flag.

Resolves #2633.
Replace the boolean redirectOnComplete marker on the server's asynchronous
machinery with an AsyncPattern enum (STANDARD_ASYNC_PATTERN, BULK_DATA), so the
@AsyncSupported annotation reads as a deliberate choice between two named
patterns rather than the presence or absence of one behaviour.

Correct the Javadoc that misattributed the redirect-based completion to a "SQL
on FHIR unify-async" specification, citing the HL7 Asynchronous Interaction
Request Pattern instead.

Behaviour is unchanged; the existing async tests pass with only their
identifiers updated.
@johngrimes johngrimes added new feature New feature or request server Issues relating to Pathling server. labels Jun 21, 2026
@github-project-automation github-project-automation Bot moved this to Backlog in Pathling Jun 21, 2026
@johngrimes johngrimes moved this from Backlog to In progress in Pathling Jun 21, 2026
isAnyExportEnabled now also covers the SQL on FHIR asynchronous export
operations, so the all-exports-disabled case must also disable the
viewdefinition-export and sqlquery-export flags, both of which default
to enabled.
@johngrimes johngrimes self-assigned this Jun 21, 2026
The SQL on FHIR export integration tests added to the server module push
the full build-and-test suite past the previous 30-minute job limit,
causing the run to be cancelled before completion. Raise the timeout on
the test, pre-release, and release server workflows so the suite has room
to finish.
The disabled-operation assertion called doesNotContain on its own, which
passes vacuously if the operation list is empty. Assert the still-enabled
sqlquery-run operation is present first, so the test proves the list is
populated and only sqlquery-export was dropped.
@johngrimes johngrimes marked this pull request as ready for review June 22, 2026 02:35
@johngrimes johngrimes requested a review from piotrszul June 22, 2026 02:35

@piotrszul piotrszul left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid PR overall — the async refactor, the spec-aligned manifest/conformance work, the shared operations/export abstractions, and the bulk of the test suite are all strong. Two correctness issues block merge, plus a confirmed flaky-CI failure that should be fixed before this lands. Findings grouped by severity. All are static reads (plus one CI log); I haven't run the full build.

🔴 Critical (must fix)

1. $sqlquery-export inherits the synchronous run guards — a silent 1M-row truncation and a 60s abort.
Both operations funnel through SqlQueryPipeline.executeSqlQueryExecutor.execute. That executor applies a row cap and a wall-clock timeout meant to keep the synchronous $sqlquery-run snappy:

  • Row cap: SqlQueryExecutor.java:124 result = result.limit(effectiveLimit(request.getLimit(), requestId)). effectiveLimit (:148-162) returns the server cap when callerLimit == null — and the export parser always passes null (SqlQueryExportRequestParser.java:195). Default maxRows = 1_000_000 (SqlQueryConfiguration.java:43, application.yml:96). So every export is silently truncated at 1M rows, with no _limit to raise it and no manifest flag indicating truncation.
  • Timeout: watchdog at SqlQueryExecutor.java:112,127-133, default timeoutSeconds = 60 (SqlQueryConfiguration.java:51). Since the export is all-or-nothing, a single query past 60s fails the whole export.

This defeats the purpose of an async export (large/slow result sets). The export path should bypass the synchronous cap/timeout (or have its own, much larger/unlimited config). Please add a test that exports a result set exceeding maxRows and asserts all rows are written.

2. $viewdefinition-export patient scoping is broken under the spec-aligned Reference contract.
The realignment moved sibling operations to List<Reference> with Patient/ prefix stripping, but $viewdefinition-export was left half-migrated:

  • ViewDefinitionExportProvider.java:172-173 still declares @OperationParam(name="patient") final List<String> patientIds and inserts them verbatim into the compartment filter via new HashSet<>(patientIds) (:426) with no prefix stripping.
  • A spec-aligned patient=Patient/123 therefore lands as the literal "Patient/123" and never matches stored ids (123) → silently empty/incorrect export. (group is unaffected — IdType.getIdPart() strips Group/55.)
  • No test catches it: ViewDefinitionExportExecutorTest feeds a plain id set straight to the executor, and no IT sends a Patient/-prefixed patient. Docs corroborate the gap — view-run.md shows Reference, view-export.md still shows id.

Fix: migrate view-export patient/group to List<Reference>, reusing the shared ReferenceParameters/toPatientIds/toGroupIds path (and update view-export.md). Add an IT that sends patient=Patient/<id> and asserts the compartment actually filters.

🟡 Important (should fix)

3. Export rejects the spec-recommended json format while supporting the optional parquet.
Issue #2633 lists _format as ndjson/csv/json/parquet and notes json/ndjson/csv are recommended, parquet is MAY. The export uses ViewExportFormat (ViewExportFormat.java:32-41 — NDJSON/CSV/PARQUET only); _format=json returns 400, whereas $sqlquery-run supports json (SqlQueryOutputFormat.java:45). So a client can get json from the sync op but not its async counterpart, and the format set is inverted relative to the spec's recommendation. The gap is documented in sql-export.md, so it's at least deliberate — but please either implement json for export or declare the narrower set in the OperationDefinition/CapabilityStatement so conformance tooling sees the constraint.

4. Job-delete race causes a confirmed flaky CI failure (test stability); benign in production.
On DELETE, JobProvider.handleJobDeleteRequest cancels the future (:181) and immediately fs.delete(jobDir, true) (:213), but Spark cancellation is asynchronous (signalled via SparkJobListener at stage boundaries), so an in-flight commit can still write files into a directory being deleted. This produces an intermittent ExportOperationIT teardown failure: cleanup (ExportOperationIT.java:134) → FileUtils.cleanDirectoryNoSuchFileException on a partitioned NDJSON part-file as a cancelled write settles → BUILD FAILURE (reproduced; passes on rerun). The production effect is minor (orphaned partial output / storage leak; narrow-window 500; no data-integrity risk), so this is Important only for test stability. Minimum: make teardown tolerant of concurrent modification (catch/retry NoSuchFileException, or await() job termination in cancel/delete tests). Proper at-source fix (block on job termination before fs.delete) can be a follow-up issue.

🟢 Minor (nice to have)

5. ExportManifest.buildResultUrl parses the job id/file out of a URL string under a mismatched catch.
ExportManifest.java:163 does localUrl.split("/jobs/")[1].split("/") then indexes [0]/[1], but the surrounding try only catches URISyntaxException (:172) — a localUrl not shaped …/jobs/<uuid>/<file> throws an uncaught ArrayIndexOutOfBoundsException → generic 500. Not a regression (carried over), but it now lives in shared code exercised by both export ops. Cheapest fix: widen the catch / parse defensively. Better: carry (jobId, fileName) structurally from ExportFileWriter (which already knows both) instead of reconstructing them from the URL.

Replaced a Javadoc-style {@code} tag, which renders literally in JSDoc,
with a backtick code span.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new feature New feature or request server Issues relating to Pathling server.

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

Implement $sqlquery-export operation

2 participants