Skip to content

[Investigation] Post-extraction bundle validation — approach decision #1983

Description

@leoniedickson

Background

This is a follow-on from issue #1927. As part of #1927, PR #1974 added a save gate that blocks save-as-final when the form has validation errors (required fields, targetConstraint violations). This covers form-level constraints — it ensures the questionnaire answers are valid before extraction runs.

What it doesn't cover: constraints on the extracted FHIR resources themselves. After extraction, the resulting Bundle entries (e.g. Condition, MedicationStatement) need to conform to SHC profiles. Profile constraints (required cardinality, mandatory coded fields, etc.) aren't expressed in the questionnaire and aren't checked before write-back.

The gap

Extraction can produce invalid resources even when the form-level gate passes:

  • A template has a mapping bug that drops a required field (e.g. Condition.code)
  • A required field exists in the extracted resource but wasn't a questionnaire item (so no form-level constraint can catch it)
  • The profile constraint is more complex than what's expressible as a FHIRPath expression scoped to the form

Without post-extraction validation, if the FHIR server does not enforce profile constraints on write, invalid resources may be silently written to the patient record. If the server does reject the bundle, existing error handling surfaces a warning — but the QR is already saved as final at that point (the save and bundle write are separate sequential calls), so the user cannot fix and retry without amending.

Options

Option 1: $validate against the implementer's FHIR server

Call POST [base]/[ResourceType]/$validate for each non-FHIRPatch bundle entry after extraction. Surface issues in the write-back dialog.

$validate itself is a standard FHIR operation and widely supported by mature servers (HAPI, Azure Health Data Services, Firely, Google Cloud Healthcare, etc.). The real question is whether SHC profiles are loaded on the server — without them, $validate only checks base FHIR R4 constraints, which are much looser and won't catch most SHC-specific violations.

Whether implementers can be expected to have SHC profiles loaded is worth clarifying. If smart-forms is being deployed for SHC data capture and extraction, requiring profiles to be loaded may be a reasonable documented expectation.

A configurable validateUrl (rather than a simple on/off toggle) would allow implementers to point validation at either their clinical FHIR server or a separate endpoint without it being on by default.

Option 2: Client-side validation

Validate extracted resources in the browser before write-back, with no dependency on the implementer's server. Several sub-approaches considered:

2a — Medplum validator (@medplum/core) with bundled SHC SDs
Use Medplum's open-source (MIT) StructureDefinition-aware validator against SHC SDs bundled with the app. Handles both cardinality constraints and FHIRPath invariants. Known gaps around complex slicing and terminology binding. Adds bundle-size cost from bundled SDs and a dependency on Medplum's validation coverage. This is the only sub-approach worth pursuing — see 2b and 2c.

2b — FHIRPath invariants extracted from SHC SDs (not worth pursuing)
FHIR StructureDefinitions contain explicit FHIRPath invariants (element.constraint.expression) that could be extracted and evaluated with fhirpath.js. However, cardinality constraints — the most common class of violation — are expressed as structural properties of element definitions, not FHIRPath, so they would be missed entirely. Requires significant custom implementation for incomplete coverage.

2c — Manually written FHIRPath checks (not worth pursuing)
Write FHIRPath expressions by hand for known resource types. No SD dependency and very simple, but unprincipled — not derived from the actual profiles, not comprehensive, and goes stale silently as profiles evolve. Noted for completeness.

Option 3: targetConstraint expressions on the questionnaire

Encode profile constraints as targetConstraint extensions on questionnaire items. These feed into the existing form-level save gate from #1927 — no new validation infrastructure needed.

Validates form answers, not the extracted resource — so doesn't catch template bugs or constraints on fields that aren't questionnaire items. Puts maintenance burden on questionnaire authors who must duplicate SHC profile constraints into each questionnaire. Viable as a complement to other options but not a standalone solution.

Option 4: Dedicated validator service

Run the HL7 Java validator or HAPI FHIR server as a REST API with SHC profiles loaded, hosted within the implementer's own infrastructure. Gives authoritative full-profile validation with no client-side complexity.

Consideration: this faces the same availability question as option 1 — if implementers can't be relied on to load profiles into their existing FHIR server, they're unlikely to stand up a separate service either. It's only practically useful as opt-in infrastructure for implementers who want stronger guarantees and are willing to operate it.

Patient data: the extracted bundle contains real patient data. If this were a shared hosted service (e.g. CSIRO-hosted), data from multiple organisations would flow to a single endpoint — likely not viable without explicit data sharing agreements. A per-deployment instance within the implementer's own infrastructure avoids this, at the cost of operational burden.

Questions to answer

  1. Can implementers deploying smart-forms for SHC data capture be expected to have SHC profiles loaded on their FHIR server? If yes, option 1 becomes much more viable as the primary mechanism.
  2. Should validation failures block write-back, or be surfaced as informational warnings?
  3. If a client-side floor is needed regardless: is option 2a (Medplum) worth the dependency and bundle-size cost?
  4. Is a validateUrl config option useful for implementers who want to opt in to stricter validation?

Scope

This issue is for investigation and decision-making. Implementation follows once the approach is agreed.

Related: #1927, #1974

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions