$export is asynchronous: authenticate with SMART Backend Auth (JWT bearer assertion against the EHR's token endpoint), POST to the $export operation, get a 202 Accepted with Content-Location, poll until status is complete, then download NDJSON files. Most production failures concentrate in JWT signing, timeout handling, and NDJSON streaming — not in the operation itself.
What FHIR Bulk Data ($export) actually is
$export is a FHIR R4 operation defined by the FHIR Bulk Data Access Implementation Guide. It lets an authorized client request all FHIR resources for a defined cohort — system-wide, all patients in a Group, or a single Patient's data — and receive them as newline-delimited JSON (NDJSON) files. It exists because fetching cohorts one resource at a time with FHIR search hits practical limits at scale: pagination overhead, rate limits, latency, network instability.
Three operations are defined: [base]/$export for system-wide, [base]/Group/[id]/$export for a Group, and [base]/Patient/$export for all patients the requester has access to. The system-wide form is rare in practice — most production deployments use Group $export against a defined cohort.
The operation is asynchronous. The client POSTs the request; the server returns 202 Accepted with a Content-Location header pointing to a status endpoint. The client polls the status endpoint until it returns 200 OK with a manifest of NDJSON file URLs. Each file is then downloaded separately. The whole flow is designed for cohorts that take minutes-to-hours to assemble, not seconds.
SMART Backend Authorization — the part that breaks first
$export needs system-level credentials, which means SMART Backend Authorization (RFC 7521 + RFC 7523 — JWT bearer assertion against an OAuth 2.0 token endpoint). User-context tokens don't work for cohort-level export. The flow:
- You generate an asymmetric key pair (RSA or ECDSA) and host the public key as a JWKS document at a reachable URL.
- You register your client with the EHR — the EHR records your client_id, your JWKS URL, and the scopes you're permitted (e.g.
system/Patient.read system/Observation.read). - To request a token, you build a JWT signed with your private key, with claims
iss= client_id,sub= client_id,aud= token endpoint URL,jti= a one-time random ID,exp= no more than 5 minutes in the future. - You POST to the token endpoint with form params
grant_type=client_credentials,client_assertion_type=urn:ietf:params:oauth:client-assertion-type:jwt-bearer,client_assertion=<your JWT>, andscope=<requested scopes>. - You get back an access token (typically 5–60 minutes lifetime) that you use as a Bearer header on the $export call.
Common failures: signing with the wrong key (use kid in the JWT header to disambiguate); JWKS endpoint not publicly reachable (most EHRs require HTTPS with a valid cert chain); aud claim mismatch (the token endpoint URL must match exactly, including trailing slash); jti reuse (the EHR rejects a JWT it has already accepted in the same time window).
The $export request itself
POST to [base]/Group/[id]/$export with these headers: Accept: application/fhir+json, Prefer: respond-async, Authorization: Bearer <access token>. Optional query params shape the cohort:
_outputFormat=application/fhir+ndjson— set this explicitly even though it's the default. Some EHRs are strict._since=2026-01-01T00:00:00Z— only return resources updated since this timestamp. Critical for incremental pulls._type=Patient,Encounter,Observation,Condition— limit which resource types come back. Cuts file size dramatically._typeFilter=Observation?category=laboratory— finer-grained filter, applied per resource type. Few EHRs implement this fully; check before depending on it.
The response is 202 Accepted with a Content-Locationheader pointing to the status endpoint. Save this URL — you'll poll it.
Polling the status endpoint
GET the status URL with the same Bearer token. Three possible responses:
- 202 Accepted — still in progress. Look for an
X-Progressheader (rough percentage) and aRetry-Afterheader (suggested next poll interval). Wait, then re-poll. - 200 OK — done. Body is a manifest JSON with
outputarray (one entry per NDJSON file),errorarray, andtransactionTime. - 4xx/5xx — an error occurred. Body should be an OperationOutcome explaining what failed.
Polling cadence: start at 15 seconds, exponentially back off to 5 minutes maximum. Honor Retry-After if present. Never poll faster than 5 seconds. Most large cohort exports complete in 5–30 minutes; very large ones can take hours.
NDJSON download — the second place things break
The manifest's output array contains entries like { "type": "Patient", "url": "..." }. GET each URL with your Bearer token. Files can be large — gigabytes for a real population — so stream the response body and parse line-by-line. Each line is a single FHIR resource as JSON.
Production checklist: connection-pool tuning (don't open one per file, reuse), retry with exponential backoff on transient failures, byte-range resume support if the EHR provides ETag/Content-Length headers, SHA-256 verification if the manifest includes integrity hashes, and post-download cleanup (delete the manifest after all files are consumed — most EHRs expire it in 24–48 hours anyway).
Per-EHR notes
eClinicalWorks: Group $export is the standard path; Patient $export is supported but rarely used. JWKS registration goes through the eCW developer portal and can take 1–2 weeks. Common failure: 401 Unauthorized despite valid credentials — usually a JWKS visibility issue, fix by hosting the JWKS document on HTTPS with a public certificate. See our eClinicalWorks $export Sprint for a productized implementation.
Epic: $export is exposed through Epic's App Orchard / Vendor Services. Group $export against a Patient List is the default. Epic's implementation honors _typeFilter well. Common gotcha: scope strings differ slightly from the SMART spec — Epic uses system/Patient.Read with capital R.
Cerner (Oracle Health):Group $export is supported via the Cerner Code Console. Cerner is fast on small cohorts but has historically had longer queue times for very large ones. Cerner's NDJSON files are well-formed but conform strictly to US Core profiles, so resources missing required US Core extensions will fail downstream validation.
athenahealth:$export support is improving but historically lagged. Check current support status; some athena deployments still rely on athena's proprietary bulk APIs rather than FHIR $export. When $export is supported, the auth flow is standard SMART Backend.
Common production failures and root causes
- $export returns 202 but Content-Location polling never completes. Usually an EHR backend timeout or task-queue backlog — break the cohort into smaller chunks with
_sinceand chunked Group exports. - NDJSON download fails mid-stream on large files. Connection pool exhaustion or transient network. Solution: streaming download with byte-range resume and exponential retry.
- Returned resources fail US Core validation. EHR profile-conformance setting. Configure US Core conformance in the EHR admin and validate output before downstream ETL.
- Group endpoint returns 404 even though the Group exists. Logical vs technical Group ID mismatch. Verify with FHIR search before $export.
- 401 mid-flow on long-running export. Token expired during the polling window. Solution: refresh the token proactively when its lifetime drops below 5 minutes.
What this looks like in production
A production-ready $export pipeline has five components: (1) a JWT signing service with private key in HSM/KMS, (2) a token-management module with auto-refresh and rotation, (3) an export-orchestration service that chunks cohorts and tracks job status, (4) a streaming NDJSON downloader with retry logic, (5) a post-download ETL into your data warehouse with US Core profile validation. Add monitoring on top: export-job success rate, average duration, file-size trends, error categorization.
Production teams typically run their first single-cohort $export in 4–6 weeks of focused engineering work. Multi-environment rollouts with QA gates take 8–12 weeks. The wall-clock time is more about EHR registration cycles than engineering — most of the actual code is well-defined and testable in isolation.