HL7 v2 is the lingua franca of US healthcare data — by our count, still close to 85% of real clinical messages in 2026, despite every FHIR adoption projection of the last decade. It's resilient because it's simple: pipe-delimited segments, well-defined trigger events, a standard ACK/NACK handshake, and transport over MLLP or file. That simplicity is also why integration teams trip over the same set of errors year after year — the specification has a lot of flex, and every vendor implementation interprets that flex slightly differently.
This article catalogs the 15 HL7 errors that our integration team has diagnosed most often in production environments at US hospitals, labs, and healthtech companies. Each is presented with the symptoms you'll see, the root cause, the fix, and the prevention measure that stops it coming back. It's a working field guide — not a textbook.
For the broader reference, read our HL7 integration guide. If you need an engineer to pair with you on an active incident, our Mirth Connect helpdesk has an under-15-minute response SLA.
1. A Framework for HL7 Errors
Every HL7 error falls into one of five layers. Triaging to the right layer first is the single biggest time saver.
- Transport. MLLP framing, TCP connectivity, TLS. If the message never reaches the receiver intact, nothing above matters.
- Encoding. Character set, byte-level integrity. Messages can arrive intact at the transport layer but be misinterpreted at parse time.
- Structure. Segment order, field separators, cardinality, MSH version. The parser either succeeds or emits an AR NACK.
- Semantics. Required fields present, values within allowed codes, trigger-event consistency. AE NACKs live here.
- Downstream. The message parsed fine but the receiver's downstream system rejected the content — patient not found, lab order ID collision, business rule violation.
Before running any diagnostic, pick a layer. Jumping straight to a specific error category without this triage wastes time on wrong hypotheses.
2. AE and AR NACKs — Reading the MSA
The HL7 acknowledgement segment (MSA) is the first thing to read when a message fails. It has three values in position 1: AA (Application Accept), AE (Application Error), and AR (Application Reject).
MSH|^~\&|LAB|HOSP|EHR|HOSP|202604210930||ACK^R01|123|P|2.5
MSA|AE|MSG-12345|Patient identifier not recognized
ERR|PID^1^3&CX.1&1|PatientID|2072.1 AE — Application Error
The receiver parsed the message successfully but encountered a problem while processing. Common causes:
- Patient identifier doesn't resolve to a known record.
- Required business field is present but empty.
- Downstream database deadlock or timeout.
- Business rule violation (order for a discharged patient).
Rule: AE is typically retriable. Queue the message, retry with exponential backoff, and escalate after three failed attempts.
2.2 AR — Application Reject
The receiver refused the message structurally. Common causes:
- Unsupported HL7 version (sender is 2.5.1, receiver only accepts 2.3).
- Unsupported trigger event (sender emits ADT^A08, receiver only listens for A01/A04/A08 — wait, actually only A01/A04).
- Missing required segment (no PID in an ADT).
- Malformed segment failing strict parsing.
Rule: AR is not retriable. Retrying an AR is a data-quality bug; route to a dead-letter channel for engineer review.
2.3 Reading the ERR segment
HL7 v2.5+ adds the ERR segment with specific location pointers (ERR-2 — field reference) and error conditions (ERR-3 — coded). If your receiver emits it, parse it — it tells you the exact segment, field, and component that failed, saving you diff work.
3. Segment Parse Failures
A surprisingly common category — the parser sees a segment name it doesn't recognize, a segment in the wrong order, or a segment with a field count that mismatches the profile.
Symptoms
- AR NACK with MSA.3 referencing a segment name.
- Mirth log shows
HL7Exception: Segment PID occurred before EVNor similar. - Strict HAPI-HL7v2 validators raise structure exceptions.
Root causes and fixes
- Custom Z-segments unknown to the receiver. The sender emits ZPI or similar; the receiver rejects unknown segments strictly. Fix: strip Z-segments in an outbound transformer, or configure the receiver to tolerate unknown segments.
- Optional segments in wrong position. Some profiles require NK1 after PID but before PV1; a sender that places NK1 after PV1 will fail strict validation. Fix: sort segments to comply with the profile.
- Segment profile drift. Sender is on 2.5.1 which allows repeating groups that 2.3 doesn't. Fix: either upgrade the receiver or flatten repetitions in the transformer.
- Truncation of the final segment. Networking layer drops the last few bytes, so the final segment is incomplete. Fix: verify MLLP framing (section 5).
4. Character Encoding Drift
The single most common silent error. Names with accents (José, Muñoz, Müller) become mojibake. Curly quotes corrupt field boundaries. Em dashes truncate segments.
Symptoms
- Names appear fine for most patients but corrupt for patients with non-ASCII characters.
- Messages sometimes appear truncated mid-field after a specific character.
- Downstream reports include replacement characters (?, □, or the literal
?). - MSH.18 is either empty or disagrees with the sender's actual encoding.
Diagnosis
# Examine the actual bytes, not the rendered text
xxd /path/to/message.hl7 | head -n 30
# Or, look at the declared encoding in MSH.18
grep -E '^MSH' message.hl7 | awk -F'|' '{print $18}'Compare the byte pattern with the declared charset:
- UTF-8: multi-byte sequences starting with
0xC0-0xFDfollowed by0x80-0xBFcontinuation bytes. - Latin-1 (ISO-8859-1): single bytes in
0x80-0xFFwith no continuation requirement. - Windows-1252: Latin-1 superset with curly quotes and em dashes at
0x91-0x9F.
Fix
Declare MSH.18 explicitly on the sender, set the receiver's channel encoding to match, and standardize both sides on UTF-8 unless you have a legacy requirement to do otherwise.
// Mirth channel configuration (Groovy deploy script)
// Force UTF-8 interpretation regardless of default JVM charset
System.setProperty('file.encoding', 'UTF-8')
System.setProperty('sun.jnu.encoding', 'UTF-8')For deeper transport discussion, see Mirth Connect MLLP connection refused.
5. MLLP Framing Drift
MLLP framing is rigid: start-block 0x0B, message, end-block 0x1C, terminator 0x0D. Every sender and receiver in the chain must agree.
Symptoms
- Connection stays up; individual messages time out waiting for ACK.
- Receiver log shows
MLLPv2 framing errororunexpected end of stream. - Manual test with nc produces no ACK response.
Common drift patterns
- Missing
<CR>after<FS>. Some sender implementations emit only<FS>. Strict receivers hang waiting for<CR>. <LF>where<CR>is required. A Windows sender emitting0x0Ainstead of0x0Dbreaks the frame.- Segment separators of
\\r\\ninstead of\\r. HL7 segments must be separated by<CR>only. Extra<LF>characters create phantom empty segments. - MLLPv2 framing when MLLPv1 is expected. MLLPv2 adds transport-layer acknowledgements. Strictly different wire format.
Diagnosis with tcpdump
sudo tcpdump -i any -X -s 0 'port 6661' | head -n 200
# Look for the exact bytes around the start and end of each message.6. MSH and PID Mismatches
Subtle misconfigurations in MSH and PID are responsible for an outsized share of interface incidents — especially after environment migrations.
MSH.3 / MSH.4 — Sending application and facility
Receivers often route based on these fields. If your test environment uses HOSP-TEST and production uses HOSP-PROD, a config copy-paste will send test traffic to production routing. Always parameterize with configuration-map keys:
// Mirth source transformer
msg['MSH']['MSH.3'] = $cfg('mirth.sending.app')
msg['MSH']['MSH.4'] = $cfg('mirth.sending.facility')MSH.9 — Trigger event
Format is MessageType^TriggerEvent^MessageStructure. Many outages come from sending ADT^A08 when the receiver registered for ADT^A01 only. Maintain a strict contract per interface and validate at channel-start time.
MSH.11 — Processing ID
P (production), T (training), D (debug). Many receivers treat non-P messages as discardable. A test-to-prod fatfinger with MSH.11=T silently drops patient data on the receiver side. Alert on MSH.11 mismatches at the Mirth source filter level.
PID.3 — Patient identifier list
PID.3 is a repeating field holding one or more CX (identifier) data types. Common errors:
- Sender emits MRN in the first repetition; receiver expects it in a specific repetition with a specific assigning authority in CX.4.
- Identifier type code (CX.5) missing or wrong — receiver can't tell if it's an MRN, SSN, or insurance ID.
- Assigning-authority namespace drift after a hospital merger.
7. Control Character Injection
HL7 uses |, ^, ~, \\, and & as delimiters. When real data contains one of these characters — a patient named O'Brien|Smith, a comment with a caret — the field shifts silently unless escaped.
Symptoms
- Name fields shifted by one component in downstream records.
- Receiver reports NTE.3 fields with unexpected values.
- Parse succeeds but data is semantically wrong.
Fix
Escape with the HL7 escape sequences \\F\\, \\S\\, \\R\\, \\E\\, \\T\\:
// Groovy helper to sanitize free-text entering HL7 fields
static String hl7Escape(String s) {
if (s == null) return ''
return s
.replace('\\', '\\E\\')
.replace('|', '\\F\\')
.replace('^', '\\S\\')
.replace('~', '\\R\\')
.replace('&', '\\T\\')
}8. Duplicate Message Control IDs
MSH.10 is the message control ID. It's supposed to be unique per message in the sender's window. Duplicates cause receivers to either double-process or silently reject — neither is good.
Common causes
- Counter reset. Sender restarts and resets its sequence; now it emits IDs colliding with recently sent messages.
- Insufficient entropy. Timestamp-to-seconds as the ID; two messages in the same second collide.
- Database restore. A DR exercise restores an older sequence value; IDs repeat for an entire day.
- Reprocessing without rewriting. Queued retry re-sends the original ID; receiver's duplicate detector rejects it.
Fix
Use a monotonic high-entropy ID — timestamp in milliseconds plus a per-process counter, or a UUID. On retry, rewrite MSH.10 to a new ID and preserve the original in MSA.2 or a custom Z-segment for traceability.
// Groovy source transformer for outbound retries
import java.util.UUID
def newId = "${System.currentTimeMillis()}-${UUID.randomUUID().toString().take(8)}"
msg['MSH']['MSH.10'] = newId
channelMap.put('originalMessageId', msg['MSH']['MSH.10'].text())9. Timezone Corruption in TS Fields
HL7 TS (timestamp) fields can include an offset: 20260421093000-0500. They often don't. A naked timestamp is open to interpretation — sender's local time, receiver's local time, UTC, or the default timezone of whichever JVM last touched it.
Symptoms
- Appointment times drift by 4-5 hours after an interface change.
- Observation timestamps appear in the future for Alaska/Hawaii sites.
- Twice-a-year DST anomalies cause one hour of apparent future/past records.
Fix
Always emit with an explicit offset. Always parse with an explicit timezone assumption for legacy senders that don't include one. Store downstream in UTC.
// Groovy — safe timestamp normalization
import java.time.*
import java.time.format.DateTimeFormatter
def raw = msg.OBX.'OBX.14'.text()
def zone = ZoneId.of('America/Chicago') // known sender timezone
def parsed
if (raw.length() >= 14 && (raw.contains('+') || raw.contains('-'))) {
parsed = OffsetDateTime.parse(raw, DateTimeFormatter.ofPattern('yyyyMMddHHmmssZ'))
} else {
parsed = LocalDateTime.parse(raw, DateTimeFormatter.ofPattern('yyyyMMddHHmmss'))
.atZone(zone).toOffsetDateTime()
}
channelMap.put('observedAt', parsed.toInstant().toString())10. Trigger-Event Confusion
HL7 trigger events capture the business reason for the message — admission (A01), discharge (A03), update (A08), merge (A30). Senders often misalign the trigger event with the payload, or receivers only implement a subset.
Common patterns
- A08 used as a catch-all. "Update patient" is sent whenever any field changes — including fields that should trigger A31 (update person) or A04 (register patient).
- A04 vs A28. A04 is "register a patient" (inpatient), A28 is "add person information" (outpatient or pre-admit). Receivers that distinguish will reject the wrong one.
- A40 merge. Sender merges two records but emits A08 instead of A40; receiver has two records instead of one.
- ORM vs OML vs ORL. Order entry is ORM^O01 in pre-2.5; OML^O21/O22/O23 in 2.5+. A sender on a new version and a receiver on the old still has to agree.
Fix
Document the trigger-event contract per interface. List the trigger events the receiver accepts, and route any unknown or unsupported trigger event to a dead-letter channel with an alert — do not silently drop.
11. Five More Production Errors
11.1 OBX value-type / observation-value mismatch
OBX.2 declares the value type (NM for numeric, ST for string, CE for coded element). When the actual OBX.5 content doesn't match — e.g., OBX.2 = NM but OBX.5 contains "<10" — strict parsers reject. Fix by aligning the value type with the actual value or using ST with a reference-range narrative.
11.2 Empty required fields vs missing fields
HL7 distinguishes empty (||), null (|""|), and missing (field absent). Vendors treat these differently. A receiver expecting "" may fail on empty, and vice versa. Normalize on the sender side.
11.3 Batch file delimiter confusion
Batch files use FHS/BHS/BTS/FTS wrappers. Mixing standard HL7 messages with batch wrappers, or vice versa, breaks parsers that strictly expect one form. Validate the first few bytes of any inbound file.
11.4 Custom Z-segment collisions
Z-segments are site-specific. Two vendors can define ZPI with different semantics. When an interface connects across organizational boundaries, Z-segment semantics must be negotiated or stripped.
11.5 Null return on ACK timeout
Sender ships a message; receiver persists it successfully; network delays ACK past the sender's timeout. Sender assumes failure, retries, receiver double-processes. Fix with MSH.10-based idempotency on the receiver side, not by tuning the timeout.
12. Prevention Checklist
A mature HL7 integration program has the following in place:
- ✓Every HL7 interface has a documented spec — versions, trigger events, expected segments, encoding, delimiter overrides
- ✓Character encoding is explicit on both sides of every interface — never implicit or default
- ✓MLLP framing bytes are verified at connection-establishment time, not just on parse failure
- ✓Message control IDs are monitored for duplicates — alert on any collision inside a 7-day window
- ✓All TS fields are stored with explicit offset or normalized to UTC — never naked local time
- ✓NACK (AE/AR) rates are graphed per interface — a spike triggers an investigation, not just a dashboard ignore
- ✓Test harness sends a battery of edge-case messages to every new interface before go-live
- ✓Control character sanitization runs on every inbound and outbound message
- ✓Duplicate-detection window is set longer than the longest realistic reprocessing scenario
- ✓Every interface has a rehearsed rollback procedure for a bad release
- ✓Integration team reviews NACK samples monthly and adjusts transformations or sender contracts as needed
- ✓MSH.9 trigger-event combinations are validated against the contracted list — unknown events alert rather than silently discard
For the operational maturity program, see our Mirth Connect issues and fixes catalog and healthcare interoperability guide.
13. Frequently Asked Questions
What is the difference between an AE and AR NACK in HL7?
AE (Application Error) means the receiver understood the message but encountered an error while processing it — bad data, failed business rule, downstream system unavailable. AR (Application Reject) means the receiver rejected the message structurally — wrong format, missing required segment, unsupported trigger event. AE is typically retriable; AR is typically not.
Why do HL7 messages sometimes appear truncated after a special character?
Almost always character encoding. Sender emitted UTF-8 bytes (say, a Unicode apostrophe) and the receiver interpreted them as Latin-1, producing a byte sequence the MLLP framer interpreted as end-of-message. Fix both sides to the same encoding and declare it in MSH.18.
What are the MLLP framing bytes?
Start block 0x0B (VT, vertical tab), end block 0x1C (FS, file separator), terminator 0x0D (CR, carriage return). The complete frame is <VT>message<FS><CR>. Drift on any of these three causes messages to hang or get silently dropped.
How do I prevent duplicate HL7 message control IDs?
Use a monotonically increasing ID source with enough entropy for your volume — timestamps in milliseconds plus a sequence counter, or a UUID. Monitor for duplicates in a sliding 7-day window and alert on any collision. Never reset the counter without also reseeding the downstream duplicate-detection window.
What is the right way to handle timezones in HL7 TS fields?
Always include the offset — HL7 v2.5+ supports YYYYMMDDHHMMSS[.S[S[S[S]]]][+/-ZZZZ]. Store in UTC downstream. Never treat a naked timestamp as local time without knowing the sender's timezone, because you almost certainly don't.
What are the most common MSH field errors?
MSH.3 / MSH.4 (sending and receiving application/facility) misconfigured after environment moves, MSH.9 trigger event mismatch (sender says A01, receiver only accepts A04), MSH.11 processing ID wrong (P versus T), and MSH.12 version mismatch between sender and receiver parsers.
How do I catch encoding errors before they hit production?
Run a pre-go-live test battery that includes messages with accented characters (José, Müller), curly quotes, em dashes, and at least one character that would round-trip poorly between the sender's and receiver's claimed encodings. Verify what the receiver actually persisted, not just what it ACKed.
Can a PID segment really cause an entire interface to break?
Yes. A malformed PID.3 (patient identifier) that fails the receiver's parser will cause hard parse errors. A PID with the wrong number of repetitions for a required cardinality can fail strict validators. And a PID with a stray field separator character in the name will shift every subsequent field by one.
What is trigger-event confusion in HL7?
The sender advertises one trigger event in MSH.9 (say, ADT^A04 registration) but emits a message shaped like a different one (say, with PV1 required for an admit). Receivers that validate strictly reject; receivers that validate loosely process incorrectly. Resolution requires aligning the trigger event to the actual payload, not picking based on business terminology.
When is it okay to auto-retry a failing HL7 message?
When the NACK is AE and the error is transient — database deadlock, downstream system briefly unavailable, rate limit hit. Never auto-retry AR NACKs — those indicate a structural problem that retrying won't fix, and you'll log the same error 10,000 times instead of once.
Related Reading
- HL7 Integration: The Complete Guide
- FHIR Integration: The Complete Guide
- Mirth Connect: The Complete Guide
- EHR Integration Guide
- Healthcare Interoperability Guide
- Mirth Connect MLLP Connection Refused
- Mirth Connect Java Heap Space Error
- Groovy vs JavaScript Transformers
- HIPAA Compliance for Integration Engineers
- Common Mirth Connect Issues & Fixes
- Mirth Connect Installation Guide
- Mirth Support & HL7 Integration USA
- Best HL7 Integration Engines 2026