Transactional outbox
Ironflow uses the transactional outbox pattern to decouple accepting an event from announcing it on NATS. This document explains the architecture, the guarantees it provides, and the operational surface an operator needs to know about.
Introduced in #487 (supersedes #477).
Why an outbox
Section titled “Why an outbox”Every event in Ironflow lives in two places: the events table (source of truth, queryable, durable) and the NATS JetStream PUBSUB stream (notification channel consumed by projections and real-time subscriptions). An event is only useful once it exists in both — the DB row without the NATS message means projections never see it; the NATS message without the DB row means consumers act on state that the application can’t prove exists.
Before the outbox, the publish path wrote the DB row, then called NATS, then patched events.nats_seq with the assigned sequence. Four paths left events.nats_seq permanently NULL:
- Legacy
Emit()API — non-entity events never published to NATS. publisher == nil(test mode, misconfigured deployments) — no publish call at all.- Publish failure — the write-back path was documented as non-fatal, so transient NATS outages left the row unpublished.
- The t1→t3 window — between the DB insert and the seq write-back, readers could observe
NULL. Process crashes in that window made theNULLpermanent.
These gaps broke WaitForEvent(eventID), which has no way to resolve an event ID to a NATS sequence when the column is NULL.
The outbox pattern
Section titled “The outbox pattern”AppendEvent / Emit request: ┌───────────────────────────────────────────────────────┐ │ BEGIN txn │ │ INSERT INTO events (..., nats_seq=NULL) │ │ INSERT INTO outbox (event_id, topic, kind, ...) │ │ ← optimistic concurrency check happens here │ │ COMMIT │ └───────────────────────────────────────────────────────┘ │ ▼ return to caller (no NATS in the hot path)
outbox worker (one goroutine per node): ┌───────────────────────────────────────────────────────┐ │ SELECT ... FOR UPDATE SKIP LOCKED │ │ pub.PublishWithSeqAndMsgID(topic, payload, entry.id) │ │ UPDATE events SET nats_seq = ack.Sequence │ │ DELETE FROM outbox WHERE id = entry.id │ └───────────────────────────────────────────────────────┘The event row and the outbox row ride the same transaction, so either both are written or neither is. Optimistic concurrency conflicts roll back before anything lands on NATS — no orphan messages. NATS outages no longer block event acceptance; the outbox absorbs the backlog and drains when NATS recovers.
Guarantees
Section titled “Guarantees”- Atomicity. Accepting an event and scheduling its publish are one database transaction. Concurrency conflicts leave no trace on NATS.
- Eventual consistency.
events.nats_seqisNULLtransiently — typically 10-50ms under healthy conditions, bounded by outbox worker lag. It stops beingNULLonce the worker publishes successfully. Readers that need read-your-writes callWaitForEvent(eventID)to block until the row drains. - No duplicate NATS messages on retry. The worker passes the outbox entry ID as the NATS msg-id; a worker that publishes successfully then crashes before marking the row published will re-publish with the same msg-id, and JetStream’s dedup window returns the original sequence.
- Per-entity FIFO. Within a node, each batch is bucketed by
entity_idand each bucket runs on a single goroutine that iterates serially. Non-entity rows (plainEmitcalls) dispatch in parallel with no ordering constraint. Cross-node ordering relies onSELECT FOR UPDATE SKIP LOCKEDplus a 30s claim lease onnext_retry_at. - Poison-row handling. Publishes that fail 10 times (default) move to
outbox_dead_letter. The main outbox stays small and hot; dead-letter rows stay for manual inspection until explicitly requeued or discarded.
Non-guarantees
Section titled “Non-guarantees”- Global FIFO across entities. Not provided. Projections dedupe by event ID, so cross-entity ordering is not required.
- Zero duplicate deliveries at the consumer. NATS is at-least-once; consumers still need idempotence by event ID.
AppendEventResponse.Sequencein the same round trip. Under the outbox pattern the sequence is not known atAppendEventtime. The field is always0on the response — callers that need the sequence callWaitForEvent(eventID)(now blocking) orWaitProjectionCatchup(streaming).
Operational surface
Section titled “Operational surface”Schema
Section titled “Schema”Two tables, introduced in migration 011:
outbox— the live queue. Columns:id, event_id, entity_id, topic, kind, payload, metadata, environment_id, created_at, next_retry_at, attempts, last_error.outbox_dead_letter— rows that exceeded retry budget. Same shape minusnext_retry_at, plusdead_at.
kind is events (consumed by projections, triggers UPDATE events.nats_seq on publish) or entity (consumed by entity stream subscriptions only).
Tuning
Section titled “Tuning”Environment variables on the server binary (not yet wired — defaults only for now):
IRONFLOW_OUTBOX_WORKERS(default 8) — concurrent publish goroutines per batch.IRONFLOW_OUTBOX_BATCH_SIZE(default 32) — rows claimed per poll.IRONFLOW_OUTBOX_POLL_MS(default 50) — idle poll interval.IRONFLOW_OUTBOX_MAX_ATTEMPTS(default 10) — move to dead-letter after this many failures.
Inspecting the outbox
Section titled “Inspecting the outbox”For ad-hoc inspection beyond the CLI and HTTP API (covered below), use direct SQL:
-- How many rows pending?SELECT COUNT(*) FROM outbox;
-- Oldest pendingSELECT event_id, topic, attempts, last_error, created_atFROM outboxORDER BY created_at ASCLIMIT 20;
-- Dead-letter summarySELECT event_id, topic, attempts, last_error, dead_atFROM outbox_dead_letterORDER BY dead_at DESCLIMIT 20;Recovery
Section titled “Recovery”If the outbox backlogs (NATS down, sustained publish failures), the worker keeps retrying with exponential backoff (1s → 10min cap) until the breaker recovers. No operator action is normally needed.
If rows land in outbox_dead_letter, use the CLI:
# Inspect what's in the DLQironflow outbox dlq list --env prod
# Push a row back to the outbox for another publish attemptironflow outbox dlq requeue <event-id>
# Permanently delete a DLQ row (the underlying event row is kept)ironflow outbox dlq discard <event-id> --yesThe CLI is a thin wrapper over three HTTP routes (handler in
internal/server/outbox_handler.go):
| Method | Path | Action |
|---|---|---|
GET | /api/v1/outbox/dead-letter | List DLQ rows (env-scoped via auth context). |
POST | /api/v1/outbox/dead-letter/{eventID}/requeue | Move a DLQ row back to outbox for retry. |
DELETE | /api/v1/outbox/dead-letter/{eventID} | Permanently delete a DLQ row (event row kept). |
Use these directly when integrating with dashboards or external tooling.
See the Outbox DLQ runbook for the full triage flow and alert rules. For raw-SQL access when the CLI is unavailable, the outbox_dead_letter table follows the schema above and accepts the usual requeue pattern (INSERT ... SELECT ... into outbox, then DELETE from outbox_dead_letter).
What the SDK change means for callers
Section titled “What the SDK change means for callers”The breaking change: AppendEventResponse.Sequence and EntityEmitResult.Sequence are always 0. Callers that previously used them for read-your-writes patterns should switch to WaitForEvent(eventID), which blocks until the outbox worker drains the row (or returns a terminal error if the row is dead-lettered):
// beforeresp, _ := client.AppendEvent(ctx, req)_, _ = client.WaitProjectionCatchup(ctx, &WaitReq{MinSeq: resp.Msg.Sequence})
// afterresp, _ := client.AppendEvent(ctx, req)wait, _ := client.WaitForEvent(ctx, &WaitForEventReq{ EventId: resp.Msg.EventId, Timeout: durationpb.New(5 * time.Second),})// wait.Msg.TargetSeq is the resolved NATS seq.Observability
Section titled “Observability”ironflow_outbox_dead_letter_count{env}— gauge, current DLQ depth per environment, computed at scrape time.ironflow_dlq_writes_total{source}— counter, incremented every time the worker moves an outbox row to the DLQ (source="outbox").
Alert rules and triage steps live in the Outbox DLQ runbook.
Out of scope for the initial release
Section titled “Out of scope for the initial release”- Prometheus publish latency histogram and
ironflow_outbox_backlog_sizegauge (only the DLQ count is wired today). - Cross-node per-entity advisory locking — current implementation serializes per-entity only within one node. Multiple nodes can publish concurrently for the same entity; acceptable until cluster scale warrants
pg_try_advisory_xact_lock.
These are tracked as follow-ups on #487.