Architecture

Ironflow Architecture

Ironflow is designed as a Control Plane that decouples workflow state from execution. It collapses the traditional backend stack (Queue + Database + Orchestrator) into a single, high-performance binary.

Continuous History: Entity Lifecycle — showing how events, workflows, projections, and time-travel connect in one unified history

The Stack

Ironflow leverages the Go ecosystem to deliver industrial-strength durability with zero external dependencies for local development.

Layer	Technology	Role
Gateway	ConnectRPC	Unified HTTP/1.1 and HTTP/2 gateway for Webhooks and Workers.
Event Bus	NATS JetStream	The internal “central nervous system” for all events and state changes.
State Store	SQLite / Postgres	Persistent storage for workflow runs, steps, organizations, projects, and environments.
Execution	Go Goroutines	Lightweight threads that manage state machines and retry logic.
Control Plane	React + Tailwind	Embedded Dashboard for real-time monitoring and hot-patching.

Execution Modes

Ironflow is unique in its support for the Serverless Trilemma. You can define a function once and execute it in two different patterns simultaneously.

Mode A: Push (Serverless)

Mechanism: Ironflow makes an outbound HTTP POST to your endpoint (Next.js, Lambda, etc.).
Ideal for: High-concurrency, stateless, short-lived tasks (under 10s).
Benefit: Scales to zero; works with any HTTP-capable platform.

Mode B: Pull (Workers)

Mechanism: Your application opens a long-lived gRPC stream to Ironflow and pulls jobs.
Ideal for: GPU workloads, video processing, or long-running reasoning chains (minutes/hours).
Benefit: Persistent connections bypass serverless timeout limits.

Durable Execution Engine

At the heart of Ironflow is a Write-Ahead-Log (WAL) approach to workflow logic.

Durable Workflow Execution — showing memoized steps, failure recovery, and parallel branches

Durable Execution Engine — SDK step calls flow through cache check, execution, and persistence, connected to State Store and Event Stream

Memoization: Before running any step, the engine checks the database. If a successful result exists, it is returned instantly.
Persistence: New results are committed to the database before the SDK proceeds to the next step.
Observability: Every state change emits a system event to NATS, powering the Dashboard and TUI without extra database queries.

Internal Networking

Synchronous Triggers (`await`)

When a client triggers a workflow with wait: true, Ironflow parks the request goroutine and subscribes to an internal NATS topic: ironflow.{project_name}.{env_name}.results.run.{run_id}. When the worker completes the final step, the result is published to that topic, waking the goroutine to deliver the HTTP response. This provides a “Sync-over-Async” experience with zero CPU polling.

Smart Concurrency Lanes

Ironflow implements per-function concurrency limits using virtual “Lanes.” When a function is configured with a concurrencyKey (e.g., event.data.orgId), the engine creates an in-memory lane keyed by the extracted value and enforces a per-key active-job limit. Jobs over the limit are queued rather than dropped. This prevents a single user from starving the entire worker pool.

Security Model

Ironflow operates on an Identity-Embedded model.

Single-Binary Auth: On first boot, the engine auto-bootstraps a root Organization and an Admin API Key.
Resource Naming: Every internal resource follows the IRN (Ironflow Resource Name) format: irn:ironflow:{org}:{project}:{type}:{env}:{id}.
Zero-Trust: All internal service communication (Engine to Worker) is authenticated via the same API Key system as the external REST API.

Debugging & Observability

Ironflow includes built-in debugging and observability features:

History Navigation (time-travel debugging): Reconstruct run state at any historical timestamp by replaying recorded events. Available via CLI (ironflow inspect --at, --replay), TUI, SDK, and dashboard.
History Editing (scoped injection): Pause running workflows at step boundaries, inspect completed step outputs, inject modified data, and resume execution. Supports both push and pull modes.
Circuit Breaker: Push endpoints are protected by a circuit breaker (keyed by function ID + endpoint URL) that opens when repeated errors occur. State is persisted in NATS KV (SYS_circuit_breakers) and shared across cluster nodes, surviving restarts and rolling deploys. Dashboard shows circuit state badges on the Functions page.
Continuous History Log (audit stream): Append-only event log per function recording step starts, completions, failures, injections, and saga compensations. This is the foundation that powers history navigation — because the audit trail was always there.
OpenTelemetry & Prometheus: Optional tracing via OTLP gRPC exporter with W3C propagation, and Prometheus metrics at /metrics. Zero overhead when disabled.

What Makes it Different?

Collapsing the Stack: Replaces Redis (Queue), Postgres (State), and Temporal (Orchestrator) with one binary.
Continuous History: Because every step is a recorded fact in one continuous history, you can navigate to any moment and correct any step in real-time.
Pure Go: No CGO dependencies (ModernC SQLite), making Ironflow portable to any architecture without a toolchain.