Architecture
Ironflow Architecture
Section titled “Ironflow Architecture”Ironflow is designed as a Control Plane that decouples workflow state from execution. It collapses the traditional backend stack (Queue + Database + Orchestrator) into a single, high-performance binary.
The Stack
Section titled “The Stack”Ironflow leverages the Go ecosystem to deliver industrial-strength durability with zero external dependencies for local development.
| Layer | Technology | Role |
|---|---|---|
| Gateway | ConnectRPC | Unified HTTP/1.1 and HTTP/2 gateway for Webhooks and Workers. |
| Event Bus | NATS JetStream | The internal “central nervous system” for all events and state changes. |
| State Store | SQLite / Postgres | Persistent storage for workflow runs, steps, organizations, projects, and environments. |
| Execution | Go Goroutines | Lightweight threads that manage state machines and retry logic. |
| Control Plane | React + Tailwind | Embedded Dashboard for real-time monitoring and hot-patching. |
Execution Modes
Section titled “Execution Modes”Ironflow is unique in its support for the Serverless Trilemma. You can define a function once and execute it in two different patterns simultaneously.
Mode A: Push (Serverless)
Section titled “Mode A: Push (Serverless)”- Mechanism: Ironflow makes an outbound HTTP POST to your endpoint (Next.js, Lambda, etc.).
- Ideal for: High-concurrency, stateless, short-lived tasks (under 10s).
- Benefit: Scales to zero; works with any HTTP-capable platform.
Mode B: Pull (Workers)
Section titled “Mode B: Pull (Workers)”- Mechanism: Your application opens a long-lived gRPC stream to Ironflow and pulls jobs.
- Ideal for: GPU workloads, video processing, or long-running reasoning chains (minutes/hours).
- Benefit: Persistent connections bypass serverless timeout limits.
Durable Execution Engine
Section titled “Durable Execution Engine”At the heart of Ironflow is a Write-Ahead-Log (WAL) approach to workflow logic.
- Memoization: Before running any
step, the engine checks the database. If a successful result exists, it is returned instantly. - Persistence: New results are committed to the database before the SDK proceeds to the next step.
- Observability: Every state change emits a system event to NATS, powering the Dashboard and TUI without extra database queries.
Internal Networking
Section titled “Internal Networking”Synchronous Triggers (await)
Section titled “Synchronous Triggers (await)”When a client triggers a workflow with wait: true, Ironflow parks the request goroutine and subscribes to an internal NATS topic: ironflow.{project_name}.{env_name}.results.run.{run_id}. When the worker completes the final step, the result is published to that topic, waking the goroutine to deliver the HTTP response. This provides a “Sync-over-Async” experience with zero CPU polling.
Smart Concurrency Lanes
Section titled “Smart Concurrency Lanes”Ironflow implements per-function concurrency limits using virtual “Lanes.” When a function is configured with a concurrencyKey (e.g., event.data.orgId), the engine creates an in-memory lane keyed by the extracted value and enforces a per-key active-job limit. Jobs over the limit are queued rather than dropped. This prevents a single user from starving the entire worker pool.
Security Model
Section titled “Security Model”Ironflow operates on an Identity-Embedded model.
- Single-Binary Auth: On first boot, the engine auto-bootstraps a root Organization and an Admin API Key.
- Resource Naming: Every internal resource follows the IRN (Ironflow Resource Name) format:
irn:ironflow:{org}:{project}:{type}:{env}:{id}. - Zero-Trust: All internal service communication (Engine to Worker) is authenticated via the same API Key system as the external REST API.
Debugging & Observability
Section titled “Debugging & Observability”Ironflow includes built-in debugging and observability features:
- History Navigation (time-travel debugging): Reconstruct run state at any historical timestamp by replaying recorded events. Available via CLI (
ironflow inspect --at,--replay), TUI, SDK, and dashboard. - History Editing (scoped injection): Pause running workflows at step boundaries, inspect completed step outputs, inject modified data, and resume execution. Supports both push and pull modes.
- Circuit Breaker: Push endpoints are protected by a circuit breaker (keyed by function ID + endpoint URL) that opens when repeated errors occur. State is persisted in NATS KV (
SYS_circuit_breakers) and shared across cluster nodes, surviving restarts and rolling deploys. Dashboard shows circuit state badges on the Functions page. - Continuous History Log (audit stream): Append-only event log per function recording step starts, completions, failures, injections, and saga compensations. This is the foundation that powers history navigation — because the audit trail was always there.
- OpenTelemetry & Prometheus: Optional tracing via OTLP gRPC exporter with W3C propagation, and Prometheus metrics at
/metrics. Zero overhead when disabled.
What Makes it Different?
Section titled “What Makes it Different?”- Collapsing the Stack: Replaces Redis (Queue), Postgres (State), and Temporal (Orchestrator) with one binary.
- Continuous History: Because every step is a recorded fact in one continuous history, you can navigate to any moment and correct any step in real-time.
- Pure Go: No CGO dependencies (ModernC SQLite), making Ironflow portable to any architecture without a toolchain.