Skip to content

Architecture

Ironflow is designed as a Control Plane that decouples workflow state from execution. It collapses the traditional backend stack (Queue + Database + Orchestrator) into a single, high-performance binary.

Continuous History: Entity Lifecycle — showing how events, workflows, projections, and time-travel connect in one unified history


Ironflow leverages the Go ecosystem to deliver industrial-strength durability with zero external dependencies for local development.

LayerTechnologyRole
GatewayConnectRPCUnified HTTP/1.1 and HTTP/2 gateway for Webhooks and Workers.
Event BusNATS JetStreamThe internal “central nervous system” for all events and state changes.
State StoreSQLite / PostgresPersistent storage for workflow runs, steps, organizations, projects, and environments.
ExecutionGo GoroutinesLightweight threads that manage state machines and retry logic.
Control PlaneReact + TailwindEmbedded Dashboard for real-time monitoring and hot-patching.

Ironflow is unique in its support for the Serverless Trilemma. You can define a function once and execute it in two different patterns simultaneously.

  • Mechanism: Ironflow makes an outbound HTTP POST to your endpoint (Next.js, Lambda, etc.).
  • Ideal for: High-concurrency, stateless, short-lived tasks (under 10s).
  • Benefit: Scales to zero; works with any HTTP-capable platform.
  • Mechanism: Your application opens a long-lived gRPC stream to Ironflow and pulls jobs.
  • Ideal for: GPU workloads, video processing, or long-running reasoning chains (minutes/hours).
  • Benefit: Persistent connections bypass serverless timeout limits.

At the heart of Ironflow is a Write-Ahead-Log (WAL) approach to workflow logic.

Durable Workflow Execution — showing memoized steps, failure recovery, and parallel branches

Durable Execution Engine — SDK step calls flow through cache check, execution, and persistence, connected to State Store and Event Stream

  1. Memoization: Before running any step, the engine checks the database. If a successful result exists, it is returned instantly.
  2. Persistence: New results are committed to the database before the SDK proceeds to the next step.
  3. Observability: Every state change emits a system event to NATS, powering the Dashboard and TUI without extra database queries.

When a client triggers a workflow with wait: true, Ironflow parks the request goroutine and subscribes to an internal NATS topic: ironflow.{project_name}.{env_name}.results.run.{run_id}. When the worker completes the final step, the result is published to that topic, waking the goroutine to deliver the HTTP response. This provides a “Sync-over-Async” experience with zero CPU polling.

Ironflow implements per-function concurrency limits using virtual “Lanes.” When a function is configured with a concurrencyKey (e.g., event.data.orgId), the engine creates an in-memory lane keyed by the extracted value and enforces a per-key active-job limit. Jobs over the limit are queued rather than dropped. This prevents a single user from starving the entire worker pool.


Ironflow operates on an Identity-Embedded model.

  • Single-Binary Auth: On first boot, the engine auto-bootstraps a root Organization and an Admin API Key.
  • Resource Naming: Every internal resource follows the IRN (Ironflow Resource Name) format: irn:ironflow:{org}:{project}:{type}:{env}:{id}.
  • Zero-Trust: All internal service communication (Engine to Worker) is authenticated via the same API Key system as the external REST API.

Ironflow includes built-in debugging and observability features:

  • History Navigation (time-travel debugging): Reconstruct run state at any historical timestamp by replaying recorded events. Available via CLI (ironflow inspect --at, --replay), TUI, SDK, and dashboard.
  • History Editing (scoped injection): Pause running workflows at step boundaries, inspect completed step outputs, inject modified data, and resume execution. Supports both push and pull modes.
  • Circuit Breaker: Push endpoints are protected by a circuit breaker (keyed by function ID + endpoint URL) that opens when repeated errors occur. State is persisted in NATS KV (SYS_circuit_breakers) and shared across cluster nodes, surviving restarts and rolling deploys. Dashboard shows circuit state badges on the Functions page.
  • Continuous History Log (audit stream): Append-only event log per function recording step starts, completions, failures, injections, and saga compensations. This is the foundation that powers history navigation — because the audit trail was always there.
  • OpenTelemetry & Prometheus: Optional tracing via OTLP gRPC exporter with W3C propagation, and Prometheus metrics at /metrics. Zero overhead when disabled.

  1. Collapsing the Stack: Replaces Redis (Queue), Postgres (State), and Temporal (Orchestrator) with one binary.
  2. Continuous History: Because every step is a recorded fact in one continuous history, you can navigate to any moment and correct any step in real-time.
  3. Pure Go: No CGO dependencies (ModernC SQLite), making Ironflow portable to any architecture without a toolchain.