Skip to content

Benchmarks

Ironflow ships a benchmark suite that measures component throughput and end-to-end latency. Use it to establish baselines for your hardware and detect regressions.

Terminal window
# Component benchmarks (in-process, no server needed)
make bench
# Load tests (starts server, runs k6 scripts, captures pprof)
make loadtest
# Both
make bench-all
ToolRequired forInstall
Go 1.25+make benchAlready required
k6make loadtestbrew install k6

make bench and make loadtest both run against in-memory SQLite with embedded NATS. PostgreSQL comparison is not currently wired into the bench pipeline.

Runs Go testing.B benchmarks against in-memory SQLite with embedded NATS. No running server needed.

What’s measured:

CategoryBenchmarks
StoreCreateRun, CreateRun (parallel), GetRun, CreateStep, CreateEvent, ListRuns
EngineStepMemoLookup, CreateAndCompleteStep
NATSPublish, KV Put, KV Get, PublishSubscribe
PatternParse, Match, MatchWildcard, PatternMatcher
APIEmitEvent, GetRun, ListEvents, HealthCheck

Plus TestGoroutineLeak (asserts no goroutine leak after 500 step cycles) and TestBootTime (cold start to /health response — 10 iterations, reports median and p95; logs a warning above 500ms but does not fail).

BenchmarkStore_CreateRun-10 85423 14025 ns/op 2048 B/op 42 allocs/op
│ │ │ │ │ │
│ │ │ │ │ └ heap allocations per op
│ │ │ │ └ bytes allocated per op
│ │ │ └ nanoseconds per operation
│ │ └ iterations run
│ └ GOMAXPROCS
└ benchmark name

Lower ns/op = faster. Lower B/op and allocs/op = less GC pressure.

Starts an Ironflow server with serve --dev --port 9199 --pprof (with IRONFLOW_METRICS_ENABLED=true and a dedicated bench DB), registers the SDK benchmark worker from tests/loadtest/functions/, then runs k6 scripts over a 3m30s ramp.

What’s measured:

ScriptPeak VUsMetric
event-emission.js100Event ingest throughput and latency
mixed-workload.js100Weighted mix (40% emit, 20% list runs, 20% list events, 10% functions, 10% health)
function-invoke.js50Function trigger-to-completion time
event-to-projection.js50End-to-end event → projection latency via WebSocket
cancel-on-event.jsvariesEvent-driven run cancellation latency
policy-eval.jsvariesCEL policy evaluation throughput
http_req_duration...: avg=3.19ms min=245µs med=2.23ms max=27.3ms p(90)=7.67ms p(95)=9.28ms
http_req_failed.....: 0.00% 0 out of 134602
http_reqs...........: 134602 640.899/s
  • Thresholds are per-script — event-emission enforces p95 < 500ms / failure < 1%, mixed-workload p95 < 1s / failure < 1%, event-to-projection p95 < 2s, function-invoke p95 < 10s / failure < 5%, policy-eval failure < 1%
  • Results and pprof profiles are saved to tests/loadtest/results/{timestamp}/ — captures include heap-before.prof, heap-after.prof, goroutine-before.prof, goroutine-after.prof
Terminal window
# Compare heap before/after load
go tool pprof -diff_base results/heap-before.prof results/heap-after.prof
# Check goroutine state
go tool pprof results/goroutine-after.prof

The Grafana performance dashboard is included in the Helm chart at deploy/helm/ironflow/dashboards/ironflow-performance.json. When deployed with monitoring.dashboards.enabled=true, Grafana auto-imports it via sidecar. For standalone Grafana, import the JSON file directly. Requires IRONFLOW_METRICS_ENABLED=true.