Benchmarks
Ironflow ships a benchmark suite that measures component throughput and end-to-end latency. Use it to establish baselines for your hardware and detect regressions.
Quick Start
Section titled “Quick Start”# Component benchmarks (in-process, no server needed)make bench
# Load tests (starts server, runs k6 scripts, captures pprof)make loadtest
# Bothmake bench-allPrerequisites
Section titled “Prerequisites”| Tool | Required for | Install |
|---|---|---|
| Go 1.25+ | make bench | Already required |
| k6 | make loadtest | brew install k6 |
make bench and make loadtest both run against in-memory SQLite with embedded NATS. PostgreSQL comparison is not currently wired into the bench pipeline.
Component Benchmarks (make bench)
Section titled “Component Benchmarks (make bench)”Runs Go testing.B benchmarks against in-memory SQLite with embedded NATS. No running server needed.
What’s measured:
| Category | Benchmarks |
|---|---|
| Store | CreateRun, CreateRun (parallel), GetRun, CreateStep, CreateEvent, ListRuns |
| Engine | StepMemoLookup, CreateAndCompleteStep |
| NATS | Publish, KV Put, KV Get, PublishSubscribe |
| Pattern | Parse, Match, MatchWildcard, PatternMatcher |
| API | EmitEvent, GetRun, ListEvents, HealthCheck |
Plus TestGoroutineLeak (asserts no goroutine leak after 500 step cycles) and TestBootTime (cold start to /health response — 10 iterations, reports median and p95; logs a warning above 500ms but does not fail).
Reading the Output
Section titled “Reading the Output”BenchmarkStore_CreateRun-10 85423 14025 ns/op 2048 B/op 42 allocs/op│ │ │ │ │ ││ │ │ │ │ └ heap allocations per op│ │ │ │ └ bytes allocated per op│ │ │ └ nanoseconds per operation│ │ └ iterations run│ └ GOMAXPROCS└ benchmark nameLower ns/op = faster. Lower B/op and allocs/op = less GC pressure.
Load Tests (make loadtest)
Section titled “Load Tests (make loadtest)”Starts an Ironflow server with serve --dev --port 9199 --pprof (with IRONFLOW_METRICS_ENABLED=true and a dedicated bench DB), registers the SDK benchmark worker from tests/loadtest/functions/, then runs k6 scripts over a 3m30s ramp.
What’s measured:
| Script | Peak VUs | Metric |
|---|---|---|
event-emission.js | 100 | Event ingest throughput and latency |
mixed-workload.js | 100 | Weighted mix (40% emit, 20% list runs, 20% list events, 10% functions, 10% health) |
function-invoke.js | 50 | Function trigger-to-completion time |
event-to-projection.js | 50 | End-to-end event → projection latency via WebSocket |
cancel-on-event.js | varies | Event-driven run cancellation latency |
policy-eval.js | varies | CEL policy evaluation throughput |
Reading k6 Output
Section titled “Reading k6 Output”http_req_duration...: avg=3.19ms min=245µs med=2.23ms max=27.3ms p(90)=7.67ms p(95)=9.28mshttp_req_failed.....: 0.00% 0 out of 134602http_reqs...........: 134602 640.899/s- Thresholds are per-script —
event-emissionenforces p95 < 500ms / failure < 1%,mixed-workloadp95 < 1s / failure < 1%,event-to-projectionp95 < 2s,function-invokep95 < 10s / failure < 5%,policy-evalfailure < 1% - Results and pprof profiles are saved to
tests/loadtest/results/{timestamp}/— captures includeheap-before.prof,heap-after.prof,goroutine-before.prof,goroutine-after.prof
Profiling After Load Tests
Section titled “Profiling After Load Tests”# Compare heap before/after loadgo tool pprof -diff_base results/heap-before.prof results/heap-after.prof
# Check goroutine statego tool pprof results/goroutine-after.profGrafana Dashboard
Section titled “Grafana Dashboard”The Grafana performance dashboard is included in the Helm chart at deploy/helm/ironflow/dashboards/ironflow-performance.json. When deployed with monitoring.dashboards.enabled=true, Grafana auto-imports it via sidecar. For standalone Grafana, import the JSON file directly. Requires IRONFLOW_METRICS_ENABLED=true.