Skip to content

Error Handling

Ironflow provides automatic retries for transient failures while allowing you to mark permanent failures that shouldn’t be retried.

Retry Behavior

By default, errors in step functions are retried according to your function’s retry configuration. These are the system defaults — applied when retry is not specified explicitly:

createFunction(
{
id: "my-function",
triggers: [{ event: "my.event" }],
retry: {
maxAttempts: 3, // Maximum retry attempts
initialDelayMs: 1000, // Initial delay (1 second)
backoffFactor: 2.0, // Exponential backoff multiplier
maxDelayMs: 300_000, // Maximum delay (5 minutes)
},
},
async ({ event, step }) => { /* ... */ },
);

Retry delays follow exponential backoff:

  1. First retry: 1 second
  2. Second retry: 2 seconds
  3. Third retry: 4 seconds
  4. (capped at maxDelayMs)

(Derived from initialDelayMs=1000 × backoffFactor=2.0 shown above.)


NonRetryableError

Use NonRetryableError to indicate permanent failures that shouldn’t be retried:

import { NonRetryableError } from "@ironflow/node";
await step.run("validate", async () => {
if (!isValid(data)) {
// Won't retry - permanent failure
throw new NonRetryableError("Invalid input");
}
// Regular errors retry automatically
throw new Error("Temporary failure");
});

When to use NonRetryableError:

  • Invalid input data that won’t change on retry
  • Business logic failures (e.g., insufficient funds)
  • Authentication/authorization errors
  • Resource not found errors

When NOT to use NonRetryableError:

  • Network timeouts
  • External service temporary failures
  • Rate limiting (should back off and retry)

Error Types Summary

Error TypeBehaviorUse Case
Regular ErrorRetried with backoffTransient failures
NonRetryableErrorNot retriedPermanent failures — invalid input, business-rule violations
StepErrorThrown by Ironflow when a step’s underlying error propagates out of step.runCatch around step calls to inspect step name + attempt count
StepTimeoutErrorNot retried beyond function policyA step exceeded its configured timeout
TimeoutErrorSubject to function retry policyFunction-level timeout (whole run exceeded its deadline)
ValidationError / SchemaValidationErrorNot retriedEvent payload failed schema validation

Use the isRetryable(err) helper (exported from @ironflow/node) to test whether a caught error will be retried by the engine.


Webhook Signature Verification

All requests from Ironflow are signed for security. The SDK verifies signatures automatically when you provide a signing key:

import { serve } from "@ironflow/node";
export const POST = serve({
functions: [myFunction],
signingKey: process.env.IRONFLOW_SIGNING_KEY, // Automatic verification
});

Manual signature verification is not yet available in the JS SDK. Use the signingKey option in serve() for automatic verification.

Signature Header

Ironflow includes the signature in the X-Ironflow-Signature header using HMAC-SHA256.

Development Mode

During local development, you can skip verification:

export const POST = serve({
functions: [myFunction],
skipVerification: true, // Only for local development!
});

Never disable signature verification in production. This protects your endpoints from unauthorized requests.


Global Error Observation (Client onError)

For client-side operations (emitting events, managing runs, KV store, etc.), you can register a global onError handler to observe all errors without wrapping every call in try/catch:

import { createClient } from "@ironflow/node";
const client = createClient({
onError: async (error, context) => {
// Send to your logging/metrics system
await logger.error("Ironflow client error", {
method: context.method, // e.g. "emit", "kv.bucket.get"
endpoint: context.endpoint, // e.g. "/ironflow.v1.IronflowService/Trigger"
statusCode: context.statusCode, // HTTP status or undefined for network errors
error: error.message,
});
},
});

Key behaviors:

  • The handler fires before the error is re-thrown — it never suppresses errors
  • Async handlers are fully awaited before the error propagates
  • If the handler itself throws, its error is swallowed (logged to stderr)
  • Propagates to sub-clients created via client.kv() and client.config()

This is useful for centralized logging, metrics collection, and alerting on client errors. See the @ironflow/node reference for the full API.

onError is for observing client errors. For controlling retry behavior inside functions, use NonRetryableError instead.


Handling Failed Runs

When a run fails after exhausting retries, you can:

  1. Hot Patch: Edit step outputs and resume from a specific step
  2. Investigate: Use the TUI debugger or dashboard to inspect the failure
  3. Fix and Retry: Fix the underlying issue and trigger a new event

See Debugging for more details on investigating and recovering from failures.


What’s Next?