Skip to content

Common Pitfalls

AI assistants make predictable mistakes when generating Ironflow code. Learn to spot and fix them.


1. Non-Idempotent Step Code

The Problem: AI generates step code that causes duplicate operations on retry.

await step.run("create-payment", async () => {
// No idempotency key - creates duplicate charges on retry!
return await stripe.charges.create({
amount: order.total,
currency: "usd",
});
});

How to Fix: Always use idempotency keys when calling external APIs. Most payment processors, email services, and APIs support idempotency keys.

Tell the AI:

Ensure all external API calls inside steps use idempotency keys.
For Stripe, use idempotencyKey. For other services, use appropriate
deduplication mechanisms.

2. Side Effects Outside Steps

The Problem: AI puts side effects (API calls, database queries) outside step.run, causing them to execute on every replay.

async ({ event, step }) => {
// BAD: This runs on EVERY replay, not just once
const user = await db.users.find(event.data.userId);
await step.run("process", async () => {
return processUser(user);
});
};

How to Fix: Every external call (database, API, file system) must be inside a step.run.

Tell the AI:

All side effects (database queries, API calls, file operations) must be
inside step.run. The function body outside steps should only contain
control flow logic using data returned from steps.

3. Duplicate Step IDs

The Problem: AI reuses step IDs, especially in loops.

for (const item of items) {
// BAD: Same step ID for every item - only first executes
await step.run("process-item", async () => {
return processItem(item);
});
}

How to Fix: Use step.map for arrays. If using loops, include a unique identifier in the step ID.

Tell the AI:

Each step needs a unique ID. For arrays, use step.map instead of loops.
If you must use loops, include the index or item ID in the step name.

4. Missing Error Handling

The Problem: AI doesn’t distinguish between retryable and non-retryable errors.

await step.run("validate", async () => {
if (!isValidEmail(email)) {
// BAD: Will retry forever, but email won't become valid
throw new Error("Invalid email");
}
});

How to Fix: Use NonRetryableError for validation failures and permanent errors. Let transient errors (network, timeout) retry normally.

Tell the AI:

Use NonRetryableError for:
- Validation failures
- Invalid input
- Business rule violations
- Resource not found
- Permission denied
Let these retry normally:
- Network timeouts
- Database connection errors
- Rate limiting (with backoff)

5. Misunderstanding waitForEvent Timeout

The Problem: AI treats waitForEvent timeout as something the workflow can catch and recover from inline. It cannot — the SDK return type is non-nullable (Promise<IronflowEvent<T>>), and the engine transitions the step to timed_out and fails the run with "waitForEvent timed out". The code after the wait never runs.

// BAD: null check never fires — timeout fails the run instead
const approval = await step.waitForEvent("wait-approval", {
event: "order.approved",
match: "data.orderId",
timeout: "24h",
});
if (!approval) {
await step.run("cancel", async () => cancelOrder(orderId));
return { status: "cancelled" };
}

How to Fix: Pick the model that matches the business requirement:

  • Hard deadline is acceptable — let waitForEvent time out. The run fails with "waitForEvent timed out"; an external listener on system.run.*.failed triggers the recovery flow.
  • Soft deadline needs inline branching — schedule a fallback event via step.sleep + emit (or a sibling cron function) so the waitForEvent matches either the real event or the timeout event. The workflow keeps running on a single, well-typed result.

Tell the AI:

waitForEvent does NOT return null on timeout. The return type is
Promise<IronflowEvent<T>>, and a timeout marks the step "timed_out"
and fails the run.
If the workflow needs to handle the timeout case inline, model the
timeout as its own event: kick off a delayed emit with step.sleep,
then waitForEvent on a unified "settled" event that either branch
publishes. Otherwise, let the run fail and react via a separate
listener on system.run.*.failed.

6. Wrong Execution Mode

The Problem: AI uses push mode for long-running tasks or pull mode for quick tasks.

// BAD: Video transcoding in push mode will timeout
export const POST = serve({
functions: [transcodeVideo], // Takes 10+ minutes
});

How to Fix:

  • Push mode: Tasks under 30 seconds, serverless deployments
  • Pull mode: Long-running tasks, GPU workloads, no timeout limits

Tell the AI:

Use push mode for tasks < 30 seconds (serverless).
Use pull mode for:
- Video/audio processing
- ML inference
- Large file operations
- Any task that might exceed serverless timeouts

7. Incorrect Event Matching

The Problem: AI uses wrong field path for event correlation.

// Original event: { data: { orderId: "123" } }
await step.waitForEvent("wait-payment", {
event: "payment.completed",
match: "orderId", // BAD: Should be "data.orderId"
timeout: "1h",
});

How to Fix: The match field uses the full path including data. prefix.

Tell the AI:

The match field in waitForEvent uses dot notation from the event root.
Event payload is in data, so use "data.fieldName" not just "fieldName".

8. Mixing Async Patterns

The Problem: AI mixes promises incorrectly within steps.

await step.run("fetch-all", async () => {
// BAD: Untracked promises - may not complete before step returns
const users = fetchUsers();
const orders = fetchOrders();
return { users, orders }; // Returns promises, not values
});

How to Fix: Either use step.parallel for tracked concurrent operations, or properly await all promises inside a step.

Tell the AI:

For concurrent operations, prefer step.parallel so each operation is
independently memoized and tracked. If using Promise.all inside a step,
await the result - don't return unresolved promises.

9. Hardcoded Configuration

The Problem: AI hardcodes URLs and configuration instead of using environment variables.

const client = createClient({
serverUrl: "http://localhost:9123", // BAD: Hardcoded
});

How to Fix: Use environment variables for URLs, API keys, and configuration.

Tell the AI:

Use environment variables for configuration:
- IRONFLOW_SERVER_URL for server address
- IRONFLOW_SIGNING_KEY for webhook verification
- Never hardcode production URLs

10. Ignoring Type Safety

The Problem: AI uses any types instead of proper TypeScript types.

const processOrder = ironflow.createFunction(
{ id: "process-order", triggers: [{ event: "order.placed" }] },
async ({ event, step }) => {
const data = event.data; // any type - no autocomplete, no safety
await step.run("process", async () => {
return data.orderId; // Could crash at runtime
});
},
);

How to Fix: Define Zod schemas for event data and pass them via the schema config option.

Tell the AI:

Define Zod schemas for:
- Event data payloads
- Step return types
- Function return types
Pass the schema via createFunction({ schema: MySchema }) to get full type safety.

11. Impure Managed Projections

The Problem: AI puts side effects (database queries, external API calls, fetch) inside managed projections. Managed projections must be pure reducers.

const statsProjection = createProjection({
name: "stats",
events: ["order.completed"],
initialState: () => ({ total: 0 }),
handler: async (state, event) => {
// BAD: Side effect in a managed projection
await sendSlackNotification("New order!");
return { total: state.total + event.data.amount };
}
});

How to Fix: Use mode: "external" for projections that need to perform side effects. Managed projections must only return the new state based purely on the previous state and the event.

Tell the AI:

Managed projections (which return state) MUST be pure synchronous functions with NO side effects.
If you need to make API calls, send emails, or write to external databases, use an external projection (mode: "external") or a function.

12. Missing Expected Versions in Entity Streams

The Problem: AI appends to entity streams without using optimistic concurrency control, which can lead to race conditions when multiple workers update the same entity.

// BAD: Unconditional append can overwrite other updates
await ironflow.streams.append("user-123", {
name: "user.updated",
data: { status: "active" },
entityType: "user"
});

How to Fix: Always provide expectedVersion when appending to an entity stream if the append depends on previous state.

Tell the AI:

When appending events to entity streams, always use optimistic concurrency control by fetching the current version and passing it as expectedVersion in the options object.

13. Missing Properties in Upcasters

The Problem: AI writes upcasters that fail to return all required properties, inadvertently deleting data during schema migrations.

registry.register("user.created", 1, 2, (data) => {
// BAD: Forgot to spread the rest of the old data!
// This deletes email, name, etc.
return {
fullName: `${data.firstName} ${data.lastName}`
};
});

How to Fix: Always spread the existing data payload (...data) or use destructuring to preserve un-migrated properties.

Tell the AI:

When writing event upcasters, always return ALL properties of the event data, not just the migrated fields. Use the spread operator to ensure data isn't lost during the upcast.

When NOT to Use AI

Some tasks should be done manually:

Security Configuration

Don’t let AI generate:

  • Signing keys or secrets
  • Authentication logic
  • Permission checks
  • Encryption/decryption code

Instead, use established libraries and review security code manually.

Production Deployments

Don’t let AI:

  • Configure production infrastructure
  • Set up monitoring/alerting rules
  • Create database migrations for production
  • Manage secrets in CI/CD

Complex Business Logic

Review AI-generated code carefully for:

  • Financial calculations
  • Legal compliance logic
  • Data privacy handling
  • Multi-tenant isolation

Verification Checklist

After AI generates Ironflow code, verify:

  • Every step has a unique, descriptive ID
  • All external calls are inside step.run
  • External API calls use idempotency keys
  • NonRetryableError used for permanent failures
  • waitForEvent timeout strategy is intentional (hard fail via system.run.*.failed, or modeled as a fallback event the wait can match)
  • Correct execution mode (push vs pull)
  • Event match field uses data. prefix
  • Environment variables used for configuration
  • Types are properly defined (no any)
  • No sensitive data logged or exposed