Common Pitfalls

AI assistants make predictable mistakes when generating Ironflow code. Learn to spot and fix them.

1. Non-Idempotent Step Code

The Problem: AI generates step code that causes duplicate operations on retry.

Wrong
Correct

await step.run("create-payment", async () => {
  // No idempotency key - creates duplicate charges on retry!
  return await stripe.charges.create({
    amount: order.total,
    currency: "usd",
  });
});

await step.run("create-payment", async () => {
  // Idempotency key prevents duplicate charges
  return await stripe.charges.create({
    amount: order.total,
    currency: "usd",
    idempotencyKey: `order-${orderId}`,
  });
});

How to Fix: Always use idempotency keys when calling external APIs. Most payment processors, email services, and APIs support idempotency keys.

Tell the AI:

Ensure all external API calls inside steps use idempotency keys.
For Stripe, use idempotencyKey. For other services, use appropriate
deduplication mechanisms.

2. Side Effects Outside Steps

The Problem: AI puts side effects (API calls, database queries) outside step.run, causing them to execute on every replay.

Wrong
Correct

async ({ event, step }) => {
  // BAD: This runs on EVERY replay, not just once
  const user = await db.users.find(event.data.userId);

  await step.run("process", async () => {
    return processUser(user);
  });
};

async ({ event, step }) => {
  // GOOD: Database call is memoized
  const user = await step.run("fetch-user", async () => {
    return await db.users.find(event.data.userId);
  });

  await step.run("process", async () => {
    return processUser(user);
  });
};

How to Fix: Every external call (database, API, file system) must be inside a step.run.

Tell the AI:

All side effects (database queries, API calls, file operations) must be
inside step.run. The function body outside steps should only contain
control flow logic using data returned from steps.

3. Duplicate Step IDs

The Problem: AI reuses step IDs, especially in loops.

Wrong
Correct

for (const item of items) {
  // BAD: Same step ID for every item - only first executes
  await step.run("process-item", async () => {
    return processItem(item);
  });
}

// GOOD: Use step.map for arrays
const results = await step.map(
  "process-items",
  items,
  async (item, s, index) => {
    return await s.run(`process-${index}`, async () => {
      return processItem(item);
    });
  },
);

// OR if manual loop is needed, use unique IDs
for (let i = 0; i < items.length; i++) {
  await step.run(`process-item-${i}`, async () => {
    return processItem(items[i]);
  });
}

How to Fix: Use step.map for arrays. If using loops, include a unique identifier in the step ID.

Tell the AI:

Each step needs a unique ID. For arrays, use step.map instead of loops.
If you must use loops, include the index or item ID in the step name.

4. Missing Error Handling

The Problem: AI doesn’t distinguish between retryable and non-retryable errors.

Wrong
Correct

await step.run("validate", async () => {
  if (!isValidEmail(email)) {
    // BAD: Will retry forever, but email won't become valid
    throw new Error("Invalid email");
  }
});

import { NonRetryableError } from "@ironflow/core";

await step.run("validate", async () => {
  if (!isValidEmail(email)) {
    // GOOD: Stops immediately, won't waste retries
    throw new NonRetryableError("Invalid email format");
  }
});

How to Fix: Use NonRetryableError for validation failures and permanent errors. Let transient errors (network, timeout) retry normally.

Tell the AI:

Use NonRetryableError for:
- Validation failures
- Invalid input
- Business rule violations
- Resource not found
- Permission denied

Let these retry normally:
- Network timeouts
- Database connection errors
- Rate limiting (with backoff)

5. Misunderstanding waitForEvent Timeout

The Problem: AI treats waitForEvent timeout as something the workflow can catch and recover from inline. It cannot — the SDK return type is non-nullable (Promise<IronflowEvent<T>>), and the engine transitions the step to timed_out and fails the run with "waitForEvent timed out". The code after the wait never runs.

Wrong
Correct

// BAD: null check never fires — timeout fails the run instead
const approval = await step.waitForEvent("wait-approval", {
  event: "order.approved",
  match: "data.orderId",
  timeout: "24h",
});

if (!approval) {
  await step.run("cancel", async () => cancelOrder(orderId));
  return { status: "cancelled" };
}

// GOOD: model the timeout as its own event. Schedule a fallback emit
// before yielding, then waitForEvent races approval vs the fallback.

await step.sleep("wait-timer", "24h");
await step.run("fire-timeout", async () =>
  ironflow.emit("order.approval_timeout", { orderId })
);

// Separate function reacts to either outcome.
const settled = await step.waitForEvent("await-settlement", {
  event: "order.settled", // emitted by approval handler OR timeout handler
  match: "data.orderId",
  timeout: "48h", // hard ceiling — run fails if this fires
});

await step.run("process", async () => processSettlement(settled.data));

How to Fix: Pick the model that matches the business requirement:

Hard deadline is acceptable — let waitForEvent time out. The run fails with "waitForEvent timed out"; an external listener on system.run.*.failed triggers the recovery flow.
Soft deadline needs inline branching — schedule a fallback event via step.sleep + emit (or a sibling cron function) so the waitForEvent matches either the real event or the timeout event. The workflow keeps running on a single, well-typed result.

Tell the AI:

waitForEvent does NOT return null on timeout. The return type is
Promise<IronflowEvent<T>>, and a timeout marks the step "timed_out"
and fails the run.

If the workflow needs to handle the timeout case inline, model the
timeout as its own event: kick off a delayed emit with step.sleep,
then waitForEvent on a unified "settled" event that either branch
publishes. Otherwise, let the run fail and react via a separate
listener on system.run.*.failed.

6. Wrong Execution Mode

The Problem: AI uses push mode for long-running tasks or pull mode for quick tasks.

Wrong
Correct

// BAD: Video transcoding in push mode will timeout
export const POST = serve({
  functions: [transcodeVideo], // Takes 10+ minutes
});

// GOOD: Long-running task uses pull mode
const transcodeVideo = ironflow.createFunction(
  {
    id: "transcode-video",
    mode: "pull", // Worker pulls jobs via gRPC
    triggers: [{ event: "video.uploaded" }],
  },
  async ({ event, step }) => {
    // Can run for hours without timeout
  },
);

const worker = createWorker({
  serverUrl: "http://localhost:9123",
  functions: [transcodeVideo],
});
await worker.start();

How to Fix:

Push mode: Tasks under 30 seconds, serverless deployments
Pull mode: Long-running tasks, GPU workloads, no timeout limits

Tell the AI:

Use push mode for tasks < 30 seconds (serverless).
Use pull mode for:
- Video/audio processing
- ML inference
- Large file operations
- Any task that might exceed serverless timeouts

7. Incorrect Event Matching

The Problem: AI uses wrong field path for event correlation.

Wrong
Correct

// Original event: { data: { orderId: "123" } }

await step.waitForEvent("wait-payment", {
  event: "payment.completed",
  match: "orderId", // BAD: Should be "data.orderId"
  timeout: "1h",
});

// Original event: { data: { orderId: "123" } }

await step.waitForEvent("wait-payment", {
  event: "payment.completed",
  match: "data.orderId", // GOOD: Full path including "data."
  timeout: "1h",
});

How to Fix: The match field uses the full path including data. prefix.

Tell the AI:

The match field in waitForEvent uses dot notation from the event root.
Event payload is in data, so use "data.fieldName" not just "fieldName".

8. Mixing Async Patterns

The Problem: AI mixes promises incorrectly within steps.

Wrong
Correct

await step.run("fetch-all", async () => {
  // BAD: Untracked promises - may not complete before step returns
  const users = fetchUsers();
  const orders = fetchOrders();
  return { users, orders }; // Returns promises, not values
});

// GOOD: Use step.parallel for concurrent operations
const [users, orders] = await step.parallel("fetch-all", [
  async (s) => s.run("fetch-users", fetchUsers),
  async (s) => s.run("fetch-orders", fetchOrders),
]);

// OR await inside a single step
await step.run("fetch-all", async () => {
  const [users, orders] = await Promise.all([fetchUsers(), fetchOrders()]);
  return { users, orders };
});

How to Fix: Either use step.parallel for tracked concurrent operations, or properly await all promises inside a step.

Tell the AI:

For concurrent operations, prefer step.parallel so each operation is
independently memoized and tracked. If using Promise.all inside a step,
await the result - don't return unresolved promises.

9. Hardcoded Configuration

The Problem: AI hardcodes URLs and configuration instead of using environment variables.

Wrong
Correct

const client = createClient({
  serverUrl: "http://localhost:9123", // BAD: Hardcoded
});

const client = createClient({
  serverUrl: process.env.IRONFLOW_SERVER_URL || "http://localhost:9123",
});

// Or in Go
client := ironflow.NewClient(ironflow.ClientConfig{
    ServerURL: ironflow.GetServerURL(),  // Reads env var
})

How to Fix: Use environment variables for URLs, API keys, and configuration.

Tell the AI:

Use environment variables for configuration:
- IRONFLOW_SERVER_URL for server address
- IRONFLOW_SIGNING_KEY for webhook verification
- Never hardcode production URLs

10. Ignoring Type Safety

The Problem: AI uses any types instead of proper TypeScript types.

Wrong
Correct

const processOrder = ironflow.createFunction(
  { id: "process-order", triggers: [{ event: "order.placed" }] },
  async ({ event, step }) => {
    const data = event.data; // any type - no autocomplete, no safety
    await step.run("process", async () => {
      return data.orderId; // Could crash at runtime
    });
  },
);

import { z } from "zod";

const OrderEventSchema = z.object({
  orderId: z.string(),
  customerId: z.string(),
  items: z.array(z.object({ productId: z.string(), quantity: z.number() })),
  total: z.number(),
});

const processOrder = ironflow.createFunction(
  {
    id: "process-order",
    triggers: [{ event: "order.placed" }],
    schema: OrderEventSchema,
  },
  async ({ event, step }) => {
    const { orderId, customerId, items, total } = event.data;
    // Full type safety and autocomplete — inferred from Zod schema
  },
);

How to Fix: Define Zod schemas for event data and pass them via the schema config option.

Tell the AI:

Define Zod schemas for:
- Event data payloads
- Step return types
- Function return types

Pass the schema via createFunction({ schema: MySchema }) to get full type safety.

11. Impure Managed Projections

The Problem: AI puts side effects (database queries, external API calls, fetch) inside managed projections. Managed projections must be pure reducers.

Wrong
Correct

const statsProjection = createProjection({
  name: "stats",
  events: ["order.completed"],
  initialState: () => ({ total: 0 }),
  handler: async (state, event) => {
    // BAD: Side effect in a managed projection
    await sendSlackNotification("New order!");
    return { total: state.total + event.data.amount };
  }
});

// GOOD: Pure reducer for state
const statsProjection = createProjection({
  name: "stats",
  events: ["order.completed"],
  initialState: () => ({ total: 0 }),
  handler: (state, event) => {
    return { total: state.total + event.data.amount };
  }
});

// GOOD: Separate external projection for side effects
const notificationProjection = createProjection({
  name: "notifications",
  events: ["order.completed"],
  mode: "external",
  handler: async (event) => {
    await sendSlackNotification("New order!");
  }
});

How to Fix: Use mode: "external" for projections that need to perform side effects. Managed projections must only return the new state based purely on the previous state and the event.

Tell the AI:

Managed projections (which return state) MUST be pure synchronous functions with NO side effects.
If you need to make API calls, send emails, or write to external databases, use an external projection (mode: "external") or a function.

12. Missing Expected Versions in Entity Streams

The Problem: AI appends to entity streams without using optimistic concurrency control, which can lead to race conditions when multiple workers update the same entity.

Wrong
Correct

// BAD: Unconditional append can overwrite other updates
await ironflow.streams.append("user-123", {
  name: "user.updated",
  data: { status: "active" },
  entityType: "user"
});

// GOOD: Use optimistic concurrency
const info = await ironflow.streams.getInfo("user-123");
const currentVersion = info ? info.version : 0;

await ironflow.streams.append("user-123", {
  name: "user.updated",
  data: { status: "active" },
  entityType: "user"
}, { expectedVersion: currentVersion });

How to Fix: Always provide expectedVersion when appending to an entity stream if the append depends on previous state.

Tell the AI:

When appending events to entity streams, always use optimistic concurrency control by fetching the current version and passing it as expectedVersion in the options object.

13. Missing Properties in Upcasters

The Problem: AI writes upcasters that fail to return all required properties, inadvertently deleting data during schema migrations.

Wrong
Correct

registry.register("user.created", 1, 2, (data) => {
  // BAD: Forgot to spread the rest of the old data!
  // This deletes email, name, etc.
  return {
    fullName: `${data.firstName} ${data.lastName}`
  };
});

registry.register("user.created", 1, 2, (data) => {
  const { firstName, lastName, ...rest } = data;
  // GOOD: Preserves all unchanged properties
  return {
    ...rest,
    fullName: `${firstName} ${lastName}`
  };
});

How to Fix: Always spread the existing data payload (...data) or use destructuring to preserve un-migrated properties.

Tell the AI:

When writing event upcasters, always return ALL properties of the event data, not just the migrated fields. Use the spread operator to ensure data isn't lost during the upcast.

When NOT to Use AI

Some tasks should be done manually:

Security Configuration

Don’t let AI generate:

Signing keys or secrets
Authentication logic
Permission checks
Encryption/decryption code

Instead, use established libraries and review security code manually.

Production Deployments

Don’t let AI:

Configure production infrastructure
Set up monitoring/alerting rules
Create database migrations for production
Manage secrets in CI/CD

Complex Business Logic

Review AI-generated code carefully for:

Financial calculations
Legal compliance logic
Data privacy handling
Multi-tenant isolation

Verification Checklist

After AI generates Ironflow code, verify: