Skip to content

Build a Durable AI Agent in 5 Minutes

AI agents are just workflows with LLM-powered decision steps. But most agent frameworks treat them as throwaway scripts — if the process crashes mid-execution, you start over. If a tool call fails, the whole chain breaks. If you want to debug what happened, you grep through logs.

There’s a better way.

A typical AI agent makes a series of tool calls: search the web, read documents, call APIs, summarize findings. Each call depends on the previous result. If any step fails — network timeout, rate limit, API error — the entire agent run is lost.

Plan → Search → Search → Summarize → Store
✓ ✓ ✗ (timeout)
Agent dies here. Restart from scratch.

Worse, when an agent produces a wrong answer, debugging is painful. What queries did it generate? What did each search return? What did the summarizer actually receive? Without step-by-step execution history, you’re guessing.

Ironflow treats agent tool calls as durable steps — each one is memoized, retried on failure, and permanently recorded:

Plan → Search → Search → Summarize → Store
✓ ✓ ✗ (timeout)
Ironflow retries this step automatically.
Previous steps are NOT re-executed.

If the process crashes entirely, the agent resumes from the last completed step. No wasted API calls, no lost context, no starting over.

Here’s a research agent built with Ironflow. Each step.run() is durable:

import { createFunction } from "@ironflow/node";
const researchAgent = createFunction({
id: "research-agent",
triggers: [{ event: "agent.research" }],
recording: true, // Enable time-travel debugging
}, async ({ event, step }) => {
const topic = event.data.topic;
// Step 1: Generate search queries (durable — only runs once)
const plan = await step.run("plan-research", async () => {
return {
queries: [
`${topic} overview`,
`${topic} best practices`,
`${topic} common pitfalls`,
],
};
});
// Step 2: Execute each search (each is independently durable)
const results = [];
for (const query of plan.queries) {
const result = await step.run(`search-${query}`, async () => {
// Your search API call here
return await searchWeb(query);
});
results.push(result);
}
// Step 3: Summarize findings (durable)
const summary = await step.run("summarize", async () => {
return await summarizeResults(results);
});
return { topic, summary, sourcesUsed: results.length };
});

Key properties:

  • Memoized: If plan-research completes but search-... fails, the plan step is NOT re-executed on retry. Its output is replayed from the record.
  • Retried: Each search step retries independently. A timeout on search 2 doesn’t affect search 1 or 3.
  • Recorded: Every step’s input and output is permanently stored. Time-travel through any agent run.

Instead of stuffing conversation history into a database column, record agent activities as events:

import { createProjection } from "@ironflow/node";
const agentMemory = createProjection({
name: "agent-memory",
events: ["agent.research"],
initialState: () => ({ tasks: 0, topics: [] }),
handler: (state, event) => ({
tasks: state.tasks + 1,
topics: [...state.topics, event.data.topic],
}),
});

The projection automatically derives the agent’s history from events. Query it anytime:

Terminal window
curl http://localhost:9123/api/v1/projections/agent-memory | jq '.state.state'

This is where it gets powerful. When an agent produces unexpected results, you can scrub through the entire execution:

Terminal window
ironflow inspect <run-id>

Arrow keys move frame-by-frame through the agent’s execution. You see the exact output of every step — what the planner generated, what each search returned, what the summarizer received.

No log grepping. No guessing. Just step through the timeline.

Terminal window
brew install sahina/tap/ironflow
ironflow serve --dev
ironflow init my-app && cd my-app
pnpm dev
ironflow emit agent.research --data '{"topic":"event sourcing best practices"}'

Check the AI agent example for a complete, runnable implementation, or start with the getting started tutorial.

Your agents deserve the same durability guarantees as your production workflows.