Cancel On Event

cancelOn declares a list of cancel specs at function config time. Ironflow auto-cancels in-flight runs whose recorded match value equals an incoming event’s extracted JSON path value. No cancelRun() calls in handler code, no listener bookkeeping — the engine wires the listener at run start and tears it down at run end.

Use it when:

A user can cancel a long-running workflow by emitting a *.cancelled event (order checkout, deployment pipeline, batch job).
A downstream system signals “stop this work” with a domain event (shipment.aborted, tenant.suspended).
A run should abort the moment a sibling workflow publishes a “no longer applicable” signal.

How It Works

emit order.placed{orderId: "ord-42"}
        │
        ▼
   ┌─────────────────────────┐
   │ run starts              │   atomic insert:
   │ cancel_on_specs persist │   - run row
   └────────────┬────────────┘   - one cancel_on_spec per cancelOn entry
                │
                ▼
       step.run("charge"…)
       step.run("ship"…)
                │
   ┌────────────┴────────────────────┐
   │ meanwhile…                       │
   │ emit order.cancelled{orderId:    │
   │   "ord-42"}                      │
   └────────────┬────────────────────┘
                │
                ▼
   scheduler matches event.orderId
   against cancel_on_specs index
                │
                ▼
        run cancelled — cleanly,
        at the next step boundary

At run start, the engine extracts each spec’s match value from the triggering event’s payload (e.g. event.data.orderId = "ord-42") and persists one cancel_on_specs row per spec, in the same transaction as the run row. When any future event in the same environment matches a spec’s (event_name, match_path, match_value) tuple, the scheduler fires CancelRunVersionedWithCause against the run via the same compare-and-swap path as imperative cancelRun(). Concurrent cancellers (parallel events, user cancel, scheduler match) all converge on a single terminal transition; losers swallow ErrVersionConflict.

OR semantics across specs. Multiple specs on the same function are independent triggers. Any matching event fires the cancel.

Tenant isolation. Spec rows carry environment_id. The match query is WHERE event_name = ? AND environment_id = ? AND match_path = ? AND match_value = ?. Tenant A’s order.cancelled cannot match Tenant B’s run.

Define a Function With `cancelOn`

TypeScript (Node SDK)

import { ironflow } from "@ironflow/node";

const processOrder = ironflow.createFunction(
  {
    id: "process-order",
    triggers: [{ event: "order.placed" }],
    cancelOn: [
      { event: "order.cancelled", match: "orderId" },
    ],
  },
  async ({ event, step }) => {
    await step.run("charge", () => chargeCard(event.data));
    await step.run("ship", () => createShipment(event.data));
    return { ok: true };
  }
);

Go SDK

import "github.com/sahina/ironflow/sdk/go/ironflow"

var ProcessOrder = ironflow.CreateFunction(ironflow.FunctionConfig{
    ID:       "process-order",
    Triggers: []ironflow.Trigger{{Event: "order.placed"}},
    CancelOn: []ironflow.CancelOnConfig{
        {Event: "order.cancelled", Match: "orderId"},
    },
}, func(ctx ironflow.Context) (any, error) {
    // ...
    return nil, nil
})

A processOrder run started by order.placed{orderId: "ord-42"} cancels when any subsequent order.cancelled event in the same environment carries orderId = "ord-42". A order.cancelled event with a different orderId (or no orderId) does nothing.

Configuration

Each entry in the cancelOn array:

Field	Type	Required	Description
`event` (TS) / `Event` (Go)	string	yes	Event name to listen for. Must be non-empty. Case-sensitive, must match the emitted event name exactly.
`match` (TS) / `Match` (Go)	string	yes	JSON path into the event’s `data` field. Resolved against the triggering event at run start, then matched against incoming candidates. Supports nested paths (`order.id`, `customer.organization.tenantId`). Same extraction rules as `concurrency.key` and `debounce.key`.

The SDK rejects empty event, empty match, and duplicate specs (same event + match pair) at createFunction time so misconfigurations fail fast at registration, not at runtime.

Match Semantics

v1 is equality only. The extracted value from the cancel event must equal the value recorded at run start. Comparison is string-based: internal/eventpath.Extract uses gjson scalar string serialization (via gjson.GetBytes(...).String()) on both sides, so 42 (number) and "42" (string) collapse to the same "42" and match. Strings, numbers, booleans, and null all match by stringified value. If you need strict typing across the boundary, project the value to a canonical string at emit time on both sides (e.g. always emit orderId as a string).

Path missing. If the cancel event’s payload lacks the match path (e.g. spec is match: "orderId" but event.data has no orderId key), the event is ignored. No match, no cancel.

Nested paths. match: "order.id" traverses event.data.order.id. Missing intermediate object → no match.

Malformed JSON. Incoming events with unparseable data are ignored safely — no panic, no mismatch fired by accident.

Equality only by design. Regex, numeric ranges, and predicate matchers are out of scope for v1. If you need them, extract the matchable scalar at the emit site and put it on the event payload.

Run-Start Race

A cancel event can arrive in the millisecond window before cancel_on_specs rows commit. Ironflow closes this race with a per-process bounded replay window (D11): every event passing through Scheduler.HandleIncomingEvent is appended to an in-memory ring buffer. Immediately after CreateRunWithCancelOnSpecs, the engine replays the buffer against the new specs. A cancel event that landed up to 30 seconds before the run+specs commit still fires.

The buffer is per-process — in multi-node clusters, a cancel event landing on Node A while run-start commits on Node B is not caught by the replay (Node B’s buffer never saw the event). The permanent answer is being scoped — the structural fix (commit run+specs before the matching event row is visible) is on the internal TODO list. Best-effort by design until the window_miss rate proves the structural fix worth the work.

Tunables:

Env var	Default	Description
`IRONFLOW_CANCEL_ON_REPLAY_WINDOW_SECONDS`	`30`	TTL for buffered events. Older entries are evicted on next append.
`IRONFLOW_CANCEL_ON_REPLAY_BUFFER_SIZE`	`10000`	Max entries in the ring. Oldest evicted when full.

Larger windows raise memory cost; shrink only when you measure pressure.

Edge Cases

cancelOn integrates cleanly with the rest of the runtime — the cancel CAS path is the same one user-initiated cancels and platform cancels use, so behavior under sleep, pause, and parallel fanout is uniform:

Cancel during step.sleep. Run is paused awaiting wakeup. Cancel arrives, run flips to cancelled. When the sleep wakeup fires, resumeRun short-circuits because the run is no longer paused. No resurrection.
Cancel during scoped-injection pause. Same path — resume after injection unwind sees terminal status, drops the resume.
Cancel during step.waitForEvent. Same. The wait-satisfaction path checks the run status and honors the terminal.
Cancel mid-parallel fanout. Children stop at the next step boundary. Already-completed children are not undone (compensation requires sagas — see Sagas & Compensation).
Cancel event arrives after run already terminal. Idempotent swallow. The CAS path filters on status, the late event matches no live spec (specs are deleted on terminal cleanup), and the late attempt no-ops cleanly.
Rapid-fire identical cancel events. The CAS path dedups via the version guard. 100 identical cancels = exactly one terminal transition; the other 99 swallow ErrVersionConflict.

Cluster Behavior

Spec persistence is transactional with the run row. The match query hits a selective index on (event_name, env_id, match_path, match_value) so the hot path is O(matches) regardless of total spec count. The CAS cancel path is the P1 primitive — exactly-one-winner across nodes for any given run+version pair.

Cleanup runs on every terminal transition. MarkRunTerminal (cancel-only) and finalizeRun (completed/failed) both call DeleteCancelOnSpecsByRunID so spec rows do not orphan.

Compensation

Cancellation triggers the same termination path as cancelRun(). Saga compensation on cancel is tracked in #571 (P2b — compensateOnCancel runtime dispatch in pull workers). For now, declarative cancel terminates the run cleanly but does not auto-run saga compensate steps — compensateOnCancel: true on the function is parsed but not yet honored at runtime. Track #571 for the runtime dispatch.

Trade-offs

What you get: declarative, zero-handler-code cancellation. Cancel intent lives at the function boundary where it can be seen, audited, and tested. The cancel CAS path is the same one used everywhere else — no new failure modes.

What you give up:

Equality only in v1. Need a regex match? Project the regex output to a scalar at emit time.
Single-node race window. D11 covers ~30s on the same node; a cross-node race in a 3+ node cluster is not caught. Track the ironflow_cancel_on_spec_resolution_failed_total{reason="window_miss"} metric in production.
No mid-step cancellation. Step boundaries are the cancel points (same as imperative cancelRun()). A long-running step finishes its current call, then the boundary check honors the cancel. Both Node and Go SDKs receive the cancel signal at step boundaries only; mid-step pre-emption is not supported in either. Split long-running work into smaller steps so cancel boundaries land more often.