Trigger.dev V2: What a Temporal Alternative for TypeScript Reveals About Durable Execution Plumbing

Trigger.dev launched in February 2023 as a “developer-first Zapier alternative” and earned 745 points on Hacker News. Eight months later, the team shipped V2 as a “Temporal alternative for TypeScript” and got 172 points. That pivot tells you something about what developers thought they wanted versus what their systems actually needed.

The shift from event-driven workflows to durable execution is not cosmetic. It changes how you think about state persistence, retry semantics, and what happens when a worker crashes halfway through a multi-step agent task.

What Durable Execution Actually Means

Event-driven workflows treat each step as a discrete function call. You trigger an event, run some code, emit another event. If a step fails, you retry that step. State lives in a database or message queue between hops.

Durable execution treats the entire workflow as a single logical unit. The orchestrator persists execution state at every decision point. If a worker dies mid-task, the runtime resumes from the last checkpoint without re-executing completed steps. You write code that looks synchronous but survives crashes, network partitions, and multi-hour delays.

Temporal pioneered this model. Workflows are Go or Java functions that call activities (side-effecting operations) and can sleep for days without holding a connection open. The runtime replays the workflow function from history whenever it needs to make a decision, skipping already-completed activities.

The trade-off: you cannot use non-deterministic operations (random numbers, system time, direct I/O) inside workflow code. Everything must be deterministic so replay produces the same execution path.

Why TypeScript Teams Hit a Wall with Temporal

Temporal’s TypeScript SDK exists, but the platform was designed around Go’s concurrency model. The documentation, examples, and operational tooling assume you are comfortable with:

Running persistent worker processes (not serverless functions)
Managing gRPC connections to a Temporal cluster
Understanding workflow replay semantics in a language without native coroutines
Deploying and scaling workers separately from your application code

For teams building agent systems in Next.js or Remix, this is a context switch. You want to write a TypeScript function that calls an LLM, waits for a webhook, retries on rate limits, and eventually returns a result. You do not want to learn a new deployment model.

Trigger.dev V2 targets this gap. It provides durable execution primitives with a TypeScript-native API and a managed runtime that handles worker orchestration.

Orchestration Primitives in Trigger.dev

The core abstraction is the task. A task is a TypeScript function that runs durably. If it crashes, the platform retries it. If it waits, the platform persists state and releases resources.

import { task, wait } from "@trigger.dev/sdk/v3";

export const processOrder = task({
  id: "process-order",
  run: async (payload: { orderId: string }) => {
    // Step 1: Charge payment (retries on failure)
    const charge = await stripe.charges.create({
      amount: payload.amount,
      source: payload.token,
    });

    // Step 2: Wait for webhook confirmation (releases worker)
    await wait.for({ seconds: 300 });

    // Step 3: Ship order (only runs if payment succeeded)
    const shipment = await shippo.transactions.create({
      orderId: payload.orderId,
    });

    return { charge, shipment };
  },
});

The wait.for call does not block a thread. The runtime checkpoints state, terminates the worker, and resumes execution when the timer expires. If the worker crashes after charging the card but before shipping, the task resumes at the wait.for boundary without re-charging.

State Persistence Model

Trigger.dev persists execution state in Postgres. Each task run gets:

A unique run ID
A status (pending, running, completed, failed)
A payload snapshot
A history of completed steps
Retry metadata (attempt count, next retry time)

When a task resumes, the runtime loads the snapshot and replays the function. Completed steps return cached results instead of re-executing. This is similar to Temporal’s event sourcing model but implemented with relational storage instead of a custom event log.

Retry Semantics

Tasks retry automatically with exponential backoff. You configure retry behavior per task:

export const flakeyTask = task({
  id: "flakey-task",
  retry: {
    maxAttempts: 5,
    factor: 2,
    minTimeout: 1000,
    maxTimeout: 60000,
  },
  run: async (payload) => {
    // Throws on transient errors, retries automatically
    return await unreliableAPI.call(payload);
  },
});

Retries are not immediate. The platform schedules the next attempt based on the backoff policy and releases the worker. This prevents retry storms and allows other tasks to use worker capacity.

Concurrency and Queue Control

Agent systems often need to limit concurrency. You might want to:

Run only one instance of a task per user (avoid duplicate work)
Limit API calls to respect rate limits
Serialize access to a shared resource

Trigger.dev provides concurrency keys:

export const userTask = task({
  id: "user-task",
  queue: {
    concurrencyLimit: 1,
    key: (payload) => payload.userId,
  },
  run: async (payload: { userId: string }) => {
    // Only one task per userId runs at a time
  },
});

The platform maintains a queue per concurrency key. Tasks with the same key execute serially. Tasks with different keys run in parallel up to the global concurrency limit.

This is cheaper than distributed locks. The orchestrator enforces serialization at the queue level without requiring workers to coordinate.

Long-Running Task Guarantees

Agent workflows often involve:

Waiting for human approval
Polling external APIs until a condition is met
Running multi-hour data processing jobs

Trigger.dev tasks can run indefinitely. The platform does not impose a timeout. You can write:

export const pollUntilReady = task({
  id: "poll-until-ready",
  run: async (payload: { jobId: string }) => {
    while (true) {
      const status = await externalAPI.checkStatus(payload.jobId);
      if (status === "complete") {
        return status;
      }
      await wait.for({ minutes: 5 });
    }
  },
});

The loop does not consume resources while waiting. Each wait.for releases the worker. The task resumes every 5 minutes, checks status, and either completes or waits again.

If the external API is down for 3 hours, the task survives. If the worker crashes, the task resumes from the last checkpoint.

Observability Shape

Trigger.dev provides a web UI that shows:

All task runs with status, duration, and retry count
Execution timeline (when each step started and completed)
Logs and errors per run
Queue depth and concurrency metrics

The platform instruments tasks automatically. You do not need to add tracing code. Each step in the task function appears as a span in the timeline.

For programmatic access, the SDK exposes a runs API:

import { runs } from "@trigger.dev/sdk/v3";

const run = await runs.retrieve("run_1234");
console.log(run.status, run.output, run.error);

This is useful for building custom dashboards or integrating task status into your application UI.

Deployment Model Comparison

Aspect	Temporal	Trigger.dev
Runtime	Self-hosted cluster (Cassandra + workers)	Managed platform (Postgres + workers)
Worker deployment	You deploy and scale worker processes	Platform manages workers
Language support	Go, Java, TypeScript, Python	TypeScript only
State storage	Event sourcing (custom format)	Postgres snapshots
Replay model	Full workflow replay from history	Step-level checkpointing
Serverless compatibility	Requires persistent workers	Tasks can trigger from serverless
Operational complexity	High (cluster, storage, workers)	Low (managed service)

Temporal gives you more control. You can tune storage, run workers on specific hardware, and integrate with existing infrastructure. The cost is operational overhead.

Trigger.dev trades control for simplicity. You write tasks, deploy them, and the platform handles execution. You cannot customize the runtime or run it on-premises (unless you self-host the open-source version).

Failure Modes and Boundaries

Worker Crashes

If a worker crashes mid-task, the platform detects the failure (via heartbeat timeout) and reschedules the task. The task resumes from the last checkpoint. Completed steps do not re-execute.

Database Failures

If Postgres is unavailable, new tasks cannot start and running tasks cannot checkpoint. The platform queues task triggers and retries checkpointing until the database recovers. Workers continue executing until they need to checkpoint or complete.

Poison Pills

If a task throws an error on every attempt, it eventually exhausts retries and moves to a failed state. The platform does not automatically retry failed tasks. You must manually retry or fix the underlying issue.

Non-Determinism Bugs

If you use Math.random() or Date.now() inside a task, replay may produce different results. Trigger.dev does not enforce determinism like Temporal. You are responsible for avoiding non-deterministic operations or accepting that replays may diverge.

When Agent Systems Need Durable Execution

Agent workflows often involve:

Multi-step reasoning loops with LLM calls
Tool calls that may fail or timeout
Human-in-the-loop approvals
Long-running data processing between reasoning steps

Without durable execution, you must manually persist state between steps. If the agent crashes after calling a tool but before processing the result, you lose context. You either re-execute the tool call (wasting money and time) or fail the entire workflow.

Durable execution makes the agent invincible. The orchestrator checkpoints state after each tool call. If the agent crashes, it resumes with the tool result already available. The reasoning loop continues without re-executing completed steps.

This is especially important for agentic workflows that involve external APIs with rate limits or costs. You do not want to re-call GPT-4 because a worker restarted.

Technical Verdict

Use Trigger.dev when:

You are building agent systems in TypeScript and want durable execution without operational overhead
You need long-running tasks with retries and concurrency control
You want observability and queue management without writing custom infrastructure
You are comfortable with a managed platform and do not need on-premises deployment

Avoid Trigger.dev when:

You need multi-language support (Temporal supports Go, Java, Python)
You require full control over the runtime and storage layer
You already have a Temporal cluster and expertise
You need strict determinism guarantees enforced by the runtime

The platform exposes the plumbing that agent systems need: state persistence, retry logic, concurrency control, and long-running task guarantees. It does not abstract away the orchestration model. You still write code that thinks about checkpoints, retries, and failure boundaries. But you do not deploy workers, manage queues, or run a database cluster.

For TypeScript teams building agentic workflows, that trade-off often makes sense.