Trigger.dev V2: What a Temporal Alternative for TypeScript Reveals About Durable Execution Plumbing

Trigger.dev launched in February 2023 as a “developer-first Zapier alternative” and earned 745 points on Hacker News. Eight months later, the team shipped V2 and repositioned as a “Temporal alternative for TypeScript.” That pivot tells you everything about what developers building agent infrastructure actually need: not webhook glue, but durable execution with retries, resumable state, and long-running task orchestration.

The shift exposes a fundamental architectural divide. Zapier-style tools handle event-driven workflows where each step completes in seconds. Temporal-style systems handle long-running tasks that span minutes or hours, survive crashes, and resume exactly where they left off. Trigger.dev’s V2 architecture reveals what it takes to build the second kind in TypeScript.

What Durable Execution Actually Means

Durable execution guarantees that a function runs to completion even if the process crashes, the network fails, or the server restarts. The runtime must:

Serialize execution state at checkpoints
Persist that state to durable storage
Resume from the last checkpoint after a failure
Replay deterministic operations without side effects

Temporal solves this with a Go-based history service that logs every workflow event. Workers replay the event log to reconstruct state. This works because Go’s concurrency model and explicit state management make replay deterministic.

TypeScript presents different challenges. JavaScript closures capture lexical scope, async/await creates implicit state machines, and the event loop makes timing non-deterministic. You cannot naively serialize a JavaScript function mid-execution and resume it later.

Trigger.dev’s approach:

Tasks are TypeScript functions wrapped in a task() decorator
The runtime instruments async boundaries (await points)
State snapshots happen at each await
Retries resume from the last successful await, not from the start

This means your task code looks like normal TypeScript, but the runtime tracks execution progress behind the scenes.

Architecture Comparison: Trigger.dev vs Temporal

Component	Temporal	Trigger.dev V2
Runtime	Go workers + history service	Node.js/Bun workers + Postgres
State persistence	Event sourcing (append-only log)	Checkpoint snapshots at await points
Replay mechanism	Full event replay from start; re-executes deterministic code	Resume from last checkpoint; skips completed steps using persisted results
Language support	Go, Java, Python, TypeScript (SDK)	TypeScript native
Deployment model	Self-hosted cluster or Temporal Cloud	Managed platform or self-hosted
Observability	Temporal UI + history queries	Real-time dashboard + trace logs
Failure recovery	Automatic replay from history	Automatic retry from checkpoint

Temporal’s event sourcing gives you complete audit trails and time-travel debugging. Every decision, timer, and activity is logged. You can replay a workflow from any point in history.

Trigger.dev’s checkpoint model trades audit granularity for simpler state management. You get resumability without replaying every operation. Intermediate results are persisted and reused on retry. This matters for non-deterministic operations like API calls. Temporal requires you to wrap them in activities, while Trigger.dev checkpoints after each await.

Task Orchestration Plumbing

Here’s what a durable task looks like:

import { task } from "@trigger.dev/sdk/v3";

export const processDocument = task({
  id: "process-document",
  retry: {
    maxAttempts: 3,
    factor: 2,
    minTimeout: 1000,
  },
  run: async (payload: { documentId: string }) => {
    // Checkpoint 1: Persisted to Postgres after completion
    const doc = await fetchDocument(payload.documentId);
    
    // Checkpoint 2: OCR result stored; skipped on retry if already completed
    const text = await ocrService.extract(doc.url);
    
    // Checkpoint 3: AI response persisted; reused if this step succeeded before crash
    const analysis = await openai.chat.completions.create({
      model: "gpt-4",
      messages: [{ role: "user", content: text }],
    });
    
    // Checkpoint 4: Final state written atomically
    await saveAnalysis(payload.documentId, analysis);
    
    return { status: "complete", analysisId: analysis.id };
  },
});

If the worker crashes after the OCR step, the next retry starts from checkpoint 2. The document fetch and OCR results are already persisted. The runtime skips those steps and resumes at the OpenAI call.

Key plumbing details:

Each await creates an implicit checkpoint
The runtime serializes the payload and intermediate results
Retries use exponential backoff with jitter
Timeouts apply per-checkpoint, not per-task
Concurrency limits prevent queue overload

State Management Across Failures

The hardest part of durable execution is handling partial state. If a task makes three API calls and crashes after the second, you need to:

Avoid re-executing the first two calls
Preserve their results for the third call
Handle cases where the second call succeeded but the state write failed

Trigger.dev’s checkpoint system addresses this by:

Writing state to Postgres after each await completes
Tagging each checkpoint with a sequence number
Comparing sequence numbers on retry to skip completed steps
Using database transactions to ensure state consistency

This creates a failure mode: if Postgres is unavailable, tasks cannot checkpoint and will fail. Temporal’s event log has the same dependency on its persistence layer, but the append-only model is simpler to scale and replicate.

Long-Running Task Failure Modes

Agent workflows often run for minutes or hours. Common failure scenarios:

Timeout cascades: A 30-minute task calls five APIs, each with a 10-minute timeout. If the first API hangs, the entire task times out before reaching the second checkpoint. Solution: set per-step timeouts, not just task-level timeouts.

Non-deterministic retries: If your task reads the current timestamp or generates random IDs, retries produce different results. Solution: generate IDs and timestamps once, before the first checkpoint, and pass them as state.

External state drift: A task fetches a user record, processes it, then updates the record. If the user record changes between retries, the update may overwrite newer data. Solution: use optimistic locking or version checks.

Queue backpressure: If tasks arrive faster than workers can process them, the queue grows unbounded. Solution: set concurrency limits and reject new tasks when the queue is full.

Trigger.dev provides concurrency controls and per-task retry policies, but you still need to design for idempotency and state consistency.

Observability and Debugging

Durable execution makes debugging harder because execution is non-linear. A task may pause for hours, resume on a different worker, and retry multiple times.

Trigger.dev’s observability stack:

Real-time dashboard showing task status, duration, and retry count
Trace logs for each checkpoint with payload and result
Span-based tracing to correlate task execution across retries (a span represents a single operation like one API call; tracing links spans across retries to show the full execution path)
Webhook notifications for task completion or failure

This is less granular than Temporal’s event history, which logs every decision and timer. But for most use cases, checkpoint-level visibility is enough.

Deployment Shape

Trigger.dev V2 runs as a managed platform or self-hosted. The self-hosted option requires:

Postgres for state persistence
Redis for queue management
Node.js or Bun workers
Optional: S3 for large payloads

The managed platform handles scaling, monitoring, and infrastructure. Workers auto-scale based on queue depth. You deploy task code as Docker images or via CLI.

Temporal requires more operational overhead: a cluster with frontend, history, matching, and worker services, plus Cassandra or Postgres for persistence. The tradeoff is more control over data residency and scaling behavior.

When TypeScript-Native Matters

Temporal’s TypeScript SDK is a thin wrapper over the Go runtime. You write TypeScript, but the execution model is Go’s. This creates friction:

Workflow code must be deterministic (no Date.now(), Math.random(), or external I/O)
Activities run in separate processes with serialization overhead
Debugging requires understanding the Go runtime’s behavior

Trigger.dev’s TypeScript-native runtime removes these constraints. You write normal async TypeScript. The runtime handles durability without requiring you to separate workflows from activities.

This matters for teams that want to move fast without learning Temporal’s mental model. The tradeoff is less control over replay behavior and audit trails.

Technical Verdict

Use Trigger.dev V2 when:

Your tasks have 4+ async boundaries and average runtime between 2 and 60 minutes (the checkpoint model works best when execution has clear async breakpoints, not tight loops)
You need automatic retries for API orchestration workflows where each step is an external call (document processing, data enrichment, multi-step AI agent tasks)
Your team writes TypeScript and wants to avoid the operational cost of running a Temporal cluster (managed platform handles scaling; self-hosting requires only Postgres and Redis)
You can tolerate checkpoint-level observability instead of full event replay (you get execution status per await, not per conditional branch)
Your workflows are primarily linear pipelines with occasional branches, not complex state machines with parallel execution or human-in-the-loop steps

Avoid it when:

You need sub-second event replay for compliance audits or financial transaction workflows (Trigger.dev’s checkpoint model does not log every decision, only await boundaries)
Your workflows require complex branching with parallel execution, sagas, or compensation logic (Temporal’s workflow DSL handles these patterns natively; Trigger.dev requires manual orchestration)
You already run Temporal and need consistent orchestration across Go, Python, and TypeScript services (adding a second orchestration system creates operational overhead)
You require on-premise deployment with strict data residency requirements and cannot use managed infrastructure (self-hosting is possible but less mature than Temporal’s deployment options)
Your tasks execute tight loops or CPU-bound operations without async boundaries (the checkpoint model depends on await points; synchronous code cannot be resumed mid-execution)

The V1-to-V2 pivot reveals a market truth: developers building agent infrastructure need more than webhook glue. They need durable execution, automatic retries, and resumable state. Trigger.dev’s TypeScript-native approach trades Temporal’s event sourcing rigor for simpler developer experience. That tradeoff works for most agent and automation use cases.

Source Links

Trigger.dev V2 Announcement (172 points, 39 comments)
Trigger.dev V1 Show HN (745 points, 190 comments)
Trigger.dev GitHub
Trigger.dev Documentation