mech.app
Automation

Trigger.dev V2: What a Temporal Alternative for TypeScript Reveals About Durable Execution Plumbing

How Trigger.dev pivoted from event triggers to durable execution, exposing retry semantics, state persistence, and the TypeScript orchestration gap.

Source: trigger.dev
Trigger.dev V2: What a Temporal Alternative for TypeScript Reveals About Durable Execution Plumbing

Trigger.dev launched in February 2023 as a “developer-first Zapier alternative” and earned 745 points on Hacker News. Eight months later, the team shipped V2 with a complete architectural pivot: they became a Temporal alternative for TypeScript developers. That shift from event-driven webhooks to durable execution exposes the infrastructure gap between workflow orchestration and long-running background jobs in the JavaScript ecosystem.

The V2 announcement (172 points, October 2023) positioned the platform explicitly against Temporal, targeting developers who need retry semantics, state persistence, and observability without adopting Go or Java tooling. The pivot reveals what happens when you build orchestration plumbing for a language runtime that was never designed for it.

The V1 to V2 Pivot: Event Triggers to Durable Tasks

V1 focused on event-driven automation. You connected webhooks, scheduled jobs, and API integrations through a visual builder. The execution model was stateless: trigger fires, handler runs, job completes or fails.

V2 rebuilt the core around durable execution:

  • Task primitives replace event handlers
  • Automatic retries with exponential backoff
  • State persistence across failures and restarts
  • Long-running workflows that survive process crashes
  • Observability with execution traces and step-level logging

The shift addresses a specific pain point: TypeScript developers building AI agents, media pipelines, or data workflows need Temporal’s guarantees but cannot justify the operational overhead of running a JVM-based system.

Durable Execution Without Event Sourcing

Temporal uses event sourcing. Every state transition writes an event to durable storage. Replay reconstructs execution state by re-running the workflow code against the event log. This model guarantees determinism but requires strict rules: no random number generation, no wall-clock time, no direct I/O in workflow code.

Trigger.dev takes a different approach:

  • Checkpoint-based persistence: State snapshots at explicit await boundaries
  • Idempotency keys: Automatic deduplication for retried steps
  • Step isolation: Each task step is a separate execution unit with its own retry policy
  • No replay constraints: You can call Math.random() or Date.now() in task code

This trades deterministic replay for developer ergonomics. The system does not re-execute completed steps during retries. Instead, it resumes from the last successful checkpoint.

Architecture: Task Queue and Worker Isolation

The V2 runtime separates orchestration from execution:

// Task definition with retry and timeout boundaries
export const processVideo = task({
  id: "process-video",
  retry: {
    maxAttempts: 3,
    factor: 2,
    minTimeout: 1000,
    maxTimeout: 10000,
  },
  run: async (payload: { videoUrl: string }) => {
    // Step 1: Download (isolated retry boundary)
    const file = await downloadVideo(payload.videoUrl);
    
    // Step 2: Transcode (separate checkpoint)
    const transcoded = await transcodeVideo(file);
    
    // Step 3: Upload (final checkpoint)
    const url = await uploadToS3(transcoded);
    
    return { url };
  },
});

Orchestration layer:

  • Manages task queues and scheduling
  • Tracks execution state in Postgres
  • Handles retry logic and timeout enforcement
  • Routes tasks to available workers

Worker layer:

  • Pulls tasks from queues
  • Executes task code in isolated Node.js processes
  • Reports checkpoints and failures back to orchestrator
  • Scales horizontally based on queue depth

The separation allows elastic scaling. Workers can crash, restart, or scale to zero without losing execution state.

Retry Semantics and Failure Boundaries

Trigger.dev exposes three retry scopes:

  1. Task-level retries: Entire task re-executes from the beginning
  2. Step-level retries: Individual await boundaries retry independently
  3. Manual retries: Explicit retry blocks with custom logic

Each scope has configurable backoff, jitter, and max attempts. The system tracks retry counts in the execution state, so you can implement circuit breakers or escalation logic.

Failure modes:

Failure TypeBehaviorRecovery
Transient network errorStep retries with backoffAutomatic
Timeout (step)Step fails, task retriesAutomatic
Timeout (task)Task fails, enters dead letter queueManual intervention
Unhandled exceptionStep fails, retries up to max attemptsAutomatic or DLQ
Worker crashTask resumes from last checkpointAutomatic

The dead letter queue captures tasks that exhaust all retries. You can inspect failures, modify state, and manually re-queue.

State Persistence: Postgres as the Execution Log

Trigger.dev uses Postgres for state storage:

  • Execution records: Task metadata, status, retry count
  • Checkpoints: Serialized state at each await boundary
  • Event log: Step completions, failures, retries
  • Queue tables: Pending tasks, priorities, concurrency limits

This differs from Temporal’s event sourcing model. Trigger.dev does not replay events to reconstruct state. Instead, it loads the most recent checkpoint and resumes execution.

Checkpoint serialization:

  • JSON for simple types
  • Binary for buffers and streams
  • References for large objects (stored in S3)

The system automatically handles serialization. You do not need to mark functions as deterministic or avoid side effects in task code.

Observability: Execution Traces and Step-Level Logs

The V2 dashboard shows:

  • Execution timeline: Visual graph of steps, retries, and wait states
  • Step logs: Console output and error traces for each checkpoint
  • Metrics: Latency, retry rates, queue depth
  • Alerts: Configurable triggers for failure thresholds

Each task execution gets a unique trace ID. You can link traces to external observability tools (Datadog, Honeycomb) via OpenTelemetry.

The step-level granularity helps debug long-running workflows. You can see exactly which API call failed, how many times it retried, and what the backoff schedule was.

Concurrency Control and Queue Shaping

Trigger.dev exposes concurrency primitives:

export const sendEmail = task({
  id: "send-email",
  queue: {
    name: "email-queue",
    concurrencyLimit: 10, // Max 10 concurrent executions
  },
  run: async (payload: { to: string; body: string }) => {
    await emailProvider.send(payload);
  },
});

Queue configuration:

  • Concurrency limits: Per-queue or per-task
  • Priority levels: High, normal, low
  • Rate limiting: Max tasks per second
  • FIFO guarantees: Optional ordering within a queue

This allows you to shape traffic to downstream APIs. If your email provider rate-limits at 100 requests/minute, you set concurrencyLimit: 2 and let the queue absorb bursts.

Deployment Shape: Managed vs. Self-Hosted

Trigger.dev offers two deployment models:

Managed (Cloud):

  • Hosted orchestration layer
  • Elastic worker scaling
  • Postgres and S3 managed by Trigger.dev
  • Pay per task execution

Self-Hosted (Open Source):

  • Run orchestration and workers in your infrastructure
  • Bring your own Postgres and object storage
  • Deploy via Docker Compose or Kubernetes
  • No execution fees

The self-hosted option uses the same codebase as the managed service. You can develop locally, test in staging, and deploy to production without changing task definitions.

Comparison: Trigger.dev vs. Temporal

DimensionTrigger.devTemporal
Language runtimeNode.js, TypeScriptGo, Java, Python, PHP
State modelCheckpoint-basedEvent sourcing
Replay semanticsResume from checkpointRe-execute workflow code
Determinism requiredNoYes
Operational complexityLow (managed) or medium (self-hosted)High (requires Cassandra/Postgres + Elasticsearch)
ObservabilityBuilt-in dashboardRequires external tools
Ecosystem maturityEarly (launched 2023)Mature (launched 2019)

Temporal offers stronger guarantees for mission-critical workflows. The event sourcing model ensures you can always reconstruct state, even years later. But it requires discipline: workflow code must be deterministic, and you need to manage schema evolution carefully.

Trigger.dev optimizes for developer velocity. You write normal TypeScript. The system handles retries, state persistence, and observability without forcing you to learn a new programming model.

When Checkpoint-Based Persistence Breaks Down

The checkpoint model has limits:

  1. Non-deterministic retries: If a step produces different results on retry (e.g., calling a non-idempotent API), you can get inconsistent state.
  2. Large state objects: Serializing multi-GB objects to Postgres is slow. You need to manually offload to S3.
  3. Long-running transactions: If a step holds a database lock for hours, retries can cause deadlocks.
  4. Audit requirements: Event sourcing provides a complete history. Checkpoints only capture snapshots.

For workflows that need strict auditability or handle billions of events, Temporal’s event sourcing model is safer. For most background jobs, AI agents, and data pipelines, checkpoints are sufficient.

Real-World Use Case: AI Agent with Tool Calling

The code snippet from Trigger.dev’s site shows a research agent:

export const researchAgent = task({
  id: "research-agent",
  run: async ({ topic }: { topic: string }) => {
    const messages: CoreMessage[] = [
      { role: "user", content: `Research: ${topic}` },
    ];
    
    for (let i = 0; i < 10; i++) {
      const { text, toolCalls, steps } = await generateText({
        model: anthropic("claude-opus-4-20250514"),
        system: "You are a research assistant with web access.",
        messages,
        tools: { search, browse, analyze },
        maxSteps: 5,
      });
      
      if (!toolCalls.length) {
        return { summary: text, stepsUsed: steps.length };
      }
      
      for (const call of toolCalls) {
        const result = await executeTool(call);
        messages.push({ role: "tool", content: result });
      }
    }
  },
});

Plumbing details:

  • Each generateText call is a checkpoint
  • If the LLM API times out, the task retries from the last successful tool call
  • The messages array persists across retries, so the agent does not lose context
  • Tool execution failures trigger step-level retries
  • The entire loop can run for minutes or hours without losing state

This pattern works because the checkpoint model does not require deterministic replay. You can call the LLM API, get different results on retry, and the system handles it gracefully.

Technical Verdict

Use Trigger.dev when:

  • You need durable execution for TypeScript/Node.js workflows
  • You want automatic retries and observability without operational overhead
  • Your team prefers managed infrastructure over self-hosting Temporal
  • You are building AI agents, media pipelines, or data workflows that run for minutes to hours

Avoid Trigger.dev when:

  • You need strict auditability with event sourcing
  • Your workflows handle billions of events and require Temporal’s scalability
  • You already run Temporal and have invested in its ecosystem
  • You need multi-language support (Trigger.dev is TypeScript-only)

The V2 pivot reveals a real gap in the TypeScript ecosystem. Developers want Temporal’s guarantees without the operational complexity of running a distributed system. Trigger.dev fills that gap by trading event sourcing for checkpoint-based persistence, making durable execution accessible to teams that cannot justify a dedicated platform team.