mech.app
Automation

Trigger.dev V2: What a Temporal Alternative for TypeScript Reveals About Durable Execution Plumbing

How Trigger.dev pivoted to durable execution primitives, examining state persistence, retry semantics, and observability for long-running agent workflows.

Source: trigger.dev
Trigger.dev V2: What a Temporal Alternative for TypeScript Reveals About Durable Execution Plumbing

Trigger.dev launched in February 2023 as a developer-first Zapier alternative and pulled 745 points on Hacker News. Eight months later, the team shipped V2 with a hard pivot: forget event hooks, build durable execution primitives for TypeScript. The repositioning as a “Temporal alternative” signals what developers building agent workflows actually need: resumable tasks, automatic retries, and state persistence without managing checkpoints manually.

This shift exposes the gap between event-driven automation and long-running orchestration. Agent workflows that call LLMs, wait for human approval, or scrape data across hours need execution guarantees that webhooks cannot provide. Trigger.dev V2 addresses this with TypeScript-native primitives that handle state, retries, and observability without requiring developers to learn Temporal’s workflow DSL or manage JVM infrastructure.

Why Durable Execution Matters for Agent Workflows

Standard serverless functions timeout after 15 minutes. Agent workflows often run longer: a research task might call an LLM, browse multiple pages, wait for rate limits, and synthesize results over 30 minutes or more. If a network error occurs at minute 28, you lose all progress unless you manually checkpoint state.

Durable execution frameworks solve this by persisting execution state at each step. If a task fails, the runtime resumes from the last successful checkpoint instead of restarting from scratch. This requires:

  • State persistence: Serializing task context to durable storage after each step
  • Retry semantics: Configurable backoff, max attempts, and idempotency guarantees
  • Resumability: Reconstructing execution context from stored state
  • Observability: Inspecting task history, pending steps, and failure reasons

Temporal provides these primitives but requires Go or Java workflow definitions, a separate worker fleet, and deep understanding of activity vs. workflow boundaries. Trigger.dev V2 collapses this complexity into TypeScript task definitions that run on managed infrastructure.

Orchestration Primitives in Trigger.dev V2

The core abstraction is the task, a TypeScript function decorated with retry logic, concurrency controls, and automatic state persistence. Here’s what the plumbing looks like:

import { task } from "@trigger.dev/sdk/v3";

export const processDocument = task({
  id: "process-document",
  retry: {
    maxAttempts: 3,
    factor: 2,
    minTimeout: 1000,
    maxTimeout: 10000,
  },
  run: async (payload: { documentId: string }) => {
    // Step 1: Fetch document (persisted)
    const doc = await fetchDocument(payload.documentId);
    
    // Step 2: Extract text (persisted)
    const text = await extractText(doc);
    
    // Step 3: Analyze with LLM (persisted)
    const analysis = await analyzeLLM(text);
    
    // Step 4: Store results (persisted)
    await storeResults(payload.documentId, analysis);
    
    return { status: "complete", analysis };
  },
});

Each await inside the run function creates an implicit checkpoint. If analyzeLLM fails due to rate limiting, the runtime persists the text result and retries from that step. You don’t write checkpoint logic or manage state serialization.

State Persistence Mechanics

Trigger.dev persists task state to Postgres (self-hosted) or their managed cloud storage. The runtime serializes function arguments, return values, and intermediate results using structured cloning. This means:

  • Primitive types, objects, and arrays serialize automatically
  • Functions, class instances, and circular references fail serialization
  • Large payloads (>1MB) trigger warnings but don’t block execution

The persistence layer tracks:

  • Task ID and run ID for deduplication
  • Execution timeline with step-level granularity
  • Retry attempts and backoff state
  • Pending, running, and completed steps

Retry and Idempotency

Retry configuration lives at the task level, not buried in infrastructure config:

  • maxAttempts: Hard limit on retry count
  • factor: Exponential backoff multiplier
  • minTimeout/maxTimeout: Backoff bounds in milliseconds

Idempotency requires explicit handling. Trigger.dev provides idempotencyKey as a task option, which deduplicates runs with the same key. For external API calls, you still need to implement idempotent operations (e.g., using PUT instead of POST, checking for existing records).

Concurrency and Queue Controls

Agent workflows often need rate limiting or sequential execution. Trigger.dev exposes concurrency primitives at the task and queue level:

export const rateLimitedScraper = task({
  id: "scrape-page",
  queue: {
    name: "scraping",
    concurrencyLimit: 5, // Max 5 concurrent tasks
  },
  run: async ({ url }: { url: string }) => {
    const html = await fetch(url).then(r => r.text());
    return parseHTML(html);
  },
});

Queues provide:

  • Concurrency limits: Cap parallel execution across all instances
  • Priority: Integer-based task ordering within a queue
  • FIFO guarantees: Sequential processing when concurrency is 1

This matters for agent workflows that hit rate-limited APIs or need ordered execution (e.g., processing a conversation thread chronologically).

Observability and Debugging

The V2 dashboard provides a timeline view similar to Temporal’s workflow history:

  • Run list: All task executions with status, duration, and retry count
  • Step-by-step trace: Each await appears as a discrete step with timing
  • Logs and errors: Captured stdout/stderr and exception stack traces
  • Replay capability: Re-run failed tasks from the last checkpoint

The observability model assumes you’re debugging production failures, not local development. There’s no local timeline viewer; you deploy to a dev environment and inspect runs in the cloud dashboard.

Deployment and Infrastructure Shape

Trigger.dev offers three deployment models:

ModelState StorageWorker ExecutionUse Case
Managed CloudTrigger.dev PostgresTrigger.dev containersFastest setup, least control
Self-HostedYour PostgresYour Kubernetes/DockerFull control, compliance needs
HybridTrigger.dev PostgresYour workersObservability without hosting DB

The managed cloud model is the default. You push code with npx trigger.dev@latest deploy, and the platform handles:

  • Building a Docker image from your TypeScript
  • Deploying to their container runtime
  • Auto-scaling workers based on queue depth
  • Persisting state to their managed Postgres

Self-hosting requires running the Trigger.dev server (open source, MIT licensed) and worker containers. You manage Postgres, Redis (for queue coordination), and container orchestration. This adds operational overhead but keeps task state in your VPC.

Comparing Temporal and Trigger.dev Primitives

AspectTemporalTrigger.dev V2
LanguageGo, Java, TypeScript (via SDK)TypeScript-native
State persistenceWorkflow history in Cassandra/PostgresTask state in Postgres
Retry semanticsActivity retry policiesTask-level retry config
ConcurrencyWorker pools, task queuesQueue concurrency limits
ObservabilityWorkflow timeline, event historyTask run timeline, step traces
DeploymentSelf-hosted workers + Temporal serverManaged cloud or self-hosted
Learning curveWorkflow vs. activity mental modelStandard async TypeScript

Temporal’s workflow/activity split prevents non-deterministic code in workflows (e.g., Math.random(), Date.now()). Trigger.dev allows non-deterministic code but warns that replay behavior may differ. This trade-off favors developer ergonomics over strict determinism.

Failure Modes and Limitations

Serialization failures: If your task returns a class instance or function, serialization fails silently. The runtime logs a warning but doesn’t block execution, leading to incomplete state recovery.

Large payloads: Tasks with >1MB of intermediate state trigger performance warnings. The platform doesn’t enforce hard limits, but Postgres storage costs scale with payload size.

Cold start latency: Managed cloud workers exhibit 2-5 second cold starts for infrequently used tasks. This matters for latency-sensitive workflows but not for batch processing.

No distributed tracing: Trigger.dev provides task-level traces but doesn’t integrate with OpenTelemetry or Datadog APM. If your agent workflow spans multiple services, you lose end-to-end visibility.

Queue ordering guarantees: FIFO only applies within a single queue. If you split tasks across queues for priority handling, you lose ordering guarantees.

When to Use Trigger.dev V2

Good fit:

  • TypeScript-first teams building agent workflows with LLM calls, API integrations, and human-in-the-loop steps
  • Workflows that need retries and resumability but don’t require strict determinism
  • Teams that want managed infrastructure without learning Temporal’s workflow DSL
  • Batch processing tasks (document analysis, media transcoding) that run for 10+ minutes

Poor fit:

  • Workflows requiring strict determinism (financial transactions, consensus protocols)
  • Teams already invested in Temporal with Go/Java expertise
  • Latency-sensitive real-time systems (sub-second response requirements)
  • Workflows with complex distributed tracing needs across multiple services

Technical Verdict

Trigger.dev V2 solves the “I need Temporal but don’t want to manage Temporal” problem for TypeScript developers. The task abstraction hides state persistence and retry logic behind familiar async/await syntax, making durable execution accessible without infrastructure overhead.

The trade-off is less control. Temporal’s workflow/activity boundary enforces determinism and provides fine-grained replay guarantees. Trigger.dev prioritizes developer ergonomics, accepting non-deterministic code and simpler observability.

Use Trigger.dev when you’re building agent workflows in TypeScript and need resumability without managing checkpoints. Avoid it when you need strict determinism, sub-second latency, or already have Temporal expertise in-house.