Trigger.dev launched in February 2023 as a “developer-first Zapier alternative” and pulled 745 points on Hacker News. Eight months later, the team shipped V2 with a completely different pitch: a Temporal alternative for TypeScript developers. That pivot exposes a real infrastructure gap. Event-driven webhooks work fine for simple automation, but multi-step agent tasks need durable execution primitives that survive crashes, retries, and long delays without forcing you into Go or Java.
Why the V1 Model Broke Down
V1 followed the Zapier pattern: trigger on event, run a handler, call some APIs, done. This works when your workflow is a linear chain of HTTP calls that complete in seconds. It falls apart when you need:
- State persistence across retries. If your LLM call fails halfway through a 10-step research loop, you want to resume from step 5, not restart from scratch.
- Arbitrary delays. Waiting 24 hours for a human approval or rate-limit cooldown means your process can’t block a worker thread.
- Fan-out and join. Spawning 50 parallel tool calls and waiting for all results requires coordination beyond a single function invocation.
- Idempotency guarantees. Retrying a failed step shouldn’t double-charge a credit card or send duplicate emails.
V1’s webhook model had no answer for these. You could bolt on Redis for state and a job queue for retries, but then you’re building Temporal yourself.
What V2 Actually Provides
Trigger.dev V2 gives you durable execution primitives in TypeScript without running your own Temporal cluster. The core abstraction is a task that can pause, wait, and resume across process boundaries.
Key Primitives
| Primitive | Purpose | Agent Use Case |
|---|---|---|
task.run() | Durable function that survives crashes | Multi-step research agent that calls search, browse, analyze tools in sequence |
wait.for() | Pause execution for seconds to days | Rate-limit backoff, human-in-the-loop approval gates |
batch.trigger() | Fan-out parallel subtasks | Scrape 100 URLs concurrently, wait for all results |
| Idempotency keys | Prevent duplicate execution on retry | Ensure LLM tool calls aren’t invoked twice if network fails |
| Observability hooks | Trace every step, retry, and state transition | Debug why your agent got stuck in a loop |
Execution Model
When you call a task, Trigger.dev serializes the function state to Postgres. If the worker crashes mid-execution, the platform replays from the last checkpoint. This is the same pattern Temporal uses, but implemented in TypeScript and managed as a service.
The replay mechanism works by logging every non-deterministic operation (HTTP calls, random numbers, timestamps) and replaying them from the log. Your code runs multiple times, but external side effects only happen once.
Architecture: Hosted Workers vs. BYO Infra
Trigger.dev offers two deployment modes:
Hosted (Cloud): You write tasks, push to GitHub, Trigger.dev builds a container and runs it on their infrastructure. You get automatic scaling, retries, and observability without managing servers. The tradeoff is cold start latency (similar to AWS Lambda) and vendor lock-in for the execution layer.
Self-hosted: You run the Trigger.dev platform in your own Kubernetes cluster or Docker Compose setup. You control the worker pool, database, and network boundaries. The tradeoff is operational overhead: you’re responsible for Postgres backups, worker autoscaling, and monitoring.
Both modes use the same task API. The difference is who operates the control plane.
Agent Workflow Example
Here’s how a research agent maps to Trigger.dev primitives:
export const researchAgent = task({
id: "research-agent",
run: async ({ topic }: { topic: string }) => {
const messages: CoreMessage[] = [
{ role: "user", content: `Research: ${topic}` }
];
for (let i = 0; i < 10; i++) {
// This LLM call is logged and won't re-execute on retry
const { text, toolCalls, steps } = await generateText({
model: anthropic("claude-opus-4-20250514"),
system: "You are a research assistant with web access.",
messages,
tools: { search, browse, analyze },
maxSteps: 5,
});
if (!toolCalls.length) {
return { summary: text, stepsUsed: steps.length };
}
// Fan out tool calls in parallel
const results = await batch.trigger(
toolCalls.map(call => ({
payload: call,
task: executeToolTask
}))
);
// Wait for all results before continuing
messages.push(...results.map(r => ({
role: "tool",
content: r.output
})));
}
},
});
If the worker crashes after the first generateText call, Trigger.dev replays the function but returns the cached LLM response instead of calling Anthropic again. This prevents duplicate API charges and non-deterministic behavior.
State Management and Replay Boundaries
The durable execution model requires careful handling of side effects. Trigger.dev tracks:
- Deterministic code: Pure functions, loops, conditionals. These re-execute on replay.
- Non-deterministic operations: API calls, database writes, random number generation. These are logged once and replayed from the log.
You mark non-deterministic operations by wrapping them in Trigger.dev’s SDK functions (fetch, wait.for, batch.trigger). If you call Math.random() directly, you’ll get different values on replay and break idempotency.
Checkpoint Frequency
Trigger.dev checkpoints state after every SDK operation. This means:
- Fine-grained recovery: If your task crashes on step 47 of 100, you resume from step 47.
- Storage overhead: A task with 100 steps generates approximately 100 checkpoint rows per execution. Cumulative storage depends on retry count and task volume.
For agent loops with hundreds of tool calls, this can bloat your database. The mitigation is to batch tool calls into subtasks that checkpoint less frequently.
Observability and Debugging
The platform provides:
- Execution traces: See every step, retry, and state transition in a timeline view.
- Logs per checkpoint: Stdout/stderr captured at each step, not just at task completion.
- Replay simulation: Re-run a failed task locally with the same inputs and logged responses to debug non-deterministic issues.
This is critical for agent debugging. When your research loop gets stuck, you need to see which tool call failed, what the LLM returned, and whether the retry logic triggered correctly.
Security Boundaries
Trigger.dev runs your tasks in isolated containers, but all tasks in a project share the same Postgres database for state. This means:
- No multi-tenancy isolation: If you’re building a SaaS where each customer triggers tasks, you need to implement your own access control. Trigger.dev won’t prevent Task A from reading Task B’s state.
- Secrets management: Environment variables are encrypted at rest but visible to all tasks in the project. Use a separate secrets manager (AWS Secrets Manager, Vault) for customer-specific credentials.
Failure Modes
| Failure | Behavior | Mitigation |
|---|---|---|
| Worker crash mid-task | Task resumes from last checkpoint | None needed, automatic |
| Postgres outage | All tasks pause until DB recovers | Run Postgres with replication, monitor lag |
| Infinite retry loop | Task retries forever, exhausts quota | Set maxAttempts to 5-10, monitor CloudWatch/Datadog for retry rate spikes |
| Non-deterministic replay | Task produces different results on retry | Only use SDK functions for side effects, avoid raw Math.random() or Date.now() |
| Cold start latency | First task invocation takes 2-5 seconds | Keep workers warm with scheduled pings or use self-hosted mode |
The most dangerous failure is the infinite retry loop. If your LLM tool call always fails (bad API key, rate limit), Trigger.dev will retry until you hit maxAttempts. Set this to a reasonable number (5-10) and monitor retry rates.
When to Use Trigger.dev vs. Temporal
| Factor | Trigger.dev | Temporal |
|---|---|---|
| Language | TypeScript only | Go, Java, Python, TypeScript |
| Deployment | Hosted or self-hosted | Self-hosted (complex setup) |
| Learning curve | Low (familiar async/await) | High (workflow/activity split) |
| Ecosystem | Smaller, newer | Mature, large community |
| Agent use case | Good for LLM loops, tool orchestration | Better for multi-service workflows |
| Cost model | Pay-per-task (hosted) or ops cost (self-hosted) | Self-hosted infrastructure and ops overhead |
Trigger.dev makes sense if you’re already in the TypeScript ecosystem (Next.js, Remix, Node.js) and want durable execution without operating a Temporal cluster. Temporal makes sense if you need multi-language support, have complex workflow requirements, or already run Kubernetes at scale.
Technical Verdict
Use Trigger.dev when:
- You’re building agent workflows in TypeScript and need retries, delays, and state persistence without rolling your own queue infrastructure.
- You want hosted execution with automatic scaling and don’t want to manage Temporal’s operational complexity.
- Your tasks are primarily LLM tool loops, API orchestration, or scheduled jobs that benefit from durable execution.
- Your agent loops involve 5+ sequential tool calls per iteration where cold start latency (2-5 seconds) is acceptable.
- You can tolerate vendor lock-in for the execution layer in exchange for zero-ops deployment.
Avoid Trigger.dev when:
- You need polyglot workflows where Python data processing agents call Rust ML inference services or Java backend systems.
- You require strict multi-tenancy isolation at the infrastructure level (each customer gets separate state stores and network boundaries).
- Your workflows involve complex distributed transactions across multiple services with compensation logic (use Temporal or Cadence).
- You’re already running Temporal in production and have the operational expertise to maintain it.
- You need sub-500ms latency between tool invocations (cold starts in hosted mode make this impossible, even self-hosted adds checkpoint overhead).
- Your agent tasks generate more than 1,000 checkpoints per execution (checkpoint overhead and Postgres query performance may degrade; test at scale).
The V1 to V2 pivot reveals a real pattern: as agents move from demos to production, the infrastructure gap between “run a function” and “orchestrate a multi-step workflow that survives failures” becomes critical. Trigger.dev fills that gap for TypeScript developers who don’t want to become Temporal experts.
Source Links
- Trigger.dev V2 Announcement (Hacker News, Oct 2023)
- Trigger.dev V1 Launch (Hacker News, Feb 2023)
- Trigger.dev GitHub Repository
- Official Documentation