x402station.io's Risk Signal Layer: How Agentic Commerce Probes 86,599 Endpoints to Build Transaction Trust

Agentic commerce systems need to know if an endpoint is safe before signing a payment transaction. x402station.io built an independent risk signal layer that probes 86,599 active endpoints every 10 minutes and publishes trust scores that agents can consume before they commit funds. The Preflight API is purpose-built for autonomous agent decision-making, not human dashboards.

The Problem: Agents Need Pre-Transaction Risk Signals

The x402 protocol lets agents pay endpoints directly using PAYMENT-SIGNATURE headers. Once an agent signs, the transaction settles on-chain. There is no chargeback mechanism.

Before signing, an agent needs to answer:

Is the endpoint reachable right now?
Does it respond within acceptable latency bounds?
Has it been stable over the last N probes?
Is the price suspiciously high compared to similar endpoints?
Is this a decoy endpoint designed to extract payment without delivering content?

x402station.io built a probe network to answer these questions at scale. The system runs HTTP-naked calls (no PAYMENT-SIGNATURE, no settlement) and classifies endpoints into risk tiers that agents can query before committing funds.

Probe Architecture

The probe worker runs on a 10-minute cadence. For each endpoint in the catalog:

HTTP GET without payment headers. The probe does not attempt to consume paid content. It measures reachability, latency, and HTTP status codes.
Failure classification. Transient failures (timeouts, 5xx) are tracked separately from permanent failures (DNS errors, certificate issues, 404s on the base path).
Aggregate scoring. Raw probe results feed into uptime percentages, latency percentiles (p50, p95, p99), and first/last-seen timestamps.
CDP settlement sync. When available, the system correlates probe health with on-chain settlement data from the Coinbase Developer Platform to detect endpoints that accept payment but fail to deliver.

The public dataset is aggregate-only. One row per endpoint with classification, uptime, latency percentiles, and settlement aggregates. The raw probe-by-probe time-series is gated behind the paid Preflight API.

Trust Scoring and Risk Classes

x402station.io classifies endpoints into risk tiers based on probe history and settlement behavior:

Risk Class	Criteria	Agent Behavior
Green	>99% uptime, p95 latency <500ms (per x402station.io standards), no settlement disputes	Proceed
Yellow	95-99% uptime, occasional transients, no fraud signals	Proceed with caution, log for review
Orange	<95% uptime, high latency variance, or new endpoint (<7 days)	Require human approval or skip
Red	Decoy pricing (≥$1,000 USDC), settlement fraud, or permanent outage	Block transaction

The scoring algorithm weighs recent probes more heavily than historical data. A single 5xx response does not trigger a downgrade, but three consecutive failures within 30 minutes will move an endpoint from Green to Yellow.

Decoy detection combines price-based heuristics and behavioral signals from settlement data. The system flags endpoints priced at ≥$1,000 USDC (73 endpoints with $23.2M aggregate sticker price, median $500,000) and cross-references them with settlement data. If an endpoint accepts payment but returns 403 or empty responses, it gets Red classification.

Concentration Risk

Two providers own 88.07% of the active catalog. Ten providers own 91.85%. This concentration exceeds typical resilience standards for financial infrastructure, where payment processor diversity is a regulatory and operational requirement.

If a single provider experiences an outage, agents relying on x402 endpoints will see widespread failures. The probe network surfaces this risk by tracking provider-level health metrics alongside endpoint-level scores.

Agents can query the Preflight API with a provider filter to check if their fallback options are concentrated on the same infrastructure.

State Management and Caching

Agents cannot afford to re-probe every endpoint on every transaction. The system uses a tiered caching strategy:

Hot cache (Redis). Green endpoints are cached for 10 minutes. Agents can query without triggering a fresh probe.
Warm cache (PostgreSQL). Yellow and Orange endpoints are cached for 5 minutes. The system runs a fresh probe if the cache is stale.
Cold path (immediate probe). Red endpoints and new endpoints trigger an immediate probe before returning a score.

The system invalidates cache entries for degraded endpoints to ensure agents see updated scores within the probe cadence window. If a Green endpoint fails a probe, the cache is flushed and subsequent queries return the updated classification.

How Agents Consume Risk Signals

The Preflight API exposes three consumption patterns:

1. Synchronous Query (REST)

import requests

try:
    response = requests.get(
        "https://x402station.io/api/v1/preflight",
        params={"endpoint": "https://example.com/api/data"},
        headers={"Authorization": "Bearer YOUR_API_KEY"},
        timeout=5
    )
    response.raise_for_status()
    
    risk = response.json()
    if risk.get("class") in ["green", "yellow"]:
        # Agent proceeds with payment signature
        sign_and_pay(endpoint)
    else:
        # Agent logs and skips
        log_blocked_transaction(endpoint, risk.get("reason"))
        
except (requests.RequestException, ValueError) as e:
    # Handle network failures, timeouts, and invalid JSON
    log_preflight_error(endpoint, str(e))
    # Fall back to hardcoded list of Green-classified providers or skip transaction

In production, wrap queries in try/except to handle timeouts and 5xx responses as described in Failure Modes. Agents query before signing and block if the response is not Green or Yellow.

2. Bulk Preflight (Batch)

Agents planning to call multiple endpoints can batch queries:

try:
    response = requests.post(
        "https://x402station.io/api/v1/preflight/batch",
        json={"endpoints": [url1, url2, url3]},
        headers={"Authorization": "Bearer YOUR_API_KEY"},
        timeout=10
    )
    response.raise_for_status()
    
    results = response.json().get("results", [])
    safe_endpoints = [r["endpoint"] for r in results if r.get("class") == "green"]
    
except (requests.RequestException, ValueError) as e:
    log_batch_preflight_error(str(e))

Batch queries return results for up to 100 endpoints in a single call.

3. Webhook Notifications (Push)

Agents can subscribe to degradation events. x402station.io sends a POST request with Content-Type: application/json:

POST https://your-agent.com/webhooks/x402-risk
Content-Type: application/json

{
  "endpoint": "https://example.com/api/data",
  "previous_class": "green",
  "current_class": "orange",
  "reason": "3 consecutive probe failures",
  "timestamp": "2026-05-23T08:05:00Z"
}

Webhooks fire within 60 seconds of a classification change. Agents can pre-emptively remove degraded endpoints from their routing tables.

Failure Modes

1. Probe Network Outage

If x402station.io itself becomes unavailable, agents lose their risk signal layer. A recommended practice is to cache the last known Green endpoints and proceed with a higher logging threshold.

Some agents implement a circuit breaker: if the Preflight API returns 5xx or times out, fall back to a static allowlist of trusted providers.

2. False Negatives (Decoy Passes Probes)

A decoy endpoint can return 200 OK to probes but fail when agents send PAYMENT-SIGNATURE. The system mitigates this by correlating probe health with settlement data, but there is a detection lag (up to 10 minutes).

Agents should log all paid transactions and flag endpoints that accept payment but return empty or invalid responses. These logs feed back into the risk scoring model.

3. Rate Limiting and Authentication Boundaries

Some endpoints require authentication even for health checks. The probe network cannot test these endpoints without credentials, so they remain unclassified (Gray risk class).

Agents calling Gray endpoints should implement their own health checks or require human approval.

Legal and Ethical Constraints

Probing 86,599 endpoints every 10 minutes generates significant traffic. The system respects robots.txt and honors rate limits when endpoints return 429 responses.

The probe worker does not attempt to bypass authentication, execute paid transactions, or scrape content. It measures reachability and latency only.

Some endpoint operators may view probing as unwanted traffic. The system publishes its user-agent string (x402station-probe/1.0) and provides an opt-out mechanism for operators who do not want their endpoints monitored.

Technical Verdict

Use x402station.io’s Preflight API if:

Your agents execute financial transactions on x402 endpoints and need pre-transaction risk signals.
You are building a policy engine that routes agents to healthy endpoints and blocks decoys.
You need to audit endpoint concentration risk across providers.

Avoid or supplement if:

Your endpoints require authentication for health checks (Gray class, no probe data).
You need sub-10-minute freshness (the probe cadence is fixed at 10 minutes).
You are building a consumer-facing app where humans review transactions (the API is designed for autonomous agents, not dashboards).

The free trial endpoint (/api/v1/preflight-trial) lets you test the response shape without funding a wallet. Production agents should use the paid tier for fresh data, bulk queries, and SLA guarantees.