mech.app
AI Agents

When Agents Control Robots: A Zero Trust Policy Model for Agentic Cyber-Physical Systems

How Cobot-Claw enforces safety constraints when LFM agents control industrial robots through natural language, and why prompt injection becomes physical...

Source: arxiv.org
When Agents Control Robots: A Zero Trust Policy Model for Agentic Cyber-Physical Systems

Multi-agent systems powered by large foundation models (LFMs) are moving from chatbots to industrial robot controllers. When natural language becomes the control plane for physical actuators, prompt injection stops being a data leak and becomes physical sabotage. A new paper from researchers at multiple institutions introduces ZTPM (Zero Trust Policy Model), a security layer designed specifically for agentic cyber-physical systems where security failures produce physical consequences.

The work analyzes Cobot-Claw, a deployed four-agent system controlling a UR3e robotic arm through natural language commands. The paper identifies five attack classes unique to agentic robots and proposes 25 typed policy primitives across five enforcement domains to prevent malicious or compromised agents from causing physical harm.

The Attack Surface

Traditional agent security focuses on data exfiltration, credential theft, or service disruption. When agents control robots, the threat model expands:

  • Prompt injection with physical consequences: A compromised agent could be tricked into commanding unsafe movements, collision trajectories, or excessive force application.
  • Multi-agent coordination attacks: One poisoned agent in a four-agent system can influence the entire control chain if trust boundaries are not enforced.
  • Non-deterministic actuation: The same natural language command produces different motor parameters depending on the LFM backend, making traditional input validation insufficient.
  • Latency-sensitive enforcement: Policy checks must complete within real-time control loop constraints (typically <100ms for industrial robots).
  • Novel action sequences: Agents generate movement combinations not seen during training, so policies cannot rely on allowlists alone.

The paper identifies five specific attack classes for agentic cyber-physical systems, though the full taxonomy is not detailed in the abstract. The core insight is that security boundaries must exist at the physical actuation layer, not just at the agent orchestration layer.

ZTPM Architecture

The Zero Trust Policy Model sits between LFM agents and robot actuators. It does not trust agent output, even from authenticated agents with valid credentials. Every command is validated against a policy engine before reaching motor controllers.

Five Enforcement Domains

ZTPM defines 25 typed primitives across five domains:

  1. Spatial constraints: Workspace boundaries, collision zones, keep-out volumes.
  2. Kinematic limits: Velocity caps, acceleration bounds, jerk limits.
  3. Force and torque: Maximum applied force, contact detection thresholds.
  4. Temporal policies: Cooldown periods, rate limiting, sequence timeouts.
  5. Multi-agent coordination: Mutual exclusion zones, handoff protocols, priority arbitration.

Each primitive is typed, meaning the policy engine understands the semantic difference between a position constraint and a force limit. This allows the system to compose policies that interact correctly (e.g., a spatial constraint that tightens when force limits are approached).

Physical Impact Tiers

ZTPM introduces Physical Impact Tiers as a runtime policy dimension. Commands are classified by potential physical consequence:

TierExample ActionsPolicy Enforcement
Tier 0Status queries, sensor readsMinimal validation
Tier 1Low-speed movements in safe zonesStandard kinematic checks
Tier 2High-speed movements, near obstaclesStrict spatial + velocity limits
Tier 3Force application, tool engagementMulti-domain validation + logging
Tier 4Emergency stop, safety system overrideRequires human confirmation

The tier system allows the policy engine to apply proportional scrutiny. A compromised agent commanding a Tier 4 action triggers additional validation steps, while Tier 0 queries pass through quickly.

Cobot-Claw: Four-Agent Control System

The paper evaluates ZTPM on Cobot-Claw, a deployed system controlling a UR3e industrial robot arm. The architecture uses four specialized agents:

  • Planning Agent: Translates high-level natural language goals into task sequences.
  • Trajectory Agent: Generates motion paths and waypoint sequences.
  • Execution Agent: Converts trajectories into low-level motor commands.
  • Safety Monitor Agent: Observes system state and can veto commands.

Each agent is powered by an LFM (the paper tests two backends, though specific models are not named in the abstract). The ZTPM policy engine intercepts commands between the Execution Agent and the robot’s motor controllers.

Non-Deterministic Actuation Problem

The empirical evaluation ran 60 execution traces across two LFM backends. Key finding: the same natural language command produces different actuation parameters depending on the model. For example, “move slowly to the left” might generate a 10 cm/s velocity on one backend and 15 cm/s on another.

This non-determinism breaks traditional input validation. You cannot simply check if a velocity value is in a safe range because the range itself depends on context the LFM inferred from natural language. The policy engine must validate the semantic intent (slow movement) against the physical parameters (actual velocity), not just the parameters alone.

Implementation Considerations

Latency Overhead

Real-time robot control loops run at 125 Hz to 1 kHz depending on the application. A policy check that takes 50ms adds unacceptable latency. The paper does not provide specific latency measurements, but the architecture implies several optimizations:

  • Pre-compiled policies: Rules are compiled into decision trees or lookup tables, not interpreted at runtime.
  • Tiered validation: Tier 0 and Tier 1 commands use fast paths with minimal checks.
  • Async logging: Audit trails are written asynchronously to avoid blocking the control loop.
  • Hardware acceleration: Spatial constraint checks (collision detection, workspace boundaries) can be offloaded to GPUs or FPGAs.

Policy Language

The paper mentions 25 typed primitives but does not show the policy language syntax. A plausible implementation might look like:

policy:
  name: "safe-pick-and-place"
  domains:
    spatial:
      - type: workspace_boundary
        x_range: [-0.5, 0.5]
        y_range: [-0.5, 0.5]
        z_range: [0.0, 0.8]
      - type: collision_zone
        shape: cylinder
        center: [0.2, 0.2, 0.4]
        radius: 0.1
        height: 0.6
    kinematic:
      - type: velocity_limit
        max_linear: 0.25  # m/s
        max_angular: 1.57  # rad/s
      - type: acceleration_limit
        max_linear: 0.5  # m/s²
    force:
      - type: contact_threshold
        max_force: 50  # Newtons
        action: emergency_stop
    temporal:
      - type: rate_limit
        max_commands_per_second: 10
        window: 1.0
  impact_tier_overrides:
    tier_3:
      kinematic.velocity_limit.max_linear: 0.1
      force.contact_threshold.max_force: 20

The policy engine evaluates each command against the active policy, applying tier-specific overrides based on the command’s classification.

Multi-Agent Coordination

When one agent is compromised, the system must prevent it from influencing trusted agents. ZTPM enforces this through:

  • Agent identity verification: Each command includes a cryptographic signature tied to the agent’s identity.
  • Mutual exclusion zones: If Agent A is controlling a workspace region, Agent B’s commands for that region are rejected.
  • Priority arbitration: The Safety Monitor Agent has veto authority over all other agents.
  • Audit trails: All commands and policy decisions are logged with agent identity for forensic analysis.

The paper does not detail how agents are authenticated or how keys are managed, but the zero-trust model implies each agent must prove its identity on every command.

Failure Modes

Policy Conflicts

What happens when an agent’s natural language intent conflicts with hard-coded safety constraints? The paper does not provide a resolution strategy, but three approaches are common:

  1. Reject and log: Block the command, notify the agent, record the violation.
  2. Degrade gracefully: Execute a safe approximation of the intent (e.g., reduce velocity to stay within limits).
  3. Escalate to human: Pause execution and request operator approval.

The choice depends on the Physical Impact Tier. Tier 1 conflicts might degrade gracefully, while Tier 3 conflicts escalate.

Policy Drift

Policies are static, but agent behavior evolves as models are updated. A policy that was safe for GPT-4 might be too restrictive for GPT-5, or too permissive if the new model generates more aggressive trajectories. Continuous validation is required:

  • Shadow mode testing: Run new models in parallel with old models, compare actuation parameters, flag divergences.
  • Policy versioning: Tag policies with model versions, require explicit approval before deploying new model-policy pairs.
  • Anomaly detection: Monitor actuation parameter distributions over time, alert when they shift significantly.

Latency Spikes

If the policy engine experiences a latency spike (e.g., due to a complex collision check), the robot must decide whether to:

  • Pause: Stop all motion until the check completes.
  • Continue: Execute the command and retroactively validate it.
  • Abort: Trigger an emergency stop.

The paper does not specify a strategy, but industrial safety standards typically require pause or abort.

Comparison to Traditional Robot Safety

ApproachEnforcement PointHandles Novel ActionsHandles LFM Non-DeterminismMulti-Agent Aware
Hardware E-stopsPhysical layerNoNoNo
PLC safety logicController firmwareNoNoNo
ROS safety nodesMiddlewarePartiallyNoPartially
ZTPMAgent-actuator boundaryYesYesYes

Traditional robot safety relies on hardware interlocks and pre-programmed logic. These systems cannot handle novel action sequences generated by LFMs or adapt to non-deterministic actuation parameters. ZTPM complements (does not replace) hardware safety by adding a policy layer that understands agent intent.

Technical Verdict

Use ZTPM when:

  • You are deploying LFM-powered agents to control physical systems (robots, drones, industrial equipment).
  • Your agents generate novel action sequences not seen during training.
  • You need to enforce safety constraints that depend on semantic intent, not just parameter ranges.
  • You have multiple agents coordinating physical actions and need to prevent one compromised agent from affecting others.
  • Your threat model includes prompt injection, agent poisoning, or malicious natural language commands.

Avoid ZTPM when:

  • Your robot control is fully deterministic with pre-programmed trajectories (traditional PLC logic is sufficient).
  • Latency requirements are so tight (<1ms) that any policy check is unacceptable (use hardware interlocks only).
  • Your agents do not have physical actuation authority (e.g., they only generate reports or recommendations).
  • You lack the engineering resources to define, test, and maintain typed policy primitives across five enforcement domains.

The core contribution is recognizing that agent orchestration security and physical actuation security are distinct problems requiring distinct solutions. As LFMs move from generating text to controlling motors, the policy layer between intent and action becomes critical infrastructure.

Tags

agentic-ai security orchestration infrastructure

Primary Source

arxiv.org