MOSS: Self-Rewriting Agents That Evolve Their Own Source Code

Most deployed agents are static. They fail the same way twice, log the error, and wait for a human to ship a fix. Self-evolving agents exist, but they confine mutation to text artifacts: skill files, prompt templates, memory schemas, workflow graphs. The agent harness itself (routing logic, hook ordering, state invariants, dispatch) stays frozen in code.

MOSS crosses that boundary. It rewrites the agent’s own Python source at runtime, then replays production failures against the modified code in ephemeral workers before promoting the change via container swap.

Why Source-Level Rewriting Matters

Text-mutable artifacts are a subset of what an agent can do. If your routing logic lives in a YAML workflow, you can mutate it. If your routing logic lives in a Python function that inspects request headers and decides which tool to call, you cannot touch it without rewriting code.

Source-level adaptation is Turing-complete. It can express any change a text artifact can express, plus structural changes: reordering middleware, adding exception handlers, changing state machine transitions, inserting observability hooks.

It also takes effect deterministically. A prompt change depends on the base model’s compliance. A code change either compiles or it does not.

Architecture: Pipeline, Replay, Swap

MOSS operates in stages:

Failure curation: Automatically batch production failures that share a root cause.
Code modification: Delegate the rewrite to a pluggable external coding agent (CLI tool, LLM-backed editor, symbolic patcher).
Verification: Replay the failure batch against the modified code in ephemeral trial workers.
Promotion: If replay passes, swap the running container in place. If health probes fail, roll back.

The system does not decide what to change. It orchestrates the pipeline and enforces stage boundaries. The coding agent is a black box that receives failure evidence and returns a patch.

State Isolation

Each trial worker runs in an ephemeral container with a snapshot of production state at the time of the original failure. The replay is deterministic: same inputs, same environment, same initial state.

If the modified code passes replay, MOSS promotes it by replacing the running container. The old container stays warm for rollback. Health probes run continuously. If response latency spikes or error rate crosses a threshold, the system reverts to the previous image.

Version Control and Audit

Every rewrite is a Git commit. The failure batch, the patch, the replay logs, and the promotion decision are all stored in a structured audit log. You can replay any historical failure against any historical version of the agent.

This is not optional. Without deterministic replay, you cannot verify that a change fixed the failure. Without version control, you cannot roll back or understand what changed.

Failure Modes and Boundaries

Table 1: MOSS Risk Mitigation Matrix

Risk	Mitigation	Residual Exposure
Infinite rewrite loop	Rate limit rewrites per time window; require human approval after N consecutive failures	Agent stuck in degraded mode if loop detection is too aggressive
Cascading failure from bad code	Replay in ephemeral workers; health-probe-gated rollback	Window between promotion and rollback detection (seconds to minutes)
State corruption from rewrite	Snapshot state before replay; isolate trial workers from production DB	Replay may not catch state bugs that only appear under load
Malicious patch injection	Require signed commits; audit all external coding-agent outputs	Compromised coding agent can still generate valid but harmful patches
Drift from human-maintained codebase	Periodic reconciliation with main branch; flag conflicts for manual review	Divergence accumulates if reconciliation is infrequent

The biggest risk is not technical failure. It is loss of human understanding. If the agent rewrites itself 50 times, the deployed code no longer matches what any engineer has read. You need tooling to diff the live agent against the last human-reviewed version and flag semantic changes.

Code Shape: Replay Harness

# Pseudocode; actual implementation depends on container runtime (Docker, Kubernetes, etc.)

class ReplayHarness:
    def __init__(self, failure_batch, candidate_image):
        self.batch = failure_batch
        self.image = candidate_image
        self.results = []

    def run_trial(self):
        for failure in self.batch:
            worker = EphemeralWorker(
                image=self.image,
                state_snapshot=failure.state_snapshot,
                env=failure.env
            )
            result = worker.replay(
                request=failure.request,
                timeout=failure.timeout * 1.5
            )
            self.results.append({
                "failure_id": failure.id,
                "passed": result.status == "success",
                "latency": result.latency,
                "logs": result.logs
            })
            worker.destroy()
        return all(r["passed"] for r in self.results)

    def promote_if_passed(self):
        if self.run_trial():
            swap_container(
                old_image=current_agent_image(),
                new_image=self.image,
                health_check=lambda: check_error_rate() < 0.01
            )
            return "promoted"
        else:
            return "rejected"

The harness does not care what the coding agent changed. It only cares whether the modified code passes replay. If it does, the change is safe to promote. If it does not, the change is discarded.

When to Use This

You should consider source-level rewriting if:

Your agent has structural failure modes that cannot be fixed by changing prompts or skill files (routing bugs, state machine deadlocks, missing exception handlers).
You can afford the operational overhead of ephemeral workers, replay infrastructure, and continuous health monitoring.
You have a deterministic way to reproduce failures (request logs, state snapshots, environment variables).
You trust your coding agent to generate safe patches, or you have strong enough verification to catch unsafe ones.

You should avoid this if:

Your agent is stateless or its failures are always prompt-fixable.
You cannot snapshot state or replay requests deterministically.
You do not have the engineering capacity to maintain version control, audit logs, and rollback automation.
Your deployment environment does not support in-place container swaps or health-probe-gated rollback.

Technical Verdict

MOSS is infrastructure for agents that need to fix their own plumbing. It is not a replacement for human-driven refactoring. It is a way to automate the loop from production failure to verified fix to deployment, with the agent itself as the patch author.

The key insight is that source code is just another mutable artifact, and if you can replay failures deterministically, you can verify code changes the same way you verify prompt changes. The hard part is not the rewriting. The hard part is the replay harness, the rollback automation, and the audit trail that lets you understand what the agent did.

This approach is viable if you can afford ephemeral worker overhead and deterministic replay infrastructure. If your agents are stateless or your failures are always fixable in the text layer, stick with skill-file evolution.