mech.app
Dev Tools

Claude Code's Terminal Agent Architecture: How Anthropic Routes Commands Between Chat, Execution, and Git Workflows

How Claude Code decides when to execute shell commands, read files, or respond directly. Routing logic, state persistence, and git workflow guardrails.

Source: github.com
Claude Code's Terminal Agent Architecture: How Anthropic Routes Commands Between Chat, Execution, and Git Workflows

Anthropic just open-sourced Claude Code, a terminal-native agent that sits in your shell and decides whether your request needs a file read, a shell command, a git operation, or just a conversational response. It hit #3 on GitHub trending for Python, and the repo exposes production patterns for routing agent decisions across multiple execution contexts without explicit mode switching.

The interesting problem is not that it calls tools. It’s how it decides which tool to call, when to skip tools entirely, and how it maintains state across multi-turn coding sessions where context includes conversation history, file contents, and git status.

Routing Architecture

Claude Code runs a continuous decision loop. Each user input triggers a classification step that routes to one of four execution paths:

  • Direct response: Conversational queries, explanations, or clarifications that don’t require file or shell access.
  • File operations: Reading, writing, or searching files when the request references specific paths or code patterns.
  • Shell execution: Running commands, installing dependencies, or testing code when the request implies action.
  • Git workflows: Commits, branches, PRs, or status checks when the request involves version control.

The agent does not ask “which mode do you want?” It infers intent from the natural language input and the current project state. If you say “explain this function,” it reads the file. If you say “run the tests,” it executes a shell command. If you say “commit these changes,” it invokes git tools.

The routing layer sits between the LLM and the tool execution runtime. It parses the LLM’s tool call requests, validates them against the current working directory and git state, and either executes or rejects them based on safety rules.

State Persistence

Claude Code maintains three layers of state:

State LayerContentsStorage LocationLifetime
Conversation historyUser inputs, agent responses, tool resultsIn-memory session bufferUntil terminal exit
File contextRecently read or modified files, search resultsEphemeral cache tied to conversationCleared on new session
Git metadataCurrent branch, uncommitted changes, remote statusQueried on-demand via git CLIPersistent across sessions

Conversation history persists only in memory. When you close the terminal, the session ends. This avoids the complexity of serializing and rehydrating multi-turn context, but it means you lose continuity if the process crashes or you restart.

File context is built incrementally. When the agent reads a file to answer a question, it keeps that file in the conversation context. If you ask a follow-up question about the same file, the agent already has it. But this context does not persist to disk. Each new claude invocation starts with an empty file cache.

Git metadata is stateless from the agent’s perspective. Every time the agent needs to know the current branch or uncommitted changes, it shells out to git status or git branch. This avoids stale state but adds latency on every git-related decision.

Git Workflow Guardrails

Git operations are the highest-risk tool category. A bad git reset --hard or git push --force can destroy work. Claude Code implements three layers of guardrails:

  1. Read-only by default: The agent can query git status, log, and diff without confirmation. Write operations (commit, push, branch creation) require explicit user approval.
  2. Destructive operation blocking: Commands like git reset --hard, git clean -fd, or git push --force are flagged and require a second confirmation with a warning message.
  3. Uncommitted change detection: Before executing any git write operation, the agent checks for uncommitted changes and warns the user if they exist.

These guardrails are implemented in the tool execution layer, not in the LLM prompt. The LLM can request any git command, but the runtime validates it before execution. This separation means you can trust the guardrails even if the LLM hallucinates or misunderstands the request.

Tool Invocation Flow

Here’s what happens when you type “commit these changes with a message about the bug fix”:

  1. LLM receives input: The user message plus conversation history and file context.
  2. LLM emits tool call: git_commit(message="Fix bug in parser logic").
  3. Runtime validates: Checks for uncommitted changes (pass), checks for destructive flags (none), checks user approval setting (required).
  4. User confirmation prompt: “Commit 3 files with message ‘Fix bug in parser logic’? (y/n)”
  5. Execution: If approved, shells out to git commit -m "Fix bug in parser logic".
  6. Result injection: Stdout/stderr from git command is appended to conversation history.
  7. LLM follow-up: Confirms success or explains any errors.

The conversation history now includes the tool call, the user’s approval, and the git output. If you ask “did that work?” the agent already knows the result.

Plugin System

Claude Code ships with a plugin architecture that lets you extend the tool set. Plugins are Python modules that register new commands or agents. The repo includes several examples:

  • Custom commands: Add new slash commands like /deploy or /test that map to shell scripts or API calls.
  • Custom agents: Register new tool categories (e.g., database queries, cloud API calls) that the LLM can invoke.

Plugins hook into the same routing and validation layer as built-in tools. You define the tool schema, the execution function, and any guardrails. The LLM learns about new tools through an updated system prompt that includes the plugin’s tool definitions.

This is useful for teams that want to encode domain-specific workflows (e.g., “deploy to staging” or “run security scan”) without forking the core agent.

Data Collection and Eval Infrastructure

Claude Code collects “code acceptance or rejections” and “associated conversation data” when you use it. This is not just telemetry. It’s training data for evaluating agent performance.

The repo does not expose the eval pipeline, but the data collection signals what Anthropic is measuring:

  • Acceptance rate: How often users approve tool calls versus rejecting them.
  • Conversation length: How many turns it takes to complete a task.
  • Tool call accuracy: Whether the agent invokes the right tool for a given request.

This data feeds back into model training and prompt engineering. If users frequently reject git commits, that signals the agent is misunderstanding intent. If conversations run long, that signals the agent is not inferring context efficiently.

Deployment Shape

Claude Code offers four installation methods:

  • Curl script: Downloads a binary and installs it to /usr/local/bin.
  • Homebrew cask: Installs via the macOS package manager.
  • WinGet: Installs via the Windows package manager.
  • NPM (deprecated): The original distribution method, now discouraged.

The shift from NPM to native installers signals a move toward treating Claude Code as a system-level tool rather than a Node.js package. This avoids Node version conflicts and makes it easier to bundle with IDEs or CI/CD pipelines.

The binary is a standalone executable that shells out to git, node, and other system tools. It does not embed a runtime. This keeps the binary small but requires the user to have git and other dependencies installed.

Likely Failure Modes

Stale file context: If you edit a file outside the terminal session, the agent’s cached version is out of date. It will answer questions based on the old content until you explicitly ask it to re-read the file.

Git state race conditions: If you run git commands outside Claude Code while a session is active, the agent’s understanding of branch state or uncommitted changes can drift. It queries git on-demand, but there’s a window between the query and the tool execution where state can change.

Tool call hallucination: The LLM can request tools that don’t exist or pass invalid arguments. The runtime validates tool schemas, but it can’t catch semantic errors (e.g., committing to the wrong branch).

Session loss on crash: Because conversation history lives in memory, a terminal crash or accidental exit loses all context. You can’t resume a multi-turn task without starting over.

Technical Verdict

Use Claude Code when you want a terminal agent that handles routine git workflows, file operations, and shell commands without switching between tools. The routing logic is solid, the guardrails prevent most destructive mistakes, and the plugin system lets you encode team-specific workflows.

Avoid it when you need persistent session state across terminal restarts, or when you’re working in a repository with frequent external changes (e.g., a team pushing to the same branch). The in-memory state model and on-demand git queries create race conditions that are hard to debug.

The architecture is a good reference for anyone building terminal agents. The separation between LLM reasoning, tool validation, and execution runtime is clean. The guardrails are implemented in code, not prompts. The plugin system is extensible without being over-engineered.