CodeGraph: How Pre-Indexed Knowledge Graphs Cut Claude Code's Tool Calls by 94%

Claude Code’s Explore agents burn tokens by spawning grep, glob, and Read tool calls to understand codebases. CodeGraph replaces that file-scanning loop with a pre-indexed knowledge graph of symbol relationships and call graphs. The result: 92% fewer tool calls and 71% faster execution across six production codebases.

This is not a vector database or external service. It’s a local filesystem graph that agents query instead of reading files.

The Token Budget Problem

When Claude Code needs to understand how a function is used across a codebase, it spawns an Explore sub-agent. That agent:

Greps for the function name
Globs for candidate files
Reads each file to extract context
Repeats for related symbols

Each tool call costs tokens. Each file read adds latency. In the VS Code codebase, a single exploration question triggered 52 tool calls and took 97 seconds.

CodeGraph pre-indexes the codebase into a graph of symbols, references, and call relationships. The agent queries the graph with a single tool call and gets structured results in under 20 seconds.

Architecture: How the Graph Gets Built

CodeGraph uses language-specific parsers to extract:

Symbols: functions, classes, methods, variables
References: where each symbol is imported or called
Call graphs: which functions invoke which other functions
File boundaries: module structure and exports

The graph is stored as JSON on disk, one file per project. No database, no server, no external dependencies. The graph handles cycles by storing edges without traversal limits; agents must manage recursion depth in their queries.

Supported Languages

TypeScript/JavaScript (tree-sitter)
Python (ast module)
Rust (syn crate)
Java (JavaParser)
Swift (SwiftSyntax)
C++ (libclang)

Each parser extracts the same schema: symbol name, type, location, and edges to other symbols.

Incremental Updates

CodeGraph watches the filesystem for changes. When a file is modified:

Re-parse only that file
Diff the new symbols against the existing graph
Update edges for affected references
Write the updated graph to disk

Full re-indexing can be triggered manually via codegraph rebuild or when the project structure shifts (new directories, moved files).

Benchmark Results: Token Savings at Scale

Tested across six real codebases with identical Explore queries. Each test spawned a single agent asking the same question with and without CodeGraph.

Codebase	Tool Calls (CG)	Tool Calls (No CG)	Time (CG)	Time (No CG)	Improvement
VS Code (TypeScript)	3	52	17s	1m 37s	94% fewer calls, 82% faster
Excalidraw (TypeScript)	3	47	29s	1m 45s	94% fewer calls, 72% faster
Claude Code (Python/Rust)	3	40	39s	1m 8s	93% fewer calls, 43% faster
Claude Code (Java)	1	26	19s	1m 22s	96% fewer calls, 77% faster
Alamofire (Swift)	3	32	22s	1m 39s	91% fewer calls, 78% faster
Swift Compiler (Swift/C++)	6	37	35s	2m 8s	84% fewer calls, 73% faster

Average across all tests: 92% fewer tool calls, 71% faster execution.

The token savings scale with codebase size. Larger projects with more files see bigger reductions because the graph query cost stays constant while file-scanning cost grows linearly.

Installation and Integration

CodeGraph installs as a global npm package and auto-configures Claude Code:

npx @colbymchenry/codegraph

The installer:

Detects your Claude Code installation
Injects the CodeGraph tool into the MCP server config
Adds the graph query tool to the available tool list

To initialize a project:

cd your-project
codegraph init -i

This scans the project, builds the initial graph, and writes .codegraph/graph.json to disk. The graph updates automatically as you edit files.

Tool Call Flow: Graph Query vs File Scan

Without CodeGraph

Agent wants to find all callers of parseConfig():

Grep for parseConfig across all files (1 tool call)
Glob for TypeScript files in likely directories (1 tool call)
Read each candidate file to extract context (N tool calls)
Grep again for related symbols discovered in step 3 (M tool calls)

Total: 2 + N + M tool calls, where N and M grow with codebase size.

With CodeGraph

Agent queries the graph for symbol relationships and call sites. The query returns all call sites with file paths and line numbers, symbol definitions, and related functions in the call chain.

Total: 1 tool call.

(Illustrative query structure; see GitHub README for current tool schema.)

Failure Modes and Fallback Strategy

Stale Graph

If the graph becomes out of sync with the filesystem (user edited files while the watcher was inactive), queries may return outdated results. The agent may reference functions that no longer exist or miss new code.

Fallback: Agents can request file scans as a fallback. If a graph query returns unexpected or incomplete results, the agent can fall back to file scanning. The cost is the same as running without CodeGraph, but you only pay it when the graph fails.

Incomplete Graph

Languages without full parser support (e.g., dynamically generated code, macros, reflection) may have missing edges in the graph.

Mitigation: The agent can combine graph queries with targeted file reads. Query the graph for known symbols, then use Read to fill gaps for dynamic or generated code.

Parser Errors

Syntax errors or unsupported language features can cause the parser to skip files. Parse failures are logged, and you can inspect which files were indexed and which were skipped.

Deployment Shape

CodeGraph runs entirely on the developer’s machine. No cloud service, no API keys, no network calls.

Storage: Graph files are stored on disk in .codegraph/graph.json. Size scales with project complexity.

Memory: Indexing uses memory proportional to project size. The graph stays on disk; queries load only the relevant subgraph into memory.

CPU: Initial indexing completes in seconds to minutes depending on project size. Incremental updates are fast enough for interactive use.

Security Boundaries

The graph contains symbol names, file paths, and code structure. It does not store full source code, but it does expose:

Function and variable names
Module organization
Call relationships

If your codebase includes secrets in symbol names (e.g., API_KEY_PROD), those names appear in the graph. Recommend adding .codegraph/ to .gitignore to prevent graph files from being committed.

Claude Code reads the graph via the MCP tool interface. The agent cannot write to the graph or modify the filesystem through CodeGraph. All mutations happen through the filesystem watcher, which only responds to actual file changes.

Technical Verdict

Use CodeGraph if:

Codebase >500 files AND token costs are tracked
Agent exploration tasks run frequently (multiple times per day)
Supported language with stable project structure
Local development environment with filesystem access

Avoid CodeGraph if:

Project <100 files (graph overhead exceeds savings)
Unsupported language or heavy code generation (incomplete graph)
Distributed team without per-developer graph sync strategy
Agent orchestration already limits tool calls via other mechanisms

The core value is token cost reduction. If you’re not paying for API calls or your codebases are small, the benefit is marginal. For teams running agents on large projects, the 92% reduction in tool calls translates directly to lower bills and faster iteration.