Chat context windows expire. Agentic coding sessions span days. MetaBrain is a local document store designed to bridge that gap with a CLI that agents can discover and use without structured prompts or API keys.
The project wraps LevelDB in Swift for cross-platform deployment (Mac, Linux, Windows) and uses a filesystem-like path model with searchable content, tags, metadata, and retained versions. The CLI is the primary interface. A Mac native GUI is in App Store review.
Why Agents Need Persistent Memory
Agentic coding workflows hit a wall when context resets between sessions. The agent forgets decisions, task state, and workspace conventions. Developers compensate by pasting summaries into every new chat or maintaining scratch files that agents cannot reliably discover.
MetaBrain solves this by providing a workspace-local store at .metabrain/store.leveldb. Agents can write durable facts, search by content or metadata, and retrieve context without relying on in-context knowledge or external vector databases.
Key design choices:
- CLI-first: Commands like
mb put,mb search, andmb patchare simple enough for agents to learn from tool descriptions. - Workspace-local: No global config. The store lives in the project directory.
- Lexical search: Full-text search with tag and metadata filters, no embedding model required.
- Versioned documents: Updates keep snapshots. Unified diffs can be applied with
mb patch.
Architecture
MetaBrain has three components:
mbCLI: The primary interface for reading, writing, searching, and patching documents.mbddaemon: Optional local server that can manage multiple stores concurrently.MetaBrainCorelibrary: Shared Swift library for embedding in other tools.
Storage Layer
LevelDB provides the key-value store. Documents are indexed for lexical search. Each document carries:
- Path: Filesystem-like hierarchy (e.g.,
/tasks/release-checklist). - Body: Text content, optionally compressed.
- Tags: Freeform labels for filtering.
- Metadata: Key-value pairs (e.g.,
status=active). - References: Links to other documents or external resources.
- Versions: Retained snapshots of previous states.
Compression Strategy
MetaBrain borrows a trick from OpenZFS:
- Try ZSTD quick compression.
- If savings exceed 10%, apply ZSTD level 9.
- Otherwise, store uncompressed.
This avoids compression overhead for small or incompressible documents while maximizing storage efficiency for large text bodies.
Agent Discoverability
The CLI is designed for agents to discover and use without human intervention. Tool descriptions can expose commands like:
# Write a document
mb put /tasks/release-checklist \
"Prepare first public release." \
--tag release --meta status=active
# Search by content and tag
mb search "public release" --tag release
# Apply a unified diff
mb patch /tasks/release-checklist --patch-file change.diff
Agents can infer the schema from command output and adapt their queries based on search results.
Trade-offs: LevelDB vs. Vector Databases
| Dimension | LevelDB (MetaBrain) | Vector Database |
|---|---|---|
| Latency | Sub-millisecond key lookups | 10-100ms for similarity search |
| Storage | Local filesystem, no network | Often requires remote service |
| Query model | Lexical search with filters | Semantic similarity via embeddings |
| Setup | Zero config, workspace-local | Requires embedding model, API keys |
| Concurrency | Single-writer, multiple readers | Varies by implementation |
| Cost | Free, local compute | API costs or self-hosted infrastructure |
MetaBrain optimizes for low latency, zero setup, and predictable behavior. Vector databases excel at semantic search but introduce network dependencies, API costs, and embedding model drift.
For agentic coding workflows, lexical search often suffices. Agents can search by exact terms, tags, or metadata without needing to embed queries or manage similarity thresholds.
Concurrency and Failure Modes
LevelDB is single-writer, multiple-reader. If multiple agents try to write concurrently, one will block or fail. For multi-agent workflows, use the mbd daemon to serialize writes across stores. Within a single store, concurrency control is the caller’s responsibility.
Likely failure modes:
- Write conflicts: Two agents update the same document path simultaneously. Last write wins, no merge.
- Search index lag: If writes happen faster than indexing, search results may be stale.
- Compression overhead: ZSTD level 9 is CPU-intensive. For high-frequency writes, compression may bottleneck throughput.
- Disk space: Retained versions accumulate. No automatic pruning yet.
When Agents Should Use the Store
Agents should write to MetaBrain when:
- A decision or fact will be useful in future sessions.
- Task state needs to persist across context resets.
- Multiple agents need shared workspace knowledge.
Agents should skip the store when:
- The information is ephemeral (e.g., intermediate calculation).
- The context fits comfortably in the current window.
- The agent is already working from a retrieved document.
The CLI design encourages agents to check the store early in a session and write back before exiting.
Implementation Example
Here is a minimal agent loop that uses MetaBrain for task state. The subprocess pattern is intentional: agents see the CLI as a tool, not a library, which makes it discoverable through tool descriptions without requiring language-specific bindings.
import subprocess
import json
def mb_search(query, tag=None):
cmd = ["mb", "search", query]
if tag:
cmd.extend(["--tag", tag])
result = subprocess.run(cmd, capture_output=True, text=True)
return result.stdout.strip().split("\n")
def mb_put(path, content, tag=None, meta=None):
cmd = ["mb", "put", path, content]
if tag:
cmd.extend(["--tag", tag])
if meta:
for k, v in meta.items():
cmd.extend(["--meta", f"{k}={v}"])
subprocess.run(cmd, check=True)
# Agent starts a session
tasks = mb_search("status=active", tag="release")
if tasks:
print(f"Resuming: {tasks[0]}")
else:
print("No active tasks found.")
# Agent completes work and updates state
mb_put(
"/tasks/release-checklist",
"Release prep complete. Awaiting final review.",
tag="release",
meta={"status": "review"}
)
This pattern works with any agent framework that can shell out to CLI tools.
Technical Verdict
Use MetaBrain when:
- You need persistent, searchable context for agentic workflows (LangGraph, Claude MCP tools, custom agent loops).
- You want zero-config, workspace-local storage.
- Lexical search with tags and metadata is sufficient.
- You are building tools that need embeddable document memory.
- You need multi-agent coordination and can run the
mbddaemon to serialize writes.
Avoid MetaBrain when:
- You need semantic search or embedding-based retrieval.
- You require multi-writer concurrency without a daemon.
- You need automatic version pruning or garbage collection.
- You are working in a language ecosystem without easy subprocess access.
- Your agents already have reliable access to a vector database with acceptable latency.
MetaBrain fills a gap between ephemeral chat context and heavyweight vector databases. It gives agents a durable, discoverable memory layer without requiring API keys, network calls, or embedding models. The CLI-first design makes it easy for agents to adopt, and the workspace-local model keeps setup friction low.