AISuite's Unified Provider API: What Andrew Ng's Multi-LLM Interface Reveals About Agent Portability

Andrew Ng’s aisuite (14.3k stars, trending #6 Python) ships a two-layer abstraction: a Chat Completions API that normalizes calls across OpenAI, Anthropic, Google, Mistral, Hugging Face, AWS, Cohere, Ollama, and OpenRouter, plus an Agents API that adds tools, toolkits, and MCP (Model Context Protocol) integration. The reference implementation is OpenCoworker, a desktop agent that handles file access, Slack/email, PDF generation, and scheduled automations while letting you swap the underlying model mid-session.

This is not another wrapper library. It is infrastructure for agent portability: the plumbing that lets you change providers without rewriting tool-calling logic, state management, or permission boundaries.

Architecture: Two Layers, One Interface

┌───────────────────────────────────────────────┐
│ OpenCoworker │ agent harness for everyday tasks
├───────────────────────────────────────────────┤
│ Agents API · Toolkits · MCP                   │
├───────────────────────────────────────────────┤
│ Chat Completions API                          │
├────────┬──────────┬────────┬────────┬─────────┤
│ OpenAI │ Anthropic│ Google │ Ollama │ Others  │
└────────┴──────────┴────────┴────────┴─────────┘

Chat Completions API normalizes the request/response shape. You call client.chat.completions.create(model="openai:gpt-4", messages=[...]) or model="anthropic:claude-3-5-sonnet" and get back the same structure. The library translates provider-specific quirks: Anthropic’s max_tokens requirement, Google’s generationConfig, Ollama’s local endpoint.

Agents API sits on top and adds:

Tool registration and execution
Toolkit bundles (file operations, web search, code execution)
MCP server integration for external context
State persistence across turns

OpenCoworker is the harness that wraps this stack with a desktop UI, permission dialogs, and a scheduler.

Tool Calling Normalization: The Hard Part

Each provider implements function calling differently:

Provider	Schema Format	Response Key	Parallel Calls
OpenAI	JSON Schema	`tool_calls`	Yes
Anthropic	Anthropic schema	`content` blocks	Yes
Google	FunctionDeclaration	`functionCall`	No (sequential)
Ollama	OpenAI-compatible	`tool_calls`	Depends on model

aisuite translates your tool definition into the provider’s native format before sending the request. When the response arrives, it parses the provider-specific structure back into a unified ToolCall object.

from aisuite import Client

client = Client()

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"}
                },
                "required": ["location"]
            }
        }
    }
]

# Same tool definition works across providers
response = client.chat.completions.create(
    model="openai:gpt-4",
    messages=[{"role": "user", "content": "What's the weather in SF?"}],
    tools=tools
)

# Switch provider, same code
response = client.chat.completions.create(
    model="anthropic:claude-3-5-sonnet-20241022",
    messages=[{"role": "user", "content": "What's the weather in SF?"}],
    tools=tools
)

The library handles:

Converting OpenAI’s tools array to Anthropic’s tools parameter
Mapping Google’s FunctionDeclaration to the unified schema
Extracting tool call IDs and arguments from different response shapes
Formatting tool results back into the next request

State Persistence and Provider Switching

The Agents API maintains conversation state independently of the provider. When you switch from openai:gpt-4 to anthropic:claude-3-5-sonnet mid-session, the agent:

Serializes the message history into a provider-agnostic format
Translates tool call records into the new provider’s schema
Re-sends context with the new model
Continues execution without losing state

This works because aisuite stores messages in a normalized structure:

{
    "role": "user" | "assistant" | "tool",
    "content": "string or null",
    "tool_calls": [{"id": "...", "function": {...}}],
    "tool_call_id": "string (for tool role)"
}

OpenCoworker uses this to let users switch models in settings without breaking active automations. A scheduled daily news summary can start with a local Ollama model, then fall back to OpenAI if the local model fails.

OpenCoworker: Reference Harness Implementation

OpenCoworker demonstrates production-grade agent plumbing:

Sandboxing and Permissions

File access requires explicit user approval per directory
Slack/email integrations use OAuth tokens stored in OS keychain
Code execution runs in isolated subprocess with timeout

Scheduled Automations

Cron-style scheduler backed by SQLite
Each automation stores its own model preference and tool allowlist
Failures trigger fallback to alternate provider if configured

Observability

All LLM calls logged to local SQLite with request/response/latency
Tool execution traces include input/output/duration
UI shows real-time token usage per provider

Deployment Shape

Electron app (macOS ARM64, Windows x64)
Bundled Python runtime with aisuite pre-installed
API keys stored in OS credential manager, never in config files
Updates via GitHub releases with signature verification

Failure Modes and Mitigation

Provider-Specific Rate Limits

aisuite does not implement retry logic. Your code must handle RateLimitError.
OpenCoworker uses exponential backoff with jitter and falls back to alternate provider after 3 failures.

Tool Schema Incompatibility

Some models (older Ollama versions, certain Hugging Face endpoints) do not support function calling.
The library raises ToolsNotSupportedError at request time. Check client.supports_tools(model) before registering tools.

Context Window Overflow

Switching from a 128k-context model (GPT-4) to an 8k model (older Claude) mid-session can truncate history.
aisuite does not auto-truncate. You must implement sliding window or summarization.

MCP Server Availability

MCP integration assumes the external server is reachable. Network failures surface as tool execution errors.
OpenCoworker retries MCP calls once, then marks the tool as unavailable for the session.

Trade-Offs: When Abstraction Costs You

Benefit	Cost
Swap providers without rewrite	Cannot use provider-specific features (e.g., Anthropic’s prompt caching, OpenAI’s structured outputs)
Unified tool calling schema	Lowest-common-denominator: no parallel calls on Google, no streaming tool results
State persistence across models	Serialization overhead on every turn
Local + cloud model support	Must handle wildly different latencies (Ollama: 2s, GPT-4: 200ms)

If you need Anthropic’s prompt caching or OpenAI’s JSON mode, you will bypass aisuite and call the provider SDK directly. The library is for portability, not feature maximization.

Technical Verdict

Use aisuite when:

You want to A/B test models without rewriting agent logic
You need fallback paths (cloud to local, OpenAI to Anthropic)
You are building a multi-tenant system where users bring their own API keys
You want to avoid vendor lock-in at the orchestration layer

Avoid aisuite when:

You need provider-specific features (prompt caching, structured outputs, vision with specific formatting)
You are optimizing for the absolute lowest latency (abstraction adds 10-20ms per call)
You require streaming tool calls (not yet supported across all providers)
You need fine-grained control over retry logic and rate limiting

OpenCoworker proves the architecture works for real desktop agents. The code under platform/ is the blueprint: permission boundaries, state management, and provider switching without breaking user workflows.