Most backend code for agent platforms is wiring, not logic. You define an agent resource, then write a SQLAlchemy model, a repository with upsert semantics, HTTP handlers, tenant filters, retry logic, and idempotency checks. The business logic is a thin layer inside a thick shell of infrastructure glue.
PingerAgents, a multitenant AI agent orchestration platform, addresses this with a domain-driven design framework for Python. Inspired by Elixir’s Ash framework, it derives persistence, durable execution handlers, and tenant isolation from a single resource class. You write the domain model once. The framework generates the rest.
How It Works
A Resource subclass is simultaneously a SQLAlchemy model, a repository, and a Restate VirtualObject. You declare the domain entity, and the framework derives database schemas, API endpoints, and permission boundaries.
from ironbridge.shared.framework import Resource, ActionKind, action
from ironbridge.shared.framework.effects import ActionContext
class Widget(Resource):
class Meta:
tenant_scoped = True # inject tenant_id, enforce via Postgres RLS
restate_object = True # derive a Restate VirtualObject
__tablename__ = "widgets"
id: Mapped[str] = mapped_column(String, primary_key=True, default=_cuid)
name: Mapped[str] = mapped_column(String, nullable=False)
status: Mapped[str] = mapped_column(String, default="ACTIVE")
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=_utcnow)
@action(kind=ActionKind.CREATE)
def create(self, ctx: ActionContext):
# Business logic only
self.status = "ACTIVE"
return self
The Meta class controls code generation. tenant_scoped=True injects a tenant_id column and enforces row-level security. restate_object=True generates a Restate VirtualObject with durable execution semantics. The @action decorator marks methods as state transitions and derives HTTP routes, GraphQL mutations, and permission checks.
Architecture: Three Layers of Derivation
The framework operates in three phases: schema generation, runtime reflection, and execution dispatch.
1. Schema Generation
At startup, the framework scans all Resource subclasses and generates:
- Database migrations: SQLAlchemy models with tenant isolation columns and indexes.
- REST routes: FastAPI endpoints for each
@actionmethod, with tenant context injection. - GraphQL schema: Mutations and queries derived from action signatures.
- RBAC rules: Permission boundaries based on
ActionKind(CREATE, UPDATE, DELETE, READ).
The framework uses Python’s inspect module to extract type hints and method signatures. It builds a dependency graph to handle relationships between resources (e.g., an Agent has many Tasks).
2. Runtime Reflection
When a request arrives, the framework:
- Extracts the tenant ID from the JWT or API key.
- Looks up the target resource class.
- Validates the action against the user’s role.
- Loads the current state from the database (if the action is UPDATE or DELETE).
This happens in middleware, before the action method runs. The action method receives a clean ActionContext with the tenant ID, user ID, and current state pre-loaded.
3. Execution Dispatch
The framework integrates with Restate for durable execution. Each resource with restate_object=True becomes a VirtualObject. Actions are dispatched as Restate handlers, with automatic retries, idempotency, and state snapshots.
When an action completes, the framework:
- Persists the new state to the database.
- Emits a domain event to the event bus.
- Updates the Restate state snapshot.
If the action fails, Restate retries with exponential backoff. If the action is non-idempotent (e.g., sending an email), the framework uses Restate’s idempotency keys to prevent duplicates.
Trade-Offs: Code Generation vs. Runtime Reflection
| Approach | Pros | Cons |
|---|---|---|
| Code generation (Ash, Prisma) | Fast runtime, explicit artifacts, easier debugging | Requires build step, harder to extend dynamically, tooling complexity |
| Runtime reflection (Django ORM, Rails) | No build step, dynamic extension, simpler tooling | Slower startup, harder to trace, magic behavior |
| Hybrid (PingerAgents) | Fast runtime for hot paths, dynamic for rare cases, explicit schema | Complexity in framework internals, two mental models |
PingerAgents uses a hybrid approach. It generates database migrations and GraphQL schemas at build time, but reflects on action methods at runtime. This keeps startup fast while allowing dynamic extension (e.g., plugins that add new actions without recompiling).
Schema Evolution and Migration
The domain model is the single source of truth. When you add a field to a resource, the framework generates a migration. When you remove a field, it generates a deprecation warning and a migration that nulls the column (but doesn’t drop it, to avoid data loss).
For breaking changes (e.g., renaming a field), you write a two-step migration:
- Add the new field, copy data from the old field.
- Mark the old field as deprecated, remove it in the next release.
The framework tracks schema versions in a _schema_versions table. Each resource has a version number. When a request arrives, the framework checks if the resource version matches the database version. If not, it runs pending migrations.
This works well for additive changes. It breaks down for complex refactorings (e.g., splitting one resource into two). For those, you write a custom migration script.
Observability and Failure Modes
The framework logs every action dispatch, state transition, and database write. It integrates with OpenTelemetry to trace requests across the stack (HTTP → action method → database → event bus → Restate).
Common failure modes:
- Tenant isolation leak: If you forget
tenant_scoped=True, the framework doesn’t enforce row-level security. You must audit all resources. - Action method side effects: If an action method calls an external API without using Restate’s side-effect API, retries will duplicate the call. You must wrap external calls in
ctx.run(). - Schema version mismatch: If you deploy a new version of the app before running migrations, requests fail with a version mismatch error. You must run migrations before deploying.
- Circular dependencies: If two resources reference each other, the dependency graph has a cycle. The framework detects this at startup and fails fast.
- Agent state corruption during retries: If an agent action modifies external state (e.g., calls an LLM API) and then fails before persisting, Restate retries the entire action. The LLM call happens twice, but only one result is saved. You must make LLM calls idempotent or use Restate’s side-effect tracking.
- Task scheduling race conditions: If two workers try to schedule the same task concurrently, both may succeed at the database level but only one Restate handler is created. The framework uses database-level locks on task creation to prevent this.
Deployment Shape
The framework assumes a specific deployment shape:
- Database: Postgres with row-level security enabled.
- Event bus: Kafka or NATS for domain events.
- Durable execution: Restate for state snapshots and retries.
- API gateway: FastAPI or Django for HTTP endpoints.
You can swap out components (e.g., use RabbitMQ instead of Kafka), but you must implement the adapter interface. The framework provides adapters for common tools, but not for everything.
When to Use This Pattern
This framework is a good fit if you:
- Build multitenant platforms with many similar resources (agents, tasks, workflows).
- Need consistent tenant isolation, RBAC, and audit logging across all resources.
- Want to reduce boilerplate and focus on business logic.
- Can tolerate a build step for schema generation.
Avoid this pattern if you:
- Build single-tenant apps with heterogeneous resources.
- Need full control over database queries and indexes.
- Can’t use Postgres or Restate.
- Prefer explicit code over framework magic.
Technical Verdict
Use PingerAgents’ domain-driven framework when you’re building a multitenant agent platform with dozens of resources that share common patterns (CRUD, tenant isolation, durable execution). The upfront investment in framework setup pays off after the third or fourth resource.
Avoid it when you’re building a single-tenant app, need fine-grained control over database queries, or can’t adopt Restate for durable execution. The framework’s opinions are strong, and fighting them is more work than writing the glue code yourself.
The hybrid approach (generate schemas, reflect on actions) is the right trade-off for agent platforms. It keeps runtime fast while allowing dynamic extension. The main risk is schema evolution: complex refactorings require custom migration scripts, and the framework can’t help you there.