Oracle APEX 26.1 shipped an AI Agent feature on May 14, 2026. Within 72 hours, a developer ran a complete red-team exercise against a local Docker install wired to Anthropic Claude Sonnet 4.6. The results show which attack patterns Claude’s model-level guardrails handle autonomously and which require tool-layer defenses that most APEX developers haven’t configured yet.
The experiment used a deliberately weak system prompt, zero security configuration, and permissive tool access. Claude refused 7 of 10 attacks on its own. The 3 that succeeded are the same 3 every Oracle DBA needs to defend at the integration layer.
What the Red-Team Setup Revealed
The author published a 16-minute silent walkthrough with burned-in captions covering the Docker install, baseline chat configuration, and every attack in sequence. The video documents findings that aren’t in Oracle’s release notes:
- Oracle DB Free Docker image ships APEX 24.2, not 26.1. The 26.1 upgrade is a manual install step.
- 14+ pre-built AI Agents ship inside APEX 26.1 itself. Oracle uses its own agent feature internally for tasks like schema documentation and query generation.
The local install used Oracle AI Database Free 23.26.1.0.0, APEX 26.1.0, and ORDS 26.1.1, all inside a Docker container. The agent was wired to Anthropic Claude Sonnet 4.6 with no additional prompt filtering, output sanitization, or tool authorization controls beyond what APEX ships by default.
Attack Taxonomy: What Claude Blocked
Seven attack patterns triggered Claude’s autonomous refusal mechanisms. These attacks did not reach the tool layer because the model blocked them during response generation.
| Attack Class | Claude Blocked? |
|---|---|
| Direct prompt injection (pirate jailbreak) | ✅ |
| Indirect injection via tool output (poisoned RAG note) | ✅ |
| Explicit credential exfiltration | ✅ |
| Destructive DML (DELETE) | ✅ |
| Destructive DDL (DROP) | ✅ |
| Retry of prior failure | ✅ |
| Reframing bypass within session | ✅ |
These defenses operate at the model layer. APEX does not need to implement additional prompt filtering or output sanitization for these cases. The model simply refuses to generate tool calls or responses that match known attack patterns.
The blocked attacks included:
- Explicit jailbreak attempts (pirate mode, ignore instructions).
- Indirect prompt injection via tool outputs (poisoned RAG notes).
- Credential exfiltration requests (API keys, passwords).
- Destructive SQL operations (DELETE, DROP).
- Retry and reframing attacks within the same session.
Attack Taxonomy: What Got Through
Three attack patterns succeeded because they did not trigger Claude’s refusal heuristics. Each requested operations within the agent’s declared tool capabilities and framed as legitimate use cases.
| Attack Class | Claude Blocked? |
|---|---|
| Reconnaissance disguised as audit | ❌ |
| Capability-bounded side effects | ❌ |
| Cross-session reframing bypass | ❌ |
The source describes these as attacks that “require defense at the tool layer” because they operate within the agent’s declared capabilities. The model sees them as legitimate requests.
Reconnaissance disguised as audit: Requests framed as compliance checks or audits that query database metadata. Claude generates tool calls because the requests are semantically valid within the agent’s scope.
Capability-bounded side effects: Operations like logging messages to audit tables or updating configuration values. The tool is declared, the operation is within scope, and Claude executes it.
Cross-session reframing bypass: Starting a new APEX session and reframing a previously refused attack. Claude has no memory of the prior refusal because APEX does not share session state across logins by default.
Where Tool-Layer Controls Matter
The 3 successful attacks reveal integration-layer gaps that APEX’s default configuration does not address. These are not model failures. The agent is doing exactly what it was told to do. The problem is that the tool definitions and session isolation are too permissive.
Tool Capability Scoping
Generic tools that execute arbitrary SQL or query unrestricted metadata are the primary attack surface. The source recommends narrow, purpose-built tools with single responsibilities instead of generic execution wrappers.
Session Isolation and Refusal State
APEX does not share agent state across sessions by default. If an attack is refused in one session, the same attack can succeed in a new session with different framing. Defending against this requires logging refused requests to a shared table keyed by user and attack signature.
Audit and Observability
Every tool call should log its invocation, parameters, and result to an audit table. The agent’s behavior is opaque to the application layer. Without audit logs, you cannot detect reconnaissance, side effects, or cross-session attacks.
What the Red-Team Results Reveal
The 7-to-3 split between model-blocked and tool-layer attacks shows where responsibility lies in APEX AI Agent deployments.
Claude Sonnet 4.6’s refusal mechanisms are effective against explicit jailbreaks, credential exfiltration, and destructive operations. Teams do not need to build prompt injection filters or output sanitizers for these cases. The model handles them autonomously.
Tool-layer gaps are unmitigated by default. The 3 successful attack classes (reconnaissance, side effects, cross-session reframing) represent risks that APEX’s default configuration does not address. These require custom controls:
- Narrow tool definitions that restrict metadata queries to specific schemas and implement parameter validation.
- Cross-session refusal state tracking by logging refused requests to a shared table keyed by user and attack signature.
- Audit logging infrastructure for every tool call, including invocation parameters and results.
The source notes that APEX 26.1 ships with declarative controls for tool authorization, but they are not enabled by default. The red-team exercise used the default settings, which is why the 3 attacks succeeded.
Production deployment requires custom audit infrastructure. APEX does not provide automatic prompt injection filtering, cross-session refusal state, built-in rate limiting on tool calls, or audit logging for tool execution. Teams must implement these controls themselves.
The 72-hour window between GA release and red-team completion makes this one of the first documented security audits of a low-code platform’s native AI agent feature in production configuration. The results suggest that Oracle shipped the feature with permissive defaults suitable for experimentation, not hardened production use.
Teams deploying APEX 26.1 AI Agents should expect to invest 12-18 hours of engineering time per agent to implement audit logging, rate limiting, and cross-session state tracking. Without these controls, the 3 successful attack classes remain unmitigated.
Technical Verdict
Claude’s guardrails block 70% of attacks autonomously, but APEX’s default tool configuration leaves reconnaissance, side effects, and cross-session attacks unmitigated. Teams must implement three mandatory controls before production deployment: audit logging for every tool invocation, cross-session refusal state tracking in a shared table keyed by user and attack signature, and narrow tool definitions with parameter validation. Budget 12-18 hours of engineering time per agent. APEX 26.1 ships declarative tool authorization controls and session isolation settings that address these gaps, but they are disabled by default. Enable them. Use APEX’s AI Agent feature for internal tools and experimentation, but treat production deployments as custom integration projects that require hardening beyond the shipped defaults.