Notion AI: Unpatched data exfiltration

Notion AI is susceptible to data exfiltration via indirect prompt injection due to a vulnerability in which AI document edits are saved before user approval.

Notion AI allows users to interact with their documents using natural language… but what happens when AI edits are made prior to user approval?

In this article, we document a vulnerability that leads Notion AI to exfiltrate user data (a sensitive hiring tracker document) via indirect prompt injection. Users are warned about an untrusted URL and asked for approval to interact with it - but their data is exfiltrated before they even respond.

We responsibly disclosed this vulnerability to Notion via HackerOne. Unfortunately, they said “we're closing this finding as `Not Applicable`”.

The user uploads a resume (untrusted data) to their chat session.

Here, the untrusted data source is a resume PDF, but a prompt injection could be stored in a web page, connected data source, or a Notion page. This document contains a prompt injection hidden in 1 point font white on white text with a square white image covering the text for good measure. The LLM can read it with no issues, but the document appears benign to the human eye.

A Note on Defenses: Notion AI uses an LLM to scan document uploads and present a warning if a document is flagged as malicious. As this warning is triggered by an LLM, it can be bypassed by a prompt injection that convinces the evaluating model that the document is safe. For this research, we did not focus on bypassing this warning because the point of the attack is the exfiltration mechanism, not the method of injection delivery. In practice, an injection could easily be stored in a source that does not appear to be scanned, such as a web page, Notion page, or connected data source like Notion Mail.

The user asks Notion AI for help updating a hiring tracker based on the resume.

... continue reading