AI tools can unmask anonymous accounts

is a London-based reporter at The Verge covering all things AI and Senior Tarbell Fellow. Previously, he wrote about health, science and tech for Forbes.

Posts from this author will be added to your daily email digest and your homepage feed.

Do you have a Reddit alt, secret X, finsta, or Glassdoor account you trash your boss with? AI might have just made it a lot easier to unmask you. That’s the conclusion of a recently published study, which hints at some uncomfortable consequences for staying private online — even if it’s not quite time to hold a funeral for anonymity just yet.

The finding, which has not been peer reviewed, comes from researchers at ETH Zurich, Anthropic, and the Machine Learning Alignment and Theory Scholars program. They built an automated system of AI agents using unspecified models — capable of searching the web and interacting with information much like a human investigator — to test how effectively large language models can reidentify anonymized material. The system “substantially outperforms” traditional computational techniques for deanonymizing accounts, scouring text for personal details at a grand scale.

The system works by treating posts or other texts as a set of clues. It analyzes the text for patterns — writing quirks, stray biographical details, posting frequency and timing — that might hint at someone’s identity. It then scans other accounts, potentially millions of them, looking for the same mix of traits. Probable matches are flagged, compared in more detail, and winnowed down into a shortlist of likely identities.

Rather than targeting unsuspecting users, the team evaluated the system using datasets built from publicly available posts, including content from Hacker News and LinkedIn, transcripts of Anthropic’s interviews with scientists on how they use AI, and Reddit accounts that were deliberately split into two anonymized halves for testing. The paper reports that in each setting the LLM-based approach correctly identified up to 68 percent of matching accounts with 90 percent precision. By contrast, comparable non-LLM methods, like connecting scattered data points across large datasets, identified almost none.

The results weren’t uniform across every dataset, and, predictably, the model performed better when it had more structured information to work with. In one experiment examining Reddit users posting about films in the main r/movies subreddit and smaller film communities, the system was able to link accounts that mentioned just one movie about 3 percent of the time at 90 percent precision. When users mentioned 10 or more films, the success rate climbed to nearly half.

An experiment using Anthropic’s survey of scientists, meanwhile, identified nine of the 125 respondents, a recall rate of roughly 7 percent. In that test, the system built a profile of each respondent based on clues in their answers and then searched publicly available information on the web for likely matches. In an example match, the researchers highlight how references to a “supervisor” could suggest a PhD student and that the use of British English could hint at a UK affiliation. Combined with mentions of a background in the physical sciences and current work in biology research, the system was able to narrow the field to a particular candidate.

Still, the researchers argue that the ability to identify any respondents from unstructured text is noteworthy, replicating in minutes what would have taken a human investigator hours to do. Moreover, they told The Verge that performance is likely to improve as AI systems grow more capable and gain access to larger pools of data. More broadly, they caution that it may no longer be safe to assume that posting pseudonymously will protect online identities, past or future.

“Every single thing the LLM found in principle could be found by a human investigator.”

... continue reading