Like many organizations, Wiki Education has grappled with generative AI, its impacts, opportunities, and threats, for several years. As an organization that runs large-scale programs to bring new editors to Wikipedia (we’re responsible for about 19% of all new active editors on English Wikipedia), we have deep understanding of what challenges face new content contributors to Wikipedia — and how to support them to successfully edit. As many people have begun using generative AI chatbots like ChatGPT, Gemini, or Claude in their daily lives, it’s unsurprising that people will also consider using them to help draft contributions to Wikipedia. Since Wiki Education’s programs provide a cohort of content contributors whose work we can evaluate, we’ve looked into how our participants are using GenAI tools.
We are choosing to share our perspective through this blog post because we hope it will help inform discussions of GenAI-created content on Wikipedia. In an open environment like the Wikimedia movement, it’s important to share what you’ve learned. In this case, we believe our learnings can help Wikipedia editors who are trying to protect the integrity of content on the encyclopedia, Wikipedians who may be interested in using generative AI tools themselves, other program leaders globally who are trying to onboard new contributors who may be interested in using these tools, and the Wikimedia Foundation, whose product and technology team builds software to help support the development of high-quality content on Wikipedia.
Our fundamental conclusion about generative AI is: Wikipedia editors should never copy and paste the output from generative AI chatbots like ChatGPT into Wikipedia articles.
Let me explain more.
AI detection and investigation
Since the launch of ChatGPT in November 2022, we’ve been paying close attention to GenAI-created content, and how it relates to Wikipedia. We’ve spot-checked work of new editors from our programs, primarily focusing on citations to ensure they were real and not hallucinated. We experimented with tools ourselves, we led video sessions about GenAI for our program participants, and we closely tracked on-wiki policy discussions around GenAI. Currently, English Wikipedia prohibits the use of generative AI to create images or in talk page discussions, and recently adopted a guideline against using large language models to generate new articles.
As our Wiki Experts Brianda Felix and Ian Ramjohn worked with program participants throughout the first half of 2025, they found more and more text bearing the hallmarks of generative AI in article content, like bolded words or bulleted lists in odd places. But the use of generative AI wasn’t necessarily problematic, as long as the content was accurate. Wikipedia’s open editing process encourages stylistic revisions to factual text to better fit Wikipedia’s style.
But was the text factually accurate? This fundamental question led our Chief Technology Officer, Sage Ross, to investigate different generative AI detectors. He landed on a tool called Pangram, which we have found to be highly accurate for Wikipedia text. Sage generated a list of all the new articles created through our work since 2022, and ran them all through Pangram. A total of 178 out of the 3,078 articles came back as flagged for AI — none before the launch of ChatGPT in late 2022, with increasing percentages term over term since then. About half of our staff spent a month during summer 2025 painstakingly reviewing the text from these 178 articles. Based on the discourse around AI hallucinations, we were expecting these articles to contain citations to sources that didn’t exist, but this wasn’t true: only 7% of the articles had fake sources. The rest had information cited to real, relevant sources. Far more insidious, however, was something else we discovered: More than two-thirds of these articles failed verification. That means the article contained a plausible-sounding sentence, cited to a real, relevant-sounding source. But when you read the source it’s cited to, the information on Wikipedia does not exist in that specific source. When a claim fails verification, it’s impossible to tell whether the information is true or not. For most of the articles Pangram flagged as written by GenAI, nearly every cited sentence in the article failed verification.
This finding led us to invest significant staff time into cleaning up these articles — far more than these editors had likely spent creating them. Wiki Education’s core mission is to improve Wikipedia, and when we discover our program has unknowingly contributed to misinformation on Wikipedia, we are committed to cleaning it up. In the clean-up process, Wiki Education staff moved more recent work back to sandboxes, we stub-ified articles that passed notability but mostly failed verification, and we PRODed some articles that from our judgment weren’t salvageable. All these are ways of addressing Wikipedia articles with flaws in their content. (While there are many grumblings about Wikipedia’s deletion processes, we found several of the articles we PRODed due to their fully hallucinated GenAI content were then de-PRODed by other editors, showing the diversity of opinion about generative AI among the Wikipedia community.
Revising our guidance
... continue reading