How 'overworked, underpaid' humans train Google's AI to seem smart

In the spring of 2024, when Rachael Sawyer, a technical writer from Texas, received a LinkedIn message from a recruiter hiring for a vague title of writing analyst, she assumed it would be similar to her previous gigs of content creation. On her first day of work a week later, however, her expectations went bust. Instead of writing words herself, Sawyer’s job was to rate and moderate the content created by artificial intelligence.

The job initially involved a mix of parsing through meeting notes and chats summarized by Google’s Gemini, and, in some cases, reviewing short films made by the AI.

On occasion, she was asked to deal with extreme content, flagging violent and sexually explicit material generated by Gemini for removal, mostly text. Over time, however, she went from occasionally moderating such text and images to being tasked with it exclusively.

“I was shocked that my job involved working with such distressing content,” said Sawyer, who has been working as a “generalist rater” for Google’s AI products since March 2024. “Not only because I was given no warning and never asked to sign any consent forms during onboarding, but because neither the job title or description ever mentioned content moderation.”

The pressure to complete dozens of these tasks every day, each within 10 minutes of time, has led Sawyer into spirals of anxiety and panic attacks, she says – without mental health support from her employer.

Sawyer is one among the thousands of AI workers contracted for Google through Japanese conglomerate Hitachi’s GlobalLogic to rate and moderate the output of Google’s AI products, including its flagship chatbot Gemini, launched early last year, and its summaries of search results, AI Overviews. The Guardian spoke to 10 current and former employees from the firm. Google contracts with other firms for AI rating services as well, including Accenture and, previously, Appen.

Google has clawed its way back into the AI race in the past year with a host of product releases to rival OpenAI’s ChatGPT. Google’s most advanced reasoning model, Gemini 2.5 Pro, is touted to be better than OpenAI’s O3, according to LMArena, a leaderboard that tracks the performance of AI models. Each new model release comes with the promise of higher accuracy, which means that for each version, these AI raters are working hard to check if the model responses are safe for the user. Thousands of humans lend their intelligence to teach chatbots the right responses across domains as varied as medicine, architecture and astrophysics, correcting mistakes and steering away from harmful outputs.

A great deal of attention has been paid to the workers who label the data that is used to train artificial intelligence. There is, however, another corps of workers, including Sawyer, working day and night to moderate the output of AI, ensuring that chatbots’ billions of users see only safe and appropriate responses.

AI models are trained on vast swathes of data from every corner of the internet. Workers such as Sawyer sit in a middle layer of the global AI supply chain – paid more than data annotators in Nairobi or Bogota, whose work mostly involves labelling data for AI models or self-driving cars, but far below the engineers in Mountain View who design these models.

Despite their significant contributions to these AI models, which would perhaps hallucinate if not for these quality control editors, these workers feel hidden.

... continue reading