This article is a collaboration between The Verge and New York Magazine.
T
Katya clicked and was taken to a page for another company, called Mercor, where she was instructed to interview on-camera with an AI named Melvin. “It just seemed like the sketchiest thing in the world,” Katya says. She closed the tab. But a few weeks later, still unemployed, she got a message inviting her to apply to Mercor. This time, she looked up the company. Mercor, it seemed, sold data to train AI, and she was being recruited to create that data. “My job is gone because of ChatGPT, and I was being invited to train the model to do the worst version of it imaginable,” she says. The idea depressed her. But her financial situation was increasingly dire, and she had to find a new place to live in a hurry, so she turned on her webcam and said “hello” to Melvin.
It was a strange, if largely pleasant, experience. Manifesting on Katya’s laptop as a disembodied male voice, Melvin seemed to have actually read her résumé and asked specific questions about it. A few weeks later, Katya, who like most workers in this story asked to use a pseudonym out of fear of retaliation, received an email from Mercor offering her a job. If she accepted, she should sign the contract, submit to a background check, and install monitoring software onto her computer. She signed immediately.
She was added to a Slack channel, where it was clear she was entering a project already underway. Hundreds of people were busy writing examples of prompts someone might ask a chatbot, writing the chatbot’s ideal response to those prompts, then creating a detailed checklist of criteria that defined that ideal response. Each task took several hours to complete before the data was sent to workers stationed somewhere down the digital assembly line for further review. Katya wasn’t told whose AI she was training — managers referred to it only as “the client” — or what purpose the project served. But she enjoyed the work. She was having fun playing with the models, and the pay was very good. “It was like having a real job,” she says.
Two days after Katya started, the project was abruptly paused. A few days after that, a supervisor popped into the room to let everyone know it had been canceled. “I’m working assuming that I can plan around this. I’m saving up for first and last month’s rent for an apartment,” Katya says, “and then I’m back on my ass. No warning, no security, nothing.” Several days later, she got an email from Mercor with another offer, this one for a job evaluating what seemed to be conversations between chatbots and real users — many appeared to be from people in Malaysia and Vietnam practicing English — according to various criteria, like how well the chatbot followed instructions and the appropriateness of its tone. Sign the contract, the email said, and you’ll have a Zoom onboarding call in 45 minutes. It was 6:30PM on a Sunday night. Scarred from the abrupt disappearance of the previous gig, she accepted the offer and worked until she couldn’t stay awake.
Machine-learning systems learn by finding patterns in enormous quantities of data, but first that data has to be sorted, labeled, and produced by people. ChatGPT got its startling fluency from thousands of humans hired by companies such as Scale AI and Surge AI to write examples of things a helpful chatbot assistant would say and to grade its best responses. A little over a year ago, concerns began to mount in the industry about a plateau in the technology’s progress. Training models based on this type of grading yielded chatbots that were very good at sounding smart but still too unreliable to be useful. The exception was software engineering, where the ability of models to automatically check whether bits of code worked — did the code compile, did it print HELLO WORLD — allowed them to trial-and-error their way to genuine competence.
The problem was that few other human activities offer such unambiguous feedback. There are no objective tests for whether financial analysis or advertising copy is “good.” Undeterred, AI companies set out to make such tests, collectively paying billions of dollars to professionals of all types to write exacting and comprehensive criteria for a job well done. Mercor, the company Katya stumbled upon, was founded in 2023 by three then-19-year-olds from the Bay Area, Brendan Foody, Adarsh Hiremath, and Surya Midha, as a jobs platform that used AI interviews to match overseas engineers with tech companies. The company received so many inquiries from AI developers seeking professionals to produce training data that it decided to adapt. Last year, Mercor was valued at $10 billion, making its trio of founders the world’s youngest self-made billionaires. OpenAI has been a client; so has Anthropic.
Each of these data companies touts its stable of pedigreed experts. Mercor says around 30,000 professionals work on its platform each week, while Scale AI claims to have more than 700,000 “M.A.’s, Ph.D.’s, and college graduates.” Surge AI advertises its Supreme Court litigators, McKinsey principals, and platinum recording artists. These companies are hiring people with experience in law, finance, and coding, all areas where AI is making rapid inroads. But they’re also hiring people to produce data for practically any job you can imagine. Job listings seek chefs, management consultants, wildlife-conservation scientists, archivists, private investigators, police sergeants, reporters, teachers, and rental-counter clerks. One recent job ad called for experts in “North American early to mid-teen humor” who can, among other requirements, “explain humor using clear, logical language, including references to North American slang, trends, and social norms.” It is, as one industry veteran put it, the largest harvesting of human expertise ever attempted.
These companies have found rich recruiting ground among the growing ranks of the highly educated and underemployed. Aside from the 2008 financial crash and the pandemic, hiring is at its lowest point in decades. This past August, the early-career job-search platform Handshake found that job postings on the site had declined more than 16 percent compared with the year before and that listings were receiving 26 percent more applications. Meanwhile, Handshake launched an initiative last year connecting job seekers with roles producing AI training data. “As AI reshapes the future of work,” the company wrote, announcing the program, “we have the responsibility to rethink, educate, and prepare our network to navigate careers and participate in the AI economy.”
... continue reading