Anthropic Cofounder Travels to Vatican, Tells Pope They’re Finding “Unsettling” Things Inside AI Models

A photograph of cofounder of US artificial intelligence (AI) company Anthropic, Christopher Olah, attending the Vatican. The image is color-treated in an offset screenprint style and set against a field of neon yellow.

Sign up to see the future, today Can’t-miss innovations from the bleeding edge of science and tech Email address Sign Up Thank you!

Ever since being anointed as the leader of the Catholic Church last year, Pope Leo has been an outspoken critic of AI. Most recently, in his first encyclical, he called for the tech to be “disarmed,” accusing it of facilitating the emergence of “new digital slaveries” and criticizing its enormous carbon footprint.

The rebuke, however, was made while sitting next to a highly unusual bedfellow: Anthropic billionaire and self-described atheist Chris Olah.

During a presentation of the encyclical, Olah argued that “religious communities, civil society, scholars, and governments” should intervene to set rules and stop AI from “dominating humanity,” as the pope put it in his letter.

The unlikely pairing up shows how Anthropic is going to extreme lengths to position itself as the ethical choice in the industry, emphasizing its work on AI safety and alignment.

At the same time, Anthropic continues to play a major role in establishing the precise world order Pope Leo warned against in his latest encyclical. That’s something that hasn’t flown over the heads of Anthropic’s leadership, with Olah forebodingly revealing that he and his team “keep finding things that are mysterious, even unsettling” during his remarks at the event.

The degree of dissonance is baffling. In his letter, the Pope stated outright that AI can only “imitate certain functions of human intelligence” and can’t “undergo experiences” and does not “possess a body” or “feel joy or pain.” Olah, on the other hand, seemingly contradicted him by arguing during his remarks that he and his team have found “internal states that functionally mirror joy, satisfaction, fear, grief, and unease.”

Put simply, Anthropic appears to want it both ways. The Claude developer is simultaneously playing a major part in the development of powerful and what it claims to be potentially dangerous AI models, while also sending delegations to the Vatican to call for more oversight.

Olah even went as far as to say that Anthropic is operating “inside a set of incentives and constraints that can sometimes conflict with doing the right thing,” painting his employer as exactly the kind of entity that’s attempting to assume “monopolistic control” over tech, as Pope Leo warned in his encyclical.

... continue reading