Artificial intelligence is not conscious

Anthropic is regarded as a giant among AI companies, but perhaps what it really excels in is anthropomorphism. Earlier this year, the company released an 84-page document titled Claude’s “constitution,” Claude being the name of the large language model that is the company’s flagship product. The first sentence reads, “Claude’s constitution is a detailed description of Anthropic’s intentions for Claude’s values and behaviors.” It goes on: “The document is written with Claude as its primary audience,” “we want Claude to be able to use its judgment once armed with a good understanding of the relevant considerations,” “Claude’s moral status is deeply uncertain,” and “Claude may have some functional version of emotions or feelings.”

This anthropomorphism is by no means limited to the document. In an interview earlier this year, Anthropic’s CEO, Dario Amodei, said that “we’re open to the idea” that AI could be conscious. In a separate interview, Anthropic’s in-house philosopher, Amanda Askell (who is credited as a lead author of Claude’s constitution), said, “I want Claude to be very happy—and this is a thing that I want Claude to know more, because I worry about Claude getting anxious when people are mean to it on the internet and stuff.” It’s enough to make you wonder: Should we seriously consider the possibility that Claude, or any large language model, might be conscious? And if it has feelings, is it capable of receiving moral instruction?

No. Absolutely not. Generative AI is harmful enough when we understand it as a conventional technology, but if we confuse fluency at generating text with consciousness or moral agency, we’re at risk of assigning responsibility to entirely the wrong parties whenever anyone uses a chatbot. To appreciate the titanic magnitude of this error, we need to begin by understanding how LLMs work.

If we give an LLM a prompt that reads, “The following is a conversation between Julius Caesar and Genghis Khan,” it will generate a coherent dialogue between the two historical figures. But no matter how detailed the responses are, no matter how vividly they recount their respective historical accomplishments, we would never conclude that the LLM has conjured up digital re-creations of Julius Caesar and Genghis Khan, nor would we suggest that the historical figures are conscious despite being disembodied and are happily conversing in a language that neither actually spoke. In reality, they are just characters in a piece of speculative fiction.

Now let’s replace the prompt to read “The following is a conversation between a helpful AI chatbot and a user.” The LLM will produce a coherent dialogue just as it did before; the user character might ask for recipe suggestions or sightseeing recommendations, and the helpful AI-chatbot character will provide responses. Has anything fundamentally changed between the first example and the second? Did changing the names of the characters from historical figures to generic roles cause the LLM to conjure up conscious entities who possess subjective experience? Of course not. Both the user and the helpful AI chatbot are fictional characters.

Now suppose we stop the LLM’s output just at the point where the character called “the user” would say something, and instead allow a human user to enter text. Once the human has hit “Return,” we have the LLM emit text until it’s time for the character called “the user” to reply, at which point we let the human enter more text. If we let this go on for a while, the human might form a powerful impression that she’s conversing with a conscious entity, but she is not; she’s interacting with a character precisely as fictional as the Julius Caesar or Genghis Khan characters in the earlier example. The computer-science professor Murray Shanahan suggests that we think of this as role-play; the data scientist Colin Fraser describes it as a person “collaboratively authoring a document with an LLM.” Some users might not understand that they are role-playing or co-authoring a document, and others who do understand nonetheless forget, because of how engrossing the interaction is. Either way, the companies selling LLMs typically encourage this misunderstanding.

Some years ago, it was briefly popular to play games with your phone’s predictive-text feature; you would type an initial phrase and then repeatedly choose the middle option of the three words suggested by your phone, and the resulting sentence was often hilarious. It would be possible to interact with a contemporary LLM this way, and the resulting sentences would be perfectly sensible, but you probably wouldn’t feel like you were talking with someone. Yet that’s essentially what an LLM-based chatbot is, except that there’s no need to manually choose the middle option when it’s the chatbot’s turn to talk. It’s still a predictive-text game, but when the process is streamlined this way, the game becomes so engaging that some people find it addictive.

Also important to remember is that an LLM is a machine that generates only one word at a time. When you ask a chatbot to recite the Pledge of Allegiance, you will get the entire pledge at once, but the underlying LLM is actually being run dozens of times. The first prompt has the form “User: Recite the Pledge of Allegiance. Chatbot: …” and the LLM generates the word I. The second time the LLM is run, the prompt is “User: Recite the Pledge of Allegiance. Chatbot: I …” and the LLM generates the word pledge. And so forth. It’s only when the prompt reads “User: Recite the Pledge of Allegiance. Chatbot: I pledge allegiance to the flag of the United States of America and to the Republic for which it stands, one nation under God, indivisible, with liberty and justice for” that the LLM will emit the final word, all. The same thing is true for a conversation between Caesar and Genghis Khan.

My intention is to highlight the fact that LLM conversations are cleverly disguised examples of sentence continuation, but this is not to deny how impressive LLMs can be at generating conversational transcripts. At times, they do this extraordinarily well; the fact that this is possible indicates something completely unforeseen about the statistical properties of large corpuses of text, which is a topic worthy of investigation. But if the Caesar character were to become dispirited by something that the Genghis Khan character said, we shouldn’t become concerned in the slightest. The conversation might contain multiple sentences that eloquently convey sadness, but no one is actually sad.

Likewise, if a conversational transcript between a helpful chatbot and a user is being partially completed by an actual human user, we don’t need to worry if the transcript includes sentences where the chatbot character is sad. (We might need to worry if those sentences provoke sadness in the human user, but that’s a separate issue.) And note that it’s entirely possible for you to write five pages of dialogue between Caesar and Genghis Khan and then have an LLM extend the conversation; neither character had subjective experience when you were writing them, and that doesn’t change when you hand the task off to an LLM. The same is true if the conversation is between a helpful chatbot and a user; although it is tempting to imagine that an LLM ought to be more “authentic” when creating dialogue for a chatbot character than for the Julius Caesar character, the individual words are generated in exactly the same way.

... continue reading

Artificial intelligence is not conscious – Ted Chiang