The New AI Consciousness Paper – By Scott Alexander

Most discourse on AI is low-quality. Most discourse on consciousness is super-abysmal-double-low quality. Multiply these - or maybe raise one to the exponent of the other, or something - and you get the quality of discourse on AI consciousness. It’s not great.

Out-of-the-box AIs mimic human text, and humans almost always describe themselves as conscious. So if you ask an AI whether it is conscious, it will often say yes. But because companies know this will happen, and don’t want to give their customers existential crises, they hard-code in a command for the AIs to answer that they aren’t conscious. Any response the AIs give will be determined by these two conflicting biases, and therefore not really believable. A recent paper expands on this method by subjecting AIs to a mechanistic interpretability “lie detector” test; it finds that AIs which say they’re conscious think they’re telling the truth, and AIs which say they’re not conscious think they’re lying. But it’s hard to be sure this isn’t just the copying-human-text thing. Can we do better? Unclear; the more common outcome for people who dip their toes in this space is to do much, much worse.

But a rare bright spot has appeared: a seminal paper published earlier this month in Trends In Cognitive Science, Identifying Indicators Of Consciousness In AI Systems. Authors include Turing-Award-winning AI researcher Yoshua Bengio, leading philosopher of consciousness David Chalmers, and even a few members of our conspiracy. If any AI consciousness research can rise to the level of merely awful, surely we will find it here.

One might divide theories of consciousness into three bins:

Physical: whether or not a system is conscious depends on its substance or structure.

Supernatural: whether or not a system is conscious depends on something outside the realm of science, perhaps coming directly from God.

Computational: whether or not a system is conscious depends on how it does cognitive work.

The current paper announces it will restrict itself to computational theories. Why? Basically the streetlight effect: everything else ends up trivial or unresearchable. If consciousness depends on something about cells (what might this be?), then AI doesn’t have it. If consciousness comes from God, then God only knows whether AIs have it. But if consciousness depends on which algorithms get used to process data, then this team of top computer scientists might have valuable insights!

So the authors list several of the top computational theories of consciousness, including:

... continue reading