What I don’t like about chains of thoughts (2023)

What I don’t like about chains of thoughts and why language is a bottleneck to efficient reasoning.

20 may, 2023

Chain of thought (COT) is a formidable yet simple idea that empowers autoregressive LLM (gpt-like) models by allowing them to “reason” by explaning (in language) step by step how to solve a give problem. It has unlocked a lot of use cases that LLM (even instruct one) could not solve initially. It might have actually surprise AI researchers by how well it is working and has become a essential tool for AI practitioners.

As an engineer I do believe that COT will play a major role in converting current LLM capability into real world usage (and into money as well). Nevertheless I will in this blog post let my scientific self speak and explain why I believe that COT and the agent wave is mainly a hack of LLM to make them express reasoning in a inefficient way.

I thought that LLM can’t reason

It all started in the Jina AI office in Berlin in a heated after work AI debate with two colleagues of mine. The discussion quickly turned (as always in this office) into a debate around the capacity of current LLM and how far are we from human-like inteligence.

Even though I am amazed, like everybody, of the capacity of the GPT-like model I am in the camp of “We are still very far away from human intelligence let alone what people fantasize as AGI”.

At some point one of my colleagues started explaining that with the help of tool like Chain of Though and agents, current LLMs expose human-like cognitive ability like reasoning, reflection , ability to correct itself … and I had to step up and use my (what I thought was) deadly argument to show that LLM are still dumb in some way.

Did you know that it requires the same amount of compute for a LLM to perform the two following task ?

... continue reading