Tech News
← Back to articles

A trillion dollars (potentially) wasted on gen-AI

read original related products more articles

Breaking news from famed machine learning researcher Ilya Sutskever:

Below is another summary of a just-released interview of his that is making waves, a bit more technical. Basically Sutskever is saying that scaling (achieving improvements in AI through more chips and more data) is flattening out, and that we need new techniques; he is even open to neurosymbolic techniques, and innateness. He is clearly not forecasting a bright future for pure large language models.

Sutskever also said that “The thing which I think is the most fundamental is that these models somehow just generalize dramatically worse than people. And it’s super obvious. That seems like a very fundamental thing.”

Some of this may come as news to a lot of the machine learning community; it might be surprising coming from Sutskever, who is an icon of deep learning, having worked, inter alia, on the critical 2012 paper that showed how much GPUs could improve deep learning, the foundation of LLMs, in practice. He is also a co-founder of OpenAI, considered by many to have been their leading researcher until he departed after a failed effort to oust Sam Altman.

But none of what Sutskever said should actually come as a surprise, especially not to readers of this Substack, or to anyone who followed me over the years. Essentially all of it was in my pre-GPT 2018 article “Deep learning: A Critical Appraisal”, which argued for neurosymbolic approaches to complement neural networks (as Sutskever now is), for more innate (i.e., built-in, rather than learned) constraints (what Sutskever calls “new inductive constraints”) and/or in my 2022 “Deep learning is hitting a wall” evaluation of LLMs, which explicitly argued that the Kaplan scaling laws would eventually reach a point of diminishing returns (as Sutskever just did), and that problems with hallucinations, truth, generalization and reasoning would persist even as models scaled, much of which Sutskever just acknowledged.

Subbarao Kambhampati, meanwhile, has been arguing or years about limits on planning with LLMs. Emily Bender has been saying for ages that an excess focus on LLMs has been “sucking the oxygen from the room” relative to other research approaches. The unfairly dismissed Apple reasoning paper laid bare the generalization issues; another paper called “Is Chain-of-Thought Reasoning of LLMs Mirage? A Data Distribution Lens” put a further nail in the LLM reasoning and generalization coffin.

None of what Sutskever said should come as a surprise. A machine learning researcher at Samsung, Alexia Jolicoeur-Martineau summed the situation up well on X, Tuesday, following the release of the Sutskever’s interview:

§

Of course it ain’t over til it’s over. Maybe pure scaling (adding more data and compute without fundamental architectural changes) will somehow magically yet solve what researchers as such Sutskever, LeCun, Sutton, Chollet and myself no longer think it could.

And investors may be loathe to kick the habit. As Phil Libin put it presciently last year, scaling—not the generation of new ideas—is what investors know best

... continue reading