A Grim Truth Is Emerging in Employers’ AI Experiments

Sign up to see the future, today Can’t-miss innovations from the bleeding edge of science and tech Email address Sign Up Thank you!

The tremendous hype surrounding AI coding shows no signs of dying down. Last month, Anthropic released a suite of industry-specific plug-ins for its Claude Cowork AI agent, panicking investors over fears that traditional enterprise software-as-a-service companies could soon be made obsolete. The announcement triggered a trillion-dollar sell-off, with many tech companies seeing sharp declines in their share prices.

It even seemed to jolt Sam Altman’s OpenAI, which moved to drop many of its distracting “side quests” in a concerted effort to double down on coding and enterprise-specific AI tools.

Yet plenty of glaring questions about the long-term viability of AI programming prevail, with some warning that questionable and unverified code could come to spell disaster for corporations that eagerly embrace it.

Indeed, contrary to the hype, researchers have consistently found that AI-generated code is a bug-filled mess, forcing some programmers to pick up the pieces.

“No one knows right now what the right reference architectures or use cases are for their institution,” Dorian Smiley, CTO and founder of AI software engineering company Codestrap, told The Register.

“From the large language model perspective, people aren’t really addressing the fallibility of the underlying text,” CEO Connor Deeks added.

As software engineers continue to be put under pressure to use AI for their work — or else land on the chopping block — many errors could fall through the cracks.

“Even within the coding, it’s not working well,” Smiley told The Register. “Code can look right and pass the unit tests and still be wrong.”

The executive explained that the benchmarks required to verify code simply haven’t caught up yet, which means companies leveraging AI may be flying by the seat of their pants by using AI to verify AI code, a potentially dangerous feedback loop.

... continue reading