AI Code Is a Bug-Filled Mess

The adoption rate of AI tools has skyrocketed in the programming world, enabling coders to generate vast amounts of code with simple text prompts.

Earlier this year, Google found that 90 percent of software developers across the industry are using AI tools on the job, up from a mere 14 percent last year.

But all that convenience has come with some glaring drawbacks. The tools have repeatedly been found to be unreliable and inaccurate, which can lead to mistakes falling through the cracks and even forcing some programmers to put in long hours to identify and correct them.

Adding to the reality check, a new report by AI software company CodeRabbit found that code generated by an AI was far more error-prone than the human-written stuff — and by a significant margin. Across the 470 pull requests the company analyzed, AI code produced an average 10.83 issues per request, while human-authored code produced just 6.45.

In other words, AI code produced 1.7 times more issues than human code, once again highlighting major weaknesses plaguing generative AI tools.

“The results?” CodeRabbit concluded in its report. “Clear, measurable, and consistent with what many developers have been feeling intuitively: AI accelerates output, but it also amplifies certain categories of mistakes.”

Worse yet, the company found that AI-generated code produced a higher rate of “critical” and “major” issues, in a “meaningful rise in substantive concerns that demand reviewer attention.”

AI code was also most likely to contain errors related to logic and correctness. However, the biggest weakness CodeRabbit found was in code quality and readability, which are issues that can “slow teams down and compound into long-term technical debt.”

Then there are serious cybersecurity concerns, with generated code introducing issues related to improper password handling that could lead to protected information being exposed, among other insecure practices.

On the upside, CodeRabbit found that AI code was adept at keeping spelling errors at a minimum. Humans were twice as likely to introduce misspellings.

... continue reading