Skip to content
Tech News
← Back to articles

Apple researchers built an AI that tests several ideas in parallel before answering

read original get AI Parallel Testing Kit → more articles
Why This Matters

Apple's new LaDiR framework innovatively combines diffusion and autoregressive techniques to enhance large language models' reasoning capabilities, enabling parallel exploration of multiple solutions. This approach improves the accuracy and diversity of AI-generated answers, particularly in complex tasks like math and coding. The development signifies a step forward in creating more reliable and versatile AI systems for various applications.

Key Takeaways

In a new paper, a team of Apple researchers details a creative framework that improves LLM answers in math reasoning, code generation, and more. Here are the details.

Diffusion and autoregression, united

In a newly-revised study titled LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning, Apple researchers, alongside researchers from the University of California, San Diego, detail an interesting way to improve the quality of answers generated by large language models (LLMs) in certain domains.

In the past, we’ve discussed diffusion models, which generate text by iterating over many tokens in parallel with each pass, in contrast to autoregressive models, which work by calculating and predicting tokens one by one.

Apple has even looked at diffusion models applied to protein folding prediction and coding, which is endlessly interesting.

What LaDiR does, in a nutshell, is combine both approaches: it adopts diffusion during the reasoning process, and then generates the final output autoregressively.

More than that, it actually works with many reasoning paths in parallel, each one running its own diffusion process, with a mechanism that pushes them to explore different possibilities, thus producing a diverse set of candidate answers.

They explain that during inference time, when the model is essentially coming up with what and how it will answer to the user’s prompt, LaDiR generates a series of hidden reasoning blocks, each starting as a random pattern (or, noise) and gradually being refined into a more coherent step.

Once the model determines it has done enough reasoning, it switches to generating the final answer autoregressively, one token at a time.

The key detail is that LaDiR can run several of these reasoning paths in parallel, with a mechanism that encourages it to explore different possibilities to avoid them all converging on the same idea too early, defeating the purpose of the whole thing.

... continue reading