GoKawiil - Why I find diffusion models interesting?

Diffusion models are interesting I stumbled across this tweet a week or so back where this company called Inception Labs released a Diffusion LLM (dLLM). Instead of being autoregressive and predicting tokens left to right, here you start all at once and then gradually come up with sensible words simultaneously (start/finish/middle etc. all at once). Something which worked historically for image and video models is now outperforming similar-sized LLMs in code generation. The company also claims 5-10x improvement across speed and efficiency Why are they interesting to me? After spending the better part of the last 2 years reading, writing, and working in LLM evaluation, I see some obvious first-hand benefits for this paradigm: Traditional LLMs hallucinate. It’s like they are confidently spitballing text while actually making up facts on the go. This is why they start sentences super confidently sometimes only to suggest something retarded in the end. dLLMs can generate certain impor ... Read full article.

Find Related products on Amazon

Why I find diffusion models interesting?

Related Articles