When I tap the app for Anthropic's Claude AI on my phone and give it a prompt -- say, "Tell me a story about a mischievous cat" -- a lot happens before the result ("The Great Tuna Heist") appears on my screen.
My request gets sent to the cloud -- a computer in a big data center somewhere -- to be run through Claude's Sonnet 4.5 large language model. The model assembles a plausible response using advanced predictive text, drawing on the massive amount of data it's been trained on. That response is then routed back to my iPhone, appearing word by word, line by line, on my screen. It's traveled hundreds, if not thousands, of miles and passed through multiple computers on its journey to and from my little phone. And it all happens in seconds.
This system works well if what you're doing is low-stakes and speed isn't really an issue. I can wait a few seconds for my little story about Whiskers and his misadventure in a kitchen cabinet. But not every task for artificial intelligence is like that. Some require tremendous speed. If an AI device is going to alert someone to an object blocking their path, it can't afford to wait a second or two.
Other requests require more privacy. I don't care if the cat story passes through dozens of computers owned by people and companies I don't know and may not trust. But what about my health information, or my financial data? I might want to keep a tighter lid on that.
Don't miss any of our unbiased tech content and lab-based reviews. Add CNET as a preferred Google source.
Speed and privacy are two major reasons why tech developers are increasingly shifting AI processing away from massive corporate data centers and onto personal devices such as your phone, laptop or smartwatch. There are cost savings too: There's no need to pay a big data center operator. Plus, on-device models can work without an internet connection.
But making this shift possible requires better hardware and more efficient -- often more specialized -- AI models. The convergence of those two factors will ultimately shape how fast and seamless your experience is on devices like your phone.
CNET
Mahadev Satyanarayanan, known as Satya, is a professor of computer science at Carnegie Mellon University. He's long researched what's known as edge computing -- the concept of handling data processing and storage as close as possible to the actual user. He says the ideal model for true edge computing is the human brain, which doesn't offload tasks like vision, recognition, speech or intelligence to any kind of "cloud." It all happens right there, completely "on-device."
"Here's the catch: It took nature a billion years to evolve us," he told me. "We don't have a billion years to wait. We're trying to do this in five years or 10 years, at most. How are we going to speed up evolution?"
... continue reading