Voice Assistants Are Begging for a Do-Over. Should You Really Give Them One?

If you believe what Google and Amazon want you to believe, this is the age of voice assistants… once again. This week, both companies unveiled more details about the future of their smart home ecosystems, which centers on a couple things: new smart speakers (of course) in the form of the Google Home Speaker and new Echo products with better sound and faster chips, but perhaps more importantly, a new, and supposedly, upgraded crop of voice assistants to power them. For Google, it’s Gemini for Home and, for Amazon, there’s Alexa+, both of which are being fueled by advances in large language models (LLMs) like those used by ChatGPT. In both companies’ estimation, Alexa+ and Gemini for Home are not just new generations of voice assistants, but the first real generational expansion since the dawn of voice assistants 10 years ago. With that anticipation are some big promises. This time around, the companies say, you’ll be able to do it all. Want an Uber? Order it with Alexa+. Want to check your home camera to see what your cats have been up to all day? Ask Gemini. Want to turn off every smart light in your house except one? Well, that’s something you can actually ask for now instead of painstakingly lobbing several commands and hoping they stick. It all sounds great. It sounds like exactly the type of ambient computing we’ve yearned for since voice assistants crept their way into our homes ages ago. It all sounds so ideal, and it also sounds, if I’m being honest, like it could be a total crock of shit. Let me be clear: I have no doubt that chatbots can be transformative in some ways. We’ve already seen how they can be applied to areas like search, allowing for more complex queries, comparisons, and advice. We’ve seen their generative capabilities when combined and transposed to models like Veo or Sora. We’ve seen how they can code basic apps by just typing an idea into a text box. Even if all those capabilities are far from perfect, we’ve seen hard examples of how they can work when they work well. Voice assistants? Well, we have a lot less proof. Amidst the fanfare from Amazon this week, there was one glaring omission. While Alexa+, its next-gen voice assistant, has been available in early access, it hasn’t revealed when it plans to release the upgrade more broadly after a year of heralding it. It’s still in “early access” for U.S. customers. That could mean a lot of things, obviously; maybe Amazon is just taking its time and making sure things are fine-tuned before unleashing Alexa+ unto the world. But on a more cynical level, it could also mean that Alexa+ isn’t quite ready for the big leagues. Giving credence to that last theory? Siri. Remember when Apple promised an imminent release of next-gen LLM-powered Siri as a part of Apple Intelligence way back in June 2024? Yeah, well, it’s still not here, and there’s no real indication of when it will arrive. And the reason? Well, if we were to put on our thinking hats, it would be that it’s just not ready yet. To take that inference one step further: functional LLM-powered voice assistants are turning out to be a taller order than companies like Apple, arguably the most resourced tech company in the world, had anticipated. Now, maybe Amazon or Google will have more luck on that front—Google in particular has poured vast resources into advancing Gemini, and it wouldn’t be unreasonable to think all of that attention and investment could lead to some kind of breakthrough. But there’s still plenty to be skeptical about. I recently got a briefing on Google’s new smart home products, including its Gemini for Home assistant, and according to Google, the process of imbuing a voice assistant with an LLM isn’t quite as straightforward as you’d think. While Gemini might be great at understanding natural language, it might actually not be ideal for doing the simpler stuff, like turning your lights off and on. Chatbots, though they can be impressive at times, have a tendency to overthink and interpret, which makes them good for some tasks, but is not what you want when it comes to the bread-and-butter smart home ones. Anish Kattukaran, the chief product officer of Google Home and Nest, told Gizmodo that simple commands have to “work 10 out of 10 times.” Because of those differences, Google says it’s actually separating models in Gemini for Home, meaning the more advanced LLM probably won’t be switching your lights on and off or setting timers. When you say “Hey, Google,” that will be your pared-down, more task-focused Gemini—the one you’ll be using for timers, lights, playing music, quick web searches, and other general smart home automation stuff. When you say, “Hey Google, let’s chat,” however, Gemini for Home activates a Gemini Live mode that uses more of the LLM’s natural language powers to have a conversational “chat.” This is where you’ll get more reasoning and creativity for making recipes on the fly or brainstorming a vacation. In this mode with Gemini Live, the AI will be “listening” and more anticipatory, allowing you to speak naturally without it feeling like you’re constantly barking commands and then it does a thing for you. That begs the question: how much of Gemini is really in the more dumbed-down model that you’ll be using every day? And how advanced is it really? It’s also worth noting that Gemini for Home, like Alexa+, is currently in early access, and Google’s issue with retrofitting voice assistants with an LLM isn’t exclusive to Gemini—it’s the same issue companies face across the board. Listen, I’m not just willing to wind up with egg on my face writing this stuff, but I’m hoping that I do. I, like many others, have a simplified smart home myself, and have (also like many others) experienced the frustration and friction of using it even for simple tasks. I’m ready for the next generation of voice assistants, even if it means I have to fork over a monthly subscription to use them. But as hopeful as I am, it’s been a long decade of wanting more but continually expecting less. So, in the case of Gemini, Alexa, and Siri, I’m going to need to see results before I buy into a full-on voice assistant redo.

Voice Assistants Are Begging for a Do-Over. Should You Really Give Them One?

Share this article

Related Articles