TL;DR Google is enhancing Gemini Live with visual overlays that highlight objects in your camera feed and a new audio model for more expressive conversations.
The visual overlay feature helps you identify items or get advice by placing a white-bordered rectangle around objects in your camera’s view.
The new native audio model is designed for more responsive and expressive conversations.
At last year’s Made by Google event, Google unveiled Gemini Live, a feature designed for more natural, hands-free conversations with its AI chatbot. Since its debut, Google has introduced numerous upgrades to Gemini Live, most notably the ability to share your camera feed and screen. Today, Google announced a major enhancement to Gemini Live’s camera-sharing capabilities and a new audio model to make interactions even more natural.
Don’t want to miss the best from Android Authority? Set us as a preferred source in Google Search to support us and make sure you never miss our latest exclusive reports, expert analysis, and much more.
Visual overlays in Gemini Live During its presentation on the new Google Pixel 10 series, Google detailed several improvements coming to Gemini Live on Android. First, when you share your camera feed, Gemini Live will be able to display visual overlays to highlight specific objects. These highlights appear as a white-bordered rectangle around an object, while the rest of the view is slightly dimmed to make it stand out.
This “visual guidance” feature is designed to help you quickly locate and identify items in your camera’s view. For instance, you could use it to highlight the correct button on a machine, point out a specific bird in a flock, or identify the right tool for a project. You can also use it for advice, like asking Gemini to recommend the right pair of shoes for an occasion.
The feature can also handle more complex scenarios. In a briefing, a Google product manager shared a personal example from a recent international trip. He was struggling to figure out if he could park in a certain spot, unable to make sense of the foreign-language signs, road markings, and local regulations. After pulling out his phone and opening Gemini Live, he pointed his camera at the scene and asked if parking was allowed. Gemini looked up the local rules, translated the signs, and then highlighted a spot on the street where he could park for free for the next two hours.
Visual guidance in Gemini Live will be available out of the box on the Google Pixel 10 series and will start rolling out to other Android devices next week. The feature will expand to iOS devices in the coming weeks. A Google AI Pro or Ultra subscription will not be required.
New native audio model in Gemini Live Alongside the visual overlays, Google is upgrading Gemini Live with a new native audio model designed for more responsive and expressive conversations.
... continue reading