According to Google, it doesn’t send all of your video to Gemini. That would be a huge waste of compute cycles, so Gemini only sees (and summarizes) event clips. Those summaries are then distilled at the end of the day to create the Daily Brief, which usually results in a rather boring list of people entering and leaving rooms, dropping off packages, and so on.
Importantly, the Gemini model powering this experience is not multimodal—it only processes visual elements of videos and does not integrate audio from your recordings. So unusual noises or conversations captured by your cameras will not be searchable or reflected in AI summaries. This may be intentional to ensure your conversations are not regurgitated by an AI.
Credit: Google
Paying for Google’s AI-infused subscription also adds Ask Home, a conversational chatbot that can answer questions about what has happened in your home based on the status of smart home devices and your video footage. You can ask questions about events, retrieve video clips, and create automations.
There are definitely some issues with Gemini’s understanding of video, but Ask Home is quite good at creating automations. It was possible to set up automations in the old Home app, but the updated AI is able to piece together automations based on your natural language request. Perhaps thanks to the limited set of possible automation elements, the AI gets this right most of the time. Ask Home is also usually able to dig up past event clips, as long as you are specific about what you want.
The Advanced plan for Gemini Home keeps your videos for 60 days, so you can only query the robot on clips from that time period. Google also says it does not retain any of that video for training. The only instance in which Google will use security camera footage for training is if you choose to “lend” it to Google via an obscure option in the Home app. Google says it will keep these videos for up to 18 months or until you revoke access. However, your interactions with Gemini (like your typed prompts and ratings of outputs) are used to refine the model.