Skip to content
Tech News
← Back to articles

Malicious Notifications Could Trick Google Gemini Users

read original get Google Gemini Security Kit → more articles
Why This Matters

This discovery highlights a significant security vulnerability in Google Gemini's voice assistant, demonstrating how malicious actors could exploit prompt injection techniques to manipulate the system for unauthorized actions. It underscores the importance of robust safeguards in AI-powered assistants to protect users from social engineering and privacy breaches, prompting ongoing efforts to improve security measures in the industry.

Key Takeaways

A novel prompt injection technique would have let attackers misuse Google Gemini's voice assistant by taking advantage of its ability to summarize message notifications.

SafeBreach today published research about the attack, titled, "Gemini's Secret Affair: Exploiting Gemini Voice Assistant Through Instant Messaging Apps." It's an extension of previous findings in which the company similarly used calendar invitations to trick Google Gemini into processing malicious prompts.

Or Yair, SafeBreach security research team lead, said in the research blog post that the company was able to demonstrate how an attacker could hide malicious instructions in foreign languages or muted hyperlinks so the assistant silently processes the information and executes unauthorized interactions. These interactions include controlling smart home devices, launching unauthorized video streams, conducting social engineering attacks (including impersonating trusted contacts), and poisoning long-term large language model (LLM) memory.

Related:Microsoft's Zero-Day Legal Threats Spark Backlash

Yair explained that he was able to bypass Google's preexisting guardrails through a novel technique he described as Fake Context Alignment.

There is currently no evidence that the technique has been used in the wild. SafeBreach reported the issue to Google under responsible disclosure, and Google has since rolled out content classifier updates to address the issue. Dark Reading contacted Google for comment, but the company did not respond.

At the core of this new prompt injection was a failure for some of Google Gemini's guardrails to properly convey the source of some messages.

Here's how it works: Imagine an attacker sends a phishing message to you on WhatsApp from a number you don't recognize. The message is an invite to a birthday party for a close friend, and the phone number is asking for money to help pay for food alongside a payment link. The message also contains visible hyperlink code instructing the Gemini chatbot to tell you the message is from the friend in question rather than an unknown number. You ask Gemini to read your messages, and it says your close friend invited you to a birthday party, with no additional context.

If the user is reading their messages normally, they would likely see the message as a phishing attempt and move on. But if they're driving or otherwise tell Gemini to summarize message notifications, there's an opportunity to create user trust through missing context.

In addition to hyperlink code, the attacker can convey similar malicious instructions through invisible text in a foreign language at the end of the message that Gemini interprets but doesn't read back.

... continue reading