Microsoft’s warning on Tuesday that an experimental AI Agent integrated into Windows can infect devices and pilfer sensitive user data has set off a familiar response from security-minded critics: Why is Big Tech so intent on pushing new features before their dangerous behaviors can be fully understood and contained?
As reported Tuesday, Microsoft introduced Copilot Actions, a new set of “experimental agentic features” that, when enabled, perform “everyday tasks like organizing files, scheduling meetings, or sending emails,” and provide “an active digital collaborator that can carry out complex tasks for you to enhance efficiency and productivity.”
Hallucinations and prompt injections apply
The fanfare, however, came with a significant caveat. Microsoft recommended users enable Copilot Actions only “if you understand the security implications outlined.”
The admonition is based on known defects inherent in most large language models, including Copilot, as researchers have repeatedly demonstrated.
One common defect of LLMs causes them to provide factually erroneous and illogical answers, sometimes to even the most basic questions. This propensity for hallucinations, as the behavior has come to be called, means users can’t trust the output of Copilot, Gemini, Claude, or any other AI assistant and instead must independently confirm it.
Another common LLM landmine is the prompt injection, a class of bug that allows hackers to plant malicious instructions in websites, resumes, and emails. LLMs are programmed to follow directions so eagerly that they are unable to discern those in valid user prompts from ones contained in untrusted, third-party content created by attackers. As a result, the LLMs give the attackers the same deference as users.