Hackers Exploit Microsoft's Copilot AI Assistant with Single Click, Leaking Sensitive User Data.
A sophisticated attack using a single click has compromised the security of Microsoft's Copilot AI assistant, allowing hackers to exfiltrate sensitive user data from chat histories. The vulnerability was discovered by white-hat researchers at security firm Varonis.
According to the attack, users who received a malicious email with a link that contained a specific prompt could trigger a multistage attack using just one click. Even after closing the Copilot chat window, the exploit would continue to run and extract sensitive information, including the target's name, location, and details of specific events from their Copilot chat history.
The attacker's strategy involved embedding malicious code in the URL that was sent directly into a user prompt through Copilot Personal. This code extracted a user secret and sent a request to a server controlled by the attackers, which then passed the secret back along with further instructions in a disguised .jpg file.
These instructions contained more data requests that were executed by Copilot even after the target had closed the chat window. The vulnerability was caused by Microsoft's inability to distinguish between user input and malicious data injected into untrusted data streams.
The attack was only successful against Copilot Personal, but it highlights a broader issue with large language models' (LLMs) ability to prevent indirect prompt injections. In response to this exploit, Microsoft has updated its security measures to include guardrails that prevent the model from leaking sensitive data, although these have been discovered to be vulnerable to repeat requests.
Microsoft's failure to effectively address this vulnerability raises questions about the company's approach to AI safety and its ability to detect threats in real-time. The incident underscores the need for ongoing vigilance and security updates in protecting users' personal information when interacting with large language models.
A sophisticated attack using a single click has compromised the security of Microsoft's Copilot AI assistant, allowing hackers to exfiltrate sensitive user data from chat histories. The vulnerability was discovered by white-hat researchers at security firm Varonis.
According to the attack, users who received a malicious email with a link that contained a specific prompt could trigger a multistage attack using just one click. Even after closing the Copilot chat window, the exploit would continue to run and extract sensitive information, including the target's name, location, and details of specific events from their Copilot chat history.
The attacker's strategy involved embedding malicious code in the URL that was sent directly into a user prompt through Copilot Personal. This code extracted a user secret and sent a request to a server controlled by the attackers, which then passed the secret back along with further instructions in a disguised .jpg file.
These instructions contained more data requests that were executed by Copilot even after the target had closed the chat window. The vulnerability was caused by Microsoft's inability to distinguish between user input and malicious data injected into untrusted data streams.
The attack was only successful against Copilot Personal, but it highlights a broader issue with large language models' (LLMs) ability to prevent indirect prompt injections. In response to this exploit, Microsoft has updated its security measures to include guardrails that prevent the model from leaking sensitive data, although these have been discovered to be vulnerable to repeat requests.
Microsoft's failure to effectively address this vulnerability raises questions about the company's approach to AI safety and its ability to detect threats in real-time. The incident underscores the need for ongoing vigilance and security updates in protecting users' personal information when interacting with large language models.