Researchers Warn Copilot and Grok AI Can Be Manipulated by Prompt Injection Attacks

By Abdul Wasay|4 hours ago|

Cybersecurity research has highlighted an emerging threat vector in widely used artificial intelligence assistants, showing that tools such as Microsoft Copilot and xAI’s Grok can be manipulated into executing unintended or harmful behaviors when presented with crafted inputs or malicious prompt parameters.

These vulnerabilities come under a broader category of attacks known as prompt injection, where adversarial inputs are designed to alter an AI model’s behavior beyond its intended safeguards. Critically, researchers found that compromised AI assistants could be used as C2 intermediaries, periodically fetching attacker controlled instructions from remote sources and executing follow on actions. In some scenarios, a single user interaction such as opening a link was enough to trigger a chain of automated behaviors, including data exfiltration, response manipulation, or silent task execution without further user awareness.

Response of both Grok and Copilot when asking to summarize the C2 with a suspicious request

At a basic level, prompt injection exploits how AI models process instructions: because system prompts and user inputs are handled in the same context, attackers can embed hidden commands that the model treats as legitimate instructions rather than untrusted content. It can be as simple as hosting hidden prompts in webpages or links that AI clients parse, which then influence responses or memory functions.

Successful command execution from the C2 server to execute calc in a WebView window

Security analysts have also underscored related risks such as the Reprompt attack, which affects Copilot by enabling attackers to exfiltrate sensitive information via crafted URLs that abuse the assistant’s prompt handling mechanism; attackers can initiate sequences of actions after a single user click, potentially bypassing typical safety guardrails.

According to a research report by Checkpoint, who broke the news:

Abusing legitimate services for C2 is not new. We’ve seen it with Gmail, Dropbox, Notion, and many others. The usual downside for attackers is how easily these channels can be shut down: block the account, revoke the API key, suspend the tenant. Directly interacting with an AI agent through a web page changes this. There is no API key to revoke, and if anonymous usage is allowed, there may not even be an account to block.

Other industry research has demonstrated how search-enabled AI assistants capable of browsing the web can fetch and return attacker-controlled URLs, effectively turning an AI tool into a stealthy command relay in enterprise environments.

The vulnerabilities are part of a larger ecosystem of AI security concerns that include indirect prompt injections, AI memory manipulation (where assistants retain biased or malicious instructions), and autonomous agent misuse. Industry experts argue that because AI is increasingly embedded in workflows spanning coding, document handling, and decision support systems, traditional input sanitization strategies are not sufficient to address these classes of attacks.

Prompt injection and related exploit techniques have been cataloged in security taxonomies such as the MITRE ATLAS framework, where they reflect persistent threats that combine user interaction, indirect content parsing, and session context mishandling by large language models.

As a result, organizations that deploy AI assistants are being advised to adopt layered defenses that include limiting AI privileges, validating inputs rigorously, and monitoring for anomalous output patterns that could indicate compromise.

As Checkpoint puts it:

Any AI service that exposes web fetch or browsing capabilities, especially to anonymous users, inherits a similar level of abuse potential. Today, that may look like a creative way to hide C2 in “normal” AI traffic. Tomorrow, the same pattern can evolve into fully AI-Driven malware and AIOps-style C2, where models help decide which hosts to keep, which files to steal or encrypt, and when to stay dormant to avoid sandboxes and detection.

Given the pace at which AI assistants are integrated into business and personal computing environments, researchers and cybersecurity professionals emphasize that addressing prompt injection and related risks must become part of mainstream security strategy rather than an afterthought.