WebMCP can be used to hijack AI agents, Chrome warns

Google Chrome is warning developers that WebMCP tools can be used to manipulate and hijack AI agents. New guidelines explain how attackers can manipulate agents operating in a user’s browser, including within their authenticated sessions. Chrome has released two guides, one for web developers and another for AI agent developers.

Exploits are not specific to WebMCP

The warning has two warnings that explain that the exploits are not specific to WebMCP but are inherent flaws in the LLM and Chrome extensions.

The first warning indicates that the threat is not specific to WebMCP. Chrome explains that AI agents can encounter malicious input from untrusted content even without WebMCP, and that the guide identifies security techniques that are particularly relevant when agents use WebMCP:

“While this threat exists without WebMCP, we have identified some security techniques that are particularly relevant to agents using WebMCP.”

The second warning explains that Chrome extensions with host permissions can manipulate web pages even without WebMCP:

“Extensions can use host permissions to manipulate the page by executing custom JavaScript, even without WebMCP.”

Chrome has released two related WebMCP security guides:

Agent Security Considerations for WebMCP, for AI Agent Developers
and security of WebMCP tools, for developers creating WebMCP tools

Together, the two guides provide security guidance for rapid injection risks in WebMCP, including risks affecting browser-based AI agents and the tools they use.

Chrome identifies two ways AI agents can be hacked

According to Chrome’s Agent Security Guidelines, AI agents using WebMCP must defend against two main attack vectors: malicious manifests and tainted output.

Manifest
A manifest is the information that describes WebMCP tools and website functions to an AI agent. The manifest describes what the website’s functions are called, what they do, and what inputs they accept so that AI agents can discover and use them.
Contaminated output
Contaminated output is information returned by a WebMCP tool that contains malicious instructions.

A malicious manifest may contain quick injection attacks hidden in tool names, descriptions, or parameters. These instructions are designed to manipulate or hijack the behavior of an AI agent.

The second attack vector, contaminated outletsis information returned by a WebMCP tool that contains malicious instructions. Chrome warns that even trusted tools can return tainted results when they include third-party content such as user comments, reviews, forum posts, or other externally provided data.

These attacks work because large language models process instructions and data together. A model may not reliably distinguish between a user’s request and malicious instructions hidden in the content they consume. Chrome describes this as indirect prompt injection and notes that the prevalence of these attacks across the web is increasing.

Chrome says AI models can’t reliably stop rapid injection

Officer safety guidelines state:

“LLMs treat all text, instructions, and user data as a single sequence of tokens. This means they are susceptible to indirect injection, an inclusion of malicious instructions by an attacker. While some models include layers of security against rapid injection, the probabilistic nature of LLMs makes it impossible to guarantee security inside the model itself.

Security researchers have repeatedly demonstrated rapid injection attacks against agent systems that use state-of-the-art LLMs, and the prevalence of attacks across the web is increasing.

Chrome also points to repeated demonstrations of rapid injection attacks against agent systems and cites increasing rapid injection activity across the web.

Chrome recommends layered security controls

Instead of relying on the model to recognize malicious instructions, Chrome recommends a defense-in-depth strategy that combines deterministic controls with probabilistic guarantees. In this context, deterministic means predictable, rule-based, binary guardrails.

Some of the deterministic controls recommended by Chrome include:

Setting token limits on tool responses
Restrict cross-origin interactions
Require user confirmation before actions are taken
Recognize and manage content marked as untrustworthy

Chrome also says that limiting the web origins an agent can interact with can reduce the risks of unauthorized actions and data exfiltration, especially when agents operate within authenticated user sessions.

The guidelines also emphasize the need to keep humans informed and treat WebMCP tools as capable of changing state unless they are explicitly identified as read-only.

For additional protection, Chrome recommends techniques such as highlighting untrusted content, prompt injection classifiers that analyze tool descriptions and output, and secondary “critical” models that evaluate scheduled tool calls before they are executed.

Tips for WebMCP tool developers

Tool security guidance focuses on developers who create websites and applications that expose WebMCP tools to AI agents.

Chrome recommends using annotation tips that help agents understand how the tool’s results should be processed. An example is untrustedContentHint, which can be applied when a tool returns user-generated content or information from external sources. According to Chrome, the hint indicates that the release should receive further review.

Developers are also encouraged to use readOnlyHint for tools that do not modify state, helping agents make better decisions about when user confirmation is needed.

Chrome’s implementation allows developers to specify trusted origins via an exposedTo parameter, limiting access to trusted sites. The guide states that even read-only tools can reveal information about users and should only be shared with trusted origins.

Take away

The most notable aspect of the guide is not the individual security recommendations, but Chrome’s recognition that rapid injection remains a fundamental challenge for AI agents.

Rather than presenting model enhancements as a solution, Chrome’s guidance assumes that attackers will successfully place malicious instructions in tool descriptions, tool outputs, and third-party content. The recommended response is a layered security architecture that combines access controls, content isolation, human oversight, surveillance, and independent validation systems.

Chrome’s guidelines treat AI agent security as a shared responsibility between agent developers and tool developers within the WebMCP ecosystem.

Featured image by Shutterstock/A9 STUDIO

Source link

WebMCP can be used to hijack AI agents, Chrome warns

Exploits are not specific to WebMCP

Chrome identifies two ways AI agents can be hacked

Chrome says AI models can’t reliably stop rapid injection

Chrome recommends layered security controls

Tips for WebMCP tool developers

Take away

Leave a ReplyCancel Reply

Agentic AI is rewriting the martech economy and infrastructure

Buy Reddit to Earn AI Quotes is the New Link Farm

4 Retreats Women Executives Are Booking to Beat Burnout in 2026

Exploits are not specific to WebMCP

Chrome identifies two ways AI agents can be hacked

Chrome says AI models can’t reliably stop rapid injection

Chrome recommends layered security controls

Tips for WebMCP tool developers

Take away

Leave a ReplyCancel Reply

Trending now

Agentic AI is rewriting the martech economy and infrastructure

Buy Reddit to Earn AI Quotes is the New Link Farm

4 Retreats Women Executives Are Booking to Beat Burnout in 2026