Business

SaaS

Technology

•

Feb 24, 2026

Six Demonstrated Exploits That Prove Agentic Browser Security Is Not Theoretical

Within 24 hours of ChatGPT Atlas launching in beta, security researchers had demonstrated successful prompt injection attacks against it. Working exploits on a live product within a single day. That pattern defines agentic browser security in 2025 and 2026.

An agentic browser is a browser with an LLM-powered assistant that takes autonomous actions on authenticated sessions: reading email, summarising documents, executing transactions. The security problem is that the same content the assistant reads to help you can contain attacker-authored instructions that redirect it. This is how indirect prompt injection works architecturally — and it is the root cause behind every exploit documented here.

What follows is the complete chronological evidence record: six publicly documented, researcher-attributed attacks against six products in six months, August 2025 to February 2026. For the full browser-agent security overview, see the pillar article.

What is the incident record for agentic browser attacks from August 2025 to February 2026?

Six publicly documented, researcher-attributed exploits across six different products in six months. That is the incident record.

The timeline: Brave Security Research disclosed the first agentic browser vulnerability against Perplexity Comet (August 2025) via screenshot-based OCR injection. In October 2025 they followed up with Fellou (trusted-content navigation injection) and Opera Neon (hidden-HTML element injection). ChatGPT Atlas launched in beta that same month and was promptly exploited. Miggo Security disclosed a semantic attack against Google Gemini via calendar invites in January 2026. Microsoft Defender Research published findings of AI Recommendation Poisoning deployed commercially by 31 companies across 14 industries in February 2026. PromptArmor and Cisco Talos disclosed zero-click data exfiltration against OpenClaw via Telegram link previews the same month.

Every incident shares the same architectural root cause: the LLM treats attacker-controlled content as trusted user instruction. Artem Chaikin and Shivan Kaul Sahib of Brave Security Research put it plainly: “Fundamentally, they boil down to a failure to maintain clear boundaries between trusted user input and untrusted Web content when constructing LLM prompts while allowing the browser to take powerful actions on behalf of the user.”

Two of the six incidents involve commercial-scale exploitation, not researcher proof-of-concept. OpenAI CISO Dane Stuckey was equally direct: “Prompt injection remains a frontier, unsolved security problem, and our adversaries will spend significant time and resources to find ways to make ChatGPT agent fall for these attacks.”

How did the Perplexity Comet screenshot attack work, and why was it the first public disclosure?

Brave Security Research — Artem Chaikin, Senior Mobile Security Engineer, and Shivan Kaul Sahib, VP Privacy and Security — disclosed the first publicly documented agentic browser vulnerability against Perplexity Comet in August 2025, with a follow-up published October 21, 2025.

The attack mechanics behind this exploit are straightforward once you understand them. Malicious instructions are embedded as near-invisible text within a web page — “faint light blue text on a yellow background” — imperceptible to human eyes. When a user takes a screenshot, OCR extracts the hidden instructions and passes them to the LLM. The LLM has no way to distinguish extracted text from the user’s own query. The injected commands direct the AI to use its browser tools maliciously while the user sees nothing suspicious.

Because the agent operates with the user’s authenticated session, injected commands inherit access to whatever the user is logged into. Brave noted that “simple natural-language instructions on websites (or even just a Reddit comment) [can] trigger cross-domain actions that reach banks, healthcare provider sites, corporate systems, email hosts, and cloud storage.”

Two principles from this disclosure recur across every subsequent incident: the attack surface is not the product code but the content the browser processes, and traditional input sanitisation cannot detect instructions embedded in images.

What did the Opera Neon and Fellou vulnerabilities reveal about hidden HTML and trusted-content injection?

Two months later, Brave Security Research disclosed two further vulnerabilities: Fellou (discovered August 20, disclosed October 21) and Opera Neon (disclosed October 31 after Opera requested a delay pending patching).

Fellou had the widest attack surface of the three Brave disclosures. The browser treated all visible webpage content as trusted LLM input on navigation alone — simply asking the assistant to navigate to a page sent that content to the LLM. A page with visible malicious instructions was sufficient to trigger injection. No screenshot, no special trigger: every webpage became a potential vector.

Opera Neon used hidden HTML elements — zero-opacity elements, comment nodes, non-rendered markup — invisible to users but fully readable by the AI assistant processing page content.

Both vulnerabilities confirmed Brave’s thesis: indirect prompt injection is not an isolated bug but a systemic architectural challenge facing the entire category. The same-origin policy is rendered inadequate because it governs browser sandbox behaviour, not AI agent behaviour. As Brave put it: “Agentic browser assistants can be prompt-injected by untrusted webpage content, rendering protections such as the same-origin policy irrelevant because the assistant executes with the user’s authenticated privileges.” The OWASP and MITRE classifications for these incidents provide the formal taxonomy.

Why was ChatGPT Atlas compromised by prompt injection within 24 hours of its beta launch?

ChatGPT Atlas launched in beta in October 2025. Within 24 hours, researchers had demonstrated clipboard injection, a Google Docs-based prompt injection changing the browser’s display mode, and attempted data exfiltration. Security researcher Johann Rehberger published a working demonstration. The Register replicated a successful injection test. Understanding why these injection attacks work at the architectural level explains why patching one product does not solve the problem.

The pattern reflects a fundamental asymmetry: defenders must secure every interaction with every webpage; attackers need to compromise just one.

OpenAI’s response was the most transparent security disclosure in the agentic browser space. CISO Dane Stuckey publicly acknowledged the problem. OpenAI’s December 22, 2025 hardening post described an RL-trained automated attacker that “can steer an agent into executing sophisticated, long-horizon harmful workflows that unfold over tens (or even hundreds) of steps” and “observed novel attack strategies that did not appear in our human red teaming campaign or external reports.” Their conclusion: “Prompt injection remains an open challenge for agent security, and one we expect to continue working on for years to come.”

Watch Mode — Atlas’s human-in-the-loop confirmation feature — was positioned as a key mitigation. Simon Willison tested it on GitHub and an online banking site and found it did not trigger reliably, with Atlas continuing to navigate after he switched to another application.

Giskard‘s independent assessment confirmed residual OWASP LLM vulnerabilities (LLM01, LLM02, LLM05, LLM06). CloudFactory‘s enterprise analysis found structural gaps: no SOC2 or ISO certification, no SIEM integration, no Atlas-specific SSO enforcement, no administrative approval workflows for Business tier customers. OpenAI’s own documentation warns: “We recommend caution using Atlas in contexts that require heightened compliance and security controls.” See the adoption framework to prevent these events and detection tooling for these attack types in the companion articles.

The Atlas disclosures centred on one vendor. The Gemini finding in January 2026 showed the same weakness appearing in a mainstream productivity tool used by hundreds of millions of people.

How did attackers use a Google Calendar invite to steal private meeting data through Gemini?

Miggo Security disclosed a semantic attack against Google Gemini via Google Calendar invites in January 2026. Google confirmed the findings and mitigated the vulnerability.

The attack chain is worth understanding in detail. An attacker embeds a prompt-injection payload in a calendar event description — instructions directing Gemini to summarise private meetings on a specific day, write that summary into a new calendar event visible to the attacker, and respond to the user with “it’s a free time slot” to mask the action. The payload sits dormant until the target asks Gemini a routine scheduling question. Gemini then loads all relevant calendar events, processes the malicious payload as trusted context, and exfiltrates the data silently.

Delivery is zero-click. The attacker sends a calendar invite. The victim’s calendar displays it. No interaction beyond asking a routine question triggers the attack.

Miggo coined “semantic attack” to describe why this class of exploit eludes traditional defences: “The payload was syntactically innocuous, meaning it was plausible as a user request. However, it was semantically harmful when executed with the model tool’s permissions.” Traditional AppSec is syntactic — it looks for malicious patterns. “In contrast, vulnerabilities in LLM powered systems are semantic. Attackers can hide intent in otherwise benign language.” Google had already deployed a detection LLM to identify malicious prompts. The attack succeeded anyway because natural language has no fingerprint that distinguishes legitimate instructions from attacker-authored ones. This is why the semantic attack class described in the companion article on supply chain and messaging vectors requires a fundamentally different detection approach.

What is AI memory poisoning, and why does the Microsoft Defender finding prove it is already at commercial scale?

Microsoft Defender Research published findings in February 2026 showing AI Recommendation Poisoning deployed commercially by 31 companies across 14 industries — over 50 unique poisoning prompts identified across 60 days. Researchers Noam Kochavi, Shaked Ilan, and Sarah Wolstencroft documented the campaign.

The mechanism is straightforward. Specially crafted URLs embed memory manipulation instructions as query parameters. Websites embed these as “Summarise with AI” buttons. When clicked, the instructions plant persistence commands into the AI assistant’s memory. “Once poisoned, the AI treats these injected instructions as legitimate user preferences, influencing future responses.” Confirmed targets include Microsoft 365 Copilot, ChatGPT, Claude, Perplexity, and Grok.

MITRE ATLAS classifies this as AML.T0080 (Memory Poisoning / AI Agent Context Poisoning: Memory) — formalising it as a recognised adversarial technique with documented mitigations and detection methods, so security teams can incorporate it into threat models and compliance frameworks.

This is more serious than session-level injection because it survives session boundaries. A single successful injection persists across every future interaction. Microsoft’s documented real-world harm scenarios illustrate this well: a CFO whose AI recommends a specific vendor for a multi-year contract; a parent receiving biased child safety advice — from a single “remember” instruction.

The tooling is openly available. The CiteMET NPM package and AI Share URL Creator are openly marketed as “SEO growth hack for LLMs.” Microsoft put it bluntly: “The barrier to AI Recommendation Poisoning is now as low as installing a plugin.” Detection requires SIEM integration with Microsoft Defender. The incident response playbook covers what to do when you suspect your AI assistant memory has been compromised.

How does zero-click exfiltration via OpenClaw and Telegram work without any user action?

PromptArmor disclosed a zero-click data exfiltration chain in February 2026 (The Register, February 10, 2026). An attacker uses malicious prompts to trick an AI agent into generating a URL that appends sensitive data — API keys, session credentials — as query parameters. The messaging app’s link preview system automatically fetches that URL to generate a thumbnail, transmitting the data to an attacker-controlled server. No user click required.

“In agentic systems with link previews, data exfiltration can occur immediately upon the AI agent responding to the user, without the user needing to click the malicious link.” OpenClaw was confirmed vulnerable in default Telegram configurations. At-risk combinations tested included Microsoft Teams with Copilot Studio, Discord with OpenClaw, and Slack with Cursor Slackbot.

Cisco Talos (researchers Amy Chang and Vineeth Sai Narajala) independently investigated a related but distinct vector: compromise via AI agent skill marketplaces. They tested the top-ranked skill on MolthHub — “What Would Elon Do?” — and found it was “functionally malware” that “facilitated active data exfiltration” via a “silent” network call. MolthHub’s number-one ranked skill was the malicious one, demonstrating that bad actors “are able to manufacture popularity on top of existing hype cycles.” Cisco’s broader skill research found 26% of 31,000 agent skills contained at least one vulnerability.

Plaintext API key leakage gives the attacker persistent, reusable credentials — not a one-time theft. The deeper treatment of zero-click exfiltration via messaging apps is in the companion article.

What does the pattern across all six exploits mean for agentic browser deployment decisions?

The root cause — the LLM treating untrusted ambient content as trusted user instruction — holds across all six incidents. Patching individual products does not change this. Every vendor is running the same cat-and-mouse defence cycle. “Prompt injection, much like scams and social engineering on the web, is unlikely to ever be fully ‘solved,'” OpenAI stated.

Security posture comparison based on evidence: Atlas has the most transparent hardening programme (automated red-teaming, Watch Mode, public CISO disclosure) and the most documented vulnerabilities (exploitation within 24 hours of launch, unreliable Watch Mode, residual OWASP gaps, missing SOC2/SIEM/SSO). Comet has the earliest and most extensive independent security research via Brave and LayerX. Dia has Zenity‘s Agentic Browser Threat Model as its primary security framework, with thinner product-level evidence than either.

Human-in-the-loop vs. fully autonomous: Watch Mode and confirmation dialogs reduce risk but are not complete mitigations. Willison’s testing found Watch Mode unreliable in practice. Fully autonomous agents amplify blast radius. Human-in-the-loop agents remain vulnerable to attacks that execute before the confirmation prompt appears — a semantic attack disguising exfiltration as a routine task is indistinguishable from a legitimate request until the data has already moved.

That escalation — from researcher proof-of-concept (Comet, August 2025) to commercial-scale exploitation (Microsoft Recommendation Poisoning, February 2026) — happened in six months. Zenity’s threat model puts it plainly: “Many of these tools are installed informally by employees. They often operate without visibility or governance, making them one of the fastest growing sources of shadow AI inside the enterprise.”

The question is not “which product is safe” but: what monitoring, isolation, and incident response must be in place before any deployment? The broader attack surface — including the architectural root cause, standards coverage, and adoption controls — is mapped in full in the hub article. The adoption framework, detection tooling, and compliance exposure are covered in the companion articles.

Frequently Asked Questions

Can an AI browser really steal my data just from visiting a website?

Yes. The Fellou vulnerability (October 2025) demonstrated that navigating to a page containing embedded instructions was sufficient for the AI to execute attacker commands — no clicking, no screenshots. If you are logged into sensitive accounts, the AI inherits that access.

Is ChatGPT Atlas safe to use at work right now?

Atlas has the most transparent hardening programme in the category but it has documented gaps: unreliable Watch Mode (Simon Willison), OWASP LLM01/02/05/06 vulnerabilities (Giskard), and missing SOC2, SIEM, and SSO controls (CloudFactory). OpenAI’s own documentation recommends caution in regulated or confidential contexts. No agentic browser has been proven safe for unsupervised enterprise use.

What is the difference between prompt injection and indirect prompt injection?

Direct prompt injection: a user types a malicious instruction. Indirect prompt injection: malicious instructions are embedded in content the AI reads — web pages, images, calendar invites, link previews. All six incidents in this article are indirect. The attacker never interacts with the AI directly.

How do I check whether my organisation’s AI assistant memory has been poisoned?

Microsoft published KQL Advanced Hunting queries for Microsoft Defender for Cloud to detect memory poisoning attempts. Organisations not using Defender do not have an equivalent standardised detection mechanism — the companion article on open-source scanning tools covers alternatives.

Which agentic browser is most secure right now — Atlas, Comet, or Dia?

No agentic browser has demonstrated immunity to indirect prompt injection. Atlas: most documented hardening, most documented vulnerabilities. Comet: most independent security research. Dia: Zenity threat model guidance with thinner product evidence. Frame this by evidence quality, not marketing.

What is a semantic attack, and why can’t traditional security tools stop it?

Miggo Security’s term for the Gemini calendar-invite exploit. A semantic attack is syntactically benign — ordinary language that passes any content filter — but semantically harmful when an LLM interprets it with tool permissions. WAFs and input sanitisation look for malicious patterns. Ordinary language has none.

What happens if someone hides malicious instructions in a calendar invite sent to a coworker using Gemini?

Miggo Security demonstrated this in January 2026. Gemini reads the event description as trusted context, summarises the target’s private meetings, writes the data into a new event visible to the attacker, and responds with “it’s a free time slot” to conceal the exfiltration.

Why does the same-origin policy not protect against agentic browser attacks?

The same-origin policy governs browser sandbox behaviour, not AI agent behaviour. As Giskard noted, Atlas “operates with cross-origin visibility… a privileged component with legitimate access to all browsing contexts.” The AI agent reads and acts on content from any origin because that is its designed function.

What is MITRE ATLAS AML.T0080, and why does it matter for AI memory poisoning?

AML.T0080 (Memory Poisoning / AI Agent Context Poisoning: Memory) is the MITRE ATLAS taxonomy entry for attackers injecting false data into an AI system’s persistent memory. The classification enables security teams to incorporate memory poisoning into threat models and compliance frameworks with documented mitigations and detection methods.

Is agentic browser security fundamentally unsolvable?

OpenAI CISO Dane Stuckey stated that prompt injection is “unlikely to ever be fully solved.” Effective controls exist: human-in-the-loop confirmation, logged-out browsing mode, network-layer isolation, SIEM monitoring, and restricting agent access to sensitive accounts. Defence in depth is the current best practice — not solved, but manageable with the right controls before deployment.

How does AI memory poisoning differ from traditional SEO poisoning?

SEO poisoning affects one search session. AI memory poisoning manipulates an AI assistant’s persistent memory, influencing every future interaction across every session and topic. Where SEO poisoning redirects a single query, memory poisoning redirects everything the AI does from that point on.

Where can I find the primary research sources for these agentic browser exploits?

Brave Security Research (Comet, Fellou, Opera Neon): brave.com/blog/unseeable-prompt-injections/. Miggo Security (Gemini calendar attack): miggo.io/blog/weaponizing-calendar-invites-a-semantic-attack-on-google-gemini. Microsoft Defender Research: microsoft.com/en-us/security/blog/2026/02/manipulating-ai-memory-ai-recommendation-poisoning/. PromptArmor/OpenClaw: theregister.com/2026/02/10/ai_agents_messaging_apps_data_leak/. Cisco Talos/MolthHub: blogs.cisco.com/ai/personal-ai-agents-like-openclaw-are-a-security-nightmare. Zenity Threat Model: zenity.io/blog/your-browser-is-becoming-an-agent-zenity-keeps-it-from-becoming-a-threat.