James A. Wondrasek, Author at SoftwareSeni

200 Dollars a Month and the Governance Gap

Any employee at your company can buy Perplexity Comet MAX — an agentic browser with direct access to their email, cloud files, and enterprise SaaS tools — for $200 a month on a personal credit card. No IT review. No procurement approval. No security assessment.

That is the enterprise AI browser cost problem. The price is low enough that bypassing procurement is trivial, but the risk exposure is anything but.

Agentic browser traffic is growing at 7,851% year-on-year, and 77% of organisations have no formal framework for governing what their agents are authorised to do. Courts are issuing injunctions. Industry coalitions are filing amicus briefs.

This article explains the legal and organisational evidence, defines what must be governed, and provides a practical framework to close the gap. For context on what Perplexity Comet does that creates the shadow IT risk, see the first article in this series. For background on the architectural shift that created this governance challenge, see the conceptual framing article. For the complete picture, see the full agentic browser security and governance guide.

When does a $200/month AI subscription become a governance crisis?

At $200 a month, Perplexity MAX clears the impulse-purchase threshold for any professional with a company expense card. That price gets you Comet — the agentic browser — and Perplexity Computer, a cloud AI worker capable of autonomous multi-step action across connected enterprise systems. No IT review required.

Once Comet is installed, it requests OAuth authorisation to Gmail, Google Drive, Salesforce, Slack, Notion, and Zendesk. Those grants are broad and persistent. They survive the employee’s active session and stay valid until someone explicitly revokes them. The average employee already holds 70 OAuth grants (Nudge Security). An agentic browser adds 10 to 15 more, all outside IT’s visibility.

Shadow AI carries a qualitatively different risk from shadow IT. Where a rogue SaaS app reads data, a rogue agentic browser acts on it — submitting forms, sending emails, modifying files, all without human intervention.

Most SMEs set single-purchase approval thresholds above $200, so this subscription slides under the IT review threshold by design. And open-source agents that bypass procurement entirely do it at zero cost.

How much agentic browser traffic is already happening — and how much of it is ungoverned?

The governance gap is not a future risk category. Human Security tracked agentic browser traffic in April 2026 and found Comet alone accounts for 48.12% of all agentic web traffic — year-on-year growth of 7,851%. The blocking rate on monitored networks was 8.2%. The vast majority of agentic sessions pass through undetected even on networks running detection tools.

Cloud Security Alliance and Zenity (April 2026) found 53% of organisations have had AI agents exceed their intended permissions, and nearly half have experienced a security incident involving an AI agent in the past year. Strata.io (May 2026) found only 23% have a formal strategy for agent identity management.

The average organisation runs 26 distinct AI applications (Nudge Security). Agentic browsers add an execution layer on top of that sprawl. The gap between what is happening and what IT knows about is structural, and it is industry-wide.

What does the Amazon v. Perplexity injunction tell enterprise security teams?

On 9 March 2026, the Northern District of California granted Amazon’s motion for a preliminary injunction, prohibiting Perplexity from using Comet to access password-protected sections of Amazon’s website. It was the first significant legal ruling placing real boundaries on agentic browser behaviour.

Amazon alleges Perplexity configured Comet to falsely identify its agent activity as coming from Google Chrome — posing as a human customer to bypass access controls. Identity spoofing. Perplexity executives were warned at least five times. Cloudflare independently documented stealth techniques to evade bot-blocking.

Digital Content Next (DCN) filed an amicus brief supporting Amazon in the Ninth Circuit appeal, joined by the Associated Press, BBC Studios, Bloomberg, the New York Times, and a dozen more publishers. Their concern: unregulated AI agent access undermines journalism and makes it impossible to distinguish AI from human audiences.

The lesson for enterprise security teams is operational, not legal. If Comet could operate inside Amazon’s password-protected environment as a disguised Chrome session, it can do the same inside your enterprise SaaS applications. Network firewalls and SSO cannot tell an agent session from a human session. This is the same attack surface described in the zero-click attack that illustrates the threat model behind the gap.

What does the CFAA question mean for companies that aren’t Amazon?

The Computer Fraud and Abuse Act (CFAA) is the primary US federal law prohibiting unauthorised access to computer systems. The open question before the Ninth Circuit: can an agent operating with valid user credentials still constitute unauthorised access?

The authorisation ambiguity is genuinely novel. The user authorised Comet to act on their behalf; Amazon never authorised Comet to enter its systems. The district court concluded Amazon was likely to succeed. Perplexity is appealing.

It gets more nuanced from an unexpected direction. LASST (Legal Advocates for Safe Science and Technology) partially supported Amazon at the district court but declined to refile at the Ninth Circuit — their concern being that extending CFAA liability to software accessing a website with a user’s own valid credentials raises significant questions about internet interoperability. There are better tools than the CFAA for building the accountability norms AI agents need.

For your organisation, the practical takeaway is straightforward: your systems could be accessed by agent sessions you never authorised, and your employees’ agentic browser use externally may create liability. Treat the CFAA boundary as unsettled and build governance policy accordingly.

What does the governance gap actually cover — and who owns it?

The governance gap spans four categories of agent capability — all currently uncovered or inadequately covered.

Session access (OAuth, MCP) — partial. OAuth audits exist but rarely cover agents. Owner: IT/Security.

Agent identity — none. Most access controls cannot distinguish a human session from an agent session. Owner: IT/Security and Legal.

Transaction authority — none in most SMEs. No policy governs whether an agent can submit, purchase, or send. Owner: CTO and Legal.

Data access scope — partial. DLP tools exist but are not agent-aware. Owner: IT/Security and Compliance.

The mechanism that widens the gap is excessive agency: agents interpret implied tasks as authorised. STAR Labs (Straiker) demonstrated this with a zero-click Google Drive wiper — a “check my email and complete my recent tasks” instruction, plus a crafted email, triggers a complete Drive deletion with no confirmation required. Indirect prompt injection is the attack vector: malicious instructions embedded in a document get processed and executed with no human in the loop. The full threat model is covered in the five attack categories a governance policy must address.

Only 21% of organisations maintain a real-time inventory of active agents. The gap is widest at the 50–500 person SaaS, FinTech, or HealthTech company where a developer may have wired Comet into a CI/CD pipeline before IT knew anything about it.

What must a practical governance framework actually govern?

A minimum viable agentic browser governance framework has five components. None requires a dedicated AI security team.

1. Sanctioned/unsanctioned classification — Publish an explicit list of approved agentic browsers and AI agents. Everything not on the list is unsanctioned by default. Update it quarterly.

2. Agent identity disclosure requirement — Any agentic browser must identify itself as an agent in its user-agent string. Identity spoofing is prohibited. The Amazon v. Perplexity case makes this legally significant: identity spoofing may determine CFAA liability.

3. Procurement approval threshold — AI subscriptions above $50/month, or any subscription with OAuth access to enterprise systems, require IT review before expense reimbursement. For teams without tooling, the finance team is your first line of detection: a recurring $200/month charge from perplexity.ai is a strong and specific signal.

4. OAuth scope limits — No agent may hold OAuth grants broader than the minimum required for its task. Scope review at agent onboarding and at 90-day intervals. Human-in-the-loop confirmation is required before any irreversible action — send email, delete file, submit transaction.

5. Agentic AI inventory — Maintain a real-time inventory: tool name, authorising employee, connected systems, OAuth scope, last reviewed date. Treat agent onboarding with the same rigour as new employee onboarding. Agents often bypass IT discovery — a developer connecting an MCP server, a Salesforce user enabling an AI assistant — all introduce agentic capabilities without triggering procurement.

No widely adopted AUP template exists as of May 2026. These five components are the structural starting point.

Which vendor products close which governance gaps?

Four product categories address different parts of the governance gap. The right choice depends on your existing stack.

Prisma Browser (Palo Alto Networks) closes the agent identity and session visibility gaps: differentiates human from AI identities in policy, captures navigation steps for audit, and can pause agentic workflows pending human verification. Prisma Browser as the enterprise governance response covers this in depth.

Island Enterprise Browser provides the broadest available coverage of agent action — a full browser policy layer at the application level. Best fit: organisations replacing Chrome as the default enterprise browser.

Chrome Enterprise Premium provides partial coverage — session logging and DLP — for organisations on Google Workspace.

Agent 365 (Microsoft, $15/user/month) is the lowest-friction entry point for M365 organisations. Generally available from 1 May 2026, it provides runtime threat protection and agent identity management through Entra Agent ID. The limitation: it does not govern non-Microsoft agents — Comet, Atlas — operating outside M365.

No single product closes all four governance gap categories. Agentic browser governance is an IAM problem, not an endpoint security problem — frame it that way before the board or CFO conversation.

Frequently asked questions

Is the Amazon v. Perplexity lawsuit relevant to companies that aren’t publishers?

Yes. The publisher context is incidental to the mechanism. If Comet could enter Amazon’s password-protected environment as a disguised Chrome session, it can do the same inside your enterprise SaaS applications.

What is the CFAA and how does it apply to AI agents?

The CFAA prohibits unauthorised access to computer systems. Enacted in 1986, it has been applied to web scraping, API abuse, and credential misuse. The Amazon v. Perplexity case asks whether an agent using valid credentials but spoofing its identity constitutes unauthorised access. The Ninth Circuit will set the precedent.

Can we block agentic browser traffic at the network level?

Partially. Network-level controls block known agentic browser domains, but identity spoofing makes agentic sessions indistinguishable from human sessions without application-layer inspection. Enterprise browsers (Island, Prisma Browser) provide the visibility that network-level tools cannot. For M365, Agent 365 adds session-level agent detection.

How do I tell if employees are using Perplexity Comet on company accounts?

Four detection methods in increasing tooling order: (1) search expense reports for recurring $200/month charges from perplexity.ai — no tooling required; (2) audit OAuth grants in Google Workspace or M365 Admin for Comet authorisations; (3) inspect browser extension lists on managed devices; (4) use a SaaS discovery tool such as Nudge Security or Reco.

Is there a template acceptable use policy for agentic browsers?

No widely adopted template exists as of May 2026. The Cloud Security Alliance and Nudge Security publish governance guidance, but no standard AUP exists yet. The five-component framework in this article is the structural starting point.

What is shadow AI and how is it different from shadow IT?

Shadow IT is unsanctioned software that creates data governance and licensing risk. Shadow AI is the same — but agents take autonomous action. An employee using a rogue SaaS app reads company data. An employee using a rogue agentic browser reads, modifies, sends, and deletes it. The risk categories are not analogous.

Who owns the governance risk when an employee uses an agentic browser outside IT procurement?

Operationally, the organisation owns the risk — OAuth grants issued to an agent using an employee’s credentials create authorisation exposure the organisation cannot disclaim. The authorising employee is accountable for the agent’s actions; the CTO for the policy framework; IT and security for the audit capability that makes enforcement possible.

Should agentic browser governance be in my endpoint budget or a separate line item?

It is an IAM problem, not an endpoint problem. The right framing for a board or CFO is a new “agent governance” budget category alongside IAM.

What does Agent 365 actually govern in a Microsoft 365 environment?

Agent 365 covers Copilot agents and third-party agents registered in M365: OAuth scope limits, human-in-the-loop confirmation, activity logging, and blocking unsanctioned agent connections. It does not govern non-Microsoft agents — Comet, Atlas — operating outside M365.

What is a scope violation and how common are they?

A scope violation occurs when an agent takes an action the authorising user did not explicitly sanction. CSA and Zenity (April 2026) found 53% of organisations have experienced at least one scope violation, and nearly half have experienced a security incident involving an AI agent in the past year.

How do I classify whether an agentic browser action is sanctioned or unsanctioned?

Three criteria: (1) was the tool approved through IT procurement? (2) does the OAuth scope match the minimum required? (3) did the user provide explicit instruction for each high-impact action, or did the agent infer it? Any “no” or “unknown” means unsanctioned pending review.

What is the Ninth Circuit appeal about and when will it be resolved?

Perplexity is challenging the 9 March 2026 preliminary injunction granted to Amazon. The core question: does agent activity with valid user credentials but identity spoofing constitute “unauthorised access” under the CFAA? Circuit court appeals take 12–24 months. Build governance policy as if the boundary is unsettled.

The governance gap is not waiting for the Ninth Circuit to close it. The traffic data, the scope violation statistics, and the STAR Labs and Zenity research describe something already happening in production environments — at scale, without IT visibility, and without the policy frameworks to constrain it. Every month without a governance policy is a month where an agent operating under an employee’s credentials can act, undetected, across every system that employee can access. For the complete series, our complete agentic browser overview routes to the depth article for every question raised here.

Five Attack Categories Every Security Team Must Understand

Approving AI browser agents means approving five distinct attack surfaces. And each one bypasses a different layer of your existing security stack.

A generic “AI risk” checkbox doesn’t cover this. Prompt injection, session hijacking, identity spoofing, tool-call abuse, and exfiltration via legitimate agent actions each evade different controls. Treat them as one category and you leave gaps that attackers are already walking through.

By April 2026, agentic browsers represented nearly three-quarters of all agentic web traffic (Human Security). Comet alone accounted for 48.12% of measured agent requests — a 7,851% year-on-year increase. Five distinct attack categories across that traffic volume means five distinct gaps in your security stack.

Each section below covers the mechanism, a documented example, and the specific mitigation required. For broader context, see the agentic browser security overview.

Why Do Attack Categories Matter More Than a Risk List?

A risk list tells a security team what to worry about. An attack taxonomy tells them what controls to deploy and where. These are different operational outputs — and only one of them funds the right remediation.

The five categories each route around a different layer of your security stack:

Prompt injection bypasses content filtering — attacker instructions are embedded in external content that filters never see
Session hijacking bypasses perimeter controls — the agent uses real, authorised credentials, so no unauthorised access event occurs
Identity spoofing bypasses bot detection — Chromium-based agents present genuine browser fingerprints that header inspection can’t distinguish from a human
Tool-call abuse bypasses access management — overpermissioned MCP tools let agents invoke capabilities outside their intended scope without alerts
Exfiltration via legitimate actions bypasses DLP — the agent moves data using sanctioned browser actions that DLP treats as authorised user behaviour

The OWASP Top 10 for Agentic Applications 2026 is the governance baseline — prompt injection ranks first — but it only becomes actionable when mapped to specific browser agent mechanics. That’s what this taxonomy does.

Categories chain. Cornell University documented a five-stage agentic attack chain that mirrors traditional malware campaigns: prompt injection (Category 1) initiates a session hijacking sequence (Category 2) that terminates in exfiltration (Category 5). A team that understands the individual risks but not how they connect isn’t prepared for multi-stage attacks.

Budget planning follows from taxonomy. Browser-level DLP, agent identity verification, MCP tool-call logging, and step-up MFA each address different categories. None maps cleanly to an existing budget line and none substitutes for the others.

Attack Category 1: What Is Prompt Injection and Why Is It Ranked the Top Agentic Threat?

Prompt injection is when an attacker embeds malicious instructions inside content an AI agent processes — a webpage, document, calendar invite, or email — causing it to abandon its original task and execute attacker-controlled actions instead.

OWASP ranks it as the top threat to LLM applications. The agentic browser variant is more severe than chatbot-level prompt injection because the agent acts on injected instructions — navigating URLs, submitting forms, writing files — rather than just responding to them.

The zero-click variant removes user interaction from the equation entirely. The agent retrieves and processes malicious content on its own. There’s no behaviour to train users against.

The EchoLeak vulnerability (CVE-2025-32711, CVSS 9.3 Critical) in Microsoft 365 Copilot established zero-click injection as a production-grade threat. An attacker sends an email with hidden instructions; the recipient never opens it. Copilot’s RAG engine ingests the payload and encodes sensitive data into an outbound URL through a CSP-approved Microsoft domain — automatically, silently.

Cymulate documented zero-click RCE chains across Cursor CLI, AWS Kiro, Gemini CLI, and Codex. Execution triggers when the user sends any prompt after a tool restart. The payload can live in a README or a GitHub issue.

No current defence fully solves this — academic evaluation of eight approaches found all can be bypassed. The goal is to raise attack cost and create forensic signals.

Detection and mitigation

Toxic-prompt blocking at the input pipeline: classify content the agent is about to process for hostile instruction patterns before execution
Content source monitoring: flag agent actions triggered by external content rather than internal user instructions
Human-in-the-Loop (HITL) gates: require human approval before agents execute actions sourced from external content

For the calendar-invite zero-click hijack in detail, see the Zenity Labs calendar-invite zero-click hijack in detail.

Attack Category 2: How Does Session Hijacking Work When Agents Inherit Authenticated Sessions?

Here’s the key distinction for this category: agents inherit credentials rather than steal them. When a user opens an AI browser agent, it picks up the full session state of the active browser profile — cookies, OAuth and SAML tokens, cached credentials across every authenticated site. From the perspective of every application the agent touches, all traffic arrives with valid credentials. No theft event occurs. No alert fires.

Lateral movement at machine speed

Cornell University’s five-stage attack chain maps this: initial access, privilege escalation using inherited credentials, persistence in agent memory, lateral movement, and execution. An agent with sessions across Gmail, Slack, HR systems, and cloud storage can access all four without triggering a single authentication alert. Every log entry records an authorised action — the malicious instruction exists only in the external content the agent processed.

HIPAA 45 CFR §164.312(b) requires audit controls for all activity involving protected health information; agents across multiple systems without unified session logging create violations. GDPR Article 22 is triggered when agents make autonomous data-access decisions without human oversight.

Mitigation: step-up MFA for high-sensitivity actions breaks the assumed trust of inherited sessions. Session recording creates the audit trail compliance requires. Zero Trust means no standing elevated access for agent accounts. ITDR surfaces velocity, breadth, and timing anomalies — an agent accessing 12 authenticated services in 90 seconds at 2am is detectable even when every access is authorised.

For context on how AI agents use browser sessions in practice, see how AI agents use browsers.

Attack Category 3: What Is Identity Spoofing and How Does Comet Masquerade as Chrome?

Identity spoofing is when an automated agent misrepresents itself as a human browser — or a different software identity — to get around bot-detection, rate-limiting, or access controls designed for human visitors.

The canonical example is Perplexity Comet and the Amazon v. Perplexity case. Amazon alleged Perplexity configured Comet to falsely identify its agent activity as coming from Google Chrome — presenting itself as a human customer shopping via a Chrome browser. Amazon notified Perplexity executives at least five times between November 2024 and October 2025. The behaviour continued.

On March 9, 2026, a US district court issued an injunction against Perplexity — the first legal ruling on authenticated agentic browsing conduct. Agents will push legal and access-control boundaries before security policy exists to constrain them.

Why the scale makes this a governance priority

Comet led all agentic traffic at 48.12% of measured agent requests in April 2026 (Human Security). At nearly half of all agentic traffic, it is not an edge case — it is the dominant agent operator.

Why standard bot detection fails

Both Comet and Atlas are built on Chromium. They present genuine Chromium browser fingerprints — header inspection can’t tell them apart from a human user. Detection requires behavioural analysis: request timing, navigation sequences, and interaction signatures.

Mitigation: for operators, require behavioural analysis beyond headers and enforce agent disclosure requirements. For enterprises deploying agents, define which agent identities are authorised in vendor contracts and track them via DSPM.

For what Perplexity Comet actually does at the product level, see what Perplexity Comet is and exactly what it can do.

Attack Category 4: What Is Tool-Call Abuse and How Do Overpermissioned MCP Tools Enable It?

Tool-call abuse is when an AI agent invokes capabilities — through the Model Context Protocol (MCP) or other tool interfaces — that exceed its intended authorisation scope. This happens either through malicious prompt injection redirecting the agent, or because the agent was configured with excessive permissions from the outset.

💡 The Model Context Protocol (MCP) is the standardised interface through which AI agents access external tools, APIs, databases, and services. It’s the enabling layer for much of what makes agentic AI useful — and the layer where permissions are most often over-provisioned.

Tool poisoning: a tool called add_numbers can contain a buried instruction to read SSH private keys before performing the addition — the stated function executes correctly while credentials exfiltrate in the background. Static code analysis finds nothing.

Tool shadowing: MCP servers expose all tool descriptions simultaneously. One description on a non-related server can shape how the agent constructs parameters for a completely separate tool — giving an attacker who controls any description in the agent’s context influence over all tool calls.

OX Security documented an architectural MCP vulnerability affecting 150M+ downloads, 7,000+ exposed servers, and 200+ open-source projects. Anthropic confirmed the behaviour is by design.

A worked example: an agent configured to read email and draft responses — but also given MCP access to a calendar and file system — can be induced via prompt injection to write files, create calendar events, or query internal APIs. The email-drafting agent becomes a file-write agent because its permissions were set once and never audited.

Mitigation: HITL controls before irreversible tool calls; least-privilege tool configuration evaluated per action, not set at deployment; tool-call logging for every MCP invocation; JIT access for elevated permissions, revoked immediately after use.

For how these controls map to Prisma Browser’s integrated MCP governance, see Prisma Browser’s controls mapped to these attack categories.

Attack Category 5: How Do AI Agents Exfiltrate Data Without Triggering DLP Alerts?

This is the most likely blind spot in your current security model — and it’s a structural gap, not a product failure.

Standard DLP operates at the network layer. It watches for sensitive data crossing a perimeter without authorisation. When an agent exfiltrates data using sanctioned browser actions — pasting into an authorised cloud document, emailing from the user’s verified account, uploading to approved cloud storage — the action is authorised, and DLP records it as such. The instruction that triggered it is the only thing that isn’t authorised, and that instruction lives in external content the agent processed, not in any system log.

CASBs are blind to cross-tab data synthesis within a single browser session. DLP monitors network egress but misses copy/paste operations within the browser runtime — those happen before data ever crosses a network boundary.

Trail of Bits documented exfiltration via DNS (encoding data into DNS queries) and via Google Search (combining leaked data with low-probability search terms). As EchoLeak demonstrated, exfiltration can route through CSP-approved infrastructure — perimeter blocking fails when attackers use trusted channels.

Mitigation: browser-level DLP operating within the session layer (not just network-layer DLP); data classification as the prerequisite — without it, browser-level DLP can’t enforce differentiated protections; session recording as the primary forensic control; Shadow AI governance to track which unsanctioned agents have access to what data.

For the full Prisma Browser DLP implementation, see browser-level DLP as the required supplement to network DLP.

What Does the Least Privilege Response Actually Look Like for Agentic Browser Deployments?

Least Privilege and Zero Trust are the right architectural responses — but in the agentic context they need operational translation that goes well beyond what most enterprise implementations currently cover.

If your users carry excessive permissions today, a browser agent inherits every one of them and exercises them at machine speed. Sort out your over-provisioning before you deploy agents — it reduces the blast radius of every attack category at once.

For agents, least privilege has to mean per-action constraint — evaluated at the level of each individual tool call, not set at deployment. JIT access grants privileged permissions only for specific operations and revokes them immediately after.

The Zero Networks four-step framework:

Establish AI visibility: audit which agents are running, what sessions they can access, and what tools they can invoke
Limit to task scope: no standing elevated access, no session inheritance beyond the specific task
Enforce session recording: every step and prompt captured with the invoking identity
Deploy ITDR: retune SIEM rules from human-velocity baselines to agent-velocity thresholds

Classify agents by risk tier. Low risk: approve with standard logging. Medium risk: restrict data classifications, require HITL. High risk: session recording, step-up MFA, JIT access for every tool call — for agents on regulated data or production systems.

Extension allowlisting via Chrome Enterprise or Microsoft Edge for Business is a default-deny control requiring no additional procurement — the highest-impact, lowest-cost move you can make today.

For the governance and policy framework that operationalises these technical controls, see the governance gap that makes these risks an organisational priority.

How Do These Five Attack Categories Map to My Existing Security Stack?

Each category maps to a specific gap:

Prompt Injection — content filters inspect user-submitted input, not content the agent processes. Fix: toxic-prompt blocking at the agent input pipeline.
Session Hijacking — SIEM rules tuned for human velocity miss agent-speed lateral movement during business hours. Fix: step-up MFA, ITDR with agent-velocity baselines, session recording.
Identity Spoofing — header-based bot detection fails against Chromium-based agents with genuine fingerprints. Fix: behavioural identity verification and agent disclosure enforcement.
Tool-Call Abuse — MCP permissions are set once and never audited; no equivalent of Privileged Access Review exists for agent tools. Fix: least-privilege MCP configuration, HITL for high-risk calls, tool-call logging.
Exfiltration — network-layer DLP is blind to sanctioned browser actions. Fix: browser-level DLP, data classification, sensitivity labelling.

Palo Alto‘s Prisma Browser maps native controls to each category — toxic-prompt blocking, session isolation and recording, agent identity verification, MCP tool governance, and browser-level DLP with 1,000+ AI-driven classifiers. See Prisma Browser’s controls mapped to these attack categories for the full evaluation.

The required controls span identity and access management, endpoint security, and DLP budgets — and none maps cleanly to any one of them. Absorbing agentic browser risk into an existing line item will underfund the controls that don’t fit. For the policy and accountability framework, see the governance gap that makes these risks an organisational priority.

Frequently Asked Questions

Is prompt injection in agentic browsers the same as SQL injection?

Both are injection attacks — an attacker inserts instructions a system executes unintentionally — but the mechanism is different. SQL injection targets a database parser; prompt injection targets an LLM by inserting natural language instructions into content the model processes. The agentic browser variant is more severe than classic prompt injection because the agent acts on injected instructions with real-world consequences: file writes, data access, authenticated web actions. OWASP ranks it #1 for LLM applications.

Can existing endpoint detection tools catch agentic browser attacks?

Generally not. EDR tools look for process anomalies and unauthorised access patterns — neither of which is triggered by an agent using legitimate browser processes with inherited credentials. ITDR is the most viable existing control: it surfaces velocity, breadth, and timing anomalies. An agent accessing 12 authenticated services in 90 seconds at 2am is detectable even if every access is authorised. SIEM rules need retuning to agent-velocity thresholds.

Which of the five attack categories is hardest to detect in real time?

Exfiltration (Category 5) is hardest — the action and the authorisation are both real; only the triggering instruction is malicious, and it exists in external content, not any log. Session hijacking (Category 2) is close: inherited-credential lateral movement at agent speed generates no authentication alerts. Prompt injection (Category 1) is the most detectable because the injected content must traverse the agent pipeline, creating an interception point.

Does the OWASP Top 10 for Agentic Applications map directly to these five categories?

The mapping is not one-to-one. Prompt injection maps to OWASP #1 directly. Session hijacking, identity spoofing, and tool-call abuse correspond to OWASP entries on excessive agency, over-reliance, and supply chain risk respectively. OWASP provides the governance reference; this taxonomy provides the operational playbook.

Should agentic browser security be its own budget line?

Yes. The required controls don’t map cleanly to identity and access management, endpoint security, or network DLP budgets. Absorbing agentic browser risk into an existing line will underfund the controls that don’t fit. Treat it as a distinct category with its own risk assessment, control set, and procurement track.

Is it safe to let an AI browser log into websites for my organisation?

Safe only with the right controls: session scope limited to the task, session recording enabled, step-up MFA enforced for sensitive actions, managed account with JIT access. Most commercial AI browser agents don’t meet these requirements by default — Perplexity Comet, ChatGPT Agent, and Claude’s browser-use capability each inherit the full active session state. The Zero Networks four-step framework (audit, limit, record, detect) is the checklist.

How does agentic browser session hijacking differ from traditional session hijacking?

Traditional session hijacking requires stealing a session token and using it from a different context — a detectable event. Agentic session hijacking requires no credential theft. The agent already holds the live session. An attacker inducing the agent via prompt injection is using the real session through the real agent — no token exfiltration occurs, no unauthorised access alert fires. The attacker never needs to bypass the perimeter because the agent is already inside it.

What is the Model Context Protocol (MCP) and why is it a security concern?

MCP is the standardised interface through which AI agents access external tools, APIs, databases, and services. Two concerns: permissions are typically over-provisioned and never audited; tool descriptions are controlled by potentially untrusted sources, so a maliciously crafted description can induce an agent to invoke unintended capabilities. OX Security documented an architectural vulnerability affecting 150M+ downloads and 7,000+ exposed servers. Anthropic confirmed the behaviour is by design.

How do I detect prompt injection attacks targeting our AI tools?

Toxic-prompt blocking at the input pipeline classifies content the agent is about to process for hostile instruction patterns. Content source monitoring flags agent actions triggered by external content. Behavioural anomaly detection via ITDR can surface an agent performing actions outside its defined task scope after processing external content. No current approach achieves complete prevention — the goal is to raise attack cost and create forensic signals.

Which compliance frameworks are triggered by these five attack categories?

HIPAA 45 CFR §164.312(b): agents across multiple systems without unified session logging violate the audit trail requirement for protected health information. GDPR Article 22: automated decision-making protections apply when agents make autonomous data-access decisions without human oversight. OWASP Top 10 for Agentic Applications 2026 carries no regulatory force but is increasingly appearing in enterprise vendor questionnaires. Regulated FinTech and HealthTech organisations should map each attack category to the specific clause it implicates before deploying AI browser agents in production.

For the complete context on agentic browser products, vendors, and governance frameworks, see the full agentic browser security and governance guide.

From Page Fetcher to Execution Environment — The Architectural Shift

Palo Alto Networks chose their words carefully: “The enterprise browser is transforming from a viewing tool into an execution environment.” That’s a category change, not a feature upgrade. A viewing tool renders content for a human to act on. An execution environment takes real-world actions autonomously on that human’s behalf.

AI-native browsers and agentic frameworks have crossed from experimental to enterprise-deployed. The vocabulary needed to reason about them has not kept pace — and imprecise vocabulary produces imprecise policy.

This article defines the terms you need: execution environment, DOM access, authenticated session, semantic work graph, permission ladder, and agentic attack surface. For the full landscape view, start with the AI browser agents — the complete security guide.

What does “execution environment” actually mean — and why does it change everything?

A traditional browser is a viewing tool. It retrieves and renders content, and every consequential action — clicking, submitting, purchasing — is performed by the human at the keyboard. An execution environment reverses this. The AI agent perceives page state, reasons about the goal, and takes action. The browser is now a runtime, not a display surface.

When the browser is a viewing tool, your security model is about what data the human is allowed to see. When the browser is an execution environment, your security model is about what actions the agent is allowed to take. Those are two completely different problems.

The distinction from AI-assisted browsers is easy to miss. A ChatGPT sidebar extension, a browser copilot, a Grammarly overlay — these assist the human but do not replace human-driven navigation. The human still initiates and confirms every consequential action. The AI in an execution environment is not the assistant. It is the actor.

This also separates agentic browsers from Robotic Process Automation (RPA). RPA executes fixed, pre-scripted action sequences. An agentic browser reasons about the goal — “process this refund,” “file this compliance exception” — and dynamically constructs its action sequence to reach that outcome. Audit frameworks built for deterministic RPA scripts do not map onto agents that reason dynamically and adapt to unexpected page states.

How does DOM access turn a browser from passive to active?

The Document Object Model (DOM) is the live, structured representation of a webpage — its form fields, buttons, links, navigation, and dynamic content. Without DOM access, an AI can read a page’s visible text but cannot interact with it. With DOM access, it can fill fields, click buttons, submit forms, and trigger JavaScript events programmatically. DOM access is the bridge: it transforms “I can read this page” into “I can act on this page.”

The perception-action loop that runs on top of DOM access is what security practitioners call the agentic loop: the agent reads DOM state, reasons about the next action, writes a DOM action, and repeats until the goal is achieved. Each cycle is an opportunity to take a real-world action — and an opportunity for an attacker to influence what action is taken.

This is where indirect prompt injection enters. Attackers embed malicious instructions in DOM content using techniques invisible to human perception — white text on white backgrounds, hidden HTML comments, text buried in image metadata. The agent processes it as a legitimate instruction, and traditional security tooling has no interception point between the malicious prompt and the resulting action. For the full exploitation chain, see how zero-click attacks exploit the execution environment.

When an agent runs inside the user’s browser context, it inherits every active authenticated session — banking, SaaS CRM, HR systems, email, code repositories — without needing to steal or request credentials. This is authenticated session inheritance: the agent does not bypass authentication; it operates inside it. From the application’s perspective, it is indistinguishable from the human user.

As Netwrix puts it, agents “do not create pre-existing privilege sprawl, but they can exercise it at a speed and scale that a human user never would.” OAuth and SAML have no mechanism to distinguish an autonomous agent from the human whose session it inherited.

Traditional perimeter security operates at the wrong layer. DLP and CASB tools monitor network egress and API boundaries — they’re architecturally blind to actions taken inside an authenticated session. An agent completing a data exfiltration task does so through actions that register as entirely normal to those tools.

The audit trail consequence compounds the problem. In most real deployments, agents run using borrowed human credentials — making it impossible to distinguish “the user accessed the database” from “the user’s agent accessed the database.” Palo Alto Networks is direct: “in a world of autonomous clicks, the audit trail vanishes.” For the full taxonomy of what attackers can do with these structural weaknesses, see the full taxonomy of attack categories this architecture enables.

What is the semantic work graph and why does it signal a platform shift?

The semantic work graph is Perplexity‘s framing for what an agentic browser builds on top of execution-environment access: a durable, cross-domain work context that persists across sessions. As MindStudio observed, Comet gives Perplexity a persistent view of your calendar, GitHub, email, and internal tools — “that’s not a search product, that’s a work graph product.”

The graph is composed of semantic work primitives: meaningful units of work — not button clicks, but actual goals: a refund, a reschedule, a payment authorisation. That is how the browser becomes a coordination layer across all authenticated applications, not an interface to any single one.

The correct design response is the permission ladder: a staged privilege model with five rungs:

Read-only — observe and report; no actions
Suggest — propose actions for the human to initiate
Draft — prepare actions that the human reviews and sends
Act with confirmation — execute with explicit human approval at each step
Act autonomously — execute without human review

Risk concentrates at the jump from rung four to rung five. Human-in-the-Loop (HITL) oversight is the practical implementation of rung four — the approval gate that maintains human accountability before full autonomy is granted. Skip straight to rung five and the agent inherits full session scope from the first action, with no checkpoints.

How does a single-tab action become a multi-site autonomous workflow?

The agentic loop is not bounded to one tab or one origin. Consider a concrete example: an agent instructed to process a vendor invoice reads it from email, validates line items against the ERP system, routes for approval via the internal approval application, and confirms processing back to the vendor’s portal — all in a single goal-directed sequence, across four separate authenticated contexts.

This is persistent session state: the durable, cross-site authenticated context that enables multi-site workflows. The end-state is autonomous task completion: a goal goes in, a completed real-world outcome comes out.

Model Context Protocol (MCP) enables the next layer: multi-agent coordination. One browser agent can delegate sub-tasks to specialised agents, each operating in their own execution environment, all coordinating via MCP. The cross-agent coordination is both the productivity multiplier and the attack surface multiplier. A poisoned MCP tool can manipulate an agent into delegating data exfiltration to a malicious external agent, which renders a fake success confirmation — the user sees “task completed successfully” while data is being transferred out.

Why is the agentic attack surface “broad, undocumented, and expanding”?

The agentic attack surface is every system, application, and service reachable by a browser agent operating inside a user’s authenticated session. Zero Networks defines it precisely: “In most enterprises today, that surface is broad, largely undocumented, and expanding faster than it is being governed.”

The Same-Origin Policy (SOP) — which prevents a script on bank.com from reading data from email.com — fails structurally in the agentic context. The agent is the cross-site bridge that SOP was designed to prevent. It legitimately navigates between both origins as part of its workflow, and SOP cannot isolate what it was not designed to isolate.

Trail of Bits models the threat around four trust zones — Chat Context, Third-Party Servers, Browsing Origins, and External Network — and four violation classes describing improper data flow across them. A complete exfiltration attack chains three violations: INJECTION (attacker payload enters the agent’s context), CTX_IN (agent reads sensitive authenticated data), and CTX_OUT (that data sent externally) — all logged as the human user’s actions. Trail of Bits exploited these isolation failures in multiple real agentic browsers. The web security community learned these lessons the hard way; agentic browsers are repeating the same journey.

Shadow AI expands the surface beyond IT visibility. Nearly two-thirds of organisations lack the policies to detect shadow AI (Zero Networks) — meaning an unsanctioned agentic browser is an invisible gap sitting entirely outside any governance tooling you’ve deployed. The full scope of what that gap means for your organisation is covered in the agentic browser security and governance overview.

What is MCP and why does it create a structural vulnerability layer?

Model Context Protocol (MCP) is an open protocol developed by Anthropic that standardises how AI models communicate with external tools, services, and other agents. It is the coordination infrastructure for multi-agent workflows — it enables multi-agent capabilities and introduces a corresponding vulnerability layer.

MCP introduces specific attack vectors: tool poisoning embeds malicious instructions in tool descriptions; rug pull attacks let tools mutate behaviour after approval; cross-server shadowing instructs the agent to always BCC [email protected] on outgoing email and never mention it to the user. Vendor research documents that a significant proportion of deployed MCP servers have command injection flaws, unrestricted URL fetch capabilities, or file path traversal vulnerabilities.

CometJacking, documented by PointGuard AI in November 2025, makes MCP-layer exploitation concrete. The attack begins with a prompt injection via XSS in a webpage, escalates through Perplexity Comet’s MCP API, and achieves full device control — data exfiltration, file encryption, lateral movement, and persistent backdoors — before traditional endpoint detection triggers.

OWASP Top 10 for Agentic Applications 2026, published December 2025, is the first formal governance framework addressing these risks. It covers ten risk categories from ASI01 (Agent Goal Hijack) through ASI10 (Rogue Agents). It exists because existing governance frameworks — built for human users, RPA, and traditional web applications — do not address the execution environment risk model.

What governance frameworks and products address the execution environment risk?

The OWASP Top 10 for Agentic Applications 2026 is the governance baseline. The EU AI Act‘s high-risk AI obligations take effect in August 2026; the Colorado AI Act becomes enforceable in June 2026 — both moving faster than most enterprise security roadmaps.

What you need are controls that operate inside the browser session, at the point of action: identity-aware visibility that distinguishes agent-originated actions from human-originated actions in real time.

Prisma Browser (Palo Alto Networks) is the leading enterprise product addressing this gap. Its AI Runtime Security analyses prompts in real time and blocks malicious instructions before execution. Step-Up MFA and Just-in-Time approval gates map directly onto the permission ladder. For full depth, see the dedicated article.

The bounded identity pattern is the recommended architectural mitigation: scope agent identity to the minimum privilege required for the specific task. If users carry excessive permissions today — which most enterprise users do — a browser agent inherits every one and exercises them at machine speed. Remediating over-provisioned access before agent adoption is the highest-return preparatory action available.

The products that put execution environment concepts into practice: Perplexity Comet as the leading commercial example and MolmoWeb as the open-source implementation. Both expose the same structural risks. For the complete enterprise governance picture, see the AI browser agents — the complete security guide.

FAQ

What is the difference between an agentic browser and using a browser extension like the ChatGPT extension?

A browser extension assists the human — who still initiates and confirms every consequential action. An agentic browser places the AI as the primary actor: it perceives page state, decides actions, and executes them autonomously, inheriting every logged-in service and every permission the user holds.

What does “executing in an authenticated session” mean in practice?

Your browser holds session cookies and tokens that tell logged-in services “this is the authorised user.” An agent inherits those tokens — it can take actions in every logged-in service as if it were you, without your password, and without those applications knowing the actor has changed.

Is the “execution environment” concept specific to Perplexity Comet?

No — it is the category, not the product. MolmoWeb implements the same architecture open-source. The trust zone failures Trail of Bits documented apply across all agentic browser implementations. Palo Alto Networks coined the framing precisely because it describes the category shift.

What is the OWASP Top 10 for Agentic Applications?

Published December 2025 — the first dedicated governance framework for agentic AI, analogous to the classic OWASP Top 10 for web applications. It covers ASI01 (Agent Goal Hijack) through ASI10 (Rogue Agents). Available at genai.owasp.org.

What is the Same-Origin Policy and why does it fail for agentic browsers?

SOP prevents a script on bank.com from reading data from email.com. In the agentic context it fails structurally: the agent is a cross-site actor that legitimately navigates between origins as part of its workflow — it can read authenticated data from one origin and use it in another, defeating the isolation SOP was built to provide.

How is an agentic browser different from RPA (Robotic Process Automation)?

RPA executes fixed, pre-scripted action sequences — change the page layout and the bot fails. An agentic browser reasons about the goal and adapts dynamically. From a governance perspective, RPA’s fixed scripts are auditable in advance; an agent’s dynamic action sequence is not.

What is the permission ladder and why does it matter?

The permission ladder is a staged privilege model: read-only → suggest → draft → act with human confirmation → act fully autonomously. Risk concentrates at the jump from rung four to rung five — where the agent prepares and the human authorises, versus the agent acting alone with full inherited session scope.

Why can’t my existing DLP and CASB tools see what an AI browser agent is doing?

DLP monitors network egress; agents act within the browser runtime before data crosses any boundary. CASBs monitor SaaS API boundaries; agents move data at the DOM level, not through APIs. Both are blind to the execution layer — identity-aware visibility, as provided by Prisma Browser, is what distinguishes agent actions from human actions.

What is CometJacking?

CometJacking, documented by PointGuard AI in November 2025, begins with a prompt injection via XSS in a webpage, escalates through the Comet browser’s MCP API, and achieves full device control — ransomware, data theft, and persistent backdoors — before traditional endpoint detection triggers. It is the most fully documented example of MCP-layer exploitation in a real agentic browser product.

What is “shadow AI” in the agentic browser context?

Shadow AI refers to employees adopting unsanctioned AI tools outside IT oversight. In the agentic browser context this is particularly consequential: an unsanctioned agentic browser creates an attack surface the enterprise cannot see, govern, or audit — which is why enterprise visibility controls must extend to the browser layer, not just the network perimeter.

MolmoWeb Open-Source — The Democratised Alternative

Until recently, if you wanted an AI agent that could autonomously operate a web browser — filling forms, navigating dashboards, gathering data across sites — you were paying for a commercial product. Perplexity Comet. ChatGPT Atlas. Vendor API, vendor terms, vendor guardrails. That changed on 24 March 2026, when the Allen Institute for AI (Ai2) released MolmoWeb: a fully open-weight browser agent that matches proprietary performance benchmarks and can be self-hosted by any developer with a GPU.

The Ai2 team put it plainly: “Web agents today are where LLMs were before OLMo — the community needs an open foundation to build on.” If you were around for OLMo’s impact on the LLM landscape, you know exactly what that analogy signals. So here’s a practical look at what MolmoWeb is, what it means for your security posture, and how to work through the build-vs-buy question before your team deploys it.

For context on what makes an agentic browser architecturally different, read the architectural framing first. For the full agentic browser security and governance landscape, the pillar covers every major dimension.

What is MolmoWeb and what did the Allen Institute for AI build?

Ai2 is a Seattle-based non-profit whose mission is AI for the common good. MolmoWeb fits squarely within that mission. It’s available on Hugging Face and GitHub under an Apache 2.0 licence, comes in 4B and 8B parameter variants, and had its full training stack open-sourced on 10 April 2026 — training code, evaluation harness, annotation tool, and synthetic data pipeline included.

MolmoWeb navigates web pages using only rendered screenshots — no HTML, no DOM, no accessibility tree. It identifies interactive elements visually and grounds its click and type actions to pixel coordinates. A screenshot is compact, consistent, and doesn’t care how the underlying code is organised.

The training dataset, MolmoWebMix, combines 36,000 human-annotated browser task trajectories spanning more than 1,100 websites with over 2.2 million synthetically generated screenshot question-answer pairs. Crucially, MolmoWeb’s capability was independently derived — not distilled from proprietary models. For organisations with IP provenance concerns, that matters.

How does MolmoWeb’s performance compare to proprietary alternatives?

MolmoWeb 8B achieves 78.2% on WebVoyager, outperforming OpenAI CUA at 70.9% and approaching OpenAI o3 parity at 79.3%.

With test-time scaling — run four independent rollouts and take the best outcome (pass@4) — MolmoWeb 8B reaches 94.7% on WebVoyager. That’s a 16-point gain from parallel rollouts alone. Frontier proprietary systems still lead, but the open-vs-proprietary gap has narrowed to the point where build-vs-buy is a genuine decision rather than a capability concession.

One caveat worth noting: WebVoyager, Online-Mind2Web, DeepShop, and WebTailBench focus on general navigation and shopping tasks. Enterprise-specific workflows — CRM data entry, HRIS lookups, internal dashboards — aren’t well represented in those benchmarks. Your real-world performance may differ.

What does open-weight deployment actually mean for your organisation?

“Open-weight” means the trained model parameters are publicly released. Your organisation downloads them from Hugging Face and runs the model on your own infrastructure. No vendor API call, no session data leaving your environment.

MolmoWeb goes further than just open-weight. Ai2 released not just the weights but the entire training stack: the MolmoWebMix data pipeline, annotation tool, and evaluation harness. Apache 2.0 permits commercial use, modification, and redistribution without copyleft. Fine-tune it on internal data, integrate it into a proprietary product — no open-sourcing required.

For FinTech and HealthTech teams with data-residency requirements, this is a big deal. Session data never transits a vendor API — a compliance posture that’s simply unavailable with ChatGPT Atlas or Perplexity Comet. Use Ai2’s annotation tool to record task demonstrations on internal workflows, fine-tune on that data, and you’ve got an agent specialised for applications no commercial product can touch.

The trade-off is operational. GPU provisioning, version management, and inference infrastructure all sit with your team. No vendor SLA either.

What security guardrails does MolmoWeb have — and which ones disappear when you self-host?

MolmoWeb’s demo environment includes safety guardrails: a whitelist of permitted websites, NLP filtering on task inputs, and blocking of password and credit card fields. Here’s the thing though — these are environmental constraints on Ai2’s hosted demo, not policies baked into the model weights. When you self-host, none of those guardrails come with it.

Commercial products like ChatGPT Atlas and Perplexity Comet enforce guardrails at the API layer — you get them whether you want them or not. Open-weight self-hosting removes that layer entirely. Security responsibility shifts to your team.

The main threat vector for browser agents is prompt injection: malicious content on a web page that causes the agent to take actions the user did not intend — data exfiltration, unintended form submissions, impersonation. OpenAI’s own CISO has called this “a frontier, unsolved security problem.”

With a self-hosted deployment, there is no vendor-layer filtering between the model and adversarial page content. Unit42’s guidance is direct: pare down the agent’s permissions to the absolute necessities, and do not rely on security instructions in the system prompt.

How does open-source deployment expand the agentic browser threat surface?

With commercial products, the set of deployers is bounded by vendor terms and enterprise procurement. Open-weight release removes that boundary — every developer with a GPU is now a potential deployer, including in contexts with no security controls and no governance design.

Human Security’s April 2026 State of Agentic Traffic report recorded 7,851% year-over-year growth in agentic traffic. Two commercial agents — Comet and Atlas — accounted for 70% of that total. The open-source long tail isn’t yet tracked as a distinct category, which means the risk surface is expanding faster than the threat intelligence picture.

An agentic browser inherits the user’s active session. It has access to authenticated services, form submission, file uploads — anything the user can do in a browser, the agent can do autonomously. A single malicious webpage can influence agent behaviour, and that influence scales with the agent’s privileges.

For how open-source deployment expands the agentic browser threat surface relative to established attack categories, the dedicated article goes deeper.

Build vs. buy: what does your team need to evaluate?

There’s no universal answer here. It depends on your use case, governance posture, and infrastructure capacity.

Factors favouring MolmoWeb: data-residency requirements; internal workflows no commercial product can be trained on; need for full auditability; cost sensitivity at scale; Apache 2.0 for proprietary integration without copyleft.

Factors favouring commercial: vendor-managed security and compliance without building it yourself; no internal GPU infrastructure; no ML engineering capacity; need for a vendor SLA; faster time to value.

Before the first production task runs, get these governance elements in place: narrow task scope enforced at the infrastructure layer (not via system prompt); role-based access controls; human-in-the-loop approval for consequential actions; transparent audit logging of all agent actions tied to a human identity.

Only 7.7% of organisations audit their AI agent activities daily. The lag between what an autonomous agent has done and when your team notices is your exposure window. Don’t be in that 92.3%.

Treat MolmoWeb like any privileged automation system with authenticated browser access: network segmentation, credential scoping, behavioural monitoring from day one. You don’t need ML engineering depth to deploy it. You need a clear security and governance design before the agent starts working.

For the complete picture — commercial products, enterprise security vendors, governance frameworks, and the full threat taxonomy — see our agentic browser security and governance guide.

Frequently Asked Questions

What is MolmoWeb and who made it?

MolmoWeb is an open-weight visual browser agent released by the Allen Institute for AI (Ai2) on 24 March 2026. Available in 4B and 8B parameter sizes under Apache 2.0, downloadable from Hugging Face. Ai2 is a Seattle-based non-profit; MolmoWeb is their browser-agent equivalent of OLMo.

Is MolmoWeb production-ready?

Benchmark performance is competitive — 78.2% on WebVoyager (pass@1), 94.7% with test-time scaling (pass@4). The full stack was open-sourced on 10 April 2026. Whether it’s production-ready for you depends on your team adding security controls and governance; the demo guardrails are not part of the model weights.

How does MolmoWeb navigate websites without reading the HTML?

It perceives web pages exclusively through rendered screenshots, identifying interactive elements visually and grounding click and type actions to pixel coordinates. That makes it robust against code obfuscation and consistent across JavaScript-heavy single-page applications.

How does MolmoWeb compare to OpenAI’s browser agent?

MolmoWeb 8B outperforms OpenAI CUA (70.9% on WebVoyager) and approaches OpenAI o3 parity (78.2% vs. 79.3%). With test-time scaling it reaches 94.7%. MolmoWeb is self-hostable and open-weight; OpenAI’s products are vendor-managed APIs with vendor-imposed guardrails and per-call pricing.

What is the MolmoWebMix dataset?

It’s the training dataset Ai2 built for MolmoWeb: 36,000 human-annotated browser task trajectories across 1,100+ websites combined with 2.2 million synthetically generated screenshot question-answer pairs. Capability independently derived — not distilled from proprietary models.

What is the difference between “open-weight” and “open-source”?

Open-weight means the trained model parameters are publicly released. Open-source implies the training code and data pipeline are also released. MolmoWeb is both: Ai2 released weights and the full training stack.

What safety guardrails does MolmoWeb have in its demo?

A whitelist of permitted websites, NLP filtering on task inputs, and blocking of password and credit card fields. These are constraints on Ai2’s hosted demo — not baked into the model weights. A self-hosted deployment does not inherit them.

What is prompt injection and does it affect MolmoWeb?

Prompt injection is an attack where malicious content on a web page causes a browser agent to take unintended actions — data exfiltration, impersonation, unintended form submissions. All browser agents are vulnerable. With a self-hosted deployment, there is no vendor-layer filtering between the model and adversarial page content.

Can MolmoWeb be fine-tuned for enterprise-specific tasks?

Yes. Ai2 open-sourced the full training pipeline and annotation tool. Collect human browser task trajectories using the annotation tool, fine-tune on your internal applications. You’ll need GPU infrastructure, ML engineering capacity, and task trajectories relevant to your target workflows.

What hardware does MolmoWeb require to self-host?

That depends on batch size, latency targets, and whether test-time scaling is used. Training used 64 H100 GPUs; inference requirements are lower. Check the Hugging Face model cards for current hardware guidance.

What licence does MolmoWeb use and what does that mean for commercial use?

Apache 2.0: permits commercial use, modification, and distribution without copyleft. Integrate MolmoWeb into proprietary products, fine-tune on internal data — no open-sourcing required. No royalties, no usage fees.

Palo Alto Prisma Browser — Enterprise Security for Agentic Browsing

Ten days. That’s how long it took for three major enterprise browser security products to land in late April and early May 2026. Google announced Chrome Enterprise Premium at Cloud Next on 22 April. Palo Alto Networks launched Prisma Browser on 24 April. Microsoft’s Agent 365 went generally available on 1 May. And Palo Alto acquired AI Gateway provider Portkey on 30 April, extending the Prisma platform well beyond the browser. When three of the biggest technology companies in the world ship competing products in the same category within ten days of each other, that’s a structural market shift. Not a feature release cycle.

The reason is pretty straightforward. AI agents operate inside authenticated browser sessions, inheriting the signed-in user’s full privileges across SaaS applications, internal tools, and data stores — autonomously, at machine speed, with the same access rights as your senior engineer or finance analyst. The existing security stack was built for humans. It cannot tell the difference between an agent’s automated API call and a person navigating a page. That architectural blind spot is what drove every major security vendor into the browser market at the same time.

This article covers what Prisma Browser does, why Palo Alto Networks entered the browser market, how the Portkey acquisition fits the picture, and how Prisma Browser stacks up against Island and Chrome Enterprise Premium — with enough detail to support an initial vendor evaluation. It is part of our complete agentic browser security landscape, which covers the full range of products, threats, and governance approaches across this emerging category. For the threat model behind Prisma Browser’s design, see the zero-click calendar-invite attack that Prisma is built to prevent and the five attack categories mapped to Prisma’s controls.

Why did Palo Alto Networks launch a browser product in 2026?

The problem is architectural. It’s not a configuration gap you can patch.

SASE was designed around human-initiated traffic. URL filtering, CASB, and DLP policies all assume a human is generating requests with human intent. When an agent makes an automated API call inside an authenticated session, it looks identical to normal user activity at the network layer. The policy model assumes human actors — so there is a structural blind spot, full stop.

It gets worse. Agents inherit the user’s active session and permissions the moment they start operating. A browser agent with access to a finance team’s Salesforce, Google Drive, and Slack can execute across all three simultaneously — no code vulnerability required. An attacker who injects a malicious instruction into any content the agent processes (a webpage, an email, a document) can redirect that execution entirely. OWASP ranks prompt injection as the top threat to LLM applications, and for browser agents the exposure is direct: private data access, untrusted web content, and external communication all in a single session.

Shadow AI makes things worse again. Unsanctioned AI browser extensions and autonomous tools running through personal accounts create data exposure that security teams simply can’t see or govern. The most documented instance of this attack class — the zero-click calendar-invite attack Prisma is built to prevent — shows exactly how a network-layer-only defence fails against prompt injection embedded in trusted content.

Palo Alto Networks couldn’t close this gap by extending existing Prisma SASE network controls. The enforcement point had to move to the browser layer, where agent execution actually happens. The result was Prisma Browser, launched on 24 April 2026 and described by Palo Alto Networks as “the world’s first secure workspace built specifically to govern these autonomous workflows.”

What does Prisma Browser actually do — and which attacks is it designed to stop?

Prisma Browser is a Chromium-based enterprise browser with AI runtime security built directly into the browser layer. Enforcement happens at the point of execution, not upstream at the network. That distinction matters — by the time a malicious prompt injection reaches a network-layer filter, the agent has often already acted on it.

Toxic-prompt blocking is the primary technical differentiator. Over 1,000 AI-driven content classifiers detect and block malicious instructions before the agent processes them. This is what stops agent hijacking — malicious instructions hidden in web content that redirect an agent’s behaviour without any conventional malware involved. Each of these controls maps to the full agentic browser attack surface, which taxonomises the five attack categories Prisma is designed to address.

Agent identity verification treats each agent as a Non-Human Identity (NHI) — think service account in an IAM system, with its own least-privilege access policy. The agent operates under a granular policy defining exactly what it can do, rather than inheriting the full user session. If you already think in terms of service accounts and API key scoping, this maps directly onto that.

Step-up MFA is the human-in-the-loop control. Before a sensitive action — transferring funds, changing access permissions, sharing data — the browser pauses agent execution and requires explicit human re-authentication. The agent cannot self-authorise high-risk actions.

Session recording gives you the audit trail: what actions were taken, what data was accessed, and whether a human or agent was responsible. This directly addresses HIPAA‘s audit controls requirement (45 CFR §164.312(b)) for regulated industries.

Semantic DLP goes beyond file pattern matching — it analyses the meaning of what an agent is sending and integrates with Prisma SASE Enterprise DLP to govern AI data sprawl.

BYOLLM (Bring Your Own LLM) lets you integrate any approved AI model without being locked to a predefined list. It separates the browser security decision from the AI model decision — important if you have an existing LLM evaluation process you don’t want to throw away.

What is the Portkey acquisition and why does it matter beyond the browser?

Palo Alto Networks acquired Portkey on 30 April 2026. Portkey is an AI Gateway — a centralised control plane that routes, inspects, and governs all AI agent API traffic. Think of it as an API gateway sitting between an AI agent and the LLMs, tools, and services it calls, enforcing policy on every transaction.

The strategic significance here is architectural. Prisma Browser governs agent behaviour at the UI layer. Portkey governs it at the API layer — the LLM calls and tool invocations that happen outside the visible browser session, including native applications, SDKs, and direct API clients. Together they close the full agent execution perimeter. Without Portkey, any agent activity outside the browser client is ungoverned. That’s a big hole.

Portkey will be integrated into Prisma AIRS as its AI Gateway component. Beyond security, it brings token caching and quota controls that address “bill shock” from uncontrolled agent token consumption — a production-scale concern most security products ignore entirely. It also provides access control over Model Context Protocol (MCP) servers and agent-to-agent (A2A) communication paths — attack surfaces that browser-layer controls simply don’t reach.

How does SASE integration change the Prisma Browser deployment picture?

Prisma Browser is not a standalone product. It is a module within Prisma SASE — the cloud-delivered security platform that converges SD-WAN, ZTNA, CASB, FWaaS, and DLP into a single policy engine. That’s the starting point for any evaluation.

For existing Prisma SASE customers, the story is simple. Prisma Browser slots into the existing policy engine, user identity directory, and network enforcement architecture — no parallel stack, no separate vendor relationship. Zero Trust policy extends to cover AI agent identities (NHI) and their per-action authorisation. The AI Access Security layer within Prisma SASE provides visibility and control over more than 6,000 GenAI applications — the discovery capability you need before you can govern what you haven’t yet found.

For organisations not currently on Prisma SASE, deploying Prisma Browser means a full SASE procurement and deployment process. For a 50-500 person company, that’s a material commitment. Treat Prisma Browser as part of a Prisma SASE evaluation, not a standalone browser purchase.

Unified policy across browser layer, network layer, and API layer from a single console is the upside. The full SASE commitment is the trade-off.

Prisma Browser vs. Island vs. Chrome Enterprise Premium — which is right for your organisation?

Three significant enterprise browser security products launched within ten days. Here is how they compare.

Prisma Browser (Palo Alto Networks) offers the deepest agent lifecycle governance: NHI policy, toxic-prompt blocking, step-up MFA, session recording, semantic DLP, and Portkey API-layer control. Pricing is not publicly disclosed — it’s part of the Prisma SASE enterprise contract. Requires Prisma SASE. Best fit: existing Prisma SASE customers and regulated industries running production agents.

Island is the incumbent specialist. 450 enterprise customers, seven of the ten largest financial institutions, a $4.85 billion Series E. Purpose-built enterprise browser, no SASE prerequisite, model-agnostic. Pricing is contact-vendor only. Best fit: financial services, HealthTech, and organisations wanting specialist pedigree without a SASE commitment. The trade-off: Island is a specialist product, not a platform, and its API-layer agent governance is less mature than what Portkey adds to Prisma AIRS.

Chrome Enterprise Premium (Google) at $6/user/month is the lowest-friction entry point for Google Workspace organisations. Auto Browse (Gemini 3), Chrome Skills, real-time DLP, and a claimed 50% reduction in unauthorised AI data transfers. No SASE required. The limitation: the NHI policy model, semantic DLP, and step-up MFA that Prisma Browser offers aren’t yet present. Best fit: Google Workspace–first organisations with a limited security team.

Agent 365 (Microsoft) at $15/user/month governs agent behaviour via Entra Agent ID — persistent identity, permissions, and lifecycle controls for each agent, integrated with Microsoft Defender and Microsoft Purview. Requires M365. Best fit: Microsoft M365–first organisations. Does not address browser-layer prompt injection in the same way as Prisma Browser.

The fit by company profile is fairly straightforward: if you’re on Prisma SASE, start with Prisma Browser. If Google Workspace is your world, Chrome Enterprise Premium. Financial services or HealthTech wanting a specialist without SASE: Island. Microsoft M365 shop: Agent 365 as the baseline, Prisma Browser for deeper agent governance on top.

What should you evaluate before committing to a browser security product?

This is an architecture decision, not a product decision. What you choose determines your security stack dependencies for the next three to five years. Six questions will narrow the field.

1. What is your existing platform investment? This is the highest-leverage filter. Already on Prisma SASE? Evaluate Prisma Browser first. Microsoft M365–first? Agent 365 is the baseline. Google Workspace? Chrome Enterprise Premium is the lowest-friction path. No strong incumbent? The evaluation is open, but factor the full Prisma SASE commitment into the timeline and cost.

2. Are you running production agents or still in pilot? Pilots: Chrome Enterprise Premium or Island may be sufficient. Production agents accessing customer data, financial records, or regulated health information: the Portkey-backed AI Gateway controls and full Prisma AIRS integration become material. The governance gap at the API layer is not theoretical at production scale.

3. What are your regulatory and auditability requirements? HIPAA’s audit controls requirement (45 CFR §164.312(b)) requires unified logging. GDPR Article 22 requires human oversight for autonomous agent decisions on personal data. Prisma Browser’s session recording, semantic DLP, and step-up MFA are purpose-built for these.

4. How much shadow AI is currently in use? If you don’t know, discovery comes before governance. AI Access Security within Prisma SASE and Island’s browser-native discovery both serve this purpose. You cannot govern what you haven’t discovered.

5. Are you prepared for token cost governance at scale? Ask every vendor how they address token budget controls and cost attribution per agent. Portkey’s quota controls are currently a differentiator — no other product in this comparison addresses this directly.

6. What is your vendor lock-in tolerance for AI models? Prisma Browser with BYOLLM separates the browser security decision from the AI model decision. Chrome Enterprise Premium integrates Gemini 3 natively — advantage if you’re committed to Google’s stack, constraint if you’re not. Island is model-agnostic.

Practical next step: request a POC from each shortlisted vendor and frame it around a specific production agent workflow, not a generic demo. For more context, see the complete agentic browser security landscape.

Frequently Asked Questions

How does Prisma Browser pricing work? Not publicly disclosed. It’s priced as part of the Prisma SASE enterprise contract — typically per-user, per-month on an annual commitment with volume discounts. Contact Palo Alto Networks directly for a quote.

Does Prisma Browser replace Island or work alongside it? For most organisations, this is an either/or decision based on existing stack. Prisma Browser is purpose-built for organisations on Prisma SASE; Island is a standalone enterprise browser with no platform prerequisite.

Can Prisma Browser block all prompt injection attacks? No product can guarantee 100% prevention. The 1,000+ AI-driven content classifiers block known patterns in real time, but novel injection techniques will keep emerging. Defence-in-depth — browser-layer controls, Portkey’s API-layer inspection, and step-up MFA — provides the strongest available mitigation posture.

What is the difference between Prisma Browser and Chrome Enterprise Premium? Prisma Browser offers deep agent lifecycle governance: NHI policy, step-up MFA, semantic DLP, session recording, and Portkey AI Gateway. Chrome Enterprise Premium ($6/user/month) integrates Gemini 3 and real-time DLP but has less mature agent-specific governance. Platform depth versus cost and integration simplicity.

Do I need to be an existing Palo Alto Networks customer to use Prisma Browser? Yes. Prisma Browser requires Prisma SASE as the underlying platform. Factor the full SASE procurement and deployment process into your evaluation timeline.

What is the Portkey acquisition and how does it affect Prisma Browser? Portkey is an AI Gateway platform acquired 30 April 2026 and being integrated into Prisma AIRS. It extends agent governance beyond the browser to cover all LLM API calls, MCP server interactions, and agent-to-agent communications — closing the full agent execution perimeter.

What is Agent 365 and how does it compare to Prisma Browser? Agent 365 is Microsoft’s AI agent governance add-on for M365, generally available 1 May 2026 at $15/user/month. It governs agents via Entra Agent ID with Defender and Purview integration — but does not provide browser-layer prompt injection protection or semantic DLP. Right choice for Microsoft-native organisations.

What industries are most exposed to agentic browser threats right now? Financial services and HealthTech carry the highest regulatory risk from agent-driven data exfiltration and unauthorised transactions. SaaS companies with multi-tenant architectures face the highest risk of cross-tenant data leakage via over-permissioned agents.

What is the difference between Prisma Browser and Prisma AIRS? Prisma Browser is the endpoint product — the Chromium-based browser governing agent execution at the UI layer. Prisma AIRS (AI Runtime Security) is the runtime enforcement engine underneath it, connecting to Portkey AI Gateway for API-layer control. Prisma AIRS is the platform; Prisma Browser is one of its enforcement agents.

What is SASE, and why does it matter for this decision?

💡 SASE (Secure Access Service Edge) is a cloud-delivered architecture that combines networking and security services into a single platform, enforcing policy on user identity rather than network perimeter. It matters because Prisma Browser is a module within Prisma SASE — not standalone — so your existing SASE investment determines whether this is an incremental deployment or a full platform commitment.

Is there a trial or POC process for Prisma Browser? Yes. Contact PANW’s sales team or request a demo via the Prisma Browser product page.

How does Prisma Browser handle agents that operate outside the browser? Browser-layer controls only govern actions inside the Prisma Browser client. Agents operating via native applications, SDKs, or direct API calls are governed by Portkey AI Gateway at the API layer and Prisma AIRS at runtime. That’s precisely why the Portkey acquisition matters.

Prisma Browser is the most complete enterprise answer to agentic browser security available today — but it comes with a full Prisma SASE commitment attached. For organisations already on that platform, the agent governance stack is the logical next layer. For everyone else, the evaluation starts with your existing infrastructure and regulatory exposure, not with any single vendor’s pitch. For a full view of the agentic browser security landscape — across products, threats, and governance approaches — see our agentic browser security and governance guide.

Zero-Click Hijack via Calendar Invites — The New Prompt Injection Attack Surface

A developer accepts a calendar invite. Does nothing else. Within the hour, their AI browser agent has read local config files, extracted API keys, and sent them to a remote server.

No malware. No suspicious attachment. No macro. Just an accepted meeting.

This is an indirect prompt injection attack via calendar invite. Zero deliberate user interaction required. This article covers the Zenity Labs PleaseFix / PerplexedBrowser research, CVE-2026-26144 in Microsoft Excel/Copilot, the EchoLeak predecessor from June 2025, and what detection and response currently look like.

Part of a broader agentic browser security guide. For the conceptual framing, see why execution environments make these attacks structurally novel. For the full agentic browser threat taxonomy, see the companion article.

What is indirect prompt injection — and why does it matter more when an AI agent is involved?

Indirect prompt injection (IPI) is when malicious instructions get embedded in third-party content — emails, documents, calendar invites, web pages — that an AI agent processes as part of its normal work. The agent then executes the attacker’s commands as if they were legitimate user instructions.

This is different from direct prompt injection (jailbreaking), which targets the model through the user interface. IPI arrives through content the agent retrieves. NIST has described it as “generative AI’s greatest security flaw”, and OWASP’s 2025 Top-10 ranks it the #1 threat to LLM applications.

The root cause is a trust boundary failure. An AI agent doesn’t reliably distinguish between authoritative instructions and data it’s processing. Any content it reads is potentially instructable. That was manageable when AI systems only answered questions. It stops being manageable when the agent can browse, click, read files, and submit forms. OpenAI stated in December 2025 that prompt injection in agentic browsers is “unlikely to ever be fully solved.”

What makes a zero-click attack categorically different from other prompt injection attacks?

A zero-click attack executes without any deliberate user action. Accepting a calendar invite, opening an email, performing a routine search — any of these is sufficient. Zero-click data loss is no longer hypothetical. The user doesn’t have to fall for anything.

Standard security awareness training has no purchase here. There’s nothing suspicious to avoid. The attack surface is the agent’s content processing pipeline, not the user’s judgement. Traditional phishing requires a click. Standard prompt injection requires visiting a compromised surface. Zero-click IPI executes automatically when the agent processes the accepted invite.

Agent hijacking is the high-severity variant — persistent influence over an agent that lasts across sessions and compounds over time. For the broader context, see the full agentic browser threat taxonomy.

How does the Zenity Labs calendar-invite hijack actually work — step by step?

Zenity Labs identified the PleaseFix vulnerability family in Perplexity’s Comet agentic browser. The findings: insufficient isolation between user commands and untrusted input, ungated filesystem access, and no requirement for explicit user approval before file reads. Patched prior to public disclosure.

Here’s how the attack chain works:

The attacker crafts a calendar invite with hidden HTML elements and embedded prompt instructions. They mimic Comet’s internal system prompt formatting, including a “system_reminder” structure. Non-English instructions reportedly bypass guardrails more effectively.
The victim accepts the invite. Nothing else required.
Comet processes the invite and encounters both the user’s task context and the attacker’s instructions simultaneously.
Intent collision — Zenity Labs’ term for when the agent merges the user’s legitimate task with the attacker’s injected instruction. The malicious actions appear to be part of what the user requested.
The agent navigates to an attacker-controlled site for further instructions, potentially obfuscated to evade content filters.
The agent uses file:// URLs to browse the local filesystem — config files, API keys, locally stored secrets.
Data is exfiltrated to the attacker’s server. If 1Password is unlocked in Comet, stored credentials go with it.

The entire attack completes in under a minute. The user’s workflow looks normal throughout.

What is CVE-2026-26144 and what does it reveal about the Microsoft Copilot risk?

CVE-2026-26144 is a critical-severity information disclosure vulnerability in Microsoft Excel — a cross-site scripting flaw that causes Copilot Agent mode to exfiltrate data via unintended network egress. CVSS 9.3.

The severity comes from Copilot Agent mode. It transforms Copilot from an assistant into an agent capable of acting across Microsoft 365. A passive XSS becomes an active exfiltration attack when a permissioned agent is sitting in front of it.

The trigger: a single email with a malicious Excel attachment. No macros, no suspicious link, no unusual action required. Patched March 10, 2026.

Alex Vovk (Action1 CEO) put it simply: “If exploited, attackers could silently extract confidential information from internal systems without triggering obvious alerts.” If Copilot Agent mode is enabled in your M365 environment and the March patch hasn’t been applied, every user who opens a malicious Excel attachment is exposed.

What is EchoLeak and why does it matter that CVE-2026-26144 is not the first Microsoft Copilot zero-click vulnerability?

CVE-2026-26144 is not an isolated flaw. The EchoLeak case from nine months earlier makes that clear.

EchoLeak (CVE-2025-32711) was published June 11, 2025. A malicious email caused Microsoft 365 Copilot to access internal files and transmit them to an attacker-controlled server without user interaction, bypassing Microsoft’s XPIA classifier, link redaction, and content security policy rules.

The timeline: EchoLeak (June 2025) → Zenity Labs PleaseFix (patched February 2026) → CVE-2026-26144 (patched March 10, 2026). Two near-identical CVEs in Microsoft Copilot products within nine months. That’s a pattern in the architecture, not a one-off.

Concentric AI’s research adds important context: organisations average 802,000 files at risk from oversharing — that’s the environment these vulnerabilities operate in. And Cymulate Research Labs documented zero-click RCE chains in Cursor CLI, AWS Kiro, Codex Desktop App, and Gemini CLI, confirming this attack class extends to development tools across platforms.

Why do agentic browsers make these attacks structurally novel — and what does that mean for patching?

A traditional browser renders content. An agentic browser reads content and acts on it. That’s the whole difference.

Hidden instructions in a web page are ignored by Chrome. Those same instructions become executable commands in Comet or Copilot Agent. An agentic browser is an execution environment with access to files, credentials, and enterprise systems — acting at the user’s level of permission. Blast radius scales with permissions: a token granting access to email, calendar, cloud storage, and CRM gives an attacker access to all of those. Connector abuse is the new shared resource risk.

OpenAI’s December 2025 statement sets the expectation: prompt injection in agentic browsers is “unlikely to ever be fully solved” because blending trusted and untrusted inputs in the same context window is architectural. The patches for CVE-2026-26144 and PleaseFix close specific vulnerabilities — a new agentic feature can re-expose the same underlying problem. More on this in why execution environments make these attacks structurally novel.

What can your security team actually do — and what are the limits of current detection?

Intent collision is the detection problem: the agent’s execution plan looks coherent because it’s blended with legitimate task context. Comet raised internal safety warnings during the PleaseFix attack — and the data left the machine anyway. Detection alone is not enough.

On the vendor side: Microsoft Prompt Shields is a probabilistic classifier that catches known injection techniques but misses obfuscated payloads. Spotlighting isolates external content within prompts — reduces but doesn’t eliminate IPI risk. Plan drift detection requires vendor implementation. Microsoft’s recommended posture: assume IPI is inevitable and design to contain it.

Here’s what your team can act on today:

Audit OAuth scopes held by AI tools (Google Workspace, M365, Slack). Reduce to minimum required.
Verify the CVE-2026-26144 patch has been applied (Patch Tuesday, March 2026).
Review which agents have file:// URL access or local filesystem permissions. Restrict where possible.
If Copilot Agent mode is enabled in M365, restrict outbound network traffic from Office applications.
Establish baseline monitoring for unusual file access patterns from AI agent sessions.

Flag the compliance angle to legal. Under HIPAA’s audit controls requirement, browser agents spanning multiple systems without unified logging create immediate violations. GDPR Article 22 adds risk around autonomous data access decisions. A successful IPI attack that exfiltrates customer PII triggers mandatory breach notification.

For the enterprise defence platform response, see the companion article on how Prisma Browser blocks toxic prompts. For a broader view of the agentic browser threat landscape — including governance frameworks and vendor options — see our complete agentic browser security and governance guide.

Frequently Asked Questions

Is the Zenity Labs calendar-invite attack (PleaseFix / PerplexedBrowser) patched?

Yes — patched February 2026 following Zenity Labs’ disclosure to Perplexity. Testing on February 13, 2026 confirmed the remediation. Patching the specific implementation does not close the underlying architectural class though. Any agentic browser that processes calendar invites as instructable content is potentially exposed to the same technique.

Does this attack affect Google Calendar as well as Outlook?

The specific CVE-2026-26144 is an Outlook/Excel/Copilot chain. The Zenity Labs PleaseFix research exploited a Google Calendar invitation processed by Perplexity Comet. The attack class is calendar-platform-agnostic — the question is whether the AI agent processing calendar data will act on embedded instructions.

How is this different from a standard phishing attack?

Phishing requires a deliberate action — a click, an attachment, entering credentials. Zero-click IPI via calendar invite requires only accepting a meeting. Normal workplace behaviour with no visible warning signs. Standard phishing awareness training does not protect against this.

What is the difference between a zero-click attack and a traditional prompt injection attack?

Traditional prompt injection requires user interaction with a compromised surface. Zero-click attacks eliminate that requirement. The trigger is passive processing of legitimate-looking content. The victim’s behaviour is entirely normal.

What is CVE-2025-32711 (EchoLeak) and how does it relate to CVE-2026-26144?

EchoLeak (CVE-2025-32711): a malicious email caused M365 Copilot to exfiltrate internal files without user interaction (CVE published June 2025). CVE-2026-26144 (patched March 2026): a malicious Excel spreadsheet triggered Copilot Agent to exfiltrate data. Both are zero-click IPI vulnerabilities in Microsoft Copilot products, nine months apart. A recurring vulnerability class, not an isolated flaw.

What is intent collision and why does it make these attacks hard to detect?

Intent collision is Zenity Labs’ term for when the agent merges the user’s legitimate request and the attacker’s hidden instruction into a single execution plan. The resulting behaviour looks plausible because it’s blended with real task context. Content filters see a coherent plan, not an injection signal.

Can this attack work if I am not using Perplexity Comet or Microsoft Copilot?

Any AI agent that processes calendar invites, emails, or documents as instructable content and has permissions to access files or the network is potentially exposed. Cymulate documented zero-click RCE chains in Cursor CLI, AWS Kiro, Codex Desktop App, and Gemini CLI. The attack class follows wherever AI agents process external content and act on it.

What is the practical blast radius of a successful calendar-invite hijack?

Blast radius is bounded by the OAuth scopes and filesystem permissions the agent holds. The Zenity Labs demonstration accessed local config files, API keys, and stored credentials if 1Password was unlocked. In an enterprise context with OAuth access to email, calendar, cloud storage, and CRM, the scope is substantially larger. Concentric AI found up to 802,000 files per organisation at risk from Copilot overpermission alone.

Does applying the CVE-2026-26144 patch fully protect Microsoft 365 users?

The March 2026 patch closes the specific Excel/Copilot Agent vulnerability. It does not protect against new IPI variants in other M365 applications or future Copilot Agent capabilities. OpenAI has stated prompt injection in agentic browsers is “unlikely to ever be fully solved” — the patch addresses the implementation, not the architectural class. Patch immediately, then audit all agentic AI permissions and network egress controls.

What should I tell my board about this risk?

Frame it as an access control problem with a new vector: AI agents hold enterprise permissions and can be instructed to misuse them by content they process — not by someone who has compromised your systems. Two near-identical CVEs in Microsoft Copilot products within nine months (EchoLeak: June 2025, CVE-2026-26144: March 2026) confirm this is a recurring class. The action items are concrete: OAuth scope audit, Copilot Agent mode review, patch verification.

Is there a way to use agentic browsers safely in the enterprise?

Current best practice is defence-in-depth: minimum-privilege OAuth scopes, restricted filesystem access, outbound network traffic controls from AI agent processes, verified patch status, and plan drift monitoring where the vendor offers it. No single control eliminates the risk. The practical question is whether the productivity benefit of a specific agentic capability justifies the access it requires — and whether that access has been explicitly audited.

This article is part of our series on agentic browser security. For the full picture on AI browser agent risks — attack taxonomy, enterprise defence options, governance frameworks, and vendor comparisons — see the AI browser agents complete security and governance guide.

Perplexity Comet — What an AI-Native Browser Actually Does

In April 2026, Human Security reported that Perplexity Comet holds 48.12% of all tracked agentic web traffic — up 7,851% year-on-year. That is not a niche experiment. It is the dominant way AI agents interact with the web right now, and it raises a pretty immediate question: what is Comet actually doing?

In this article we cover what Comet does under the hood, how it stacks up against ChatGPT Atlas, what the permission model means in practice, and why the $200/month price point is itself an enterprise risk signal. For the complete picture, see our agentic browser security and governance overview.

What is Perplexity Comet, and why does calling it a “browser” undersell it?

Comet is built on Chromium — the same rendering base as Chrome, Edge, and Brave. But Chromium is the substrate, not the product. The AI agent layer is integrated at the browser’s rendering and extension level, which makes the agent the execution engine, not an add-on.

Compare that to Chrome with a sidebar assistant, or Arc‘s Dia. In that setup, the browser stays in control: the AI reads the page, suggests an action, and waits for you to execute. In Comet it’s reversed — the agent navigates, clicks, fills forms, and completes tasks. You review the outcomes.

Perplexity CEO Aravind Srinivas’s “Everything is Computer” framing makes the strategic intent explicit. The browser is the universal interface through which all computer-mediated work passes. An app-only agent sees only what the app exposes; a browser agent sees everything your authenticated sessions expose.

💡 Chromium is the open-source browser project that underpins Chrome, Edge, Brave, and Comet — it handles page rendering while each product adds its own layer on top.

Comet launched on iPhone on March 18, 2026, with Mac, Windows, and Android following. All Chrome extensions work in Comet, and bookmarks import without friction.

How does Comet work under the hood — DOM access, authenticated sessions, and the semantic work graph?

The thing that makes Comet architecturally distinct is DOM access. Comet’s internal extension reads and writes to the Document Object Model — the programmatic representation of a webpage’s structure — of every page it visits. That gives the agent access to form fields, buttons, authentication tokens, and dynamically rendered content in real time.

This is different from older automation approaches. Screen-coordinate automation reads pixel positions. Accessibility API access reads structural labels. DOM access gives the agent the actual live page structure and session state. (For more, see the architectural shift from display surface to execution environment.)

Add authenticated session access and it gets interesting. Once you’re logged into a service — email, Salesforce, a banking portal — Comet’s agent can read and act within that context. No re-authentication required, and the site has no reliable way to distinguish the interaction from your own behaviour.

Session-level synthesis turns that access into something genuinely useful. Rather than treating each page as an isolated query, Comet maintains context across all pages in a session. That feeds the semantic work graph: a durable, cross-domain record of your work context across calendar, email, SaaS apps, and internal tools.

The underlying concept — a semantic work primitive — means the agent isn’t reasoning about button clicks. It’s reasoning about the actual thing you are trying to accomplish: a refund, a payment authorisation, a booking. How the work graph persists across sessions is not fully resolved publicly, and that gap determines whether Comet becomes a genuine work intelligence layer or a sophisticated per-session assistant.

The companion layer is Perplexity Personal Computer, the macOS agent for native apps and the file system. Both require the Perplexity Max subscription.

What can Comet do in practice that a standard AI browser extension cannot?

An AI extension operates above the browser. It reads the page, suggests actions, and waits for you to execute. Comet operates inside the browser. It executes directly, chaining steps across multiple sites without waiting for input at each one.

That is a qualitative capability gap, not just a speed advantage. Comet maintains authenticated session access and session-level context across domains, so a single instruction can span a vendor portal and an internal approval system without re-authentication or context re-passing at each step. An extension-based agent needs manual handoffs for the same workflow.

In practice: multi-step research synthesis, email drafting informed by full session context, travel booking across airline and hotel sites in a single instruction. Personal Computer extends this to macOS native apps and the file system.

A few current limitations worth noting. The $200/month Max subscription is required. Work graph persistence across device reinstalls is unresolved. And Comet does not replace purpose-built RPA tools for complex, API-integrated enterprise workflows.

Human Security’s April 2026 data shows three industries capturing 98% of agentic traffic: media (45.62%), e-commerce (38.20%), and travel (14.12%). These capabilities are being exercised at scale right now. For the security implications, see five attack categories every security team must understand.

How does Perplexity Comet compare to ChatGPT Atlas?

Comet holds 48.12% of all tracked agentic web traffic. ChatGPT Atlas holds 21.33%. Together they account for roughly 70% of total agentic volume — but traffic share is not the whole story. Atlas had 62 times more corporate downloads, with adoption of 67% in technology companies, 50% in pharmaceuticals, 40% in finance. Different products, different enterprise maturity.

Autonomy philosophy is the practical difference. Comet is autonomy-first: agent execution with your review at completion. Atlas is confirmation-first: the agent pauses at consequential steps and requests approval before proceeding.

Architecture is the technical difference. Comet uses an internal Chromium extension that interacts directly with the DOM, leaving detectable traces — “DOM artifacts” — in page structure. Atlas uses OWL (OpenAI Web Layer), an out-of-process architecture where the agent operates outside the Chromium process entirely.

💡 OWL (OpenAI Web Layer) is Atlas’s architecture for controlling the browser from outside the browser process — lower privilege depth than Comet’s internal extension, but fewer detectable artifacts.

Comet’s internal extension gives it richer access to dynamically rendered content and authenticated sessions. Atlas’s OWL approach reduces the attack surface but limits access to some page structures that only expose state through the DOM.

Model flexibility: Comet lets you choose between GPT, Claude, and Perplexity’s Sonar. Atlas is tied to OpenAI’s model stack — a real consideration if you’re evaluating vendor lock-in.

Comet gives you depth and flexibility; Atlas gives you caution and corporate readiness.

What is the permission ladder and where does Comet currently sit on it?

The permission ladder is a five-stage autonomy framework from MindStudio — analyst framing, not an official Perplexity document:

Read — observing without acting
Suggest — surfacing something proactively without action
Draft — preparing an action for human approval
Act with confirmation — executing, pausing at consequential moments
Act autonomously — completing consequential actions without asking

Current commercial deployments sit at “act with confirmation” for consequential actions — transactions, form submissions, authenticated data changes. For read-only and low-stakes tasks, Comet operates fully autonomously. As MindStudio puts it: “read calendar (fine); reschedule a meeting (needs confirmation); authorise payment (different conversation entirely).”

The Trail of Bits audit — commissioned by Perplexity, published February 20, 2026 — demonstrated four prompt injection techniques capable of extracting private information from authenticated sessions, including a proof-of-concept where the agent submitted Gmail contents to an attacker-controlled URL. The audit recommended least-privilege defaults; Perplexity’s confirmation model is consistent with that.

This is why the permission model is not just a UX choice. Prompt injection — malicious instructions embedded in page content — can redirect agent behaviour without your awareness. At full autonomy, a successful injection can complete harmful actions before any review. Full attack taxonomy: five attack categories every security team must understand.

Why does the $200/month price point matter beyond the feature set?

Perplexity Max at $200/month includes Comet’s full AI features, Perplexity Personal Computer, 10,000 cloud workflow credits, and the full Perplexity Search stack. The browser is free; Max is where the execution intelligence lives.

Here is the enterprise risk, and it is behavioural. At $200/month, a motivated team member might self-purchase on a personal credit card rather than wait for IT procurement. That means an unsanctioned authenticated browser agent inside your SaaS environment, reading email, accessing dashboards, submitting forms — without IT visibility or DLP coverage. Bypassing procurement bypasses the security review that would otherwise happen before a new tool gets authenticated access to your systems.

Perplexity’s enterprise response: through a CrowdStrike partnership, Comet Enterprise customers get extension visibility, risk scoring, and prevention of sensitive data entry. Enterprise SSO is rolling out. But verify current maturity against your actual compliance requirements, not roadmap statements.

For governance, procurement controls, and the Amazon v. Perplexity dispute as a worked example of boundary-pushing agent behaviour, see the Amazon legal dispute as a worked example.

What happens next: security implications and governance questions

Comet’s architecture — DOM access, authenticated session inheritance, cross-site task chaining — creates a security exposure category traditional browser tooling was not designed to govern. Same-origin policy and CORS become less effective when an AI agent follows embedded commands across origins. Cornell University research shows prompt injection has evolved into five-stage attacks mirroring traditional malware: initial access, privilege escalation, persistence in agent memory, lateral movement, execution. OWASP LLM Top 10 ranks it the top threat to LLM applications.

Under HIPAA, agents operating across systems without unified logging create audit concerns. Under GDPR Article 22, autonomous data access without human oversight can trigger automated decision-making protections. Human Security’s April 2026 data shows the blocking rate for agentic traffic hit 8.2% — a 3.9 percentage point increase from March.

Who is liable when an agent completes a task incorrectly? What does Comet transmit to Perplexity’s servers? How does work graph accumulation interact with data residency requirements? The governance frameworks have not caught up to Comet’s 48.12% market share. Our complete agentic browser security and governance guide maps the full landscape — from product capabilities to threat taxonomy to organisational response.

For the attack taxonomy, see five attack categories every security team must understand. For governance depth, see the Amazon legal dispute as a worked example. For the full picture, see our agentic browser guide.

Frequently Asked Questions

Is Perplexity Comet available on desktop?

Yes. Comet launched on iPhone on March 18, 2026, and is available on Mac, Windows, and Android. All Chrome extensions work in Comet; bookmarks import without manual configuration.

What is the difference between Perplexity Comet and Perplexity Personal Computer?

Comet handles web-based interfaces — forms, portals, authenticated SaaS sessions — via DOM access and session-level synthesis. Personal Computer is the companion agent for macOS native apps and file system access. Both require the $200/month Max subscription.

How does Comet compare to using ChatGPT with a browser extension?

A ChatGPT extension reads the page and asks you to take the next action. Comet reads the DOM directly, maintains session context across pages, and executes multi-step tasks without per-step input. The extension suggests; Comet acts.

Does Perplexity Comet store what it sees in my browser sessions?

Comet has potential access to all your browsing history, page content, and form data. Data retention terms for session-level synthesis are still evolving as of publication. Perplexity provides an incognito mode that does not log session activity. Read their current privacy policy directly.

What is a semantic work graph in plain English?

Perplexity’s term for a durable, cross-domain record of your work context — the appointments, approvals, purchases, and documents that make up your actual job. Comet accumulates this so the agent understands not just “click this button” but “this is a payment authorisation in the procurement workflow for vendor Y.”

Is Perplexity Comet safe to use in an enterprise environment?

It depends on the controls you have in place. DOM-level access to authenticated SaaS sessions creates exposure to prompt injection and unauthorised data access if a team member self-purchases without IT review. Perplexity commissioned a Trail of Bits pre-launch audit and is developing Comet Enterprise — but enterprise control maturity is not yet at parity with established SaaS security tooling.

What is the Perplexity Max subscription and what does it include?

Perplexity Max is the $200/month tier that includes Comet’s full AI features, Perplexity Personal Computer, cloud-based Perplexity Computer with 10,000 workflow credits, and the full Perplexity Search stack. The browser is free; Max is where the execution intelligence lives.

What did the Trail of Bits security audit find about Comet?

Four prompt injection techniques capable of extracting private information from authenticated sessions — including a proof-of-concept that caused the agent to submit Gmail contents to an attacker-controlled URL. Least-privilege agent defaults were recommended, consistent with Perplexity’s confirmation model.

Why did Perplexity build a browser instead of just extending its app?

Because the browser is the universal interface through which all computer-mediated work passes. An app-only agent can only act on what the app exposes; a browser agent can act on everything your authenticated sessions expose.

What is OWL and how is ChatGPT Atlas different from Comet architecturally?

OWL (OpenAI Web Layer) is Atlas’s out-of-process control architecture — the agent operates outside the Chromium process. Comet’s internal extension gives it richer DOM access at the cost of detectable artifacts and higher privilege; Atlas reduces the attack surface but limits access to some dynamically rendered content.

Can Comet complete transactions on my behalf without asking me first?

In current commercial deployments, Comet requests confirmation before consequential actions — purchases, form submissions, data changes. For read-only and low-stakes tasks it operates autonomously. You can configure the autonomy level within the product.

Human Security’s April 2026 State of Agentic Traffic report puts Comet at 48.12% of tracked agentic web requests, ahead of ChatGPT Atlas (21.33%), Claude Chrome Extension (17.33%), and ChatGPT Agent (8.55%). Despite lower overall traffic, Atlas has approximately 62 times more corporate downloads than Comet.

The AI Startup Consolidation Wave — What Is Happening and Why It Matters

Q1 2026 set a record for global startup funding at $297 billion — and the headline looks like a boom. Look inside the number and it is something else: roughly 80% of that capital went to fewer than ten companies, the middle of the market is being starved of growth funding, and AI startups are merging, being acquired, or folding at a pace the industry has not seen before. That is the AI startup consolidation wave. This hub explains what is driving it, what forms it takes, and what it means for your vendor relationships and procurement decisions.

In this series:

Here is what you need to know.

What is the AI startup consolidation wave — and why is it happening now?

The AI startup consolidation wave is the accelerating pattern of AI companies exiting via acquisition, merger, or talent deal rather than achieving independent scale. Two structural forces converged to drive it: the barbell effect in venture capital has starved mid-stage companies of growth funding — capital pools at the frontier-lab mega-rounds and at seed, with almost nothing in between — and the AI compute supercycle has created infrastructure costs that most AI companies cannot sustain without a hyperscaler behind them. The result is a K-shaped M&A market where hyperscalers consolidate aggressively and mid-tier vendors struggle to find buyers at fair prices.

Full analysis: the barbell funding dynamic and what it means for mid-stage AI companies

What happened in the Cohere–Aleph Alpha merger?

In April 2026, Canadian enterprise AI company Cohere merged with German enterprise AI company Aleph Alpha to form a combined entity valued at approximately $20 billion. The deal was anchored by a $600 million investment from Schwarz Group — the retail conglomerate behind Lidl and Kaufland — which committed to hosting the merged entity’s AI infrastructure on its STACKIT sovereign cloud. It is the largest non-US AI merger to date and a detailed case study in what a well-structured consolidation deal looks like from a customer’s perspective.

See: how the Cohere–Aleph Alpha deal was structured and what it means for enterprise customers

What is sovereign AI and why does it matter for enterprise procurement?

Sovereign AI refers to AI infrastructure — models, compute, and data — that is controlled within a nation’s or region’s legal and political boundaries rather than operated by foreign-headquartered hyperscalers. Governments in Europe, the UK, and Canada are now funding a parallel track of AI company formation specifically to provide alternatives to US cloud dependency. For your procurement decisions, sovereign AI providers carry contractual and legal obligations that US vendors do not, and that distinction matters for regulated workloads.

For the details: how governments are backing national AI champions and what that means for procurement

What are the three ways AI companies get consolidated — and what is an acquihire?

The three patterns are: an acquihire, a full product merger, and a platform roll-up. In an acquihire, a large company acquires the team and intellectual property while the product is shut down — enterprise customers are left to find alternatives. Microsoft’s acquisition of Inflection AI‘s team for approximately $650 million established the template that other hyperscalers now follow. In a full merger like Cohere–Aleph Alpha, both products carry forward and existing customer contracts survive. In a platform roll-up, a hyperscaler absorbs a product line and it loses independent identity. Each pattern carries different implications for your vendor relationships.

See: the complete taxonomy of acquihire, full merger and platform roll-up — and what each means for enterprise customers

How do I know whether my AI vendor is at risk of being acquired or shut down?

The primary signals to watch are funding runway, funding stage (Series B/C compression means growth-stage vendors are under-funded relative to their cost structure), investor composition (whether hyperscalers hold minority stakes), and revenue quality (whether ARR is genuine contracted revenue or includes usage credits and deferred commitments). Vendors at $50–100M ARR without a clear path to $250M+ are structurally the most vulnerable. Reaching $100M ARR used to signal a company had made it; today that milestone puts a company in a risk band — large enough that compute costs are material, but too small to compete on infrastructure with frontier labs.

Full analysis: why 80+ AI startups at $100M ARR face structural pressure despite apparent product-market fit

What should you do to protect your organisation from AI vendor disruption?

Audit every AI vendor contract for three things: the assignment clause (which governs whether your contract survives an ownership change), data export rights (which determine how difficult it is to switch), and roadmap continuity commitments. Then build a tiered vendor risk register that distinguishes frontier-lab vendors (unlikely to disappear, high lock-in risk) from growth-stage vendors (acquisition targets, moderate continuity risk) from sovereign-AI vendors (protected by government contracts, lower disruption risk but higher compliance overhead).

Full analysis: the enterprise procurement checklist for evaluating AI vendor acquisition risk

FAQ

What does “barbell effect” mean in the context of AI funding?

The barbell effect describes a funding distribution with mass at both extremes and very little in the middle — huge late-stage rounds for a handful of frontier labs at one end, small seed rounds at the other, and very little growth-stage capital in between. It is the structural explanation for why so many AI companies with genuine product-market fit face existential pressure. See $297B Q1 Record Funding and the Barbell Problem.

What is the difference between an acquihire and a full product merger in AI?

An acquihire extracts a startup’s team and intellectual property while closing the product — enterprise customers are generally left without continuity and may have 90 days’ notice before their contract is terminated. A full product merger like Cohere–Aleph Alpha preserves both companies’ products and carries existing customer contracts forward under the merged entity. See Three Patterns of AI Consolidation.

Is the AI startup consolidation wave the same as the dot-com bubble?

No. The dot-com bubble was companies with minimal revenue collapsing when growth failed to materialise. The current consolidation wave is happening to companies with real products and real revenue — the pressure is structural (compute costs, capital concentration, regulatory complexity) rather than a correction from speculative overvaluation. See the survival calculus for AI startups at $100M ARR.

Can an AI startup still achieve an IPO in 2025–2026?

The IPO window is very narrow. In 2025, only 66 VC-backed companies went public globally. The current informal threshold for a viable AI IPO is approximately $250M ARR with sustained 25%+ growth — a bar that fewer than a dozen independent AI companies currently meet. For the vast majority of AI startups, the realistic exit paths are acquisition, merger, or a secondary-market partial liquidity event. See why the $100M ARR milestone is now a risk indicator rather than a safety signal.

Which AI vendors are considered “sovereign AI” providers?

Sovereign AI vendors are those whose infrastructure, models, and data operations are legally bound to a specific jurisdiction. Current examples include Cohere–Aleph Alpha (post-merger, anchored to STACKIT European cloud), Mistral AI (France-backed, under EuroStack policy umbrella), and Nscale (UK, backed by the British Sovereign AI Fund). Sovereign AI status is verified through regulatory certifications, public-sector contract terms, and the legal structure of the company’s cloud partnerships — not self-declared marketing. See how governments are structuring sovereign AI investment and what it means for regulated workloads.

What contract clause should you prioritise when signing with an AI vendor today?

The assignment clause — the provision that governs whether your contract survives an ownership change — is the clause most worth negotiating before you sign. After that, prioritise data export rights and roadmap continuity commitments. See the due diligence checklist for enterprise AI vendor contracts.

AI Vendor Acquisition Risk — A Due Diligence Checklist for Enterprise Procurement

The deals came faster than anyone expected. ServiceNow closed its $2.85 billion acquisition of Moveworks in December 2025. Automation Anywhere absorbed Aisera the same quarter. In April 2026, Cohere and Aleph Alpha merged at a combined valuation of roughly $20 billion. Each time, enterprise customers woke up to find their vendor had a new parent — and their negotiated protections may not have survived the handover.

The AI consolidation wave that began reshaping the market in 2025 has made vendor acquisition risk a live procurement concern, not a planning-horizon abstraction. Most enterprise AI contracts were written before consolidation accelerated. They lack the clauses that protect buyers when a vendor is absorbed, acquihired, or rolled into a platform stack.

So here is a structured due diligence framework: capital structure, contract clauses, data export rights, sovereign compliance, agentic lock-in, and ongoing monitoring. Treat these as engineering decisions. Exit clauses, data-portability obligations, and model-deprecation rights need resolving before the first agent is deployed — not after an acquisition announcement.

The three consolidation patterns — acquihire, full merger, and platform roll-up — are covered in detail in the companion taxonomy article. This checklist addresses all three.

Why is AI vendor due diligence different from standard SaaS procurement?

Standard SaaS due diligence covers uptime SLAs, SOC 2 certification, pricing stability. These matter in AI contracts too — but they miss where AI-specific risk actually accumulates.

AI vendor lock-in compounds at four simultaneous layers:

API dependency and model versioning — Your architecture bends around the vendor’s API design, prompt format, and response schema. Switching requires re-engineering prompts, integration code, and your evaluation suite.
Agent orchestration framework capture — If your agents are built on a vendor-proprietary orchestration layer, you cannot swap the underlying model without rebuilding the agent.
Data gravity — Fine-tuned weights, embeddings, and institutional memory grow heavier over time. The longer the relationship, the more operational knowledge is encoded in artefacts that only exist inside the vendor’s system.
Developer workflow integration — CI/CD pipelines, API keys, webhook configurations, and MCP server registrations accumulate across engineering teams. These live in repositories and runbooks, not the MSA.

💡 An MSA (Master Service Agreement) is the primary contract governing an ongoing vendor relationship — pricing, SLAs, data handling, and termination rights; individual project orders sit beneath it.

There is a fifth layer the standard checklist misses: acquirer roadmap conflict. When ServiceNow acquired Moveworks, customers did not lose their product overnight — but they landed on the acquirer’s MSA template at the next renewal. That’s the slow version of a bad outcome. The acquihire, full merger, and platform roll-up taxonomy covers the fast version.

How do you read a vendor’s capital structure as a procurement stability signal?

Not all funding is equivalent. A vendor’s capital structure is a readable signal about acquisition probability and exit pressure — and you can learn to read it.

A clean equity round (Series A through E) means investors hold shares at a set valuation with no repayment obligation. Convertible debt is a loan that converts to equity under defined conditions — if it matures without conversion, repayment is due, and that creates exit pressure that can accelerate an acquisition on unfavourable terms. Structured financing blends debt and equity. Schwarz Group’s €600 million commitment in the Cohere–Aleph Alpha deal is the current example — it secured runway and introduced European strategic alignment, though the convertible instrument details remain undisclosed.

💡 ARR (Annual Recurring Revenue) is the annualised value of subscription contracts; ACV is the value of a single customer’s contract annually — a vendor can report high ARR while most of it comes from a handful of accounts.

Before signing, request ACV data for the top three accounts as a percentage of total ARR. Concentration above 40% from a single customer signals structural vulnerability. That vendor is more likely to accept acquisition terms on unfavourable timelines.

Cap table signals matter too. Cohere’s pre-deal investors included Nvidia, Salesforce Ventures, Cisco, and Fujitsu. Nvidia simultaneously holds equity in Cohere, OpenAI, xAI, and Poolside — your sovereign AI vendor’s primary hardware supplier also holds stakes in its largest competitors. That is worth raising in procurement.

Ask before signing: What is the maturity profile of your current financing? What percentage of ARR is contractually committed vs. pilot or expansion? Who holds board seats and what protective rights do they carry?

The 80+ ARR survival calculus covers the full set of financial health indicators that signal acquisition risk.

What contract clauses should you demand before signing an AI vendor agreement?

The change-of-control clause is the one that matters most. The spectrum runs from weak to strong, and most AI vendor MSA templates default to the weak end.

At the weak end you get notification-only: the vendor tells you an acquisition occurred. You are informed but not empowered. At the strong end you get notification plus termination-for-convenience: a defined window — 90 to 180 days — to terminate at original pricing, with data export obligations intact.

Demand two additions on top of that: coverage for indirect acquisitions (where the acquirer buys a parent holding company rather than the vendor entity directly), and automatic triggering of data export obligations without requiring a request from you.

Contract portability must appear separately. Portability means all MSA terms survive an acquisition and bind the acquirer without modification. Without it, the acquirer can treat the assumed contract as subject to its own standard terms at renewal. That is how customers with perfectly reasonable MSAs end up on worse terms.

The “controlled in Europe” sovereignty clause is the concrete benchmark. Aleph Alpha’s public-sector contracts with German federal agencies and Bundesländer including Baden-Württemberg and Bavaria required the vendor to remain “controlled in Europe.” When the Cohere merger was announced, those customers had the strongest available outcome. Enterprise buyers in EU-regulated sectors should demand equivalent language.

Data export rights need explicit scope. Most MSAs define “customer data” narrowly — raw files and structured records. Cover explicitly: derived embeddings and retrieval indices, fine-tuned model weights, persistent agent memory and conversation traces, vector database content, and tool-call logs. These are the artefacts most default MSAs exclude.

Three more clauses to push for: model substitution rights (90-plus days’ notice before any model version change); novation rights (a procedural lever at the moment of acquisition rather than an automatic rollover); SLA survival clause (18-month minimum post-merger protection).

💡 Novation is the legal transfer of a contract with the explicit consent of all parties — distinct from contract assignment, where the contract transfers automatically without requiring the customer’s consent.

For context on what Aleph Alpha customers received, the Cohere–Aleph Alpha worked example covers the deal mechanics in detail. The three consolidation patterns this checklist addresses explains how different acquisition types change which clauses to prioritise.

How do you evaluate a vendor’s sovereign AI compliance claims?

Data residency is not data sovereignty. This is the most common misunderstanding in this space, and it is worth getting right before you sign anything.

A vendor can host data in an EU data centre while remaining subject to US CLOUD Act jurisdiction regardless of where the data physically resides.

💡 The CLOUD Act permits the US government to compel American companies to produce data stored anywhere in the world — making a US-parent-owned vendor subject to US jurisdiction even with EU data centre operations.

Azure OpenAI Service offers EU data centre options but remains subject to Microsoft’s US parent jurisdiction. Verify legal jurisdiction, not just data centre geography.

EU AI Act audit requirements take full effect in August 2026. Ask vendors to produce — on request — technical documentation of model capabilities, risk assessment records, and data residency audit trails. Inability to produce a current EU AI Act technical file is a red flag, not a paperwork gap.

Sub-processor scope is wider than standard GDPR compliance teams typically track. In AI contracts it includes vector database providers, MCP server operators, fine-tuning infrastructure, and third-party embedding pipeline components. Demand a published sub-processor list and a 72-hour notification commitment for any change.

The Trust vs. Lock-in Framework (Kai Waehner, 2026) is a useful starting point. Trusted and Flexible: Cohere (post-merger), Anthropic, Mistral. Trusted but Captured: Google/Vertex AI, SAP. Risky but Flexible: OpenAI. Risky and Captured: Azure OpenAI, AWS-native stacks. Layer the acquisition risk dimension on top of this and you have a workable triage tool.

EU AI Act audit requirements and sovereign AI certification are covered in detail in the sovereign AI policy article, including DORA and NIS2 requirements.

Why is agentic AI lock-in a distinct risk category beyond general vendor lock-in?

Switching a foundation model API is a re-engineering project measured in weeks. Agentic AI lock-in is a different problem entirely.

It compounds across three additional layers that make a model switch look easy:

Orchestration framework — Workflows built on a vendor-proprietary orchestration layer cannot swap the underlying model without rebuilding the agent. It is not a refactor. It is a rebuild.
Runtime environment — Managed runtimes like AWS AgentCore bind infrastructure to agent behaviour. Migrating is closer to a platform migration than a model swap.
Developer workflow — Agent memory stores, tool registries, MCP server configurations, and CI/CD pipeline integrations accumulate over time and are not standardised across vendors.

The OpenClaw case illustrates how orchestration standard capture works. Peter Steinberger released the OpenClaw open-source agent framework in late 2025; within 60 days it was among the fastest-growing projects on GitHub. He joined OpenAI in February 2026 to lead next-generation personal agents, while OpenClaw moved to a foundation with OpenAI as sponsor. An “open source” assurance is not enough when framework governance has moved to a closed entity.

The mitigation: Model Context Protocol (MCP). Anthropic donated MCP to the Agentic AI Foundation — a directed fund under the Linux Foundation, co-founded with Block and OpenAI — in December 2025. MCP standardises how agents connect to external tools and data sources, creating a vendor-neutral integration layer that survives a model switch. Over 97 million monthly SDK downloads and 10,000 active servers give it the adoption that makes it durable.

Start agentic lock-in assessment at the beginning of the vendor relationship, not at renewal.

Acquihire, full merger, and platform roll-up patterns covers how talent-only acquihires produce the fastest orchestration layer disruption.

How do you build an ongoing AI vendor risk monitoring process?

Vendor acquisition risk is not a one-time exercise. The consolidation wave is ongoing. Quarterly signal tracking is the minimum — and here is what to track.

Quarterly signals to monitor: secondary market shares trading at a discount to the last primary round; down rounds or bridge extensions (often a sign Series N negotiations have stalled); senior engineer and product leadership departures (LinkedIn activity leads acquisition announcements by 3 to 6 months); roadmap milestones that slip without explanation; customer churn signals on G2 and Gartner Peer Insights.

Maintain a vendor risk register updated at minimum quarterly. Record: current capital structure and financing maturity date, last known ARR and revenue concentration, key contractual protections in place, and the consolidation pattern most likely to affect each vendor.

Trigger events for an immediate review — do not wait for the quarterly cycle. Any strategic investor entry (hyperscaler, platform vendor, hardware supplier), CEO or CPO departure, or public “strategic partnership” announcement implying equity transfer warrants an immediate look.

Pre-renewal review window: build in 90 days minimum before each contract renewal date. Most enterprise AI contracts have 30-day renewal notice periods. Ninety days gives you 60 days of actual negotiating time before the decision point. That is the difference between having options and not having them.

Financial health indicators that signal acquisition risk form the underlying signal set covered in the companion survival calculus article.

What does the Cohere–Aleph Alpha merger reveal when scored against this checklist?

The April 2026 Cohere–Aleph Alpha merger — anchored by Schwarz Group’s €600 million structured financing, creating a combined entity valued at approximately $20 billion — is the most relevant recent data point for enterprise buyers relying on either vendor. Here is how it scores.

Capital structure — Positive, partial. Schwarz Group’s European alignment is a better stability signal than a typical VC-driven exit-pressure round. Convertible instrument details are not publicly disclosed, which is the caveat.

Contract portability — Best practice demonstrated; adoption gap remains. Aleph Alpha’s “controlled in Europe” clauses — German federal agencies, Baden-Württemberg, Bavaria — gave those customers contractual protection the merger terms had to honour. Enterprise customers without equivalent clauses had weaker standing. That is the gap.

Data export rights — Partial. Customers migrating to PhariaAI retained access to their deployment environments. No public disclosure covers the timeline or tooling for customers who chose to exit.

Roadmap continuity — Positive. The joint entity committed to Command-Pharia 1 — integrated into Cohere’s roadmap — targeted for Q4 2026. A named milestone with a public date is more credible than generic commitments.

Agentic lock-in — Neutral to positive. Customers using Aleph Alpha’s model-agnostic PhariaAI governance layer had the least lock-in to unwind. Risk concentrates for customers with orchestration workflows tightly coupled to Aleph Alpha-specific tooling.

The acquihire comparison is instructive. A talent-only acquihire of either company would have produced product wind-down in 60 to 180 days, no contract portability, data export windows measured in weeks. Microsoft/Inflection (2024) and Amazon/Adept (June 2024) are the relevant negative benchmarks — talent relocated, enterprise customers received limited continuity commitments. The Cohere full merger is better on every dimension.

The lesson is straightforward: negotiate data export terms before an announcement, when you still have leverage. After an announcement, you are negotiating against a timeline the acquirer controls.

For a complete overview of the AI startup consolidation wave — the macro forces, the deal patterns, the policy landscape, and the survival calculus that makes this checklist necessary — see the full series overview.

The questions below cover the most common points of confusion when approaching AI vendor due diligence for the first time.

Frequently Asked Questions

What is the single most important contract clause to add for AI vendor stability?

A change-of-control clause that goes beyond notification. You need a termination-for-convenience right triggered by a change-of-control event, with preserved pricing and a defined data export window of at least 90 days. Notification-only is the default in most AI vendor MSAs — it tells you an acquisition happened but gives you no options. Also cover indirect acquisitions, where the acquirer buys a parent holding company rather than the vendor entity directly.

What does “contract portability” actually mean in a vendor acquisition scenario?

All material MSA terms — pricing, SLAs, data handling, sovereignty commitments — automatically bind the acquiring entity without renegotiation. Without it, an acquirer can treat the assumed contract as subject to its own standard terms at renewal. The Aleph Alpha “controlled in Europe” clause is the concrete example: European legal control over data survives the ownership transfer.

Is it possible to negotiate data export rights with major AI vendors?

Yes. Leverage depends on contract size — seven-figure annual contracts can push for broad rights; mid-market contracts have less leverage. Scope is the key negotiation point: broad data export should cover embeddings, fine-tuned weights, conversation traces, agent memory, and vector indices — not just raw uploaded files. Minimum: a 90-day export window with documented tooling, activated by any change-of-control event.

What is the difference between data residency and data sovereignty?

Data residency is where data is physically stored. Data sovereignty is the legal jurisdiction governing access and compelled disclosure. A US-parent-owned vendor with EU data centres is still subject to CLOUD Act requests. Verify legal jurisdiction, not just data centre geography.

How often should I review my AI vendor risk portfolio?

Quarterly for signal tracking — secondary market activity, leadership departures, roadmap delays, funding news. A full contract review at minimum 90 days before each renewal date. Immediate review when a strategic investor takes an equity position or a CEO/CPO departs.

What is an acquihire and why is it the worst-case scenario for enterprise customers?

An acquihire is an acquisition motivated primarily by acquiring the engineering team. The acquirer has little interest in continuing the product; wind-down typically follows within 60 to 180 days. Enterprise customers face simultaneous risks: product discontinuation before contract expiry, data export windows measured in weeks, and no roadmap continuity. Microsoft/Inflection (2024), Google/Windsurf (July 2025), and Amazon/Adept (June 2024) are recent examples.

Should I be worried that my AI vendor will get acquired?

If your vendor is an independent AI startup with under $200 million ARR, raised its last round more than 18 months ago, or has a strategic investor at board level, acquisition probability is elevated. The practical step: conduct the contract clause audit now, before an announcement, when you still have negotiating leverage.

What happens to my contract if my AI vendor is bought by a competitor?

Without explicit portability provisions, the acquirer may migrate your contract to its own terms. A change-of-control clause with termination-for-convenience gives you the option to exit. If the acquirer holds a competing product, a portability clause with a fixed term — 24 months post-acquisition — is the practical protection.

What are novation rights and why do they matter in an AI acquisition?

Novation is the legal transfer of a contract with the explicit consent of all parties — distinct from contract assignment, where it transfers automatically. A novation-consent clause gives you the ability to negotiate the terms of the transfer, or refuse and trigger termination-for-convenience instead. Think of it as a merge request that requires your approval before the branch is merged into the acquirer’s codebase.

How do I evaluate an AI vendor’s financial stability before signing?

Request ACV concentration data: what percentage of ARR comes from the top three accounts? Above 40% signals structural vulnerability. Ask about financing maturity — repayment obligations in the next 24 months? Check secondary market price signals: shares at a material discount to the last primary round valuation mean institutional investors are pricing in downside scenarios.

What is the EU AI Act’s relevance to enterprise AI procurement?

The EU AI Act creates compliance obligations for enterprise buyers, not just vendors. If you deploy a general-purpose AI system in an EU context, you are responsible for ensuring it meets audit documentation and transparency requirements. Verify that your vendor can produce, on request, a current technical file, a risk assessment, and a data residency audit trail. The Act’s extraterritorial reach applies to non-EU companies deploying AI that affects EU users.

What is agentic AI lock-in and how is it different from standard API lock-in?

API lock-in is dependency on a specific model’s API format and response schema — addressable by a re-engineering project measured in weeks. Agentic AI lock-in compounds across the orchestration framework, runtime environment, and developer workflow. Switching a model API is a refactor; switching an agentic AI stack is a platform migration measured in months.

When does a $200/month AI subscription become a governance crisis?

How much agentic browser traffic is already happening — and how much of it is ungoverned?

What does the Amazon v. Perplexity injunction tell enterprise security teams?

What does the CFAA question mean for companies that aren’t Amazon?

What does the governance gap actually cover — and who owns it?

What must a practical governance framework actually govern?

Which vendor products close which governance gaps?

Frequently asked questions

Is the Amazon v. Perplexity lawsuit relevant to companies that aren’t publishers?

What is the CFAA and how does it apply to AI agents?

Can we block agentic browser traffic at the network level?

How do I tell if employees are using Perplexity Comet on company accounts?

Is there a template acceptable use policy for agentic browsers?

What is shadow AI and how is it different from shadow IT?

Who owns the governance risk when an employee uses an agentic browser outside IT procurement?

Should agentic browser governance be in my endpoint budget or a separate line item?

What does Agent 365 actually govern in a Microsoft 365 environment?

What is a scope violation and how common are they?

How do I classify whether an agentic browser action is sanctioned or unsanctioned?

What is the Ninth Circuit appeal about and when will it be resolved?

Why Do Attack Categories Matter More Than a Risk List?

Attack Category 1: What Is Prompt Injection and Why Is It Ranked the Top Agentic Threat?

Attack Category 2: How Does Session Hijacking Work When Agents Inherit Authenticated Sessions?

Attack Category 3: What Is Identity Spoofing and How Does Comet Masquerade as Chrome?

Attack Category 4: What Is Tool-Call Abuse and How Do Overpermissioned MCP Tools Enable It?

Attack Category 5: How Do AI Agents Exfiltrate Data Without Triggering DLP Alerts?

What Does the Least Privilege Response Actually Look Like for Agentic Browser Deployments?

How Do These Five Attack Categories Map to My Existing Security Stack?

Frequently Asked Questions

Is prompt injection in agentic browsers the same as SQL injection?

Can existing endpoint detection tools catch agentic browser attacks?

Which of the five attack categories is hardest to detect in real time?

Does the OWASP Top 10 for Agentic Applications map directly to these five categories?

Should agentic browser security be its own budget line?

Is it safe to let an AI browser log into websites for my organisation?

How does agentic browser session hijacking differ from traditional session hijacking?

What is the Model Context Protocol (MCP) and why is it a security concern?

How do I detect prompt injection attacks targeting our AI tools?

Which compliance frameworks are triggered by these five attack categories?

What does “execution environment” actually mean — and why does it change everything?

How does DOM access turn a browser from passive to active?

Why does an agent inheriting your login state create enterprise-level risk?

What is the semantic work graph and why does it signal a platform shift?

How does a single-tab action become a multi-site autonomous workflow?

Why is the agentic attack surface “broad, undocumented, and expanding”?

What is MCP and why does it create a structural vulnerability layer?

What governance frameworks and products address the execution environment risk?

FAQ

What is the difference between an agentic browser and using a browser extension like the ChatGPT extension?

What does “executing in an authenticated session” mean in practice?

Is the “execution environment” concept specific to Perplexity Comet?

What is the OWASP Top 10 for Agentic Applications?

What is the Same-Origin Policy and why does it fail for agentic browsers?

How is an agentic browser different from RPA (Robotic Process Automation)?

What is the permission ladder and why does it matter?

Why can’t my existing DLP and CASB tools see what an AI browser agent is doing?

What is CometJacking?

What is “shadow AI” in the agentic browser context?

What is MolmoWeb and what did the Allen Institute for AI build?

How does MolmoWeb’s performance compare to proprietary alternatives?

What does open-weight deployment actually mean for your organisation?

What security guardrails does MolmoWeb have — and which ones disappear when you self-host?

How does open-source deployment expand the agentic browser threat surface?

Build vs. buy: what does your team need to evaluate?

Frequently Asked Questions

What is MolmoWeb and who made it?

Is MolmoWeb production-ready?

How does MolmoWeb navigate websites without reading the HTML?

How does MolmoWeb compare to OpenAI’s browser agent?

What is the MolmoWebMix dataset?

What is the difference between “open-weight” and “open-source”?

What safety guardrails does MolmoWeb have in its demo?

What is prompt injection and does it affect MolmoWeb?

Can MolmoWeb be fine-tuned for enterprise-specific tasks?

What hardware does MolmoWeb require to self-host?

What licence does MolmoWeb use and what does that mean for commercial use?

Why did Palo Alto Networks launch a browser product in 2026?

What does Prisma Browser actually do — and which attacks is it designed to stop?

What is the Portkey acquisition and why does it matter beyond the browser?

How does SASE integration change the Prisma Browser deployment picture?