Business

SaaS

Technology

•

May 6, 2026

The Authorisation Gap Every AI Deployment Hits and How to Close It

Q: What is the difference between token exchange and service accounts for AI agents?

Service accounts answer 'what is this agent allowed to do always?' — static, broadly scoped, shared across tasks, difficult to revoke selectively. Token exchange answers 'what is this agent authorised to do for this specific task, initiated by this specific user?' — per-task, short-lived, revocable. A compromised service account is a persistent broad-access risk. A compromised task token expires and is useless outside its task context. That is a meaningful difference.

An AI agent with read access to your CRM retrieves a customer record and forwards it to a shared Slack channel. The channel has members who should never see that data. Nobody explicitly authorised the transfer. Nothing blocked it. The system checked whether the agent could retrieve the data — but never asked whether the people receiving it were allowed to see it.

That failure has a name: the authorisation gap. It is the structural mismatch between authorisation systems designed for predictable, human-initiated access and AI agents that cross trust domains at machine speed with inherited, static credentials. Okta research puts 80% of organisations as having already encountered unexpectedly risky AI agent behaviour. The root cause is not a bug — it is a category mismatch.

This article explains the gap, covers what token exchange, ABAC, and CAEP do in plain language, and gives teams without a dedicated IAM team a concrete starting sequence. It is one chapter in the broader challenge of building an AI-ready API estate — the complete API agent-era architecture overview is in the pillar article.

What is the authorisation gap and why does every AI deployment eventually hit it?

Here is the short version: your authorisation systems were designed for humans. AI agents are not humans. That mismatch is the authorisation gap.

The longer version: OAuth scopes, RBAC roles, and long-lived API keys all assume a bounded, predictable actor. One user, one session, one scope. An AI agent crosses trust domains, executes at machine speed, and holds permissions that persist well beyond any single task. The system was never built for that.

The mechanism that creates the gap is permission inheritance. When you provision an agent using an existing user’s OAuth token or a service account, the agent inherits every permission that user has — including the permissions that have nothing to do with what the agent actually needs to do. That broad token gets presented to every API call, regardless of whether the individual call warrants it.

The result is the Confused Deputy problem: a privileged program manipulated into misusing its authority by a lower-privileged caller. The agent retrieves data using its owner’s broad permissions, then outputs it to a context where recipients have lower permissions. No single actor did anything wrong. The architecture allowed the transfer.

And machine speed makes this dangerous at scale. Human-mediated workflows had implicit delays — approval steps, review cycles, human judgement calls — that authorisation models quietly relied on. An agent strips all of that out.

Three real vulnerabilities from 2025 confirm this is not a theoretical concern:

EchoLeak (Microsoft 365 Copilot, CVE-2025-32711): Hidden prompts in emails triggered silent exfiltration from SharePoint, Teams, and OneDrive. Retrieval: checked. Output destination: not checked.
ForcedLeak (Salesforce Agentforce): Prompt injection via Web-to-Lead forms enabled CRM exfiltration through an expired domain purchased for $5. Retrieval: checked. Output destination: not checked.
BodySnatcher (ServiceNow): A hardcoded secret plus email address allowed impersonation of any user — including administrators — bypassing MFA entirely.

The common thread: each system checked whether the invoking user could access the data. None checked whether all recipients of the output could.

One amplifier worth flagging: shadow APIs that exist because of API sprawl bypass every authorisation control you build. Sprawl and the authorisation gap need to be addressed in parallel, not one after the other.

Why do OAuth, RBAC, and API keys break when AI agents are involved?

OAuth was designed for a specific model: one user approves a scope, one application acts on that scope in a bounded session. Agents do not work like that. A traditional OAuth client forwards a user’s request. An agent reasons about a problem and decides which tools to invoke. Agents generate intent — they do not merely forward it. That is a fundamentally different thing, and OAuth was not built for it.

There are three failure modes worth understanding properly:

1. Privilege escalation. A standard client_credentials OAuth flow grants broad scope. To let a billing agent read invoices, you grant Files.Read.All. Now that agent can read invoices, employee contracts, and patent drafts. WorkOS calls this “God Mode” — an over-permissioned autonomous agent with de-facto root access.

2. Token reuse. A long-lived access token granted for one task gets retained and reused for unrelated subsequent tasks. OAuth tokens cannot be restricted after issuance without recontacting the auth server. In decentralised agent systems, that assumption breaks routinely.

3. No task-scoped revocation. If a task is cancelled mid-execution, the token issued for it remains valid. There is no native mechanism in basic OAuth or API key infrastructure to revoke a specific task’s access without revoking all access for that agent.

RBAC makes this worse. Roles are static, coarse-grained, and designed around human job functions. RBAC cannot express “this agent may access customer records only while executing task X for user Y.” That is not a role — it is a context, and RBAC has no mechanism to encode it.

API keys are the worst of the three: static credentials, no expiry on task completion, no delegation chain, no per-task revocation.

Zero Trust Architecture is the posture that resolves this. Every API call is evaluated against the specific task and context in which it occurs — not a static role held permanently. It’s a significant shift, but it’s the right one — and it sits at the centre of the zero trust API architecture that authorisation upgrades must serve.

What is token exchange and how does it fix static authorisation for AI agents?

Token exchange is the mechanism that swaps a broad, long-lived user token for a narrow, short-lived token scoped to a single specific task. The agent gets only what it needs, only while it needs it.

The standard is OAuth Token Exchange, specified in IETF RFC 8693. Each token produced by an exchange must be a more constrained version of what was presented. Token exchange cannot escalate permissions — it can only narrow them. That is a deliberate and important design choice.

Built on top is the On-Behalf-Of (OBO) flow — the delegation pattern that carries both the agent’s identity and the delegating user’s identity in a single token. Token exchange is the mechanism (how a broad token becomes a narrow one); OBO is the pattern (how that narrowed token preserves the delegation chain for audit). An OBO token answers the question: “who authorised this, and who is doing it on their behalf?”

Scope Attenuation is what prevents privilege escalation. The agent’s token is the intersection of what the user is allowed to do, what the agent is allowed to do, and what the target API will accept. Asking for broader scope does not get you broader scope.

The operational implementation combines Just-in-Time (JIT) provisioning with Zero Standing Privileges (ZSP). When a task triggers, an OBO token is minted — scoped, time-limited, bound to the initiating user’s authority. When the task ends, the token expires. No standing credentials, no persistent exposure.

What are DPoP, PKCE, and CAEP — and why does agent authorisation need all three?

Token exchange and OBO give you the right delegation pattern. But there are still attack surfaces open even when delegation is correct. Three OAuth extensions close them.

DPoP (Demonstrating Proof of Possession) — plain language: DPoP makes a stolen token worthless without the key that was used to request it. Standard bearer tokens can be used by anyone who intercepts them. DPoP cryptographically binds the token to the agent’s private key. No key, no access.

PKCE (Proof Key for Code Exchange) — plain language: PKCE lets an agent prove it initiated the authorisation request without needing to store a password. Agents often run as headless processes or serverless containers, and they cannot safely store client secrets. PKCE is already recommended for all public clients; for agents, it is the minimum hardening step before you push anything to production.

CAEP (Continuous Access Evaluation Profile) — plain language: CAEP is the mechanism that lets you revoke an agent’s access the moment something changes, rather than waiting for the token to expire. Even a short-lived 60-minute token at machine speed represents thousands of API calls. CAEP enables the identity provider to push a revocation event to resource servers immediately when a user account is compromised or a task is cancelled. This one is a maturity-layer goal, not a day-one requirement — but plan for it.

Rich Authorisation Requests (RAR, RFC 9396) allow agents to declare structured intent in the token request rather than requesting abstract scopes. This supports fine-grained policy evaluation at the identity provider level, and it makes your audit trail a lot more useful.

RBAC vs ABAC vs PBAC: which access control model is right for AI agents?

Most organisations start with RBAC because it maps naturally to how a business is organised. The question for AI deployments is not “should we replace RBAC?” It is “where does RBAC stop being sufficient, and what takes over there?”

RBAC assigns permissions to roles; roles are global, static, and coarse-grained. For human users with predictable job functions, this works well. For agents whose access requirements shift by task and runtime context, it does not. RBAC simply cannot express task-specific constraints.

ABAC (Attribute-Based Access Control) makes access decisions by evaluating policy rules that combine subject attributes, resource attributes, and environmental context. A minimal ABAC implementation targeting three attributes — task identity, initiating user, resource owner — covers the most common permission intersection failure modes without requiring complex infrastructure to get started.

PBAC (Policy-Based Access Control) uses explicit, versioned policy documents as the central evaluation mechanism. AWS IAM implements PBAC natively. For regulated environments, PBAC provides the audit trail required for GDPR, SOX, and CCPA compliance. If you operate in finance or healthcare, this is where you are probably heading.

Here is the comparison at a glance:

RBAC — Role-level granularity, low complexity, poor AI agent fit. Cannot express task-specific scope.

ABAC — Attribute-level granularity, medium complexity, good agent fit. Attribute maintenance is the ongoing cost.

PBAC — Policy-level granularity, higher complexity, good fit for regulated environments.

Beyond PBAC is Fine-Grained Authorisation (FGA), which scopes what the agent can see before retrieval — not filtering after the fact. The practical migration path: RBAC → ABAC for agent-specific policies → PBAC or FGA as compliance and scale require. The connection to the authorisation model the policy engine enforces at runtime in ART005 is direct.

How does zero trust for non-human actors work — and what is SPIFFE/SPIRE?

Zero Trust — “never trust, always verify” — applied to non-human actors requires answering two separate verification questions at every API call. Most implementations only answer one.

Question 1: Does this token have the right scope for this specific request? Token exchange, OBO, and ABAC address this.

Question 2: Is the agent running in the environment I expect? This is machine identity verification, and token exchange alone cannot provide it.

SPIFFE/SPIRE is the answer to Question 2. SPIFFE (Secure Production Identity Framework for Everyone) is the open standard for workload identity; SPIRE is the production implementation. Together they issue short-lived cryptographic identities — SVIDs — to workloads based on their infrastructure context. An SVID proves not just “this token is valid” but “this token is being presented by the workload I provisioned, running where I expect it to run.” A stolen token from an unexpected environment gets rejected. Official documentation: https://spiffe.io/.

SVIDs expire on short timers — minutes to hours — implementing Zero Standing Privileges at the infrastructure level, not just the token level.

The emerging category here is Non-Human Identity (NHI): treating agents, services, and pipelines as first-class identity principals. Microsoft Entra Agent ID reports visibility into 500,000 deployed agents. The infrastructure is not keeping up with the deployment pace, and that gap is where risk lives.

The authoritative risk framework for all of these mitigations is the OWASP Top Ten for LLMs and Agentic Applications: https://owasp.org/www-project-top-10-for-large-language-model-applications/.

Where do you start when you have basic OAuth and no dedicated IAM team?

Starting from basic OAuth and RBAC with no dedicated IAM resource is the most common situation. Here is how to reduce the highest-risk exposure first, in order of risk reduction per unit of effort.

Step 1: Audit and isolate agent credentials (week 1–2). Identify every service account and inherited OAuth token used by agents. Document their actual scopes. Any agent with broader scopes than its tasks require is an active risk right now. This costs you no infrastructure — only attention.

Step 2: Replace service accounts with token exchange (month 1–2). For the highest-risk agents — those with access to sensitive data, external APIs, or user PII — implement OAuth Token Exchange (RFC 8693) to replace static service accounts with per-task, short-lived, user-delegated tokens. Even basic OAuth 2.0 with PKCE is a substantial improvement over what you have now. Okta, Auth0, and WorkOS all support token exchange without requiring new infrastructure.

Step 3: Layer ABAC on top of token exchange (month 2–4). Once agents have task-scoped tokens, add ABAC policies targeting three attributes: task identity, initiating user, and resource owner. AWS IAM supports this natively via condition keys. WorkOS FGA and Okta FGA are managed implementations if you do not want to build policy evaluation from scratch.

Two operational details that often get overlooked:

Agent de-provisioning. When an agent is retired, its credentials frequently survive it — “zombie agents” acting as dormant backdoors. SCIM-compatible identity providers can automate de-provisioning across all connected services, so this does not become a problem you discover after the fact.

Design OAuth for token exchange from day one. Teams building new agent-facing APIs should design their OAuth scope structure with token exchange in mind from the start. Retrofitting is expensive. See incorporating OAuth design into API-first from day one.

And the sprawl connection closes this out where it opened. Everything above is premised on knowing what APIs your agents can actually reach. Shadow APIs that exist because of API sprawl bypass every control described here. Sprawl remediation and authorisation upgrades are parallel requirements, not sequential ones.

The full architectural context — how these authorisation upgrades fit into a broader AI-ready API estate — is in the complete API agent-era architecture overview.

Frequently Asked Questions

What is the authorisation gap in agentic AI and how do I close it?

The authorisation gap is the mismatch between authorisation systems built for predictable human access and AI agents that operate at machine speed across trust domains with inherited, static credentials. Close it by replacing credential inheritance with token exchange (RFC 8693), adding ABAC policies to encode task context, and implementing CAEP for real-time revocation. The three-step sequence above is where to start.

Do I need to replace my existing OAuth implementation to close the authorisation gap?

No. OAuth Token Exchange (RFC 8693) is an extension, not a replacement. Your existing OAuth infrastructure stays in place; token exchange adds a delegation layer on top of it. Most organisations can implement it with their existing identity provider — Okta, Auth0, and WorkOS all support it out of the box.

What is the difference between token exchange and service accounts for AI agents?

Service accounts answer “what is this agent allowed to do always?” — static, broadly scoped, shared across tasks, difficult to revoke selectively. Token exchange answers “what is this agent authorised to do for this specific task, initiated by this specific user?” — per-task, short-lived, revocable. A compromised service account is a persistent broad-access risk. A compromised task token expires and is useless outside its task context. That is a meaningful difference.

Can you explain what token exchange means for AI agents in plain English?

Token exchange lets an agent borrow a narrowly scoped, time-limited permission slip rather than using a permanent pass-key. When a user triggers a task, the agent gets a token valid only for that task’s duration and scope. When the task ends, the token expires. The user’s broader permissions are never transferred — the agent receives only the subset it needs.

Is ABAC only for large enterprises or can a 100-person company implement it?

ABAC is not inherently enterprise-scale. A minimal implementation targets three attributes: task identity, initiating user, and resource owner. AWS IAM already supports ABAC natively via condition keys — no new infrastructure required if you are already on AWS. WorkOS FGA and Okta FGA are managed options if you are not. Start narrow and expand as patterns emerge.

What is RBAC vs ABAC vs PBAC — which should I use for AI agents?

RBAC is appropriate for human users in defined roles, insufficient for agents whose access requirements vary by task. ABAC can encode the task context RBAC cannot express. PBAC adds versioned, auditable access decisions for regulated environments. Practical starting point: implement ABAC as an overlay on existing RBAC — RBAC governs humans, ABAC governs agents.

What is CAEP and why does it matter for AI agents specifically?

CAEP (Continuous Access Evaluation Profile) enables real-time revocation when risk conditions change. For agents at machine speed, a 60-minute token represents thousands of API calls that may continue after an account compromise or task cancellation. CAEP lets the identity provider push a revocation event to resource servers immediately. Both provider and resource servers need to support the protocol — treat it as a maturity-layer goal rather than a day-one requirement.

What is the Confused Deputy Problem and how does it apply to AI agents?

The Confused Deputy is a classic vulnerability: a privileged program manipulated into misusing its authority by a lower-privileged caller. For agents, it plays out like this — an over-permissioned agent retrieves data the initiating user cannot see, then outputs it somewhere accessible. Prompt injection — malicious instructions embedded in content the agent retrieves — is the most dangerous version of this. Token exchange plus ABAC closes it by ensuring the agent’s permissions are always bounded by the initiating user’s permissions.

What is SPIFFE/SPIRE and do I need it for AI agent security?

SPIFFE is the open standard for workload identity; SPIRE is the production implementation. Together they issue short-lived cryptographic identities (SVIDs) based on infrastructure context — not just a token, but where and how a workload is running. A stolen token from an unexpected environment is rejected. For smaller deployments, SPIFFE/SPIRE is a hardening layer to plan for rather than a day-one requirement. Official documentation: https://spiffe.io/.

What does Zero Standing Privileges mean for AI agents in practice?

Zero Standing Privileges (ZSP) means an agent holds no active permissions between tasks. Permissions are issued just-in-time when a task triggers and expire when it completes. This eliminates the persistent service account an attacker can exploit even when no agent is active. Implementation: combine token exchange (RFC 8693) with CAEP, set short token lifetimes, and eliminate all standing service accounts used by agents.

The Authorisation Gap Every AI Deployment Hits and How to Close It

What is the authorisation gap and why does every AI deployment eventually hit it?

Why do OAuth, RBAC, and API keys break when AI agents are involved?

What is token exchange and how does it fix static authorisation for AI agents?

What are DPoP, PKCE, and CAEP — and why does agent authorisation need all three?

RBAC vs ABAC vs PBAC: which access control model is right for AI agents?

How does zero trust for non-human actors work — and what is SPIFFE/SPIRE?

Where do you start when you have basic OAuth and no dedicated IAM team?

Frequently Asked Questions

What is the authorisation gap in agentic AI and how do I close it?

Do I need to replace my existing OAuth implementation to close the authorisation gap?

What is the difference between token exchange and service accounts for AI agents?

Can you explain what token exchange means for AI agents in plain English?

Is ABAC only for large enterprises or can a 100-person company implement it?

What is RBAC vs ABAC vs PBAC — which should I use for AI agents?

What is CAEP and why does it matter for AI agents specifically?

What is the Confused Deputy Problem and how does it apply to AI agents?

What is SPIFFE/SPIRE and do I need it for AI agent security?

What does Zero Standing Privileges mean for AI agents in practice?

Related Articles

Getting Resource Management Right In Active Projects

How thinking like Frankenstein will help your MVP

SoftwareSeni AI Adoption Update

Need a reliable team to help achieve your software goals?

BUSINESS HOURS

SYDNEY

YOGYAKARTA

BANDUNG