Measuring Whether AI Governance Is Working Beyond Usage Counts

Most organisations think they are governing AI because they have a policy document and can count how many staff completed AI training. Here is what Knostic‘s AI governance statistics actually show: 75% of organisations have established AI usage policies, but only 36% have adopted a formal governance framework. That gap is governance theatre. You have artefacts — policies, committees, training completion records — but no mechanism to detect whether any of it is changing behaviour or reducing risk.

Usage counts tell you AI is being used. Outcome metrics tell you whether governance is actually working. If your organisation lacks a dedicated governance function, you cannot afford to wait for a failure to establish measurement discipline.

This article gives you a practical governance metrics framework, a concrete breakdown of the key metrics, and a sequenced baseline-building guide for organisations starting from zero. Understanding what AI governance actually requires at the organisational level is the prerequisite — this article is about measuring whether you are achieving it.


Why Are Usage Counts Not AI Governance Metrics?

Usage counts — active users, licence seats, training completion rates, number of AI tools deployed — measure adoption velocity. They do not measure governance health. Whether AI use is producing acceptable outcomes requires a different class of metric entirely.

Think about the input/process/output hierarchy. Usage metrics are inputs. Governance metrics are outputs. Knowing that 85% of your team completed AI training tells you nothing about whether that training changed how they handle sensitive data in AI prompts. Knowing your policy compliance rate tells you whether the training worked.

There is a name for this mistake: the policy existence fallacy. A written Acceptable Use Framework (AUF) is the measurement baseline, not the measure of governance effectiveness. IBM’s Global Leader for Trustworthy AI, Phaedra Boinodiris, put it well: “I measure success by whether ethical AI principles are embedded into strategy, workflows and decision-making, not just written down.” If a significant policy violation occurred today, would your current dashboards alert you? If the answer is “I don’t know,” you are measuring inputs, not governance.

The numbers back this up. Only 25% of organisations have fully implemented AI governance programmes, despite near-universal AI adoption. Only 7% have fully embedded AI governance into development pipelines. And fewer than 20% track well-defined KPIs for their GenAI solutions. That means most organisations cannot compare use cases, tune guardrails, or justify governance investment — because they have no measurement foundation to build on.


What Does a Governance Metrics Framework Actually Look Like?

Governance metrics organise into four practical categories. If you only measure one, you have blind spots.

Operational metrics track what is happening: AI Asset Inventory coverage rate, shadow AI detection rate, and the number of active use cases under formal oversight.

Risk metrics track what risks are materialising: incident cadence, model drift rate, and mean time to detect policy violations.

Compliance metrics track what external requirements are being met: policy compliance rate, audit-readiness score, and remediation lead-time.

Effectiveness metrics track whether behaviour is actually changing: human-in-the-loop override rate, shadow AI trend over time, and near-miss report rate. A rising near-miss reporting rate alongside a declining incident rate signals genuine governance improvement, not just luck.

One distinction worth making before you start: AI governance metrics and AI performance metrics are not interchangeable. Performance metrics — accuracy, latency, F1 score, hallucination rate — measure whether an AI system is doing its technical job correctly. Governance metrics — policy compliance rate, incident cadence, shadow AI detection rate — measure whether your organisational controls are working. A high-performing AI system can still produce data leaks, policy violations, and accountability gaps that performance dashboards will not surface. In smaller organisations the same engineering team often owns both, which is exactly why the distinction needs to be explicit.

Here are the seven metrics worth tracking, what each one measures, and when to worry:

Shadow AI detection rate: the percentage of AI tool usage happening outside your approved inventory. Collect via CASB audit, OAuth review, and network monitoring. A rising trend or anything above 20% of traffic is a red flag.

Policy compliance rate: what percentage of AI interactions conform to your AUF. Collect via gateway logs and DLP hit rate. Below 90% or a month-on-month decline is the warning signal.

Incident cadence: frequency and severity of AI-caused events and near-misses. Collected via an incident log. Watch for a rising incident rate alongside declining near-miss reports.

Model drift rate: deviation of deployed model outputs from their validated baseline. Collected via a continuous monitoring pipeline or MLflow. Any unreviewed drift beyond your agreed threshold is the problem.

Audit-readiness score: percentage of required governance artefacts present and current. Scored against ISO/IEC 42001 or NIST AI RMF. Below 70% is your red flag.

Remediation lead-time: mean time from governance gap detection to verified closure. Tracked in an issue tracker. More than 30 days for a medium-risk issue is too slow.

Human-in-the-loop override rate: what percentage of AI decisions are reversed or reviewed by humans. Collected from system logs against HITL checkpoints. A decline to zero without documented rationale is a governance failure.

Some of these require tooling — a CASB for shadow AI detection, a continuous monitoring pipeline for drift. Others need nothing new. Policy compliance rate uses existing DLP. Incident cadence works from a shared spreadsheet. For organisations without a dedicated governance function, the constraint is not tooling — it is ownership and cadence.

The NIST AI Risk Management Framework organises governance activities into four functions: Govern, Map, Measure, Manage. The Measure function directly addresses how to establish and track AI risk indicators. ISO/IEC 42001 complements it with a certifiable governance checklist you can use to score audit-readiness.


What Is the Difference Between Static AI Governance and Continuous AI Governance?

Static governance means point-in-time compliance reviews, annual policy updates, and training sign-offs. It creates a snapshot that is outdated the moment it is completed.

Continuous AI governance treats governance as ongoing monitoring, measurement, and enforcement — operational infrastructure, not an audit event. By 2026, AI is embedded in core business workflows and risk is distributed across the entire lifecycle. Static governance is not adequate for that environment.

What continuous governance actually requires is two cadences, not one. Automated monitoring runs continuously — tracking drift in model outputs, detecting bias patterns, flagging prompt anomalies. A formal quarterly review cycle brings together technical metrics and business context to refresh policies, validate risk classifications, and assess maturity. The gap between those two cadences is where problems hide if you only do one or the other.

Without a dedicated governance function, the minimum viable continuous stack is achievable without custom infrastructure:

  1. Automated alerts for shadow AI activity — network monitoring and OAuth review to detect unsanctioned AI tool adoption
  2. Policy compliance rate via DLP integration — most security suites already have the raw data; it requires configuration, not new tooling
  3. Model drift alerting — a continuous monitoring integration (MLflow, or Liminal AI or Microsoft Purview) with a defined threshold

A reasonable cadence for small teams: Policy Compliance Audit quarterly, Technical Controls Review quarterly, Risk Classification Validation semi-annually, Vendor Assessment annually, Governance Maturity Assessment annually. This supplements continuous alerting — it does not replace it.


Where Does Your Organisation Sit on the AI Governance Maturity Model?

The AI Governance Maturity Model runs five stages: Ad-hoc (no formal measurement) → Developing (usage metrics only) → Defined (outcome metrics documented) → Managed (continuous monitoring operational) → Optimising (measurement drives governance investment decisions).

Most mid-market organisations at the 50–500 person scale sit at Developing or early Defined. They have policies and track adoption but lack outcome measurement infrastructure. The Knostic data confirms this: only 28% of organisations report enterprise-wide oversight of AI governance roles and responsibilities.

Here is a quick self-assessment. Three questions, no formal audit required:

  1. If a significant policy violation occurred today, would your current dashboards alert you?
  2. Do you know your current shadow AI detection rate?
  3. Has your incident cadence improved since you implemented governance?

If the answer to any of these is “no” or “I don’t know,” your governance programme has at least partial theatre characteristics.

Moving from Developing to Defined requires three things: establish three to five outcome metrics, assign data ownership for each metric, create a reporting cadence. That is achievable in 60–90 days without a dedicated governance team. Moving from Defined to Managed requires instrumenting continuous monitoring for the highest-risk indicators — shadow AI, policy compliance, drift — and making metrics visible to leadership.

The maturity model also solves the executive investment problem. Without governance KPIs, you cannot demonstrate governance is working or quantify the risk of not investing further. With three months of shadow AI detection rate data, you can say: “X% of our AI activity is happening outside approved controls. The cost to close that gap is Y. The cost of a data breach from ungoverned AI use, per IBM’s 2025 Cost of Data Breach report, is Z.” That is a governance investment argument. Without measurement, you cannot make it.


How Do You Build a Governance Measurement Baseline from Nothing?

Starting from zero does not mean starting everywhere at once. Here is the sequenced approach, ordered by governance value and implementation effort.

Start with the AI Asset Inventory. You cannot measure what you cannot see. The inventory covers all AI in use: approved enterprise tools, shadow AI tools employees are using without approval, custom models in development, third-party AI embedded in purchased software, and AI usage distributed across departments. Without it, every other metric rests on incomplete data. The inventory also produces your first governance metric immediately — shadow AI detection rate emerges from the gap between what you find and what was previously documented.

Make policy compliance rate your second metric. It is the most accessible entry-point governance KPI because most organisations already have the infrastructure — DLP tooling that monitors prompts, flags sensitive data, and logs policy violations. This does not require new tooling. It requires configuration of what is already present.

Set up an incident log. Even a shared spreadsheet with a defined schema is sufficient to begin tracking incident cadence. The value is the trend, not the tooling sophistication. Define what counts as an AI-caused incident, assign someone to maintain the log, review it quarterly.

Build an audit-readiness checklist. Map required governance artefacts — risk assessments, model cards, training logs, vendor contracts, bias test results — against ISO/IEC 42001 or the NIST AI RMF. Score the current state. External audits do not assess whether your AI works well — they assess whether you can demonstrate control over it. The checklist is how you know the answer before the auditor asks.

Assign metric ownership. A RACI Matrix for AI ensures each metric has a named owner responsible for data collection and reporting. Without ownership, measurement programmes decay.

Once you have three metrics tracked over two to three months, you have data to present to the board. A rising policy compliance rate and a declining shadow AI detection rate is a governance improvement narrative. As discussed in operating model design decisions that generate measurable outcomes, measurement infrastructure is inseparable from the operating model — and as covered in accountability structures whose effectiveness you need to measure, named metric ownership is the bridge between governance design and governance measurement.

For organisations already using Databricks, Unity Catalog standardises access policies, enforces lineage, and centralises metadata for risk assessment. IBM watsonx.governance and Microsoft Purview provide equivalent capabilities for those ecosystems. Liminal AI provides centralised multi-model access through a single secured interface with built-in policy enforcement and centralised logging. The observability infrastructure that generates governance measurement data is covered separately — but measurement starts with these five priorities before any specialised tooling is in place.


FAQ

What is the difference between AI governance metrics and AI performance metrics?

AI performance metrics — accuracy, latency, F1 score, hallucination rate — measure whether an AI system is doing its technical job correctly. They are engineering concerns.

AI governance metrics — policy compliance rate, incident cadence, shadow AI detection rate, audit-readiness score — measure whether your organisational controls over AI use are working. High-performing AI systems can still produce governance failures — data leaks, policy violations, accountability gaps — that performance dashboards will not surface.

How many AI governance KPIs is the right number to start with?

Three to five. More is not better when you lack a dedicated governance function — poorly-owned metrics decay into noise. Start with shadow AI detection rate, policy compliance rate, and incident cadence. Add audit-readiness score and remediation lead-time once the first three are producing reliable data.

What is governance theatre and how do I tell if my programme has become it?

Governance theatre is when you have governance artefacts — policies, committees, training completion records — but no mechanism to detect whether those artefacts are changing behaviour or reducing risk. Three-question diagnostic: (1) Would your dashboards alert you to a significant policy violation today? (2) Do you know your current shadow AI detection rate? (3) Has your incident cadence improved since you implemented governance? “No” or “I don’t know” to any of these means you have at least partial theatre.

Can I just write an AI policy and call it governance?

No. A written policy is the measurement baseline — its existence is measurable, but its effectiveness is not measured by its existence. A policy tells employees what to do. Governance measures whether they are doing it, detects when they are not, and triggers a response when violations occur. The Acceptable Use Framework is the starting point for governance, not the end point.

What does continuous AI governance actually require in practice?

Automated alerting for the highest-risk signals — shadow AI activity, DLP policy violations, model drift beyond defined thresholds — plus a defined response process when alerts fire. For organisations without a dedicated governance team, an AI gateway proxy (such as Liminal AI) or DLP integration in an existing security suite provides the raw data. Continuous governance does not mean real-time human review of every AI decision — it means automated detection with human escalation thresholds defined in advance.

What is the AI governance maturity model and where do most mid-market companies sit?

Five stages: Ad-hoc → Developing → Defined → Managed → Optimising. Most mid-market organisations sit at Developing — they have policies and track adoption but lack outcome measurement infrastructure. Moving from Developing to Defined requires three to five outcome metrics with named data owners and a reporting cadence — achievable in 60–90 days without a dedicated governance team.

How do I use governance metrics to make the case for governance investment?

When governance is not measured, the executive team cannot see risk, so they do not fund prevention. A rising shadow AI detection rate gives you a concrete argument: X% of AI activity is happening outside approved controls; the cost to close that gap is Y; the cost of a data breach from ungoverned AI use is Z. Present a maturity gap: “We are at Developing. Moving to Defined requires these three investments.” That is the conversation governance metrics make possible.

How does shadow AI detection rate indicate overall governance health?

Shadow AI detection rate is a proxy for the entire governance system’s effectiveness. If employees are circumventing approved AI channels, it simultaneously reveals gaps in policy enforcement, enablement, and cultural adoption. A high rate often signals that sanctioned tools are too restrictive or too slow — employees are not malicious, they are productive. A declining rate following an enablement programme is strong evidence that governance is changing behaviour. That is the definition of effective governance.

What is an audit-readiness score and why does it matter?

Audit-readiness score is the percentage of required governance artefacts — risk assessments, model cards, training logs, vendor contracts, bias test results — that are present, current, and verifiable at any given time. External audits do not assess whether your AI works well. They assess whether you can demonstrate control over it.

How is model drift a governance issue and not just a technical issue?

Model drift — statistical deviation of a deployed model’s outputs from its validated baseline — means the AI system you approved for production is no longer behaving as approved. Without drift monitoring, you have no mechanism to detect that your validated AI has changed — rendering all pre-deployment governance controls retrospectively irrelevant.

How does the NIST AI RMF help with governance measurement?

The NIST AI Risk Management Framework organises governance activities into four functions: Govern, Map, Measure, Manage. The Measure function directly addresses how to establish and track AI risk indicators. For organisations making a board-level investment case or preparing for customer due diligence, NIST AI RMF provides a credible, regulator-accepted framework for justifying metric selection. ISO/IEC 42001 provides a certifiable governance checklist that complements it.


If you have reached this article without a governance measurement programme in place, the starting point is simpler than the enterprise governance literature suggests: an AI Asset Inventory, DLP-derived policy compliance rate, and an incident log. Three metrics, two to three months of data, one spreadsheet. That is enough to move from governance theatre to governance intelligence — and enough to start the board-level conversation about where to invest next.

The broader picture of what AI governance actually requires at the organisational level connects this measurement infrastructure to the operational and accountability structures that generate the data worth measuring. Measurement does not replace governance design — it proves that governance design is working.

Detecting Shadow AI and Creating Sanctioned Pathways for Tool Use

A quarter of workers use AI at work without telling their manager — SurveyMonkey found that, including 43% of senior leaders. That is not a rogue employee problem. That is a governance design failure.

Shadow AI grows when official AI pathways are absent, slow, or more trouble than they are worth. When IT approval takes two weeks and a personal ChatGPT account takes two minutes, the unofficial route wins. You cannot ban your way out of it — employees need AI to stay competitive. But unrestricted use creates real exposure: shadow AI incidents account for 20% of all breaches, and IBM’s 2025 Cost of Data Breach Report found shadow AI adds $670,000 to the average breach cost.

The answer is: detect first, then enable. Build official pathways that outcompete the shadow alternatives. This is the practical playbook for a scaling technology company, and it sits within a broader AI governance framework that connects policy to actual employee behaviour.


Why Prohibition Doesn’t Work — and What to Do Instead

Cyberhaven‘s research found that 91% of AI tools used in enterprises are completely unmanaged — even at companies with explicit prohibiting policies. The tools are still there; IT just cannot see them. When 59% of employees use unapproved tools even when they understand the risks, awareness is not the problem. Governance design is.

Here is the strategic reframe: the goal is not to prohibit unauthorised use. The goal is to make sanctioned use easier than the shadow alternative. One organisation found its engineers using fifteen different coding tools simultaneously. What worked was providing GitHub Copilot Enterprise and deploying credential detection across all tools. Credential exposure dropped to zero. The sanctioned pathway won because it was better, not because the shadow one was blocked.

The correct objective is to reduce shadow AI by increasing sanctioned AI adoption — not by increasing restrictions.


Starting with Visibility: How to Build an AI Asset Inventory

You cannot govern what you cannot see. Before detection, policy work, or access controls, you need to know what AI is already in your organisation. That is the AI Asset Inventory: a catalogue of every AI tool in use, sanctioned and suspected unauthorised.

Most companies do not have one. The Reco 2025 State of Shadow AI Report found companies with 11–50 employees averaged 269 unsanctioned AI tools per 1,000 employees.

Building the inventory requires no dedicated security tooling:

1. IT procurement data. What AI subscriptions does the company actually pay for? This is your known baseline.

2. SaaS spend analysis. Review expense reports for AI vendor line items IT did not provision. Employees often expense personal AI subscriptions. They show up in the numbers.

3. SSO logs and OAuth grants. Check your Google Workspace or Microsoft 365 admin panel for OAuth grants to third-party AI services. Every “Sign in with Google” creates a grant. Auditable in minutes.

4. Browser extension review. Scan installed extensions on company-managed devices — frequently missed by network-level monitoring.

5. Employee AI usage survey. Run an anonymous survey before technical discovery — the most reliable method for surfacing AI use in legal, finance, HR, and product, where network monitoring has blind spots.

The inventory is a living document. The gap between your known AI estate and your detected AI estate is itself a governance metric.


How to Detect Unauthorised AI Tool Use Without Surveillance Overreach

Effective shadow AI detection combines network-level monitoring, endpoint visibility, and employee self-reporting. Start with what you can implement today, then add tooling as the programme matures.

Firewall and DNS log review (lightweight, free). Check outbound connections against known AI service domains: api.openai.com, claude.ai, gemini.google.com, and equivalents. No new tooling required.

SaaS spend and OAuth audit (lightweight, accessible). Review expense reports for AI vendor line items and OAuth grant lists for services IT did not provision. These two reviews surface most AI tool use quickly.

DLP and DDR monitoring (intermediate, requires tooling). Data Detection and Response tools — Cyberhaven is the category leader — monitor what data is being sent to AI endpoints. Where firewall logs show a connection to api.openai.com, DDR tells you they pasted production API keys into the prompt. Harmonic’s 22M prompt analysis found 16.9% of all enterprise AI data exposures happen through personal free-tier accounts.

CASB deployment (comprehensive, requires security team capacity). Cloud Access Security Brokers — Netskope, Microsoft Defender for Cloud Apps, Zscaler — provide full visibility and policy enforcement. Pair with DDR, as CASB often misses AI features embedded within approved SaaS platforms.

Communicate the programme before deployment — cooperation surfaces more risk than covert monitoring. Non-engineering roles — product, legal, finance, and HR — are often the highest-risk AI users and the most underdetected.

For the technical infrastructure details, see our guide to AI monitoring infrastructure.


Writing an Acceptable Use Framework That People Will Actually Follow

An Acceptable Use Framework that employees have never read is documentation, not a behaviour-change mechanism.

Most AUFs are built by lawyers for compliance rather than by operators for adoption. When the policy says “no AI tools without committee approval” and the committee meets monthly, the policy does not prevent shadow AI. It causes it.

Here is what an effective AUF for a mid-market company needs:

Permitted tools list by role category. Name the approved tools for engineering, product, marketing, and operations. Specific tools, specific use cases, specific role groups — not a blanket prohibition.

Data handling rules. What data may be entered into AI tools, and what may not — customer PII, proprietary IP, authentication credentials, regulated data. Researching case law is low risk; uploading client contracts is high risk. The AUF needs to communicate that distinction.

Output review requirements. Where AI outputs require human review — customer-facing content, financial calculations, legal documents — say so. Where they do not — internal drafts, code going through standard review — say that too.

Review cadence and named owner. Review the AUF every six months — the tool landscape changes too fast for annual cycles. One named owner, not a committee. Committees create diffused accountability and slower decisions — both of which generate shadow AI.

This connects to the broader accountability structures in our article on AI governance accountability.

Acceptable Use Framework — Minimum Viable Outline:


Role-Based AI Access Controls: A Practical Model for Mid-Market Companies

Role-based AI access controls assign permissions based on job function, seniority, and training completion. The goal is to avoid blanket permissiveness — which guarantees data exposure — and blanket restriction, which guarantees shadow AI.

IBM’s approach is the most concrete implementation model available. Compliance checks are embedded in the workflow; provisioning is automated, not a separate gate. Fast access for trained employees, no manual queue, no bureaucratic drag.

97% of organisations that experienced AI-related security incidents lacked proper AI access controls, per IBM’s 2025 Cost of Data Breach Report. Here is a three-tier model for a 50–500 person company:

Tier 1 — Basic use (all staff). Sanctioned productivity tools: Microsoft Copilot, ChatGPT Enterprise, Claude for Teams. Access requirement: a 15-minute module on data handling rules.

Tier 2 — Advanced use (engineers, product teams). Code generation tools: GitHub Copilot Business, Cursor. Access requirement: Tier 1 plus a module on prompt security and credential handling. Coding tools show a 14x concentration of credential exposure risk.

Tier 3 — Build and deploy (senior engineers, tech leads). Building AI-integrated features and deploying agents. Access requirement: Tier 2 plus named owner sign-off on the specific use case.

The enforcement layer is SSO. Employees without completed training simply do not have credentials for the relevant tools. No manual queues, no governance committee, no additional tooling.

For guidance on measuring how well this enablement programme is working, see measuring AI enablement programme effectiveness.


Risk-Tiered Governance: Where to Apply Heavy Controls and Where to Stay Out of the Way

Risk-tiered governance applies controls proportional to the actual risk of each AI use case. It is the mechanism that lets you say “yes” quickly to most AI use while reserving effort for cases that actually matter.

Applying the same approval process to “an engineer using Copilot to complete a function” and “deploying an AI agent that processes customer financial data” creates overhead without proportional risk reduction. Design tools represent 9.5% of total AI usage but only 0.06% of sensitive data exposures. Coding tools show a 14x concentration of credential exposure. Uniform governance misallocates effort — risk-tiered governance puts it where the data says it belongs.

Low risk / Light governance. Drafting, summarisation, research, code completion going through standard review. No approval gate beyond Tier 1 training. Minimal friction — this is the majority of AI use.

Medium risk / Standard governance. AI-generated content for external publication, AI-assisted customer communications, AI tools handling internal structured data. Output review against the AUF; AI tool usage logging; periodic spot audits.

High risk / Strong governance. Customer-facing AI agents, AI processing PII or regulated data, AI integrated into financial or legal workflows. Use case approval, named owner, pre-production testing, continuous monitoring, and human-in-the-loop review before outputs reach customers.

Map each use case to its risk tier in the permitted tools list. Employees know immediately what applies — less ambiguity, fewer exception requests, less friction driving shadow AI governance underground.


Frequently Asked Questions

What is shadow AI and how is it different from shadow IT?

Shadow AI is the use of AI tools by employees without formal IT or security approval. The key difference: AI tools can actively process and expose sensitive data in ways traditional unapproved SaaS apps do not. Shadow AI is also harder to detect — AI features are increasingly embedded inside already-approved tools like Notion and Grammarly.

Why do employees use unauthorised AI tools even when they know the risks?

Productivity. AI tools save employees 40–60 minutes per day. When there is no enterprise licence or the approved tools have frustrating limitations, personal accounts fill the gap. Employees are making rational decisions within the constraints their organisation has created.

How much does shadow AI cost in a data breach?

IBM’s Cost of Data Breach 2025 report found shadow AI adds $670,000 to the average breach cost — a 16% increase. Shadow AI breaches disproportionately affect customer PII: 65% versus a 53% global average.

How do I know if my team is using AI tools I don’t know about?

Three lightweight signals to start: firewall and DNS log review for outbound connections to known AI endpoints; OAuth grant lists via your SSO admin panel; and SaaS expense line items for AI accounts IT did not provision. Then run an employee AI usage survey — self-reporting surfaces tools in non-engineering roles that network monitoring misses.

What tools can I use to detect shadow AI without a dedicated security team?

Start free: firewall log analysis, an employee survey, and an OAuth grant review. For intermediate coverage: DLP/DDR tools like Cyberhaven for continuous monitoring. For comprehensive coverage: CASB platforms — Netskope, Microsoft Defender for Cloud Apps, Zscaler.

What should an AI acceptable use policy actually say?

At minimum: the permitted tools list by role, data handling rules, output review requirements for high-risk use cases, the incident reporting procedure, and a review schedule. Avoid blanket prohibitions with committee approval requirements. One page, plain language, specific approved tools by name.

How does AI governance differ for regulated industries versus growth-stage SaaS companies?

Regulated industries face mandatory compliance obligations: GDPR, HIPAA, FCA guidance, EU AI Act requirements for high-risk AI systems. Growth-stage SaaS companies have more discretion but the same shadow AI risks. The minimum viable governance approach here is most directly applicable to growth-stage companies. The risk-tiered approach is valid for both.

What is the fastest way to get AI governance in place without creating bureaucracy?

The minimum viable governance stack: an AI Asset Inventory built from an employee survey and OAuth audit; a one-page Acceptable Use Framework with a named permitted tools list; and a three-tier role-based access model provisioned via SSO. Two to four weeks, using existing HR and SSO infrastructure. No new tooling, no committee, no governance hire.

Who should own AI governance in a company without a dedicated governance function?

One person — not a committee, not a vendor. The technical lead owns the policy and permitted tools list; the engineering lead owns Tier 3 approvals; IT or DevOps owns SSO provisioning and detection tooling. Creating an AI governance committee as a workaround for named ownership is a mistake — committees create diffused accountability and slower decisions.

How do I design an AI enablement programme that makes sanctioned tools more attractive than shadow alternatives?

Apply product thinking to internal tooling. The provisioning process should be faster than signing up for a personal account. The approved tool should cover the use cases employees actually need. Training should be short, practical, and immediately applicable. If employees are not using sanctioned tools, the programme needs redesign — not tighter enforcement.

Can AI governance be automated, or does it always require human oversight?

Detection and enforcement can be substantially automated: DLP/DDR tools monitor data flows; CASB platforms enforce tool access policies; SSO provisioning gates access based on training completion. What cannot be automated: use case risk classification, high-risk tier approvals, incident response, and policy updates. These require named human accountability.

What metrics should I track to know if our shadow AI programme is working?

Four leading indicators: (1) Approved tool adoption rate — are employees using sanctioned tools? (2) Shadow AI detection rate — are new unauthorised tools appearing less frequently? (3) Policy exception request volume — a rising trend signals the permitted list is too short or the policy too restrictive. (4) Training completion rate — is certification keeping pace with headcount growth? Avoid using policy existence as a success metric — that measures governance theatre, not governance effectiveness.


For a complete overview of the enterprise AI governance gap — from operating model design to accountability structures to regulatory obligations — see What AI Governance Actually Requires and Why Most Policies Fall Short.

Runtime AI Governance and Why Observability Is Not Optional

Here is a number worth sitting with: 82% of executives in Gravitee‘s 2026 State of AI Agent Security report feel confident their existing policies protect against unauthorised agent actions. Only 47.1% of those same organisations actively monitor their AI agents at runtime. Only 14.4% report that all their agents went live with full security or IT approval.

That is the Confidence Paradox. Your governance policy describes what your AI agents are supposed to do. Runtime observability tells you what they actually do. Most organisations have written the policy. Most have not built the infrastructure.

As AI agents move from pilot projects into operational infrastructure, the absence of runtime observability is not a tooling maturity problem. It is a governance problem.

This article covers the technical layer of the broader AI governance gap observability helps close: behavioral drift, what AI observability infrastructure actually includes, how observability-driven sandboxing works, and how to design human oversight into an automated system at scale.


Why Is Runtime Governance Different From Policy Governance?

A governance policy is written before deployment. It describes intended behaviour, sets out permitted actions, and establishes accountability. It is a statement of intent — and it has no way of detecting when agent behaviour diverges from that intent at execution time.

Here is the problem. AI agents fail differently from traditional software. A broken API call throws an exception. An agent reasoning failure produces confident, plausible output that is completely wrong — no error, no alert, no log entry unless you have built the infrastructure to generate one. The most expensive failures are silent errors amplified through multi-agent pipelines before anyone notices.

Traditional software governance assumes auditing the code is sufficient. That assumption does not transfer to agentic systems. Agents make decisions at runtime that were never explicitly coded. The Gravitee Confidence Paradox makes this concrete: of organisations in active testing or production, more than half have agents running without any security oversight or logging. 88% confirmed or suspected an AI agent security incident in the last year.

Static AI policy documents describe intended behaviour. Continuous AI governance monitors and enforces actual behaviour in production. Both are required; neither replaces the other. If you have a CI/CD pipeline but no observability for your AI agents, you have automated the build but not the governance. Building enterprise AI governance means addressing both layers.


What Is Behavioral Drift and Why Does It Make AI Harder to Govern Than Traditional Software?

Behavioral drift is the progressive degradation of an AI agent’s decision patterns, tool usage, reasoning pathways, and inter-agent coordination over time in production — without any code change, parameter update, or explicit human action.

This is different from model drift, which is about training data distribution shift. Behavioral drift occurs within a deployed, unchanged model. The weights have not changed. The behaviour in production has.

Research on multi-agent LLM systems identified three distinct manifestations. Semantic Drift is progressive deviation from original intent as context accumulates. Coordination Drift is breakdown in consensus mechanisms over extended interaction sequences. Behavioral Drift in the narrow sense is the emergence of unintended shortcuts.

The numbers are striking. Detectable drift emerged after a median of 73 interactions. Task success rates dropped from 87.3% to 50.6% — a 42% degradation. Human intervention requirements increased 3.2x. These are simulation results, but the directional signal is clear: drift is early-onset and its governance costs compound.

Model version changes are a primary but underacknowledged trigger. Every model upgrade or provider switch should be treated as a re-baselining event — re-establish the baseline and monitor closely in the first production window.

The Agent Stability Index (ASI) is the measurement framework for this. It is a 12-dimensional composite tracking Response Consistency, Tool Usage Patterns, Inter-Agent Coordination, and Behavioral Boundaries over rolling 50-interaction windows. Drift is flagged when ASI drops below 0.75 for three consecutive windows. The rate of decline increases nearly 2.5x between the 0–100 interaction window and the 300–400 window — drift accelerates, making late-stage governance harder.

Behavioral drift is undetectable without continuous monitoring. Code review, deployment testing, and periodic audits catch nothing here.


What Does AI Observability Infrastructure Actually Include?

That is the gap AI observability infrastructure is designed to close.

AI observability is not the same as APM. APM traces API latency, error rates, and uptime. AI observability traces reasoning chains, decision steps, tool call sequences, inter-agent messages, and policy decisions. As LangChain‘s 2026 survey puts it: “In software, the code documents the app. In AI, the traces do.”

There are five components of production-grade AI observability infrastructure.

  1. Execution Tracing: recording each agent execution step as structured OpenTelemetry spans — think of traces as the call stack for an AI system.
  2. Evaluation Pipelines: automated continuous assessment of outputs against quality, safety, and behavioral criteria. Only 37.3% of organisations currently run online evaluations of live agents — this is the core continuous governance capability most haven’t built.
  3. Behavioral Baseline Establishment: recording the first N production interactions as the reference for expected behaviour. Without a baseline, drift is undetectable.
  4. Alerting: notification systems triggered when drift signals, policy violations, or anomalous tool use exceed thresholds. This converts passive recording into active governance.
  5. Visualisation: dashboards and trace explorers that make agent execution inspectable without raw log access.

Arize Phoenix is the recommended open-source entry point — more detail in the tooling section below.

The operational cycle runs like this: trace collection → behavioral analysis → drift detection → alert generation → intervention → re-baselining. This is what governance execution requirements look like in practice.

For organisations with scale or compliance requirements, the commercial platform tier includes Arize AX (enterprise, production-scale evaluation and compliance reporting), Fiddler AI (lifecycle focus), DataRobot (unified AI development and governance platform), and Braintrust (evaluation-first, 25+ built-in scorers).


What Is Observability-Driven Sandboxing and How Does It Work?

Monitoring tells you what happened. Observability-driven sandboxing prevents it from happening. The sandbox sits between inference and side effects: the agent plans actions, but execution is gated by explicit policy checks.

Each tool invocation is treated as a capability request. Here is the mechanism:

  1. The agent plans an action — “write to this file path” or “call this external host”
  2. The request is intercepted and evaluated against Policy-as-Code definitions
  3. The allow/deny decision is emitted as an OpenTelemetry span — traceable and auditable
  4. If denied, the agent receives a policy-violation signal recorded in the trace

Three policy classes cover the primary agent execution risk surface.

Policy-as-Code is the pattern engineers familiar with infrastructure-as-code will recognise immediately. Governance rules defined in code are version-controllable, auditable, and reviewable in a pull request.

For more on technical infrastructure for detecting unauthorised tool use, see the companion guide on detecting shadow AI and creating sanctioned pathways.


When Should AI Governance Be Automated and When Does It Require Human Review?

Sandboxing handles the preventive layer — but some decisions need a human in the loop, not just a policy check. The design question is which actions get which treatment.

The scale argument against defaulting to human approval everywhere is now empirical. McKinsey operates 25,000 AI agents for 45,000 employees. NVIDIA’s Jensen Huang projects roughly 100 AI agents per employee. At that density, human approval at every decision point is operationally impossible.

Here is how to think about the design decision.

Human-in-the-Loop (HITL) pauses agent execution for human approval. Use it for irreversible, high-liability, or external-facing actions — financial transactions, external communications on behalf of the organisation, sensitive customer data access.

Human-on-the-Loop (HOTL) lets the agent continue while humans monitor via dashboards and receive drift alerts. Use it for reversible, lower-risk actions within a sandboxed scope — file operations, internal data retrieval, draft generation with downstream human review.

HOTL is the scalable supervision model. Humans monitor running agents, receive ASI-driven alerts, and intervene when signals warrant — governance at scale without blocking throughput.

Implementation options for HITL checkpoints: LangGraph (built-in HITL support), CrewAI, HumanLayer (purpose-built approval workflows), Permit.io (fine-grained access control).

Stop Authority — the principle that specific people have explicit authority to halt AI systems — needs technical infrastructure to back it up. That infrastructure is a HITL checkpoint backed by observability dashboards. For more on implementing stop authority through observability infrastructure, see the guide on assigning accountability for enterprise AI.


What Does the AI Observability Tools Market Look Like in 2026?

The market has two tiers: open-source and free-tier tooling for mid-market companies without dedicated MLOps teams, and enterprise commercial platforms for organisations with scale or compliance requirements.

Open-source entry point — Arize Phoenix: zero cost, no budget approval required. It integrates with NVIDIA NeMo Agent Toolkit, LangGraph, and CrewAI. DeepLearning.ai‘s course uses Phoenix as the standard observability layer — that is an enterprise credibility signal worth noting. A 50-person company can implement foundational observability — tracing, baseline, alerting — without specialised staffing.

Commercial platform tier:

One thing worth being clear on: general APM tools cannot do this job. Datadog, New Relic, and Dynatrace monitor AI infrastructure at the infrastructure layer — latency, throughput, error rates. They cannot trace agent reasoning chains or detect behavioral drift. That category distinction matters when you are doing procurement.

If you are a team without a dedicated MLOps function, here is your evaluation checklist:

  1. Agent-level tracing (reasoning chains, tool call sequences)
  2. OpenTelemetry support — portability and auditability
  3. Evaluation pipeline support — automated assessment, not just logging
  4. Behavioral baseline and drift detection
  5. HITL annotation support
  6. OWASP Agentic Security alignment

As agent density increases, observability infrastructure becomes the de facto governance layer. The EU AI Act‘s requirements for high-risk AI systems — documentation, logging, human oversight — are directly addressed by execution traces, continuous monitoring, and HITL checkpoints. The organisations investing in this now are building the foundation that compliance requirements will eventually mandate for everyone. For a complete view of the broader AI governance gap observability helps close — from shadow AI through to measurement — see the series overview.

For a guide to the governance metrics that observability infrastructure enables, see the companion article on measuring whether AI governance is working beyond usage counts.


FAQ

What is the difference between AI observability and traditional application monitoring?

APM traces infrastructure metrics: latency, error rates, resource utilisation. AI observability traces what agents actually do — reasoning chains, tool selections, inter-agent interactions, and policy conformance. APM monitors performance; AI observability monitors behaviour.

What causes behavioral drift in AI agents?

Behavioral drift occurs when an agent’s decision patterns deviate from baseline without any code change. Primary causes: context accumulation (conversation histories shift response patterns), model version changes (upgrading the model can alter behaviour even when configuration is unchanged), and multi-agent coordination degradation.

What is continuous AI governance and how is it different from a static AI policy?

A static AI policy describes intended behaviour; it has no mechanism for enforcing it during execution. Continuous AI governance runs monitoring, evaluation, drift detection, and enforcement infrastructure permanently in production. Governance as ongoing infrastructure, not a compliance checkbox.

How does observability-driven sandboxing prevent unauthorised agent actions?

Observability-driven sandboxing intercepts agent tool calls before execution and enforces an allow/deny decision before any side effect occurs. The enforcement decision is emitted as an OpenTelemetry trace span — making it auditable. Denied actions generate a policy-violation signal rather than a silent failure.

When should I use human-in-the-loop versus human-on-the-loop governance?

HITL pauses execution for human approval — use it for irreversible, high-liability, or external-facing actions. HOTL allows agents to continue while humans monitor and can intervene — use it for reversible, lower-risk actions within a sandboxed scope. Most production architectures use HOTL as the default and HITL at specifically designated high-risk checkpoints.

What is the Agent Stability Index and how is it used?

The ASI is a 12-dimensional composite metric tracking Response Consistency, Tool Usage Patterns, Inter-Agent Coordination, and Operational Efficiency over rolling 50-interaction windows. Drift is flagged when ASI drops below 0.75 for three consecutive windows.

Can small companies implement AI observability without a dedicated MLOps team?

Yes. Arize Phoenix is open-source and free. It integrates with LangGraph and CrewAI and provides tracing, evaluation pipelines, and drift visualisation out of the box — no specialised staffing required.

What does the EU AI Act require in terms of runtime AI governance?

For high-risk AI systems, the EU AI Act requires technical documentation, logging, and human oversight mechanisms. Runtime observability infrastructure addresses the logging and oversight requirements directly — execution traces provide the documentation artefacts, and HITL checkpoints implement the required human oversight.

What happens to behavioral drift when you update your AI model or switch providers?

Model version changes are a primary drift trigger. Agent behaviour can shift substantially even when code and configuration are unchanged — the underlying model’s response patterns have changed. Treat every model version change as a re-baselining event.

What is Policy-as-Code in the context of AI governance?

Policy-as-Code defines sandbox rules as executable code — making enforcement deterministic, version-controllable, and auditable. The pattern will be familiar to anyone who has worked with infrastructure-as-code. In AI governance it typically covers: Workspace Enforcement (file path access), Network Allowlisting (approved external hosts), and Write Controls (operations requiring approval).

What are the OWASP Agentic Security top risks and how does observability address them?

The OWASP Agentic Security top-10 includes prompt injection, unauthorised tool invocation, data exfiltration, and uncontrolled autonomy. Sandboxing addresses tool invocation and exfiltration by intercepting tool calls before execution. Execution tracing and evaluation pipelines cover prompt injection and autonomy violations.

How do I know if our AI governance programme is actually working?

Check whether your observability infrastructure is generating actionable data. Are you collecting execution traces? Do you have a behavioral baseline? Are drift alerts firing? Are policy violations intercepted by sandboxing rather than discovered post-incident? If the answer to any of these is “no,” the governance programme exists on paper but not in infrastructure.

Assigning Accountability for Enterprise AI Before Something Goes Wrong

Picture this. It’s a Tuesday morning and your AI-powered refund approval system has been running overnight. By the time anyone notices, it has auto-approved several hundred refund requests well outside policy bounds — some legitimate, many not. The finance team calls. The CEO calls. Someone asks: “Who owns this?”

Silence.

That silence is not a technical failure. It is a governance failure, and it lands on you. Not because you wrote the code, but because no one could point to a named person who was accountable for that system’s behaviour in production.

EY‘s February 2026 Technology Pulse Poll of 500 US technology executives found that 52% of department-level AI initiatives are operating without formal approval or oversight. For 42%, stopping a production AI system requires board or CEO intervention.

This is a governance failure with direct personal career and legal consequences. The EU AI Act assigns accountability explicitly to deployers — not to the vendors who built the model. When something goes wrong, the accountability question has a legal dimension that goes well beyond your job description.

Here is the practical framework to close that gap: the Enterprise AI Ownership Stack, the Stop Authority concept, a Decision Rights Matrix you can build in a day, and a Minimum Viable Governance package you can implement without building enterprise bureaucracy.

This article is the third in a cluster on why AI governance execution matters — building on the accountability problem from ART001 and the operating model structure from ART002, and leading into the technical enforcement layer in ART004 and the regulatory stakes in ART007.

Who is personally accountable when an AI system causes harm?

Accountability in enterprise AI means one named individual answers for outcomes — not a team, a platform, or a vendor. The Business Owner is that individual for each high-impact AI use case. Accountability cannot be transferred to the AI vendor: under the EU AI Act, the deployer retains full legal responsibility regardless of contract terms.

Most organisations blur the line between accountability and responsibility. They are not the same thing. Accountability is outcome ownership — one person answers for what the AI system did. Responsibility is execution ownership — many people share the work. As Infosys frames it: accountability is who answers for outcomes; responsibility is who does the work; decision rights is who can approve, change, pause, or stop AI. All three are distinct and must be assigned separately.

The governance trigger moment is the pilot-to-production threshold. When AI advises — drafts, predicts, recommends — a human still decides. When AI acts — approves a refund, changes a credit limit, triggers a workflow — accountability must crystallise into a named person. If it does not, you have deployed a decision-making system with no owner.

The vendor accountability trap is where most organisations come unstuck. The logic is intuitive but wrong: “We’re using Vendor X’s model, so Vendor X is accountable.” The EU AI Act closes this loop explicitly. Deployment is ownership. Vendor contracts can define responsibilities but cannot transfer accountability. A contract clause that attempts to do so will not protect you.

What is the Enterprise AI Ownership Stack — and which roles matter first?

The Enterprise AI Ownership Stack distributes accountability, responsibility, and decision rights across nine named roles rather than assigning everything to one team. The three roles to fill first are the Business Owner (accountable for outcomes), AI Product Owner (owns the use case), and Platform Owner (holds stop authority).

The Infosys Enterprise AI Ownership Framework defines the full stack:

Business Owner — Outcomes, risk acceptance, escalation authority. The single person who answers when things go wrong.

AI Product Owner — Acceptance criteria, human-in-the-loop design, escalation rules. Bridges business accountability and technical delivery.

Platform Owner — Model gateways, logging, monitoring, guardrails. Holds operational stop authority; enforces safety at runtime.

Model Owner — Model performance, robustness, drift response.

Data Owner / Data Steward — Data definitions, access approvals, quality SLAs. Most AI failures are disguised as data failures.

AI Risk Owner — Risk assessments, control testing, bias and harm checking.

AI Security Owner — Threat modelling, prompt injection risks, access patterns.

Legal / Compliance / Privacy Owner — Regulatory mapping, privacy and consent, audit readiness.

AI Ops / SRE Owner — Production reliability, run books, on-call, rollback procedures. If the AI fails at 2:00 AM, you need a plan, not a research paper.

Every enterprise AI system in production requires dual ownership — a Business Owner accountable for outcomes, and a System Owner accountable for operability. Business-only ownership creates chaos when things break. IT-only ownership creates irrelevance when risk decisions are being made.

The AI Center of Excellence anti-pattern is the single most common governance mistake here. A CoE cannot formally accept business risk, and becomes a bottleneck as deployments multiply. The CoE defines standards — business units own use cases and outcomes. Governance without ownership becomes documentation; ownership without governance becomes risk.

Fill the Business Owner, AI Product Owner, and Platform Owner roles first. Defer the rest until you have the governance maturity to support them without creating bureaucracy.

Stop authority: the governance test most organisations fail

Stop authority is the pre-assigned right of a named individual to pause, halt, or roll back an AI system in production without requiring board or CEO approval. If you cannot answer “who can stop this?” in ten seconds, you do not own it — you are experimenting with it. EY (2026): only 50% of AI governance leaders have independent halt authority; 42% require board or CEO intervention.

Think about what that data means in practice. A production AI system is causing harm. The person who notices it has no authority to stop it. They must find the right executive — who may be in a meeting or a different time zone. While approval is sought, the system keeps running. When halting a production AI system requires a board meeting, your governance structure cannot protect against real-time harm.

Here is who holds stop authority:

The CTO should not be the named stop-authority holder for every AI system — that creates a bottleneck. Pre-delegation is the governance act.

Run this test now: “Who can stop our most important AI system, without calling me?” A pause before the answer is a governance gap.

How to build a Decision Rights Matrix for AI in five steps

A Decision Rights Matrix maps five governance decisions — use case approval, production launch, change approval, incident authority, and risk acceptance — to named roles with defined authority levels. It is a document, not a committee: one named decider per decision, with a clear escalation path.

The Decision Rights Matrix is a governance-specific RACI. The critical rule: one Accountable person per decision row, or the matrix does not function.

Step 1: List your current AI systems and planned use cases. Start with systems already in production, not theoretical ones.

Step 2: Define the five decisions:

Use Case Approval — Should we build and deploy this at all? The Business Owner decides; Risk/Compliance are consulted for high-impact cases.

Production Launch — Is this system ready to go live? The Business Owner gives final sign-off; Platform Owner, Model Owner, and Risk/Compliance must also sign off.

Change Approval — Can we modify prompts, models, or tools in production? The Product Owner handles minor changes; Platform Owner plus Model Owner handle major changes.

Incident Authority (Stop Authority) — Who can halt or pause this system right now? Platform Owner / AI Ops for immediate action; Business Owner for post-incident escalation.

Risk Acceptance — We know the residual risk — do we accept it? Business Owner only. This cannot be delegated to the Platform or engineering team — it’s the decision where governance most commonly fails silently.

Step 3: Name the decider and escalation path. Not a role title — a person’s name, or at minimum a role that maps to a single person in your current org chart.

Step 4: Validate with role holders that they accept the authority. A RACI where the Accountable party does not know they hold it is documentation, not governance.

Step 5: Attach the matrix to each AI system’s production launch checklist. A matrix that lives only in a governance document has no operational power. It must be a gate that production launches pass through.

Align reviews to model retraining cycles — a matrix reflecting last year’s team structure creates false confidence.

The minimum viable ownership package for a new CTO

Minimum Viable Governance (MVG) is the smallest credible accountability package you can implement immediately: one named Business Owner per high-impact AI use case, an AI Product Owner, a Platform Owner with stop authority, and a documented Decision Rights Matrix. This is enough to govern safely without enterprise bureaucracy.

The Infosys minimum viable ownership package defines six items:

1. One named Business Owner per high-impact AI use case. The person who answers when things go wrong. Not a team, not a department — one name. Must exist before production launch.

2. An AI Product Owner per use case. Owns acceptance criteria, escalation rules, human-in-the-loop design, and feedback loops.

3. A Platform Owner with stop authority. Enforces guardrails, holds operational stop authority, owns runtime infrastructure. Without it, stop authority defaults to “whoever notices the problem and can reach an executive.”

4. A lightweight AI Risk Review for each use case. Intended use, failure modes, harm potential, escalation paths — aligned to NIST AI RMF Govern principles. No enterprise risk management machinery required.

5. Explicit, documented stop authority for Ops/Platform. Written down, communicated, and tested. The Platform Owner must know they hold this authority and have the technical mechanisms pre-built.

6. A Decision Rights Matrix — even a simple spreadsheet. The five decisions, named role holders, escalation paths. Attached to each AI system in production.

What MVG does not require: a dedicated AI governance team, an AI Center of Excellence, a formal compliance certification, months of policy work, or a separate AI ethics committee.

Apply MVG strictly to high-impact AI systems. For low-risk use cases — internal productivity tools, AI-assisted drafting — lighter-touch governance is appropriate. Not everything needs the same treatment.

To close the AI policy and execution gap, start with MVG and build from there. The enterprise AI governance structures described here provide the named owners and documented decisions that make governance real. Measuring whether your accountability structures are working — not just whether they exist — is the next discipline to build.

What do the EU AI Act and NIST AI RMF require of your accountability structure?

The EU AI Act requires deployers to assign accountability, maintain human oversight, and document governance decisions. The NIST AI RMF “Govern” function requires lifecycle accountability: named owners across the full AI lifecycle. ISO/IEC 42001 requires a management system with defined AI roles and responsibilities — implementation provides real governance value without formal certification.

Governance is how you manage AI risk internally. Compliance is demonstrating that management to regulators. You need governance first.

EU AI Act. Core deployer obligations: risk classification, pre-built human oversight mechanisms, technical documentation, named accountability chains, and incident reporting. The compliance timeline is active — bans on unacceptable-risk systems took effect February 2025. You cannot transfer accountability to the LLM vendor through contract terms. If your AI system causes harm to an EU resident, your organisation is accountable regardless of what your vendor agreement says.

NIST AI RMF. The Govern function requires named roles, documented decision rights, and policies for each lifecycle phase. In practice: documented names, documented decisions, documented escalation paths.

ISO/IEC 42001. The first certifiable AI management system standard. For most companies, the relevant question is not whether to certify but whether to implement its governance blueprint. If you already run an ISO/IEC 27001 security management system, AI governance can fold into existing audit cadences.

The six MVG items map directly: named Business Owner covers EU AI Act deployer accountability and NIST AI RMF Govern named roles; the Decision Rights Matrix covers EU AI Act documentation requirements; the Platform Owner with stop authority covers EU AI Act human oversight mechanisms.

For the detailed regulatory stakes and the full compliance timeline, ART007 covers the regulatory requirements that mandate clear accountability in depth.

Frequently asked questions

Who should be responsible for AI decisions — the CTO, the business unit, or a governance team?

The Business Owner (a named executive in the relevant business unit) is accountable for outcomes; the CTO is responsible for the governance infrastructure that makes accountability work. A governance team or AI CoE sets standards but does not own AI systems. Treating the CTO as the accountable owner of every AI use case is one of the most common governance mistakes.

What is the difference between AI governance and AI compliance?

AI governance is how you manage AI risk internally — accountability structures, decision rights, stop authority. AI compliance is demonstrating that governance to external regulators. You need governance first. The most common failure mode: investing in compliance documentation without real governance in place.

Can you outsource accountability for AI to the vendor who built it?

No. Under the EU AI Act, the deployer retains full accountability regardless of vendor contracts. Contracts can define responsibilities but cannot transfer accountability. Before deploying any vendor-supplied AI system, your own governance roles must be assigned — the vendor’s compliance documentation does not cover your obligations.

What is the pilot-to-production threshold and why does it matter for governance?

It’s the moment AI transitions from advisory (drafting, predicting) to behavioural (approving, triggering, acting). At that threshold, accountability must crystallise into named roles — Business Owner, Platform Owner, and Decision Rights Matrix must be in place before go-live, not after.

How do you actually test whether your AI governance is real or governance theater?

Ask: “Who can stop our most important AI system right now, without calling the board?” If you cannot answer in ten seconds, you have a policy document, not governance. EY (2026): 42% of organisations require board or CEO intervention to halt a high-priority AI project — that’s governance theater by definition.

What does the NIST AI RMF “Govern” function actually require?

Named roles across the AI lifecycle, documented decision rights, policies for each phase, and feedback loops between governance and operations. It does not require a large team — it requires documented names, documented decisions, and documented escalation paths.

Do I need ISO/IEC 42001 certification to govern AI responsibly?

No. ISO/IEC 42001 provides a governance blueprint — implementing its requirements gives you real value without formal certification. Certification matters when clients or regulators require it. The infrastructure matters more than the certificate.

What is the difference between a Business Owner and a Platform Owner in AI governance?

The Business Owner answers for what the AI does — value, risk acceptance, escalation. The Platform Owner answers for how it runs — guardrails, monitoring, stop authority. Combining them is a governance anti-pattern: stop authority becomes meaningless if the person who benefits from the AI also decides when to halt it.

Why do executives believe AI governance is in place when operational teams say it isn’t?

Executives see policy documents and assume governance. Operational teams see runtime reality and know the distance between the two. Closing the gap means moving from governance as documentation to governance as named people with documented decisions.

How does the EU AI Act affect companies outside the European Union?

Any company deploying AI systems affecting EU residents is subject to EU AI Act requirements regardless of where it’s headquartered. For Australian and Asia-Pacific SaaS companies with EU customers, the deployer obligations apply. Even without EU exposure, the governance structures required for EU compliance are simply good governance practice.

How to Build an AI Operating Model That Goes Beyond Policy Documents

Most organisations trying to get serious about AI governance have a policy document. Very few have an operating model. That distinction matters more than it sounds — it is the difference between governance on paper and governance that actually runs.

Here’s the quickest test: if your team can’t answer “who can stop this AI system right now?” in ten seconds, you don’t have an operating model. You have an experiment with a document attached to it. This guide is part of our comprehensive AI governance gap overview, which covers the full landscape from shadow AI diagnosis through to regulatory compliance. Understanding the operating model design choices is what this article is about.

What does an AI operating model actually include — and what is it not?

An AI operating model is the organisational structure that connects your AI strategy, your AI policy, and your technology stack to actual execution. It is not any one of those three things on its own.

Here’s the three-layer distinction worth getting clear on. Your AI strategy answers what and why. Your AI policy answers what is permitted. Your AI operating model answers how it actually runs — who owns it, what the approval process looks like, who has stop authority, and how investment decisions get made. According to Databricks, “Enterprise AI readiness is ultimately an operating model decision.” Strategy and policy are inputs. The operating model is the machinery.

Every operating model has five structural components: named ownership of AI systems and outcomes; approval, deployment, monitoring, and decommission processes; an AI asset inventory as the visibility foundation; a portfolio management discipline for investment decisions; and governance structures calibrated to your current maturity level.

What it is not: a team name, a set of principles, or a project plan. Alation’s framework requires three layers to function simultaneously — the knowledge layer, the process layer, and the ownership layer. Drop any one of them and you end up with governance theatre: activity that generates documentation without providing any real oversight.

McKinsey finds fewer than 10% of AI use cases make it out of pilot mode or materially influence P&L outcomes. IBM puts it plainly: successful implementation and scaling of enterprise AI is fundamentally a people and operating model challenge, not a technology challenge.

Why does operating model design matter more than model selection?

Organisations consistently over-invest in model evaluation and under-invest in ownership design. The governance failures that follow are not technology failures — they are ownership problems.

Without clear ownership, AI projects pile up. They persist past their useful life because nobody has stop authority. They duplicate effort because nobody has a portfolio view. When incidents happen, the response is ad hoc because the accountability structure doesn’t exist.

Bain’s research on enterprise AI transformation found that assigning accountability to general managers rather than IT leadership is one of the distinguishing factors in organisations achieving meaningful EBITDA gains from AI. Governance authority is most effective when it sits where business outcomes are owned.

The policy document trap is very common. 54% of IT leaders say ensuring AI solutions comply with governance regulations is a top priority for the next 12 months. A policy tells people what they should do. An operating model determines who does it, who checks it, and who stops it when it fails.

For a 50–500 person SaaS company, the gap is sharper because there is no default organisational structure for AI governance. Unlike large enterprises that inherit governance structures from regulated industries, a mid-market tech company has to build it from scratch. Nobody designs it deliberately, so it does not exist.

What is the AI asset inventory and why is it the first step?

You cannot govern AI you cannot see. The AI asset inventory is a living register of all AI systems, models, datasets, integrations, and shadow AI deployments across your organisation. It is not a one-time audit — it is the foundational visibility layer that everything else depends on.

The shadow AI discovery step is where most organisations get uncomfortable. Nearly 60% of employees use unapproved AI tools at work, feeding sensitive company information to unsanctioned products. Shadow AI incidents account for 20% of all breaches, and 27% of organisations report that more than 30% of their AI-processed data contains private information — customer records, trade secrets, financial data. Building the inventory will surface this. That discomfort is the point.

A minimum viable inventory needs seven fields: AI tool name and vendor; business unit using it; data inputs and outputs; business process affected; approval status (sanctioned / unsanctioned / under review); named owner; last review date.

Nikhil Gupta at ArmorCode puts it well: “The CISO who claims their organization has responsible AI governance should be able to answer three questions immediately: Where is every AI asset deployed right now? Who is accountable for each one? What governed decisions were made about AI risk in the last 30 days? If any of those questions requires a manual scramble, you do not have governance. You have intent.”

The inventory needs a named owner and a quarterly review cadence. SaaS platforms add AI features in routine product updates. Developers call model APIs that never reach procurement. Assign ownership before you publish the first version. And make approved tools easier to use than unsanctioned alternatives — governance that creates friction just drives employees underground.

What is data–AI proximity and how does it signal whether governance is real?

Data–AI proximity is a governance maturity diagnostic from Dael Williamson at Databricks: how close does ownership of data and AI sit to the CEO? The shorter the distance, the more serious the company’s AI posture.

Williamson is direct: “If data and AI are owned directly by or close to the CEO, that signals a high level of strategic importance. More often, ownership sits several layers down, and in many cases data and AI are owned by entirely different groups.” Fragmented ownership produces fragmented governance — regardless of which models you’ve deployed.

45% of IT leaders point to lack of executive sponsorship as a major blocker to AI orchestration. Executive sponsorship without structural proximity is just enthusiasm.

At 150 employees you probably can’t justify a dedicated Chief AI Officer. But you can map who owns AI decisions today and assess the distance from executive authority. If the honest answer is “nobody owns it clearly,” that is the gap to close first.

Two questions worth sitting with before moving on:

Self-assessment question 1: Who in your organisation has named accountability for AI decisions today — and what is their reporting line to the CEO?

Self-assessment question 2: Is your company’s data strategy and AI strategy owned by the same person or team — or are they separate functions with separate reporting lines?

Centralised vs. federated governance — what to build first and when to evolve?

For most 50–500 person SaaS companies, start with a centralised AI governance model. One small team, or one person in a CTO-adjacent role, owns the AI standards, tool approval process, and asset inventory. Centralised governance is simpler to execute, easier to audit, and appropriate for the volume of AI decisions at this scale.

Dataiku’s five-stage maturity taxonomy gives you a staging guide: Siloed → Centre of Excellence → Hub-and-Spoke → Centre for Acceleration → Embedded. Most mid-market companies sit at Siloed or early CoE. Hub-and-Spoke is the realistic near-term target — companies that have actually scaled AI are three times more likely to use it.

The trigger to move from centralised to federated is specific: when individual business units have developed their own AI capabilities, have named AI owners, and are running AI in production. Covasant’s analysis is clear on the risk of moving too early — without documented standards, decentralisation produces duplication, shadow AI, and compliance blind spots.

Three things must happen before you decentralise: accountability must be explicitly re-assigned to business units; common standards must be documented; and the central function must redefine its role from executor to enabler. For context on why this structure matters beyond the operating model layer, see the shadow AI governance framework that drives these design decisions. Once governance structure is defined, you also need accountability structures within your operating model — who owns which AI systems, and who can stop them.

Self-assessment question 3: Which business unit in your organisation has the most mature AI usage today — and does it have a named AI owner with explicit accountability?

Why naming a CoE as “AI owner” creates a governance vacuum rather than filling it?

The most common anti-pattern at this stage is naming the AI Centre of Excellence as the accountable owner of enterprise AI outcomes. It sounds sensible. It creates a governance vacuum rather than filling one.

A CoE typically has no authority over business unit decisions, no budget ownership for business outcomes, and no named accountability when an AI system produces a harmful result. You’ve delegated accountability to a team that structurally cannot hold it. As the RACI principle makes clear: exactly one person must be Accountable per activity. When two people share accountability, nobody is truly accountable.

IBM’s enterprise AI governance model is cleaner: enterprise AI is owned by the business. The CoE enables standards. The Business Owner — a named individual in the affected business unit — holds outcome accountability.

What a CoE should own: AI tool evaluation standards; shared engineering infrastructure; governance templates and training; risk escalation pathways. What it should not own: business outcomes or production AI decisions.

Self-assessment question 4: Can you name the specific individual in each business unit who is accountable for the AI systems that team uses in production?

How do you build an AI portfolio management discipline at mid-market scale?

AI portfolio management is the discipline of treating AI initiatives as a portfolio of bets with explicit invest, pause, and stop decisions — made at a regular cadence, not by default or budget exhaustion.

Databricks is direct: “They manage AI initiatives as a portfolio, not a pipeline, with discipline around where to invest, pause, or stop. Not every project succeeds. Some need to be paused. Others warrant additional investment.” Without that discipline, projects fail quietly rather than being stopped intentionally.

The minimum viable portfolio review: quarterly, 30–60 minutes, led by the CTO or AI governance owner. Each initiative gets three criteria — current status, measurable outcome versus original intent, and an explicit decision (continue / pause / stop / scale). Define those criteria before the review, not during it.

Portfolio management has three dependencies: the AI asset inventory must exist; named ownership at the business-unit level must be in place; executive sponsorship must be real enough that pause and stop decisions get enacted without a new approval cycle each time. The technical layer of runtime enforcement — runtime AI governance and the observability infrastructure that makes portfolio decisions auditable — depends on the operating model having defined those owners first.

ISO/IEC 42001 requires documented objectives, performance evaluation, and continual improvement — all of which the portfolio review directly addresses.

Self-assessment question 5: When did your organisation last formally review all active AI initiatives and make an explicit investment, pause, or stop decision on each one?

If the honest answer is “never” or “we’re not sure,” that’s where to start. Not with a new policy document. And once you have reviews running, the question of measuring operating model effectiveness — whether the outcomes you intended are actually materialising — becomes the next governance milestone.

Frequently Asked Questions

What is the difference between an AI policy and an AI operating model?

An AI policy describes rules and principles — what is permitted, what is prohibited, what requires approval. An AI operating model is the organisational structure that determines who executes those rules, who enforces them, and who is accountable when they are violated. A policy tells people what to do. An operating model determines how it actually happens. Most organisations have a policy. Far fewer have a functioning operating model behind it.

What does “data–AI proximity” mean in practice?

Data–AI proximity, a concept from Dael Williamson at Databricks, describes how close ownership of data and AI sits to the CEO. When data and AI ownership is held at senior level, governance decisions get executive authority and budget. When it is fragmented across business units, governance becomes performative. Proximity is a maturity signal, not an org-chart preference.

How do I know which AI governance model is right for my company’s size?

Start with your current AI volume and business-unit maturity. If you have one or two AI tools in production and no business unit with a named AI owner, a centralised model is appropriate. When business units develop their own AI capabilities and named owners, begin planning for Hub-and-Spoke. Dataiku’s five-stage maturity taxonomy provides a practical staging guide.

Can a 300-person SaaS company run a Centre of Excellence?

Yes, but scope it carefully. A CoE at this scale is typically two to five people responsible for AI tool standards, shared infrastructure, governance templates, and risk escalation. The critical discipline: the CoE must never be named as the accountable business owner. It sets standards and enables business units to execute — it does not own outcomes.

What should an AI asset inventory actually contain?

At minimum: AI tool name and vendor, business unit using it, data inputs and outputs, business process affected, approval status (sanctioned / unsanctioned / under review), named owner, and last review date. The inventory is often the first governance artefact that surfaces shadow AI usage — tools employees are using outside IT procurement. Treat it as a living document, not a one-time audit output.

Who should own the AI asset inventory?

The AI governance function — the CTO directly at sub-100 employee companies, or a dedicated governance owner as the company scales. The inventory owner needs cross-functional visibility across IT procurement, business unit tool usage, and security monitoring data. Without cross-functional authority, the inventory will systematically miss shadow AI deployments.

How often should AI governance structures be reviewed and updated?

Operating model design should be reviewed annually, or when a significant organisational change occurs — a new business unit acquiring AI capability, a major new AI system going to production, or a governance incident. The AI asset inventory and portfolio review both require a quarterly cadence.

What is the minimum viable governance structure for a company deploying AI into production for the first time?

Five components: a completed AI asset inventory; named accountability for each production AI system (one individual, not a team); an acceptable use framework; a stop authority assignment (who can pause or roll back without escalation); and a quarterly portfolio review on the calendar.

How does ISO/IEC 42001 relate to building an AI operating model?

ISO/IEC 42001 is the international management system standard for AI. It requires documented objectives, defined responsibilities, performance evaluation, and continual improvement — all of which map directly to operating model design choices. Using it as a design reference produces more rigorous governance than building from scratch.

What happens when AI governance is not in place before a governance failure occurs?

Without a functioning operating model, governance failures follow a predictable pattern: the incident occurs, the responsible party is unclear, the resolution path is undefined, the response is ad hoc. The immediate incident cost is followed by remediation, reputational impact, and regulatory exposure — particularly under the EU AI Act for organisations with European operations.

What is the difference between AI governance and AI compliance?

AI governance is the internal operating discipline — ownership structures, accountability assignments, decision rights, and monitoring processes. AI compliance is the external requirement — adherence to regulations such as the EU AI Act, ISO/IEC 42001, or NIST AI RMF. Governance is the infrastructure that enables compliance. Attempting to satisfy compliance requirements without the governance infrastructure produces compliance theatre rather than risk reduction.

Is there a self-assessment tool for AI governance maturity?

No single standardised self-assessment exists for mid-market SaaS companies. The five diagnostic questions in this article provide a starting framework: Who has named accountability for AI decisions? Are data strategy and AI strategy owned by the same team? Which business unit has the most mature AI usage? Can you name the individual accountable for each production AI system? When did you last formally review all active AI initiatives? Answering those questions honestly surfaces the structural gaps to address first.

The requirements don’t change based on company size. What scales is the organisational infrastructure you build to meet them. For a complete overview of the shadow AI governance challenge — what’s driving the gap, how accountability structures fit in, and what regulatory frameworks require — see What AI Governance Actually Requires and Why Most Policies Fall Short.

Shadow AI and the Governance Gap Enterprises Are Not Measuring

Here is a number worth sitting with: 82% of executives feel confident their policies protect their organisation from unauthorised AI agent actions. Only 14.4% of those same organisations have full security approval for all AI agents currently deployed.

That 68-point gap is the governance story.

EY‘s 2026 Technology Pulse Poll surveyed 500 US technology executives and found 52% of department-level AI initiatives operating without formal approval. Three independent surveys converge on the same finding: adoption is running well ahead of governance.

This article is the diagnosis — what shadow AI is, how the governance gap is measured, why it persists, and what it is costing right now. It is part of our complete guide to what AI governance actually requires and why most policies fall short.

For a CTO at a 50–500 person SaaS company, the shadow AI governance gap is a personal liability question.

What is shadow AI — and why is it different from the shadow IT problem you already know?

Shadow AI is the use of AI tools — large language models, automation platforms, AI-powered SaaS applications — by employees without explicit IT or security approval. It is the AI-era evolution of shadow IT, but the analogy breaks down quickly.

Shadow IT meant using Dropbox instead of the approved file share. The risks were bounded: a file in the wrong place. Shadow AI carries all of those risks plus some genuinely new ones.

When employees submit sensitive data as prompts to an external model, that model processes it, may retain it, and in some configurations trains on it. The data has been consumed by infrastructure you do not control, with no audit trail and no retrieval path. That is a categorically different kind of exposure.

Harmonic Security‘s analysis of 22 million enterprise AI prompts (January–December 2025) makes this concrete: code, legal documents, and financial data comprise 74.5% of what employees expose through unsanctioned AI tools. Legal documents alone account for 35.0% of exposures — M&A materials, settlement content, litigation strategy. And 12.8% of coding tool exposures contain API keys or tokens.

Cloud access security broker (CASB) solutions cannot adequately address this. They manage access to cloud services, but they cannot assess model behaviour, training data exposure, or hallucination risk.

The employee-side framing is BYOAI — Bring Your Own AI, a term popularised by Microsoft WorkLab. 76% of businesses have active BYOAI use. This is the default state at most organisations that have not built sanctioned pathways for approved AI access. BYOAI matters as a governance design framing because it reframes the problem: this is not a failure of compliance, it is a gap between what employees need and what the organisation provides.

How large is the AI governance gap — and what does the data actually show?

Three independent surveys — a Big Four accounting firm’s executive poll, a security platform’s practitioner survey, and IBM‘s cost-of-breach research — arrive at the same conclusion using different methodologies and different populations.

EY (February 2026, 500 US technology executives): 52% of department-level AI initiatives operating without formal approval. 78% of leaders say adoption is outpacing their ability to manage risk.

Gravitee (2026, 900+ practitioners): only 14.4% have full security approval for all AI agents going live. Only 47.1% of deployed agents are actively monitored.

IBM Cost of Data Breach 2025: 20% of organisations have staff using unsanctioned AI tools. 97% of AI-related breaches lacked proper access controls.

The governance gap is not simply the absence of policy. HelpNet Security‘s Larridin survey found 69% of organisations report having AI risk and compliance policies — yet only 38% maintain a comprehensive inventory of AI applications actually in use. The average large enterprise operates 23 AI tools, with 45% of adoption occurring outside formal IT procurement.

50% of employees believe their organisation’s AI guidelines are “very clear,” yet 58% have not received formal training on safe AI use. That 28-point gap is where governance breaks down in practice.

Gravitee found 88% of organisations reported confirmed or suspected AI security incidents in the past year. The AI governance gap is not a theoretical risk — it is observed, ongoing harm.

Why does AI adoption keep outpacing governance even when leaders know it’s a problem?

EY Global Technology Sector Leader James Brundage coined the “velocity paradox” to name this structural condition: 85% of technology leaders prioritise speed-to-market; only 15% prioritise exhaustive pre-launch vetting.

Teams that move fast on AI deliver visible short-term gains. Governance requires slowing down in an environment where that speed is perceived as competitive advantage. So governance gets sequenced after adoption rather than alongside it.

Reco‘s data shows just how far this sequencing goes: two specific tools in their dataset had median usage durations of over 400 days before formal review. After that long, you are not evaluating a tool — it is core business infrastructure.

If you do not have a dedicated compliance team, the velocity paradox lands directly on you: accountability for both delivery speed and risk outcomes with no structural support. The answer is not slower AI adoption. It is building an AI operating model designed to move at adoption speed.

What is the confidence paradox — and why are executives and frontline managers measuring different things?

The confidence paradox is Gravitee’s framing for the disconnect at the heart of enterprise AI governance: 82% of executives feel confident their existing policies protect against unauthorised AI agent actions — yet only 14.4% of organisations have full security approval for all deployed agents. These two numbers cannot both be right.

The explanation is simple. Executives measure governance by proxy: does a policy exist? Is there a review process? These are inputs. Operational leaders measure governance by outputs: did this specific deployment go through the review process? Can someone name who can stop a misbehaving system right now? Different questions, systematically different answers from the same organisation.

HelpNet Security’s Larridin survey found a 16-point confidence gap between C-suite and directors. The closer you are to execution, the less confident you are in AI visibility.

The confidence paradox compounds the velocity paradox: if executives believe governance is in place, there is no organisational pressure to invest in better governance infrastructure. The gap becomes self-concealing.

EY’s data adds another layer: only 50% of organisations report their AI governance leaders have full independent authority to halt high-priority projects that fail safety guardrails. 42% require board or CEO intervention. When accountability for enterprise AI is unclear — and when those accountable cannot act without board escalation — governance becomes ceremonial.

What are the real business risks when AI tools run without oversight?

The risks are already occurring.

EY’s 2026 survey found 45% of technology executives confirmed or suspected sensitive data leaks from employees using unauthorised generative AI tools in the prior 12 months. 39% confirmed or suspected proprietary IP leaks. These are current-year disclosures, not projections.

Reco found organisations with high shadow AI density added $670,000 to their average breach cost. The exposure pattern is not random — employees submit their most sensitive operational data to AI tools because that is where the productivity gains are biggest. And 4% of enterprise prompts in Harmonic’s dataset went to China-headquartered AI tools, creating jurisdictional data risk that most governance frameworks do not track.

The agent-specific risk profile is qualitatively different. AI agents act: they execute tasks, interact with APIs, and take actions in production systems without human approval at each step. Only 21.9% of teams treat AI agents as independent, identity-bearing entities; 45.6% still rely on shared API keys, making accountability chains impossible to audit. Gravitee documented cases of agents gaining unauthorised write access to databases.

The aggregate risk is not one large breach. It is months of accumulating small, invisible exposures. Detecting shadow AI starts with visibility, and visibility starts with knowing what tools are running.

What closes the governance gap?

Publishing a more detailed policy document does not close the governance gap. The EY and Gravitee data make this clear: organisations with policies already have the gap. What is missing is measurement and operational enforcement.

Four components are identified across the research as necessary to move from policy to practice.

Operating model clarity. Where does AI ownership sit relative to the CEO? Databricks CTO EMEA Dael Williamson found that an AI-serious organisation’s first signal is how close data and AI ownership sits to the CEO. What an AI operating model actually includes is the foundation everything else depends on.

Named accountability structures. Someone must hold the right to approve, change, pause, or stop AI in production. If you cannot name who can stop an AI system in 10 seconds, you do not own it. Accountability for AI decisions is an operational question, not an organisational chart entry.

Sanctioned pathways for employees. Shadow AI thrives where approved alternatives do not exist. Role-based AI enablement — where access, tooling, and training are calibrated to role-specific risk — creates structured access without driving behaviour underground. Detecting shadow AI and creating sanctioned pathways addresses what policy enforcement cannot.

Measurement infrastructure. Counting licences is not governance. Knowing whether AI systems are behaving as intended requires an AI asset inventory, observability tooling, and audit trails. Only 38% of organisations maintain a comprehensive inventory. Measuring whether AI governance is working closes the loop.

These four components are interdependent: operating model clarity assigns ownership; accountability structures define authority; sanctioned pathways address frontline behaviour; measurement confirms whether the other three are functioning.

This article has established the diagnosis. For the full picture of enterprise AI governance — from operating model design through measurement infrastructure — see What AI Governance Actually Requires and Why Most Policies Fall Short.

Frequently Asked Questions

What is the difference between shadow AI and shadow IT?

Shadow IT refers to unauthorised use of software or cloud services without IT approval — the classic example is personal Dropbox for work files. Risks are primarily access control and data residency.

Shadow AI introduces additional risk categories: employees submit sensitive data as prompts to external model infrastructure; the AI processes it, may retain it, and can act autonomously in agentic configurations. The harm mechanism is categorically different. CASB tools, developed to handle shadow IT, cannot assess model behaviour, training data exposure, or hallucination risk.

Is 52% of AI projects running without oversight really that common?

Yes — and the consistency across multiple independent data sources is more credible than any single statistic. EY’s 2026 poll found 52% of department-level AI initiatives operating without formal approval. Gravitee’s 2026 report found only 14.4% have full security approval for all AI agents. IBM’s 2025 report found 20% of organisations have staff using unsanctioned AI tools. Three methodologically different surveys, same finding.

Why do executives think AI governance is in place when operational reality differs?

Executives measure governance by proxy: does a policy exist? Is there a review process? Operational leaders measure governance by practice: did this specific deployment go through the review process? Can someone name who can stop a misbehaving AI system?

Different questions, systematically different answers. The Gravitee confidence paradox — 82% executive confidence versus 14.4% actual approval rate — is the clearest quantification.

What types of sensitive data are most commonly exposed through shadow AI?

Harmonic Security’s analysis of 22 million enterprise prompts (January–December 2025) shows code, legal documents, and financial data comprise 74.5% of what employees expose through unsanctioned AI tools. Legal documents are the largest category at 35.0%, covering M&A materials, settlement content, and litigation strategy. 12.8% of coding tool exposures contain API keys or tokens. 4% of prompts went to China-headquartered AI tools.

What does the velocity paradox mean for a company trying to move fast on AI?

The velocity paradox — EY’s term, coined by James Brundage — names the structural tension: 85% of tech executives prioritise speed-to-market, but governance requires slowing down to assess and approve. Teams bypass governance to deliver, and shadow AI accretes.

The resolution is governance infrastructure designed to move at adoption speed — risk-tiered approval processes that apply lightweight checks to low-risk tools and heavier scrutiny to agentic systems.

What happens when employees use AI tools that IT hasn’t approved?

Short-term: invisible productivity. Medium-term: sensitive data submitted to those tools has already been processed by external model infrastructure with no audit trail and no retrieval path.

EY’s 2026 data shows 45% of tech companies confirmed or suspected sensitive data leaks from unauthorised AI tool use in the prior 12 months. Reco’s breach cost data shows a $670,000 higher average breach cost at high shadow AI density organisations. Reco also shows usage durations of 400+ days before IT identifies tools — by which point they are core business infrastructure.

Can writing an AI policy close the governance gap?

No. Many organisations with governance policies still have the governance gap. The missing capability is measurement — knowing whether what the policy says is happening actually is.

Without measurement infrastructure — AI asset inventory, monitoring, audit trails — there is no way to know whether policy is being followed. Policy documents are a necessary precondition, not a solution.

How many AI tools is the average enterprise running?

Most enterprises do not have a reliable answer — which is the governance problem. HelpNet Security’s Larridin survey found the average large enterprise operates 23 AI tools, with 45% of adoption outside formal IT procurement. Only 38% maintain a comprehensive AI application inventory.

You cannot govern what you have not enumerated. Detection approaches are addressed in the cluster article on sanctioned pathways and shadow AI detection.

What makes AI agents harder to govern than regular AI tools?

AI agents act: they execute tasks, interact with APIs, and take actions in production systems without human approval — unlike a chatbot, which generates text for a human to evaluate. An AI agent with shadow AI characteristics is an action risk, not just an exposure risk.

47.1% of deployed agents are not actively monitored; 88% of organisations reported AI security incidents in the past year. Only 21.9% of teams treat AI agents as independent, identity-bearing entities; 45.6% still rely on shared API keys, making accountability chains impossible to audit.

Should companies block shadow AI tools or create approved alternatives?

Blocking alone does not resolve shadow AI — employees route around restrictions when approved alternatives do not deliver equivalent productivity. Anton Chuvakin at Google Cloud put it plainly: “If you ban AI, you will have more shadow AI and it will be harder to control.”

The effective design is sanctioned pathways: a clear route for employees to use AI tools that meet the organisation’s security requirements, rather than forcing a choice between an approved tool that does not work and an unapproved tool that does.

The Complete Guide to SAP ECC End of Support and What Comes Next

SAP ECC — the ERP backbone for roughly 35,000 organisations worldwide — loses standard maintenance on December 31, 2027. Gartner projects nearly 17,000 will still be running ECC when the deadline hits.

If you are one of them, this guide covers the eight questions you need answered, with links to seven deep-dive articles. This page covers the “what and why.” The deep-dives handle the “how and how much.”

New to the topic? Start with what the deadline actually means.

In this guide:

What is SAP ECC and why is the 2027 deadline forcing action?

SAP ECC (Enterprise Core Components) is SAP’s legacy on-premise ERP platform — managing finance, logistics, and operations for tens of thousands of organisations worldwide. SAP will end standard maintenance for ECC EHP 6–8 on December 31, 2027. After that date, SAP stops issuing security patches, legal updates, and compliance fixes under standard contracts. The deadline is fixed and, according to SAP, will not be extended again.

Two deadlines exist, and most articles conflate them. EHP 0–5 already lost mainstream maintenance on December 31, 2025. EHP 6–8 — the majority — faces the 2027 cutoff. Knowing which applies to your system changes your urgency.

“End of mainstream maintenance” does not mean the software stops running. SAP ceases to issue patches and corrections, but priority-one support continues. The risk compounds over time rather than triggering on day one. SAP has confirmed no further extension is planned — extended maintenance and the RISE Private Edition Transition Option are the available escalation paths, both on SAP’s terms.

Full breakdown: What SAP ECC End of Support Actually Means and Why 17,000 Companies Are Not Ready.

Why do so many SAP migrations fail before they finish?

The failure rates are striking: only 8% of S/4HANA migrations completed on schedule, more than 60% exceeded budget, and nearly two-thirds reported significant quality deficiencies after completion (Horvath Partners study of 200 companies). Three structural causes account for most failures: weak Phase Zero governance, unresolved ABAP customisation debt, and change management neglect. Understanding these failure modes is a prerequisite for any realistic migration plan.

Three structural causes account for most of this. Weak project setup — as Horvath Partner Christian Daxbock put it: “The complexity of the project and the required resources are underestimated, while organisational competence is overestimated.” ABAP customisation debt — years of custom code that assessments routinely undercount. And change management neglect — the biggest challenge cited was a lack of IT integration in the overall project.

Full analysis: Why SAP Migrations Fail at a Rate That Should Concern Every Decision Maker.

What does a SAP migration actually cost and how do you build a business case?

SAP migration costs vary enormously by environment complexity, customisation level, and approach chosen — ranges from $2 million to over $1 billion make published figures misleading. The deeper challenge: 95% of survey respondents say building a positive ROI case for S/4HANA requires significant effort. The financial calculation must account for licence model changes from perpetual to subscription, implementation fees, business disruption, and the vendor lock-in risk of SAP BTP commitments.

The cost picture has three layers. The migration itself — implementation, data migration, custom code remediation. The commercial model shift — RISE with SAP bundles everything into a subscription, and giving up perpetual licences means losing ownership with no path back. And ongoing cost uncertainty — 92% of IT leaders cite escalating subscription costs as a concern.

A defensible business case needs to compare at least three paths: full migration, extended maintenance plus deferred migration, and third-party support. That honest comparison is what most vendor-led analyses leave out.

Full cost framework: The Real Cost of Migrating to SAP S/4HANA and How to Build a Business Case That Holds.

What are the main migration approaches — brownfield, greenfield, or bluefield?

Three migration approaches exist. Brownfield (system conversion) migrates your existing ECC system directly to S/4HANA, preserving customisations and data — faster but carries forward technical debt. Greenfield (new implementation) builds S/4HANA from scratch, eliminating legacy complexity but costing more. Bluefield (selective data transition) migrates specific processes or company codes while redesigning others — the most flexible approach and now the most common at roughly 48% of projects. The right choice depends on your customisation profile, timeline, and transformation ambitions.

The choice comes down to your customisation profile and transformation ambitions. Brownfield minimises retraining but tends to produce weaker long-term outcomes. Greenfield requires substantial change management and SAP’s “clean core” principle.

Decision framework: Brownfield, Greenfield, or Bluefield: Choosing the Right SAP Migration Path With the Data.

What alternatives exist to SAP’s prescribed migration path?

Three credible alternatives to a full S/4HANA migration exist. SAP extended maintenance extends standard support through December 31, 2030 at approximately a 9% cost premium — buying time, not a strategy. Third-party providers such as Rimini Street offer continued ECC support through 2040 at up to 50% lower cost than SAP fees. Composable ERP retains ECC as a stable core while integrating cloud-native applications for new capabilities. Each path involves trade-offs that deserve honest assessment.

There are trade-offs with each. Third-party support means no access to SAP’s roadmap. Composable ERP requires integration discipline. But 83% of surveyed SAP customers see value in composable approaches for faster access to AI and emerging technologies — that is not a fringe position.

Alternatives analysis: The SAP Alternatives SAP Will Not Tell You About: Third-Party Support and Composable ERP.

Do you need to migrate to SAP S/4HANA to access AI value?

SAP’s own narrative ties AI capability to S/4HANA migration — SAP Joule, its generative AI assistant, and embedded analytics both require S/4HANA and a clean-core architecture. But the counter-argument has gained substance: ECC data can be exposed to external AI systems via API gateways without migration, and composable ERP plus agentic workflows can deliver AI-enhanced capabilities on top of a stable ECC core.

As John Burns of Summit BHC puts it: “You can extract the ERP data, for example via SQL into BigQuery, and then develop AI agents that access that transactional data and effectively become the face of the ERP system.” Whether S/4HANA AI features justify migration cost is a business case question, not a technical constraint.

Full analysis: Do You Need to Migrate to SAP S/4HANA to Get AI Value, or Is There Another Way?.

What does the SAP migration crisis signal about the future of enterprise software?

The SAP ECC migration crisis is not just an upgrade project — it is a structural inflection point in enterprise software. The difficulty of moving 35,000 organisations off a platform they depend on reveals the tension between monolithic ERP architecture and the modular, API-first patterns that have reshaped every other software category. The question forming in boardrooms is whether S/4HANA represents a genuine architectural evolution or a 30-year software cycle repeating itself.

Gartner projects more than 13,000 ECC customers will still be on legacy ERP in 2030. The CISQ estimates US software technical debt at $1.52 trillion — the SAP migration crisis is one piece of a systemic challenge where decades of customisation make any platform transition expensive and slow. As Amit Basu, CIO at International Seaways, puts it: “The era of a single monolithic ERP trying to manage every function is being replaced by agile, modular platforms that better reflect how businesses work.”

Strategic analysis: Is Monolithic ERP Architecturally Obsolete, and What the SAP Crisis Signals About Enterprise Software.

Where do you start?

The starting point depends on what you do not yet know. If you are unclear on whether or when the 2027 deadline applies to your system, start with the deadline explainer. If you already understand the urgency and want to evaluate your strategic options, go to the alternatives analysis or cost framework first. If you are in an active migration and need to reduce failure risk, the migration approaches article and failure statistics analysis are your entry points.

New to SAP ECC? Deadline explainercost frameworkalternativesmigration approaches.

Already evaluating migration? Failure statisticsapproach selectionthe AI variable.

Questioning the strategic framing? Architectural analysisAI questionalternatives.

Strategic paths at a glance

Migrate to S/4HANA — 12–36 months. High upfront cost, ongoing subscription. 60% over budget, 8% on schedule. Best fit: organisations ready for transformation.

SAP extended maintenance — Buys time to 2030 at ~9% premium. Delay, not resolution. Best fit: organisations needing 2–3 years to prepare.

Third-party support (e.g. Rimini Street) — Immediate; support to 2040 at up to 50% lower cost. No SAP roadmap access. Best fit: stable ECC, low transformation appetite.

Composable ERP — Phased, multi-year. Variable cost, lower than full migration. Requires integration discipline. Best fit: incremental modernisation.

RISE Private Edition Transition — HANA by 2030; support to 2033. Gated behind RISE contract. Best fit: large, complex organisations.

Detailed cost and trade-off analysis is in the cost framework and the alternatives analysis.

Resource Hub: SAP ECC Migration Guide

Understanding the Deadline and Your Exposure

Making the Financial and Strategic Decision

Planning and Executing the Migration

The AI and Architecture Questions

Frequently asked questions

Does the 2027 deadline apply to every SAP ECC system?

No. December 31, 2027 applies to ECC EHP 6–8. EHP 0–5 already lost mainstream maintenance on December 31, 2025 and was automatically converted to customer-specific maintenance — no extended maintenance was offered for those versions. Knowing your EHP version is the first step. Full detail in the deadline explainer.

What actually stops working after December 31, 2027?

Nothing stops working immediately — and that is part of the problem. Your ECC keeps running, but without new security patches or compliance updates. The risk is not a cliff edge on January 1, 2028. It is a gradual degradation that compounds over quarters and years.

Can I stay on SAP ECC after 2027 without paying extra?

Remaining on ECC under standard terms means no new patches or compliance updates. Extended maintenance buys time to 2030 at a cost premium. Customer-specific maintenance keeps the same price but with reduced scope. Third-party providers offer another path entirely. Which fits depends on your risk tolerance and strategic direction. See the alternatives analysis.

What is the difference between SAP ECC and SAP S/4HANA?

ECC runs on traditional relational databases. S/4HANA runs on the HANA in-memory database with a simplified data model and modern Fiori UX. Moving between them is not an upgrade — it is a re-implementation. An upgrade you can schedule over a weekend. A re-implementation takes months to years.

What is RISE with SAP and is it the only migration path?

RISE with SAP (being rebranded to SAP Cloud ERP) bundles S/4HANA Cloud Private Edition, SAP BTP, and infrastructure into one subscription. It is SAP’s preferred vehicle but not the only path — S/4HANA can run on-premise, and GROW with SAP targets mid-market via public cloud. RISE was restructured in July 2025, so contract details require close scrutiny.

What is composable ERP and is it a real alternative?

Composable ERP keeps your core stable and adds capabilities by integrating best-of-breed cloud applications via APIs. Gartner projects 80% of organisations will adopt modular ERP deployment by 2026. It is a real alternative, but not a reason to avoid planning — it requires its own roadmap and integration discipline. See the alternatives analysis.

How long does an SAP ECC to S/4HANA migration actually take?

18–36 months is typical, but 59% ran over schedule. Gartner has seen three- to seven-year projects in complex environments. Resource scarcity is a growing constraint — S/4HANA architects are becoming harder to find as 2027 approaches. See the migration failure analysis.

Do you need S/4HANA to use SAP’s AI features?

SAP Joule requires S/4HANA and clean-core architecture. But ECC data can feed external AI tools via APIs, and partnerships like Rimini Street/ServiceNow are building AI automation on existing ECC estates. Whether S/4HANA AI justifies migration cost is a business decision. Explored in the AI analysis.

Is Monolithic ERP Architecturally Obsolete and What the SAP Crisis Signals About Enterprise Software

On January 29, 2026, SAP’s stock fell roughly 22%. Not from a scandal. Not from a product failure. From a guidance update about how quickly its installed base was converting to cloud software it had spent a decade building.

That sounds like a routine earnings miss. If you actually understand what ERP architecture does, it reads differently — as a market signal about the structural health of an entire software category.

SAP’s ECC end-of-life deadline is forcing a decision on tens of thousands of enterprises. The official answer SAP has been selling since 2015 is S/4HANA. Roughly 39% of SAP’s 35,000 ECC customers have taken it. The question for the other 61% is not just what to do about SAP — it is whether the monolithic ERP model itself is the right architectural foundation for the next decade. For a precise breakdown of what end of mainstream maintenance forces today, see our deadline explainer.

This article is our structured answer to that question. The honest verdict: “under structural pressure” rather than “obsolete” — but the evidence is specific and the argument is real. For the full decision-level detail, see our SAP ECC migration series.

Is the SAP Migration Crisis a Structural Inflection Point or Just an Upgrade Cycle?

Every ERP generation has had its painful upgrade cycle. SAP customers have been through R/2 to R/3, and every successive ECC version upgrade. Expensive, complained about, and eventually accepted — the installed base followed the path SAP laid out.

This cycle is different. That 39% migration figure across nearly a decade is not a typical adoption curve. Gartner projects 17,000 ECC customers — almost half the installed base — will still be on ECC when mainstream maintenance ends in 2027.

Here is the data point that matters. Bernstein Research analyst Mark Moerdler found that two-thirds of SAP’s new cloud customers are new to SAP entirely — not migrated ECC customers. A Freeform Dynamics survey of 455 organisations found 95% say building a positive ROI case for S/4HANA requires significant effort or is outright challenging. SAP’s own installed base is not following the intended path.

What makes this a structural inflection is that the installed base is routing around the official path. Kingfisher plc — the £13 billion UK retailer that owns B&Q and Screwfix — is the documented example. Rather than migrate to S/4HANA, Kingfisher moved its ECC system to Google Cloud with Rimini Street providing third-party support, then built an independent AI strategy using Google Cloud tools and Databricks.

The relevant analogy is the 2010s cloud disruption of on-premises software: installed base resistance, credible alternative paths, and growing VC investment. Whether this disruption follows the same arc is uncertain. But it is no longer a theoretical question.

Why Does $1.52 Trillion in Technical Debt Make This Moment Different?

The SAP ECC crisis is not a SAP-specific failure. It is one vivid instance of a structural problem running through enterprise software. The IT-CISQ 2022 report estimated accumulated software technical debt in the US at $1.52 trillion. Legacy systems consume resources that could fund modernisation — making modernisation progressively harder.

Monolithic ERP accumulates this debt in a specific way. Business processes are not documented externally — they are encoded inside the system. The finance workflow, the procurement approval chain, the inventory reconciliation logic: none of it lives in a process document. It lives in the ERP’s configuration. When support ends, you cannot simply move the data. You have to re-implement the business.

ECC migration costs range from $2 million for small organisations to over $1 billion for large enterprises. A Horváth study of 200 SAP user companies found migrations taking 30% longer than planned, with only 8% on schedule.

This is the structural trap. The more deeply a business is embedded, the more rational the deferral — until the deadline removes deferral as an option. “Just upgrade” is not a simple answer.

What Does the a16z Investment Thesis Actually Say About the Future of ERP?

In February 2025, Andreessen Horowitz published “The Opportunity for Next-Gen ERPs” — by partners Seema Amble, Eric Zhou, and Marc Andrusko. The thesis: the same conditions that enabled NetSuite to disrupt on-premises ERP are now emerging again, this time with AI-native ERP challenging cloud incumbents.

The NetSuite precedent is worth understanding. NetSuite was built for cloud-native mid-market when SAP and Oracle dominated on-premises. Oracle acquired it for $9.3 billion in 2016, roughly 15 years after launch. a16z argues AI is the equivalent forcing function — targeting two root causes of ERP failure: data ingestion (ERPs were not built for modern APIs) and data reconciliation (data failing to reconcile between sources like Salesforce and billing).

Rillet raised a $70 million Series B from a16z and ICONIQ in August 2025 — a16z putting capital behind its own analysis.

Is the analogy apt? The structural conditions are real. But AI-native ERP faces far more capable incumbents than NetSuite did in 2000, and the thesis depends on whether AI agents need clean modular data access to function — contested, not settled. Directionally credible. Not yet proven at enterprise scale.

Who Are the Next-Generation ERP Challengers and Are They Enterprise-Ready?

The a16z thesis named five AI-native ERP challengers. Doss raised an $18 million Series A (Theory Ventures, April 2025) — AI-native manufacturing and inventory management. Rillet raised $70 million Series B (a16z + ICONIQ, August 2025) — revenue accounting for SaaS companies. Campfire raised $3.5 million at Seed (June 2025) — SMB operations, earliest-stage and furthest from enterprise readiness. Toolkit is a fourth full AI-native rebuild. Endeavor is positioned as an AI engagement layer on top of existing ERP rather than a full rebuild.

The honest assessment: none of these have documented production deployments at the scale of a 500-person SAP ECC customer. Building vertically is architecturally sound — it is how credible challengers disrupt without replicating SAP’s horizontal breadth on day one. But it does mean they are not drop-in replacements.

What this means practically: track these for specific functional modules as part of a composable strategy. Think 3 to 5 years, not now.

For a deeper look at how composable ERP fits a near-term strategy, see composable ERP as the practical strategic alternative and how AI tools change the migration calculus.

SAP vs Oracle: Is Switching Vendors Just Trading One Lock-In for Another?

Oracle ERP Cloud (Oracle Fusion) is the primary incumbent alternative for large enterprises exiting the SAP ecosystem. If two-thirds of SAP’s new cloud customers are new to the vendor, as Moerdler found, then a substantial chunk of the ECC installed base moving to cloud ERP is choosing Oracle and others over S/4HANA. Oracle offers a fresh implementation rather than a constrained re-implementation inside SAP’s ecosystem.

The architectural counterpoint: Oracle ERP Cloud is also a monolithic integrated suite. Switching trades one vendor dependency for another. The architectural question about whether monolithic ERP can serve as a substrate for agentic AI orchestration applies equally to Oracle Fusion.

If your primary driver is escaping the specific complexity of the S/4HANA migration path, Oracle is a proven alternative. But you are solving the vendor lock-in problem specific to SAP, not the monolithic architecture problem.

What Did SAP’s 22% Stock Drop Actually Signal to the Market?

On January 29, 2026, SAP reported Q4 2025 results with cloud backlog deceleration guidance. The stock fell approximately 22%. Q4 2025 revenue was €36.8 billion, up 8% — the company is not in crisis. The drop was market pricing of risk.

Cloud backlog is the leading indicator of SaaS business health. When it decelerates, it signals that forward-contracted revenue — what SAP projected from ECC customers converting to S/4HANA cloud — is not materialising. SAP is acquiring new cloud customers, but its natural ECC migration cohort is not converting. Some are migrating to Oracle. Some are staying on ECC with third-party support.

For anyone thinking about architecture, the significance here is not SAP’s financial health — it is that institutional investors are now pricing the risk that the monolithic ERP conversion thesis will not materialise at scale.

Why Does Agentic AI Make Monolithic ERP Architecturally Problematic?

The architectural argument is real and specific — but the production evidence at scale is not yet there.

Agentic AI coordinates across procurement, finance, HR, and supply chain simultaneously — executing business processes that span multiple data sources and require cross-system action. To do this, it needs clean, modular, API-accessible interfaces.

Monolithic ERP’s structural constraint: data and business logic are deeply coupled within a single application layer. When an AI agent needs data across modules, it must work within the vendor’s own API surface and integration policy. The vendor controls the access layer.

The key framing here is “systems of record” — what monolithic ERP was built to be — versus “systems of action” — what agentic AI actually requires. In a composable architecture, each module exposes its own API surface. No single vendor controls the access layer.

What is settled: enterprises with composable architectures and open API access are better positioned to add agentic AI orchestration than those locked into monolithic systems.

Is Monolithic ERP Actually Obsolete or Just Under Structural Pressure?

Three structural forces are converging on monolithic ERP right now.

Commercial pressure: The installed base is resisting the official upgrade path. Freeform Dynamics found 83% of SAP organisations using modular architectures achieve above-average performance, versus 27% for traditional approaches.

Market pressure: VC is betting on AI-native alternatives. Oracle is capturing SAP’s natural migration cohort. The stock market is pricing conversion thesis risk. These are capital allocation decisions — people are voting with money.

Architectural pressure: There is a specific technical incompatibility between how monolithic ERP stores data and how agentic AI needs to access it. The argument is sound; the production proof at scale is still emerging.

What the evidence does not support: that monolithic ERP is definitively obsolete. SAP SE has €36.8 billion in revenue. Gartner’s Mike Tucciarone puts it plainly: “SAP ERP remains a trusted solution… We have not observed widespread movement in this direction yet.”

So here are your three paths today. Each is covered in full in our complete SAP ECC end of support guide.

Upgrade to S/4HANA: Lowest-risk for enterprises deeply embedded in the SAP ecosystem. Buys vendor support through the next decade. Defers the architectural question until the market is clearer.

Composable ERP: Replace individual modules with best-of-breed SaaS while extending ECC life via third-party support. Rimini Street supports SAP ECC through 2040 at 50–90% lower cost than SAP’s own pricing. This is the most architecturally sound path for agentic AI adoption — but it requires sustained architectural discipline.

Explore AI-native alternatives: Appropriate for specific functional domains — revenue accounting with Rillet, manufacturing with Doss. Not a wholesale ERP replacement today.

If you are not in the SAP ecosystem, the pattern still applies. The architectural question — can your core business systems serve as a substrate for agentic AI orchestration? — applies whether you run SAP, Oracle, or NetSuite.

Monolithic ERP is not dead. But the agentic AI argument is not going away.

For the specific migration decisions the SAP crisis forces, see our complete guide to SAP ECC end of support and what comes next.

FAQ

Is composable ERP just hype or a real architectural alternative?

It is operationally real. Kingfisher plc moved SAP ECC to Google Cloud with Rimini Street support and built an independent AI strategy without migrating to S/4HANA at all. Freeform Dynamics found companies using modular architectures achieve above-average performance 83% of the time, versus 27% for traditional approaches. The hype risk is overstatement — composable ERP requires sustained architectural attention, not plug-and-play.

What are the main alternatives to SAP S/4HANA migration?

Three documented paths: upgrade to S/4HANA; composable ERP — extend ECC life via third-party support and replace individual modules with best-of-breed SaaS; or migrate to Oracle ERP Cloud. AI-native ERPs like Doss, Rillet, and Campfire are a fourth option for specific functional domains, but none are proven as full SAP replacements yet.

What does a16z think about the future of ERP?

Andreessen Horowitz published “The Opportunity for Next-Gen ERPs” in February 2025, arguing AI is the same structural forcing function for ERP that cloud computing was in the 2000s. The precedent they point to is Oracle’s $9.3 billion acquisition of NetSuite in 2016. a16z invested in Rillet’s $70 million Series B in August 2025 — this is an active thesis, not just a published opinion.

Why did SAP’s stock drop in January 2026?

On January 29, 2026, SAP reported cloud backlog deceleration guidance — forward-contracted cloud revenue growing more slowly than projected. Q4 2025 revenue was €36.8 billion, up 8% — not a crisis. Bernstein analyst Mark Moerdler’s finding sums it up: two-thirds of SAP’s new cloud customers are new to the vendor, meaning SAP is not retaining its natural ECC migration cohort at projected rates.

Can enterprises really stay on SAP ECC beyond the 2027 support deadline?

Yes. Kingfisher plc is the documented example. Rimini Street supports SAP ECC through 2040 at 50–90% lower cost than SAP’s own pricing. SAP’s extended maintenance runs to 2030. Staying on ECC beyond 2027 is not theoretical — major enterprises are actively choosing it.

Are Doss, Rillet, and Campfire real competitors to SAP?

Not yet in the sense of full-enterprise displacement. These are early-stage companies building AI-native vertical ERP for specific use cases. The near-term use case is as point solutions within a composable strategy — revenue accounting with Rillet, manufacturing with Doss. For wholesale replacement of a 500-person SAP ECC deployment, they are not yet the answer.

Does moving from SAP to Oracle ERP Cloud solve the architectural problem?

No. Oracle ERP Cloud is also a monolithic integrated suite. Switching trades SAP vendor dependency for Oracle vendor dependency. The agentic AI orchestration constraints apply equally to Oracle ERP Cloud. If your primary driver is escaping the S/4HANA migration path specifically, Oracle is a proven alternative — but you are solving a vendor-specific problem, not the architectural one.

What is the architectural difference between monolithic ERP and composable ERP?

Monolithic ERP is a single integrated codebase — all modules sharing one database, business logic embedded in the system. Composable ERP is modular best-of-breed SaaS applications connected via APIs, each module owning its own data and API surface, replaceable without full rip-and-replace. The agentic AI consequence: composable ERP allows AI orchestration layers to access data through independent API surfaces; monolithic ERP constrains agents to one vendor’s integration policy.

What does the SAP crisis mean for companies not running SAP?

The pattern applies to any monolithic enterprise software dependency. The architectural question — can your core business systems serve as a substrate for agentic AI orchestration? — applies whether you run SAP, Oracle, or any other monolithic ERP. For SMB tech companies on NetSuite, the SAP crisis is a leading indicator of what mid-market cloud ERP incumbents may face as AI-native alternatives mature.

What is the NetSuite precedent and why does a16z use it?

Oracle acquired NetSuite for $9.3 billion in 2016, validating cloud ERP as a category. NetSuite launched in 1998, captured a segment incumbents were not serving, and grew to acquisition scale in roughly 15 years. a16z draws the analogy: AI-native ERP startups may capture vertical niches incumbent cloud ERP vendors do not serve well, and grow to compete at scale within a similar timeframe.

How is agentic AI different from traditional ERP automation?

Traditional ERP automation operates within the system’s process engine — automating steps the system already knows about. Agentic AI is an orchestration layer above the ERP, executing multi-step processes by calling data from multiple systems simultaneously and triggering actions across system boundaries. That is exactly why monolithic ERP creates structural friction for agentic workflows.

Is SAP S/4HANA the future or is it also becoming obsolete?

S/4HANA will be deployed at many large enterprises — it is not imminently obsolete. Structurally, it shares the monolithic architecture’s constraints on agentic AI orchestration. It solves the ECC end-of-life problem but not the longer-term architectural question. It buys 10–15 years of vendor support while the AI-native market matures. That might be exactly what you need.

The SAP Alternatives SAP Will Not Tell You About Third Party Support and Composable ERP

SAP’s messaging on ECC end-of-mainstream-support is pretty clear: migrate to S/4HANA via RISE with SAP, or else. What SAP doesn’t go out of its way to tell you is that two other commercially structured paths exist — third-party support and composable ERP. And 83% of SAP ECC customers are not fully aware of either.

This article covers both. The cost savings from third-party support are real. So are the risks that alternative vendors typically gloss over. Kingfisher — the £13 billion retail group behind B&Q and Screwfix — moved SAP ECC to Google Cloud under Rimini Street support and built AI capabilities on ECC data without touching S/4HANA. That case study is the clearest evidence we have that the alternatives actually work at enterprise scale.

For the full landscape of options, see our complete guide to SAP ECC end of support and what comes next.

What are the real alternatives to migrating to SAP S/4HANA?

There are three paths for SAP ECC customers facing the 2027 deadline. Migrate to S/4HANA via RISE or direct upgrade. Stay on ECC under a third-party support contract with someone like Rimini Street or Spinnaker Support. Or go composable — keep ECC as the stable core and modularise around it.

SAP has been pushing migration since S/4HANA launched in 2015. But as of Q4 2024, only 39% of ECC customers had adopted S/4HANA. Gartner projects around 13,000 organisations will still be on ECC in 2030. Gartner’s Fabio Di Capua put it bluntly: ” You convinced less than half of your clients to migrate in 15 years. How can you think you will migrate the next 50% in five years?”

The two alternatives aren’t mutually exclusive. Many organisations combine third-party support as the cost-reduction lever with composable ERP as the innovation strategy. That’s exactly what the Kingfisher case study demonstrates, and it’s what independent research suggests delivers above-average performance outcomes.

What does third-party SAP support actually provide and how does it work?

Third-party support replaces SAP Enterprise Support with an independent contract that covers bug fixes, security updates, interoperability patches, regulatory compliance updates, and custom code support — which SAP’s standard offering treats as your problem. SAP Enterprise Support fees typically run at 18–22% of net licence value per year. Third-party support providers charge roughly half that at a fixed rate.

Rimini Street, the largest independent provider of third-party support for SAP and Oracle enterprise software across 45+ countries, has committed to supporting SAP ECC through 2040. Spinnaker Support is the other established option, operating in 25+ countries with a more boutique service model.

The financial picture is pretty straightforward. One documented case: a global retailer paying $3.1M annually to SAP on $14M in ECC licences switched to Rimini Street and dropped support costs to $1.55M annually. Over four years that’s $6.2M in direct savings plus $1.8M in avoided SAP fee increases — $8M total.

TPS providers use virtual patching — a security technique where protective controls are implemented at the operating environment or network layer to address known vulnerabilities, without needing access to SAP’s source code. This is the technically important difference from SAP’s approach, and it’s the source of the most significant TPS risk.

What are the risks of third-party SAP support that vendors do not mention?

TPS vendors market cost savings hard. They have less incentive to spell out what you give up. Here’s the full picture.

Security patch coverage. SAP issues patches for all newly discovered vulnerabilities in its software. TPS providers issue virtual patches for vulnerabilities they have catalogued, which may lag or miss SAP’s full vulnerability surface. For regulated industries — financial services, healthcare, utilities — with specific patch compliance obligations, this gap requires a formal security risk assessment, not just a cost comparison. Gartner’s Dixie John has noted that while strategies like third-party support and edge innovation have merit, she believes core system upgrades ultimately remain necessary for advanced capabilities. Worth taking seriously.

System freeze. Third-party support means no access to SAP software updates, new functionality releases, or SAP roadmap features including S/4HANA AI capabilities. If your ECC implementation depends on SAP-native capabilities you expect to need updated, TPS locks you out of them.

Reinstatement costs. If you later return to SAP Enterprise Support or migrate to S/4HANA, SAP charges back-maintenance fees for the entire period off support plus approximately a 20% reinstatement surcharge. One documented case: a $2M/year contract off SAP for three years results in a $6M+ reinstatement bill. Model this scenario before you commit.

SAP relationship consequences. Michael Bloch of the German SAP user group DSAG was direct: “SAP will not grant you any incentives if you want to move to the SAP Cloud solutions” after switching to third-party support.

Custom code complexity. TPS providers support your existing custom code, but the S/4HANA codebase diverges while you are on ECC. A future migration becomes more complex the longer you stay on TPS — a technical debt dimension that does not appear in the cost savings calculations.

What is composable ERP and why is it gaining traction among SAP customers?

Composable ERP is a Gartner-coined architectural strategy: keep ECC for stable, foundational processes — finance, procurement, HR — and build modular, best-of-breed applications around it via APIs. The contrast is with monolithic replacement of the whole ERP stack at once, which is what RISE with SAP’s migration path involves.

The Rimini Street survey found 83% of SAP customers see composable approaches as providing faster access to AI, and 94% highlight the freedom to choose best-fit solutions for each business need. Worth noting: 29% of organisations still on ECC are no longer looking to SAP for innovation at all. Given that 92% of SAP customers cite rising and unpredictable subscription costs as a significant problem, the ability to choose tools independently has obvious commercial appeal.

Joe Locandro, Rimini Street’s global CIO, describes the implementation pattern clearly: “You can have an Oracle, an SAP, or whatever underneath and put new screens developed through ServiceNow or Microsoft and new workflows on top. It’s headless. All the innovation, workflows, and screens look nothing like the old green screens.” ECC as the data and transaction core, an API integration layer connecting it to modular applications, best-of-breed tooling on top.

The technical prerequisite is an API integration layer. The composable model only works if ECC data is accessible to modular applications. Data cleanliness and integration orchestration are where composable ERP transitions most commonly stall — and they are not optional extras.

How did Kingfisher move SAP ECC to Google Cloud without migrating to S/4HANA?

Kingfisher plc — £13 billion turnover, parent of B&Q and Screwfix — is the most prominent enterprise case study of this model. The scale matters. This is not a workaround used by a small business.

What Kingfisher did: moved SAP ECC from on-premise infrastructure to Google Cloud — an infrastructure migration, not a platform migration to S/4HANA. They moved SAP ECC support from SAP Enterprise Support to Rimini Street. Then they built AI and personalisation capabilities directly on ECC data using Google Cloud services and Databricks — personalisation engines, product recommendation systems, flexible pricing models — without waiting for S/4HANA’s AI capabilities.

Kingfisher’s group CTO Chris Blatchford presented this strategy at the Gartner Symposium/ITxpo in Barcelona in November 2025. He explained that Kingfisher had attempted negotiations with SAP to establish clear business value in upgrading and ultimately wasn’t persuaded by SAP’s case. Kingfisher’s approach directly contradicts SAP’s 2023 claim that “newest innovations and capabilities will only be delivered in SAP public cloud and SAP private cloud.”

Cloud infrastructure elasticity, AI capability on existing ECC data, and support cost reduction — all without the disruption, cost, and lock-in risk of a RISE migration. For organisations asking whether AI access requires S/4HANA migration, see our article on what AI is actually achievable without migrating.

Why does perpetual licence preservation matter and what happens if you give it up?

ECC perpetual licences are owned outright — an asset, not a subscription. RISE with SAP converts that asset into a subscription, and once that conversion is made, the perpetual licence is permanently surrendered. It cannot be recovered.

Joe Locandro puts it plainly: “You’ve got an asset on your books that you can keep running till 2040 or beyond, but if you give up perpetual licenses, there’s no going back. Sweat the asset and build outside of it.” Preserving the perpetual licence keeps all three paths open — migration, third-party support, and composable ERP — along with the negotiating leverage to push back on RISE pricing.

The lock-in chain in RISE is specific. The subscription bundles SAP BTP consumption-based pricing (variable, often unpredictable), SAP Signavio for process management, and SAP LeanIX for enterprise architecture — additional cost layers that come with the model. Exiting any component is difficult because software usage, operations, and support are contractually inseparable.

Kingfisher preserved its perpetual licence position by moving to Rimini Street and Google Cloud without converting to RISE. For the total cost of ownership comparison across all three strategic paths, see our business case analysis. For the failure statistics that make the financial case for staying on ECC, see the cost of SAP migration budget overruns.

How do you evaluate whether third-party support or composable ERP is right for your organisation?

Cost comparison is one input. Several others matter more in specific contexts.

For third-party support, the key variables:

Regulated industry and compliance posture. Does your organisation operate under compliance frameworks requiring SAP’s official patch coverage? If yes, virtual patching requires a formal gap assessment — not a standard RFP question.

Strategic dependency on SAP roadmap. Is your ECC implementation built around SAP-native capabilities you expect to need updated? If yes, system freeze risk is higher and TPS may be a bridge rather than a long-term position.

SAP relationship value. Active co-innovation agreements and preferred pricing may be jeopardised by moving off Enterprise Support. Quantify what you would be giving up.

Reinstatement scenario. Model the cost of returning to SAP support. If the reinstatement bill at the likely scale is unacceptable, TPS is effectively a one-way door.

Provider evaluation. Both Rimini Street and Spinnaker Support should go through a detailed RFP — coverage breadth for your modules and versions, security response SLAs, regulatory update timeliness for your jurisdictions. Headline cost comparisons are not sufficient.

For composable ERP, the key prerequisites:

Integration layer readiness. John Burns, senior director of financial systems at Summit BHC, put it plainly: “If your data is messy or your processes are inconsistent, splitting the layers will not fix that. The core still has to be clean, and the front end still has to be governed.” The API integration layer is a real technical prerequisite.

Process modularity. Finance and core procurement are deeply integrated with ECC’s transaction engine — they’re the last processes to modularise. E-commerce, analytics, and customer-facing processes are typically first.

The combined approach — third-party support for cost reduction while building composable ERP capability incrementally — is the Kingfisher pattern and what Freeform Dynamics data suggests delivers above-average performance.

Where do you start with a composable ERP transition alongside existing SAP ECC?

Composable ERP is incremental by design. You add capabilities around a stable core over time, not in one move. Here’s how it typically plays out.

Analytics and reporting are the lowest-disruption starting point. Connecting a modern data platform — Databricks, Snowflake, Power BI — to ECC data via API does not touch transactional systems. Most organisations start here.

Customer-facing processes come next. E-commerce, personalisation, and customer service tooling change faster than monolithic ERP can keep up with. Kingfisher’s personalisation and recommendation engines via Google Cloud and Databricks are the case study example.

Workflow and service management is where ServiceNow enters. Replace SAP-native workflow management while leaving ECC’s transactional backbone untouched. Agentic AI fits naturally here — “Agentic workflows thrive in architectures with open data access and strong APIs.”

The API integration layer choice has long-term implications. iPaaS platforms like MuleSoft, Boomi, or Azure Integration Services each have different trade-offs. Base the decision on your cloud provider relationship and integration team capability.

What to avoid starting with: finance and core procurement. The switching cost is high, integration complexity is substantial, and you are probably not ready. Start with processes that are genuinely modular and where you have the internal capability to manage the integration.

For deeper analysis of composable ERP as a next-generation architecture signal and whether monolithic ERP is obsolete, see our strategic capstone analysis. This article is part of our complete guide to SAP ECC end of support and what comes next, which maps all strategic paths from the 2027 deadline through migration choices, alternatives, and the AI variable.

Frequently asked questions

Can I get third-party support for SAP ECC after 2027?

Yes. Rimini Street has announced it will support SAP ECC through 2040. Spinnaker Support similarly offers post-2027 coverage. This is a commercially structured alternative — you enter a formal support contract with the TPS provider. SAP’s extended support through 2030 is also available at a 2% surcharge if you need a bridge period.

Is Rimini Street a legitimate alternative to SAP’s own support?

Rimini Street is the largest independent provider of third-party support for SAP and Oracle enterprise software, with the Kingfisher case study demonstrating it operating at enterprise scale. It’s a legitimate commercial alternative, but it’s not equivalent to SAP Enterprise Support in all respects — notably in security patch coverage, where virtual patching is used instead of SAP’s official code patches. Organisations in regulated industries should conduct a formal security risk assessment before switching.

What is composable ERP in plain terms?

Keep your existing ERP for stable back-office processes and add specialist software around it for everything else, connected via APIs. Instead of replacing the entire ERP stack with a new monolithic system, you add best-of-breed tools for analytics, customer experience, AI, or HR alongside the core. Faster innovation without touching the transactional backbone.

Is composable ERP real or just hype?

Composable ERP is a genuine architectural strategy. The Kingfisher case study demonstrates it at enterprise scale — see the dedicated section above. Freeform Dynamics research supports the approach: 83% of organisations combining composable architectures with third-party support achieved above-average business performance. The API integration layer is a real technical prerequisite and implementation complexity is non-trivial.

What is the cost of returning to SAP support after using third-party support?

SAP charges back-maintenance fees for the entire period off support, plus approximately a 20% reinstatement surcharge. One documented case puts this at approximately $6 million for an organisation on a $2M/year contract off SAP for three years. Model this scenario as part of the initial TPS decision.

How does virtual patching work and is it secure enough?

Virtual patching is a configuration-level security technique. TPS providers implement protective controls around known vulnerabilities at the operating environment or network layer, without applying SAP’s official code patches. The limitation is coverage: SAP patches all vulnerabilities it discovers; TPS providers patch the vulnerabilities they have catalogued, which may lag or not cover SAP’s full vulnerability surface. For organisations with strict patch compliance requirements, a formal gap assessment is necessary before switching.

What is RISE with SAP and what does it include?

RISE with SAP is SAP’s bundled cloud migration offering that converts ECC perpetual licences to subscriptions and moves customers to SAP S/4HANA. The bundle includes SAP S/4HANA (cloud edition), SAP BTP with consumption-based pricing, SAP Signavio (process management), and SAP LeanIX (enterprise architecture). The conversion of perpetual licences to subscriptions is permanent.

Can I stay on SAP ECC and still access AI capabilities?

Yes. The Kingfisher case study demonstrates this directly — see the dedicated section above for the full picture. The approach requires a composable architecture: ECC data exposed via APIs to a modern data platform, which then feeds AI applications. SAP’s S/4HANA offers native AI through SAP BTP, but that requires the RISE subscription model.

Rimini Street vs Spinnaker Support — how do I choose?

Rimini Street covers ECC, S/4HANA, HANA, BW, and CRM across 45+ countries — publicly traded, at roughly 50% of SAP fees. Spinnaker Support covers ECC, S/4HANA, and HANA across 25+ countries with a boutique service model at 50–60% of SAP fees. Both should go through a detailed RFP with specific coverage scope requirements — coverage breadth for your modules, security response SLAs, regulatory update timeliness, and customer references in your industry.