Eighty-eight per cent of enterprise AI agent pilots never reach production. The models are capable enough. The bottleneck is the absence of a production surface: no governed runtime, no identity management, no deployment pipelines, no cost tracking.
Microsoft is spending $150 billion a year on AI infrastructure. Foundry is the software surface designed to make that infrastructure consumable for enterprise workloads. Announced at Build 2026, it is a production infrastructure layer for deploying, governing, and scaling enterprise AI agents, a different category from AI studios and model training platforms.
At the centre of its design is the reliability-first thesis: the strategic asset that persists is the governance and operations layer, not any individual model.
What Is Microsoft Foundry and How Does It Differ from Azure AI Studio?
Microsoft Foundry is a production infrastructure layer for deploying, governing, and scaling enterprise AI agents. Azure AI Studio was a prototyping and experimentation environment. The gap between them is the gap between a development environment and a production cluster.
The naming has not helped. It went from Azure AI Studio to Azure AI Foundry to Microsoft Foundry, and plenty of people assumed it was just another rebrand. The name changed. The architecture changed more: Foundry runs on an entirely different resource architecture, replacing the old Azure ML workspace model that AI Studio relied on.
The architecture separates two concerns. The Foundry Resource handles IT governance: networking, security, model deployments, role-based access control. Projects sit underneath as isolated development environments where you build agents and run evaluations. This layered separation means the control plane (who can deploy, which models are approved, what policies apply) is distinct from the data plane (your agent code, your evaluations). That separation is what makes Foundry an operational surface, not a playground.
Foundry assumes models will change. The governance layer, identity layer, and operational tooling are built to persist across model generations. If you swap from GPT-5 to Claude next year, your deployment pipelines and compliance posture do not reset.
What Does Microsoft Foundry’s Agent Service Actually Provide for Production Deployments?
Foundry Agent Service provides a managed runtime that handles scaling, state management, tool orchestration, observability, and cost tracking. Your team focuses on agent logic; the service handles the infrastructure.
Agent types. Hosted agents are stateful, containerised, with managed scaling and memory persistence. They run in sandboxed sessions and are framework-agnostic: you can build them with the Microsoft Agent Framework, LangGraph, the OpenAI Agents SDK, or the Anthropic Agent SDK. Prompt agents are stateless, defined entirely through configuration with no code to maintain.
The rule: choose hosted agents when you need custom orchestration logic; choose prompt agents when instructions and tool attachments are enough.
Memory. Hosted agents are stateful because of the three memory types Foundry provides. Procedural memory governs agent instructions and behaviour patterns. Microsoft reports a 7 to 14 per cent absolute improvement in benchmark success rates at near-baseline cost, compared to agents without procedural memory configuration. User memory retains cross-session preferences so agents personalise without re-prompting. Session memory holds in-flight conversation state, preventing context collapse during long-running interactions.
Knowledge grounding. Foundry IQ lets your agents retrieve and reason over enterprise data (Azure Blob Storage, SharePoint, OneLake) with permission-aware, citation-backed answers. Built on Azure AI Search, it provides a single grounding API. An agent cannot access data the user lacks permission to see.
The Responses API provides a unified abstraction across all agent types. Switch models without changing agent code.
CI/CD. Foundry applies software deployment discipline to your agents. You create, test, trace, evaluate, publish, and monitor, with evaluation gates and canary deployments built in. GitHub Actions and Azure DevOps integration runs automated evaluations on every commit, with rollback mechanisms if scores degrade.
Developer tooling. The VS Code Foundry Toolkit provides templates, GitHub Copilot SDK integration, local debugging with traces, and one-click deployment. The Azure Developer CLI supports one-command deployments with autoscaling, managed identity, and promotion gates.
Identity and governance. Each agent gets a dedicated workload identity through Microsoft Entra ID, authenticating as itself rather than a user proxy. Azure API Management enforces rate limiting, authentication, and policy on every tool call. Semantic Kernel and the Microsoft Agent Framework provide the open-source orchestration runtime beneath the managed service.
Scale evidence. Enterprise deployments already running on Foundry include KPMG, which deployed Agent 365 to 276,000 employees, Standard Chartered with a 40 per cent developer productivity boost, and SoftBank, which cut customer support costs by $150 million. Foundry’s control plane surfaces cost per agent interaction across model providers, with budget alerts and spending caps configurable at the resource level. The production gap was never about model capability. It was always about infrastructure.
What Is Microsoft’s Vendor-Agnostic Model Marketplace Strategy and Why Does It Matter?
Microsoft Foundry’s model marketplace hosts models from Anthropic, OpenAI, Meta, DeepSeek, Mistral, xAI, Fireworks AI, and Microsoft’s own models, all behind a unified governance, identity, and cost-management layer. The strategy eliminates model lock-in: the infrastructure persists regardless of which model you choose.
The marketplace is the architectural expression of the reliability-first thesis. If models commoditise rapidly, the asset that holds value is the layer that manages them. Foundry makes that layer the product.
Consider the roster. Anthropic’s Claude Fable 5 arrived on Foundry on 9 June 2026. Opus 4.8 was available from 29 May 2026. A competitor’s frontier models, hosted as first-class citizens on Microsoft’s infrastructure. Over 10,000 customers have used more than one model on Foundry, and the number using both Anthropic and OpenAI models doubled quarter over quarter.
Deloitte demonstrates why this matters at scale. In October 2025, Deloitte rolled out Claude to more than 470,000 employees globally. They chose a specific model for specific enterprise workflows, through Foundry’s marketplace, where the governance infrastructure was already in place.
You want model portability. The Foundry Control Plane applies the same content safety filters and cost controls whether an agent calls GPT-5 or Claude. The governance layer makes multi-model choice safe. We explore the governance infrastructure that makes this possible in detail elsewhere, and how Foundry stacks up against AWS Bedrock and Google Vertex AI is a comparison worth its own treatment.
What This Means for Your AI Infrastructure
That 88 per cent statistic is not a comment on model quality. It describes a gap in operational infrastructure. Foundry’s architecture is designed to close that gap, and Microsoft’s $150 billion annual spend is the bet behind it.
Every layer makes the same argument: the infrastructure that manages models outlasts the models themselves. The marketplace, the agent service, the CI/CD pipelines. They are bets on where value will accumulate.
If models commoditise as fast as the marketplace strategy assumes, the platform with the best models today may not be the platform with the governance, deployment, and identity infrastructure still standing when today’s models are gone.
Frequently Asked Questions
How much does Microsoft Foundry cost, and what determines pricing?
Microsoft Foundry pricing is consumption-based and varies by the models and services you use. The platform itself has no fixed licensing fee; costs accrue from model inference tokens, agent hosting compute, knowledge indexing via Foundry IQ, and API management throughput. Foundry’s built-in Agent ROI analytics surface cost per agent interaction across model providers, so you can compare spending across GPT-5, Claude, and other models from a single cost-management pane. Budget alerts and spending caps are configurable at the Foundry Resource level.
Do I need to migrate away from Azure AI Studio to use Foundry?
No, and that is the point. Azure AI Studio projects remain fully operational, and Foundry does not force a migration. The two platforms serve different stages of the lifecycle: AI Studio is for experimentation and prototyping, while Foundry is for production deployment and governance. You can continue building and evaluating models in AI Studio and promote agents to Foundry’s managed runtime when they are ready for production. The Project layer within Foundry is designed to receive promoted artefacts from existing AI Studio workspaces.
Can Microsoft Foundry run open-source models, or only marketplace models?
Microsoft Foundry runs both. The model marketplace includes over 1,900 models spanning commercial frontier models from Anthropic, OpenAI, Meta, DeepSeek, Mistral, and xAI alongside open-weight models like Llama that you can deploy as managed endpoints. Beyond the marketplace, Foundry also supports custom model endpoints, meaning you can deploy your own fine-tuned or self-hosted models and govern them under the same identity, policy, and cost-management layer as any marketplace model.
Is Microsoft Foundry suitable for small and mid-sized businesses, or is it only for large enterprises?
Foundry is designed for any organisation that needs governed, production-grade AI agent infrastructure, regardless of size. The consumption-based pricing means you pay only for what you use, and the prompt agent type removes the need for custom orchestration code, making it accessible to teams without deep AI engineering resources. That said, the full governance surface (Entra ID workload identities, Azure API Management policies, CI/CD evaluation gates) delivers the most value to organisations managing risk across multiple agents, which skews toward mid-market and enterprise.
How does Foundry’s CI/CD pipeline actually work for AI agents?
Foundry’s CI/CD pipeline applies software deployment discipline to AI agents: agents move through development, staging, and production environments with evaluation gates at each stage. Before promotion, automated evaluations run against quality, safety, and performance criteria using Foundry’s built-in evaluation framework. Canary deployments route a fraction of traffic to a new agent version, and rollback mechanisms restore the prior version if evaluation scores degrade. The Azure Developer CLI supports one-command deployments with autoscaling, managed identity, and promotion rules baked in.
What types of tools can agents use through Foundry, and how are they governed?
Foundry agents can invoke any tool exposed through the Model Context Protocol (MCP), including Azure services, custom APIs, databases, and third-party SaaS connectors. The governance layer sits between the agent and its tools: Azure API Management enforces rate limiting, authentication, and request policies on every tool call, while Microsoft Entra ID ensures the agent authenticates as its own workload identity rather than impersonating a user. This means tool access is auditable, revocable, and subject to the same governance as any other enterprise API surface.
Does Microsoft Foundry support multi-agent orchestration, or is it limited to single agents?
Foundry supports multi-agent orchestration natively. The Microsoft Agent Framework and Semantic Kernel, which underpin Foundry’s managed runtime, provide patterns for agent composition, delegation, and multi-agent workflows. You can build hosted agents that coordinate sub-agents for complex, multi-step business processes, and the Responses API provides a unified abstraction so that orchestrator agents call the same interface regardless of whether a sub-agent runs on GPT-5, Claude, or a fine-tuned custom model.
What security and compliance certifications does Microsoft Foundry hold?
Microsoft Foundry inherits the Azure compliance portfolio, which includes over 100 certifications spanning ISO 27001, SOC 1/2/3, HIPAA, FedRAMP, PCI DSS, and GDPR standards. Because Foundry operates within the Azure trust boundary, its data handling, encryption, and audit logging are covered by existing Azure compliance attestations. The key addition at the Foundry layer is agent-specific governance: content safety filters, prompt shields, and protected material detection apply uniformly across every model in the marketplace, providing a consistent safety surface regardless of which provider you choose.
Can I bring my own fine-tuned models into Foundry, or am I limited to marketplace offerings?
You can bring fine-tuned models into Foundry as custom model endpoints. The platform supports deploying fine-tuned versions of open-weight models like Llama and also provides a path for fine-tuning within Foundry using your enterprise data. Once deployed as a custom endpoint, your fine-tuned model operates under the same governance, identity, and cost-tracking layer as any marketplace model. This means the reliability-first architecture extends to models you control directly, not just those Microsoft or its partners provide.
How does Foundry handle data residency and sovereignty requirements?
Foundry deploys within your chosen Azure region, and data processed by agents stays within that region’s boundary unless you configure cross-region failover. Knowledge grounding via Foundry IQ indexes enterprise data at rest in your Azure Blob Storage, SharePoint, or OneLake instances, meaning source data never leaves your tenant. For model inference, Foundry’s managed endpoints route requests to model providers from within your selected region where available, and you can restrict which models an agent may call based on regional deployment availability and your organisation’s data sovereignty policies.