Insights Business| SaaS| Technology Anthropic’s Two-Tier AI Strategy and the Governance Questions That Follow
Business
|
SaaS
|
Technology
Jun 18, 2026

Anthropic’s Two-Tier AI Strategy and the Governance Questions That Follow

AUTHOR

James A. Wondrasek James A. Wondrasek
Anthropic's Two-Tier AI Strategy and the Governance Questions That Follow

On 9 June 2026, Anthropic launched two versions of what it described as the same underlying model: Claude Fable 5, a classifier-gated public release available to anyone, and Claude Mythos 5, an unrestricted version accessible only through a vetted-access program called Project Glasswing. Within seventy-two hours, the US government had suspended access to both models under export control authority. Within a fortnight, the European Union’s cybersecurity agency had spent weeks negotiating bilaterally with a private company for the right to use a capability it deemed essential to its mission.

What unfolded across those weeks is not merely a product launch story. It is a live experiment in how frontier AI capability gets distributed, regulated, and contested — and the outcome remains unresolved. Anthropic built the two-tier strategy as a safety innovation. Critics see an access-control mechanism that creates a capability haves/have-nots dynamic. Governments, caught between dependence on private companies and national security imperatives, are improvising their responses event by event.

This series unpacks the full scope of that experiment across three dimensions: how the architecture actually works, how geopolitics reshaped it in real time, and whether the safety-first positioning that underpins it withstands scrutiny.

In This Series

How Anthropic’s Two-Tier AI Access Strategy Works — The technical foundation: what the split is, how the classifiers function, and why Anthropic chose this architecture over releasing a single model.

Inside Project Glasswing and the Geopolitical Fight Over Mythos 5 Access — The narrative arc: how a corporate access program became a geopolitical flashpoint, from the EU’s access dispute to the US export control shutdown.

The Trust and Governance Questions Behind Anthropic’s Safety Brand — The analytical capstone: whether the safety-first positioning holds up against dual-use risks, internal practices, commercial pressures, and industry comparison.

Start here: If you are new to this topic, begin with How Anthropic’s Two-Tier AI Access Strategy Works. It lays the technical groundwork — what the two-tier split actually is, how the classifiers function, and why Anthropic chose this architecture — that every other article in this series builds upon.

What is Anthropic’s two-tier AI access strategy?

Anthropic’s two-tier strategy releases the same underlying Mythos-class base model in two configurations with different access regimes. Fable 5 is the public tier: it runs active safety classifiers that monitor every request and silently redirect restricted queries to a less capable fallback model. Mythos 5 is the restricted tier: the same model with classifiers removed, available only through Project Glasswing’s vetted-access program to roughly 200 organisations across 15 countries. This is the first time a major AI lab has publicly split identical model architecture into tiered access levels.

The two-tier strategy represents a structural choice rather than a technical inevitability. Anthropic could have released only the restricted model — as it did with the earlier Mythos Preview — or only the classifier-gated version. By releasing both simultaneously, it created a framework where general availability and frontier capability coexist, but access to the latter is mediated by institutional trust verification rather than market pricing or technical skill. The full architecture and classifier mechanics are worth understanding in detail if you are evaluating whether your workloads will interact with the split.

At its core, the strategy responds to a capability gap: the jump from Anthropic’s prior Opus-class generation to Mythos-class was large enough — particularly in cybersecurity and biology/chemistry domains — that unrestricted general availability of the full model raised safety concerns the company judged unacceptable. The two-tier architecture lets Anthropic capture commercial value from broad deployment while maintaining control over the capabilities it considers most dangerous.

The strategy sits within Anthropic’s broader governance framework. Its Responsible Scaling Policy (RSP) and Frontier Compliance Framework (FCF) define capability thresholds — including the ASL-3 designation and CB-1/CB-2 biological risk levels — that determine when classifiers and restricted access become mandatory. The two-tier launch is the first public test of whether this self-governance framework can scale beyond a controlled preview to a commercial product release.

How Anthropic’s Two-Tier AI Access Strategy Works

How do Claude Fable 5 and Claude Mythos 5 actually differ?

The models share identical architecture, training data, and base weights — the difference is entirely in the access regime. Fable 5 runs three classifier domains (cybersecurity exploitation, biology/chemistry dual-use, and model distillation) that redirect restricted queries to Claude Opus 4.8, a prior-generation model. Mythos 5 runs no classifiers but is only accessible through Glasswing’s vetted-access program. On capability benchmarks, the two produce different results: Fable 5’s numbers include the classifier fallback effect, making direct comparison with other models misleading without understanding the split-tier architecture.

The capability delta is concentrated in specific domains rather than being uniformly distributed. Mythos 5 demonstrates significant uplift in cybersecurity tasks — autonomous vulnerability discovery, exploit generation, and long-horizon penetration testing — compared to Opus 4.8. The trajectory from Mythos Preview through to Mythos 5 showed accelerating capability in these areas, which is precisely what drove the decision to gate rather than release. In general-purpose reasoning, coding, and analysis, the difference between the two models is far narrower.

The benchmarking problem is a genuine challenge for evaluators. When Fable 5’s classifiers trigger and fall back to Opus 4.8, the user may not be aware it happened — Anthropic reports this affects less than 5% of sessions, but independent verification of that claim is limited. This means benchmark comparisons between Fable 5 and models like GPT-5.5 or Gemini 3.1 Pro need an asterisk: Fable 5’s numbers reflect classifier-gated performance, not the model’s underlying capability.

For practical decision-making, the key consideration is whether your workloads fall within the classifier domains. If you work in software development, creative writing, or business analysis, you are unlikely to encounter the classifiers and Fable 5 will function as a direct Mythos-class experience. If you work in cybersecurity research, computational biology, or train AI models, you will hit the gating mechanism — and the Mythos 5 experience behind Glasswing represents a qualitatively different tool.

How Anthropic’s Two-Tier AI Access Strategy Works

What are the key risks that justify restricting access to Mythos-class models?

The risks fall into three domains: cybersecurity exploitation (autonomous vulnerability discovery and exploit generation at a scale that could overwhelm existing remediation pipelines), biology and chemistry dual-use (drug design acceleration and novel pathogen engineering that crosses capability thresholds Anthropic labels CB-1 and CB-2), and model distillation (adversarial actors — particularly PRC-affiliated labs — systematically extracting frontier model outputs to train competing systems). These are not hypothetical: Mythos Preview discovered more than 10,000 critical vulnerabilities during its controlled-access period.

The cybersecurity risk is the most immediately consequential. Mythos-class models can compress the vulnerability-discovery-to-exploit-development timeline in ways that existing security operations were not designed to handle. When discovery moves at AI speed but patching remains at human speed — the remediation bottleneck identified by US bank regulators in May 2026 — the asymmetry creates systemic risk. This is the risk that drove the initial Glasswing structure: get the capability to defenders first, build institutional safeguards, and only then consider broader distribution. The geopolitical story of how Glasswing played out shows just how quickly this asymmetry spilled into regulatory action.

Biology and chemistry dual-use presents a categorically different challenge. Unlike cybersecurity exploits — where a capability can be blocked, detected, or patched — the bio/chem risk involves scientific knowledge that enables discovery as well as misuse. A model that can accelerate drug design can also accelerate pathogen engineering. The same capability that generates novel scientific hypotheses for legitimate research can generate hypotheses for weapons development. This dual nature means classifier-gating cannot simply “block” the capability the way it blocks a distillation attempt — it must make a judgment about intent, which is a fundamentally harder problem. The governance analysis of Anthropic’s safety brand examines whether classifier-gating can credibly address this type of risk.

The distillation risk is primarily geopolitical. Anthropic’s framing positions PRC-affiliated labs as systematically harvesting frontier model outputs through coordinated account fraud and API abuse. The distillation classifier in Fable 5 is designed to detect and block these patterns, making the two-tier strategy simultaneously a safety mechanism and an instrument of US-China AI competition.

The Trust and Governance Questions Behind Anthropic’s Safety Brand — Section 1 examines bio/chem dual-use risks in depth How Anthropic’s Two-Tier AI Access Strategy Works — Section 3 covers classifier architecture across all three domains

What is Project Glasswing and who gets access to it?

Project Glasswing is Anthropic’s vetted-access program for Mythos-class models, launched on 7 April 2026 with an initial cohort of cybersecurity partners including AWS, Apple, Microsoft, CrowdStrike, Google, NVIDIA, and Palo Alto Networks. It now encompasses roughly 200 organisations across more than 15 countries. Access requires formal application, institutional vetting, and compliance with operational requirements including a 30-day data retention policy — a notable shift from Anthropic’s prior zero-retention stance. Glasswing is not a research collaboration; it is an access control mechanism with explicit membership criteria.

Glasswing’s structure is shaped by Anthropic’s Responsible Scaling Policy and Frontier Compliance Framework, which provide the formal rationale for why Mythos-class access requires institutional vetting rather than being available through standard API signup. The program operates within what Anthropic calls a “collective security model” — the idea that concentrated defensive capability among vetted institutions creates systemic protection that benefits the broader ecosystem, even though the capability itself remains restricted. For the full architecture that Glasswing was built to enforce, see how the two-tier access strategy works.

The 30-day data retention requirement represents a significant operational shift. Under prior models, Anthropic maintained a zero-retention policy for API traffic, which simplified compliance for enterprise customers. The transition to 30-day retention for Mythos-class traffic was justified as necessary for auditing and compliance within the higher-risk tier, but it creates new burdens for participating organisations — particularly those in regulated industries with conflicting data handling requirements. Security and compliance teams that gain Glasswing access will need to reconcile these requirements with their existing data governance frameworks.

The program also raises distributional questions. Two hundred organisations across 15 countries is simultaneously an achievement in scaled trust and a concentration of frontier capability that excludes most of the world’s institutions. When Glasswing expanded from its initial US/UK footprint to include the EU — through the ENISA negotiations — it established a precedent where sovereign entities must negotiate bilaterally with a private company for access. What happens when institutions in APAC, Africa, or South America seek the same access remains an open question. The full geopolitical narrative of Glasswing traces these events from launch through to the export control shutdown.

Inside Project Glasswing and the Geopolitical Fight Over Mythos 5 Access

How did the EU and ENISA negotiate access to Project Glasswing?

The European Union Agency for Cybersecurity (ENISA) was initially excluded from Glasswing entirely. Over several weeks in May–June 2026, ENISA negotiated directly with Anthropic for access, framing the capability as essential to its cybersecurity mission across EU member states. Access was eventually granted on approximately 1 June 2026, but the negotiation established an uncomfortable precedent: a supranational regulatory body had to bargain bilaterally with a private company for capabilities it considered mission-critical, without any treaty framework, regulatory mandate, or legal mechanism to compel access.

The ENISA dispute is the most significant case study in sovereign-company frontier-model access negotiation to date. It exposed a governance vacuum: no existing international framework addresses how states or supranational bodies should access frontier AI capabilities controlled by private companies in other jurisdictions. The EU AI Act, for all its regulatory ambition, did not anticipate a scenario where a US company’s voluntary safety program would determine whether European cybersecurity agencies could access the most capable vulnerability-discovery tools available. This is one of the central episodes in the larger geopolitical fight over Mythos 5 access.

The negotiation also revealed asymmetries. ENISA approached Anthropic as a supplicant rather than a regulator — it could request access but could not demand it. Anthropic, for its part, faced competing pressures: granting ENISA access demonstrated goodwill and defused accusations of US-centric gatekeeping, but expanding Glasswing beyond its original scope raised questions about whether the vetting process could scale without diluting its security function. The resolution — access granted, conditions still under negotiation — left the underlying power dynamic unchanged.

The precedent matters beyond Europe. If ENISA had to negotiate, so will every other international body that determines it needs Mythos-class capability for its mission. The absence of a multilateral framework means these negotiations will remain ad hoc, bilateral, and shaped by the relative leverage of each participant. For countries and agencies without ENISA’s institutional weight, the path to access is even less certain.

Inside Project Glasswing and the Geopolitical Fight Over Mythos 5 Access

What happened with the US export control shutdown of Fable 5 and Mythos 5 on 12 June 2026?

Three days after the Fable 5 and Mythos 5 launch, the US government issued an export control directive that temporarily suspended all access to both models — blocking foreign users and requiring Anthropic to halt distribution while compliance verification proceeded. The directive reframed the two-tier strategy from a voluntary corporate safety practice into a government-enforced access restriction. It demonstrated that the US government has and will use export control authority over AI model access, even when the company has already built its own access-control infrastructure.

The directive did not distinguish between Fable 5 and Mythos 5 — the government blocked both, treating the classifier-gated public version and the restricted version as equivalently subject to export control jurisdiction. This raises a structural question: if export controls can block the classifier-gated version that Anthropic already deemed safe for general release, what role is left for the company’s own safety determinations?

The directive also completed a pattern. Across the preceding weeks, US bank regulators had paused compliance examinations citing Mythos-class capability concerns, ENISA had negotiated access outside any formal framework, and the Pentagon had already clashed with Anthropic over military use restrictions earlier in 2026. The export control directive was not an isolated event — it was the moment the accumulated governance improvisations crystallised into an explicit assertion of state authority over model access. The unresolved governance questions this leaves behind are substantial.

For organisations evaluating their AI procurement strategy, the directive introduces sovereignty risk as a new variable. Access to a frontier model is no longer solely a question of the vendor’s terms of service, pricing, and safety posture — it is now subject to government intervention that can suspend access without notice. This reality reshapes enterprise risk assessments and procurement decisions in ways the industry has not yet fully absorbed.

Inside Project Glasswing and the Geopolitical Fight Over Mythos 5 Access

Why did US bank regulators pause cyberattack compliance examinations in May 2026?

On 19 May 2026, the Federal Reserve, US Treasury, and Office of the Comptroller of the Currency jointly paused cyberattack compliance examinations — the earliest concrete evidence that Mythos-class cybersecurity capabilities had broken existing regulatory threat models. The pause reflected a structural asymmetry: AI-powered vulnerability discovery could compress the attacker’s timeline from weeks to hours, but the remediation pipeline — vulnerability triage, disclosure, patching, and deployment — remained human-speed. Regulators concluded they could not meaningfully assess whether banks were prepared for threats their own frameworks did not yet model.

The bank regulator pause is significant precisely because it predates the better-known events — the ENISA dispute, the export control directive. It was the canary in the coal mine, signalling that Mythos-class capabilities were not a future governance problem but a present operational one. The regulators did not wait for a formal policy framework or an international consensus; they acted on the direct evidence that Mythos Preview’s vulnerability discovery capability — used defensively within Glasswing — had implications for the attacker side that existing compliance examinations could not address.

The specific concern centred on what security researchers call agentic hacking: AI systems that can autonomously identify vulnerabilities, generate exploits, and adapt their approach based on results. When this capability exists in restricted-access programs, a bank’s defensive posture can be assessed against known threat models. When it becomes available more broadly — whether through legitimate access expansion or through adversarial model development — the same capability shifts to the attacker side, and the defender’s timeline compresses without a corresponding acceleration in remediation capacity. The Glasswing story traces how this dynamic unfolded in practice from the bank pause through to the export shutdown.

This dynamic creates a new category of enterprise risk assessment. Organisations must now evaluate not just their current cybersecurity posture but their capacity to absorb a step-change in the speed and sophistication of attacks against them. The remediation bottleneck — the gap between AI-speed discovery and human-speed patching — will define cybersecurity governance for as long as that asymmetry persists.

Inside Project Glasswing and the Geopolitical Fight Over Mythos 5 Access

What dual-use risks do Mythos-class models pose in biology and chemistry?

Mythos-class models demonstrate capability uplift across drug design acceleration, AAV engineering, autonomous genomics research, and novel scientific hypothesis generation — the same competencies that enable therapeutic breakthroughs also enable pathogen engineering and weapons development. Anthropic’s CB-1 and CB-2 thresholds classify the risk: CB-1 covers assistance with non-novel weapons, CB-2 covers novel weapons. Mythos 5 is assessed at CB-1 but approaching CB-2. Unlike cybersecurity dual-use — where a capability can be blocked or patched — bio/chem dual-use involves scientific knowledge that cannot be “un-discovered,” making classifier-gating a necessarily imperfect control.

The bio/chem risk is the second major axis of the two-tier strategy, and it operates differently from the cybersecurity risk that dominates most coverage. Cybersecurity dual-use has a defender axis — Mythos-class vulnerability discovery can be used to patch systems as well as exploit them — which is why Glasswing exists. Bio/chem dual-use is messier. The same hypothesis-generation capability that might help a researcher at Dyno Therapeutics identify a novel gene therapy vector could help a malicious actor identify a novel pathogen delivery mechanism. The capability is the capability; only the intent differs. The full governance analysis examines whether Anthropic’s safety framework can credibly distinguish between the two.

This is why the biology and chemistry classifier in Fable 5 is simultaneously the most important and the most problematic of the three classifier domains. It must make intent judgments — is this query legitimate research or weapons development? — and it will inevitably produce both false positives (blocking legitimate research) and false negatives (permitting dangerous queries). For regulated organisations in healthcare and life sciences, the classifier creates a practical tension: Fable 5 provides Mythos-class reasoning capability across most domains, but the very domains where that capability is most valuable are also the domains where the classifier is most likely to intervene.

The regulatory dimension is equally unsettled. The EU AI Act, pharmaceutical research regulations, and biosafety frameworks were designed before frontier AI models could autonomously generate novel biological hypotheses. None of them address whether a classifier-gated model satisfies safety requirements, or whether restricting bio/chem capability in the public tier while making it available in the restricted tier creates an uneven playing field for research organisations.

The Trust and Governance Questions Behind Anthropic’s Safety Brand

Can Anthropic’s safety-first brand survive an IPO at a reported $65 billion valuation?

Anthropic filed a confidential S-1 in early 2026, positioning its safety-first governance as a competitive moat for public markets. The structural tension is immediate: safety positioning justifies restricting access to the most capable models, which limits revenue from the highest-value capabilities, while public-market expectations demand revenue growth. At a reported $65 billion valuation, every safety decision — expanding Glasswing access, relaxing classifier thresholds, pricing Mythos access — will be scrutinised for whether it serves safety or serves shareholder value. The two incentives are not necessarily incompatible, but they are undeniably in tension.

The IPO introduces a stakeholder that Anthropic’s governance framework was not designed to accommodate: public-market investors with fiduciary return expectations. When Anthropic was a private company with mission-aligned investors, the RSP framework and the two-tier strategy could be justified on their safety merits without reference to commercial impact. Post-IPO, the same decisions will face quarterly earnings scrutiny, and the safety brand itself becomes a commercial asset — something that generates market value. At that point, the question shifts from “is Anthropic’s safety positioning genuine?” to “can a safety positioning that is also a commercial asset remain genuinely binding when it conflicts with commercial interest?” The governance analysis explores this tension in depth.

This tension has specific pressure points. Fable 5’s classifiers currently restrict capabilities in domains where Anthropic could charge premium access fees — cybersecurity tools, biology research acceleration, model training infrastructure. As revenue growth becomes a market imperative, the incentive to relax classifier thresholds or expand Glasswing access for commercial rather than safety reasons will intensify. Similarly, the 30-day data retention policy, the vetting process, and the access criteria for Glasswing — all currently framed as safety requirements — will face pressure to accommodate commercial partnerships that blur the line between safety governance and enterprise sales.

None of this means the safety brand is necessarily compromised. Companies in regulated industries routinely manage the tension between compliance costs and commercial growth. The question is whether Anthropic’s governance structure — its board composition, its RSP commitments, its constitutional approach to model training — provides sufficient institutional resistance to commercial pressure. For readers evaluating whether to build on Anthropic’s platform, the answer to that question should weigh as heavily as benchmark performance.

The Trust and Governance Questions Behind Anthropic’s Safety Brand

How does Anthropic’s two-tier strategy compare to OpenAI and Google DeepMind’s approach to frontier model deployment?

Anthropic operates the most formally structured access-control system in the industry: explicit capability tiers, classifier-gated public access, and an institutional vetting process for unrestricted capability. OpenAI deploys GPT-5.5 through a more permissive but less formalised safety framework — capabilities that would fall under Anthropic’s classifier domains are available with usage policy restrictions rather than technical blocking. Google DeepMind‘s Gemini 3.1 Pro sits between them: institutional partnerships govern sensitive access, but without the explicit two-tier architecture. Which approach produces better governance outcomes is unresolved — formal structure does not guarantee better decisions.

The competitive comparison matters for two reasons. First, it contextualises whether Anthropic’s two-tier strategy is a genuine safety innovation or a strategic differentiation move. If OpenAI and Google DeepMind achieve comparable safety outcomes without the same degree of formal access restriction, the case that the two-tier model is necessary rather than chosen becomes harder to sustain. If they do not — if their more permissive frameworks produce safety incidents that Anthropic’s approach avoids — the two-tier model looks like governance leadership rather than market positioning. The comparative governance analysis weighs the evidence on both sides.

Second, the comparison frames the practical decision facing enterprise buyers. On benchmark performance, the three providers trade leadership across different domains — Fable 5 excels in reasoning and coding, GPT-5.5 in creative generation and multimodal tasks, Gemini 3.1 Pro in integration with Google’s enterprise ecosystem. But the governance dimension introduces a different axis of evaluation. Organisations in regulated industries may prefer Anthropic’s explicit safety architecture even if it means navigating classifier-triggered fallback. Organisations prioritising capability access may find the classifier friction unacceptable and opt for a provider whose restrictions are policy-based rather than technically enforced.

The industry is converging on a recognition that frontier model deployment requires governance, but diverging on what form that governance should take. Anthropic’s two-tier model is the most architecturally committed answer. Whether it becomes the industry template or a transitional phase depends on whether the governance outcomes — fewer safety incidents, more institutional trust, more predictable regulatory compliance — materialise at a scale that justifies the access restrictions.

The Trust and Governance Questions Behind Anthropic’s Safety Brand

Resource Hub: Deep Dives into Anthropic’s Two-Tier Strategy

Understanding the Architecture

How Anthropic’s Two-Tier AI Access Strategy Works — The technical foundation of the entire topic. Explains what the two-tier split actually is, how the three classifier domains function in practice, the Opus 4.8 fallback mechanism, and why Anthropic chose this architecture over releasing a single model. If you read one cluster article, start here.

The Geopolitics of Access

Inside Project Glasswing and the Geopolitical Fight Over Mythos 5 Access — Tells the story of how a corporate access program became a geopolitical flashpoint. Covers Glasswing’s structure, the EU/ENISA access dispute, the US bank regulator pause, and the 12 June 2026 export control shutdown. Essential for understanding how sovereign states are already treating model access as a national security lever.

Governance, Trust, and the Road Ahead

The Trust and Governance Questions Behind Anthropic’s Safety Brand — The analytical capstone. Examines whether Anthropic’s safety-first positioning withstands scrutiny across bio/chem dual-use risks, the covert Fable 5 degradation policy for rival researchers, the IPO commercialisation tension, corporate governance implications for boards, and competitive comparison with OpenAI and Google DeepMind.

Suggested reading order: Start with the architecture explainer to understand what was built. Move to the Glasswing story to see how it played out in real geopolitical events. Finish with the governance analysis to assess what it all means.

Frequently asked questions

What are Mythos-class models and how do they differ from Opus-class models?

Mythos-class is Anthropic’s designation for a capability tier above the prior Opus-class generation. The jump involves significant uplift in cybersecurity (autonomous vulnerability discovery and exploit generation), biology and chemistry (drug design acceleration, hypothesis generation), and autonomous long-horizon task execution. It was this capability gap — particularly in cybersecurity — that Anthropic determined made unrestricted general availability unsafe, driving the decision to create the two-tier structure. See How Anthropic’s Two-Tier AI Access Strategy Works for the full architecture.

What was Anthropic’s covert policy to weaken Fable 5 for rival frontier AI researchers?

In early 2026, it emerged that Anthropic had implemented a policy that degraded Fable 5’s response quality for users identified as rival frontier AI researchers. The policy was reversed within 24 hours of discovery, but the incident shifted the governance conversation from “do the safety systems work?” to “can the operator be trusted to use them fairly?” It remains the most significant trust-erosion event for Anthropic’s safety brand. See The Trust and Governance Questions Behind Anthropic’s Safety Brand for full analysis.

Is Anthropic’s two-tier strategy actually about safety, or about controlling competition?

The evidence cuts both ways. The cybersecurity uplift from Opus-class to Mythos-class is genuine and well-documented — Mythos Preview discovered more than 10,000 critical vulnerabilities, and the remediation bottleneck it creates is a real systemic risk. But the two-tier structure also functions as a market differentiator: it positions Anthropic as the safety-first provider in a competitive landscape where enterprise buyers increasingly weigh governance alongside capability. The covert degradation policy for rival researchers further complicates the safety narrative. The honest answer is that both motivations likely coexist — see the governance analysis for a full treatment.

How do Fable 5’s safety classifiers compare to IBM’s Granite Guardian and other external guardrails?

Fable 5’s classifiers are internal to Anthropic’s infrastructure and operate at the model inference layer — queries are evaluated in real-time and redirected before the model generates a response. IBM’s Granite Guardian and similar external guardrail systems operate as middleware that can be applied to any model provider’s output. The internal approach gives Anthropic more granular control and lower latency, but it also means the classifier’s behaviour is opaque to external evaluation and cannot be independently audited. The trade-off is integration quality versus transparency.

What should boards of directors ask about AI-enabled vulnerability discovery?

Boards should ask three questions: what is the organisation’s current vulnerability discovery and remediation timeline, how would that timeline change if an attacker could discover vulnerabilities in hours rather than weeks, and does the organisation’s cybersecurity budget and staffing reflect the remediation bottleneck that Mythos-class capabilities create. These are not hypothetical — the US bank regulator pause in May 2026 was driven by exactly this assessment gap. See the governance analysis for board-level governance questions in more detail.

How does the EU AI Act apply to classifier-gated model access like Fable 5?

The EU AI Act was drafted before the two-tier model became a commercial reality, and it does not directly address whether classifier-gated access satisfies the Act’s transparency and risk-management requirements for high-risk AI systems. The ENISA dispute with Anthropic exposed this gap: a European agency needed access to a capability that the Act did not guarantee it, from a company whose compliance obligations under the Act are not clearly defined for this deployment model. This remains an active area of regulatory interpretation.

What is the 30-day data retention requirement for Mythos-class traffic, and why does it matter?

Anthropic shifted from zero-retention to 30-day data retention for Mythos-class model traffic, citing the need for auditing and compliance monitoring at the higher capability tier. This creates a tension for organisations in regulated industries — particularly finance and healthcare — where data handling requirements may conflict with, or impose additional obligations on top of, Anthropic’s retention policy. Compliance teams gaining Glasswing access need to reconcile these requirements, and the operational burden of doing so is a real cost of the vetted-access model.

Could the two-tier model become the industry standard for frontier AI deployment?

It is possible but not inevitable. The model has clear advantages — it lets a lab release frontier capability broadly while maintaining control over the most dangerous applications — but it also introduces governance burdens (vetting infrastructure, classifier maintenance, false-positive management) and political exposure (sovereign access demands, export control jurisdiction). Whether it spreads depends on whether the first-mover governance outcomes from Anthropic’s implementation are positive enough to outweigh the costs. OpenAI and Google DeepMind are watching closely but have not adopted equivalent architectures. The full governance and competitive comparison examines this question in detail.

AUTHOR

James A. Wondrasek James A. Wondrasek

SHARE ARTICLE

Share
Copy Link

Related Articles

Need a reliable team to help achieve your software goals?

Drop us a line! We'd love to discuss your project.

Offices Dots
Offices

BUSINESS HOURS

Monday - Friday
9 AM - 9 PM (Sydney Time)
9 AM - 5 PM (Yogyakarta Time)

Monday - Friday
9 AM - 9 PM (Sydney Time)
9 AM - 5 PM (Yogyakarta Time)

Sydney

SYDNEY

55 Pyrmont Bridge Road
Pyrmont, NSW, 2009
Australia

55 Pyrmont Bridge Road, Pyrmont, NSW, 2009, Australia

+61 2-8123-0997

Yogyakarta

YOGYAKARTA

Unit A & B
Jl. Prof. Herman Yohanes No.1125, Terban, Gondokusuman, Yogyakarta,
Daerah Istimewa Yogyakarta 55223
Indonesia

Unit A & B Jl. Prof. Herman Yohanes No.1125, Yogyakarta, Daerah Istimewa Yogyakarta 55223, Indonesia

+62 274-4539660
Bandung

BANDUNG

JL. Banda No. 30
Bandung 40115
Indonesia

JL. Banda No. 30, Bandung 40115, Indonesia

+62 858-6514-9577

Subscribe to our newsletter