How AI-Generated Contributions Are Reshaping Open-Source Supply Chain Risk

AI coding assistants have collapsed the cost of generating a pull request to near-zero. The cost of reviewing that pull request has not changed at all. That asymmetry — cheap generation, expensive review — is the root cause of a structural shift in open-source supply chain risk that your existing dependency management process was not built to handle.

The underlying problem is economic. The projects your software depends on are receiving more submissions than their maintainers can review, and a maintainer who burns out does not hand the project to a successor. They quietly disengage, and what was a healthy dependency becomes a zombie component sitting in your stack.

This guide maps six dimensions of that risk and directs you to the deep-dive article for each.

Cluster overview

| Risk dimension | Article | |—|—| | The economic mechanism | Why AI Pull Requests Cost More Than They Contribute to Open-Source Projects | | The documented incident record | Curl Bug Bounty Shutdown and the Open-Source Incidents That Proved the Problem Is Real | | Governance responses | Three Open-Source Governance Orientations for Managing AI-Generated Contribution Volume | | Platform and ecosystem tooling | What GitHub and the OSS Ecosystem Are Building to Protect Maintainers from AI Slop | | Dependency risk management | Adding Open-Source Maintainer Health to Your Software Supply Chain Risk Process | | Contributing back as risk mitigation | The Business Case for Contributing Back to the Open-Source Projects You Depend On |

What is the AI-generated open source supply chain risk problem?

Open-source supply chain risk has always existed — your software depends on upstream libraries you do not control, and vulnerabilities, licence conflicts, or maintainer abandonment propagate into your product. What is new is that AI coding assistants have made it trivially cheap to generate plausible-looking pull requests, bug reports, and issue comments without a corresponding increase in maintainer capacity to review them. The Black Duck 2026 OSSRA report found open-source vulnerabilities more than doubled year-over-year, with 93% of audited codebases containing zombie components. This affects you regardless of whether your team generates any AI contributions.

For the economic mechanism, see Why AI Pull Requests Cost More Than They Contribute to Open-Source Projects.

Why does volume alone create a security problem, not just a quality annoyance?

The risk is not that AI-generated pull requests are bad on average. It is that the maintainers who would catch the bad ones are finite and increasingly overwhelmed. AI-generated submissions add submissions, not reviewers — each one is a net drain on maintainer time even if it is technically valid. When review capacity is saturated, real vulnerabilities and legitimate contributions get triaged out alongside the slop.

The practitioner term for the worst of these is “AI slop” — superficially plausible but incorrect, hallucinated, or misattributed code. It looks genuine from a distance, so it demands full review effort before its quality can be assessed.

A CodeRabbit study found AI-generated PRs contained 1.7 times more issues than human-written ones, with security vulnerabilities 1.5 to 2 times more frequent. Google’s 2025 DORA Report found that 90% increased AI adoption correlated with 91% longer code review times. When review capacity is saturated, real vulnerabilities get triaged out alongside the slop. A maintainer who disengages produces a zombie component — one your SCA tooling will flag eventually, but which was an active project twelve months earlier.

For the incidents that made this visible, see Curl Bug Bounty Shutdown and the Open-Source Incidents That Proved the Problem Is Real.

What has actually happened? The incident record so far

The evidence is documented, named, and dateable. In January 2026, curl’s maintainer Daniel Stenberg shut down the project’s bug bounty programme because AI-generated reports had reduced the genuine vulnerability rate from above 15% to below 5%. tldraw’s Steve Ruiz began automatically closing external pull requests after AI-driven PR volume more than doubled in a single quarter. Ghostty pivoted to an invite-only contribution model, the Node.js TSC held a formal governance discussion after a 19,000-line PR generated by Claude Code triggered a petition from over 80 developers, and Django formalised a written AI contribution disclosure policy.

These projects span different scales and ecosystems — curl is widely-used global infrastructure, tldraw is a focused developer tool, Node.js is a platform runtime. The spread confirms this is systemic. For contrast, OpenSSL‘s AISLE programme used AI-assisted expert analysis to find genuine zero-days — the productive use case is expert-led review augmented by AI, not AI-generated volume submission.

For full incident analysis, see Curl Bug Bounty Shutdown and the Open-Source Incidents That Proved the Problem Is Real.

How are open source projects responding to AI contribution volume?

Academic research published in March 2026 (arXiv 2603.26487) analysed 67 projects and identified three governance orientations: Prohibitionist (AI contributions present structural, non-absorbable risk — tldraw, QEMU); Boundary-and-accountability (AI inputs may enter the workflow under explicit conditions of disclosure and verification — Ghostty’s Vouch system, Django’s disclosure requirements); and Quality-first (contributions are judged by quality standards regardless of how they were produced). Each orientation involves real tradeoffs around contributor pool size, maintainer load, and community culture.

None of these orientations is universally right. But the existence of a stated governance orientation is itself a signal — a project that has thought about AI contributions is better positioned to maintain review quality than one that has not. AGENTS.md files, the emerging convention for communicating AI contribution policy directly to coding agents, are part of this landscape.

For the governance framework in depth, see Three Open-Source Governance Orientations for Managing AI-Generated Contribution Volume.

What are GitHub and the broader ecosystem building to address this?

GitHub is developing platform-level controls: configurable PR permissions, a “disable pull requests” switch added in February 2026, improved AI attribution visibility, and more granular controls for who can create and review PRs. Mitchell Hashimoto’s Vouch provides forge-agnostic trust-gating, and the Open Source Pledge is formalising a mechanism for companies to make sustainability commitments. Dependabot and Renovate already established the precedent for platform-level bot management — the governance infrastructure for automated contributions exists and is being extended to cover AI-generated submissions.

None of these controls are fully deployed. The ecosystem response is lagging the problem by 12 to 18 months, which means supply chain risk remains elevated while platform controls catch up.

For platform responses in detail, see What GitHub and the OSS Ecosystem Are Building to Protect Maintainers from AI Slop.

How does this change what you should be checking in your dependency tree?

The bus factor — CHAOSS‘s formal metric is the Contributor Absence Factor — is the minimum number of contributors whose departure would jeopardise a project. AI contributions that add code volume without adding capable maintainers do not improve that number; they may mask it, because a project can appear “active” based on PR volume while the actual maintenance rests with one overwhelmed person.

Add CHAOSS viability metrics — commit frequency, bus factor, issue response time, release cadence — to your dependency audit. SCA tooling catches known vulnerabilities but does not catch maintainer health decline. When a dependency deteriorates, your options are the Fork / Fund / Migrate framework: fork and self-maintain (average cost: $258,000 per release cycle), fund the upstream maintainer, or migrate to an alternative.

For the operational framework, see Adding Open-Source Maintainer Health to Your Software Supply Chain Risk Process.

What does the EU Cyber Resilience Act mean for your use of open source?

The EU Cyber Resilience Act (fully effective by 2027) requires software products with digital elements to maintain Software Bills of Materials (SBOMs) in machine-readable format. AI-generated code complicates SBOM production because its training-data provenance is opaque — models trained on copyleft code may produce output that inadvertently reproduces those licence obligations. The 2026 OSSRA report found licence conflicts in 68% of audited codebases, the largest year-over-year increase in the report’s history.

Even outside European markets, US and EU SBOM mandates are converging. Building SBOM generation into your release process now costs less than retrofitting it later.

For SBOM and maintainer health integration, see Adding Open-Source Maintainer Health to Your Software Supply Chain Risk Process.

Is there a feedback loop that makes this worse over time?

Yes. AI coding models are trained on public repositories. As those repositories fill with AI-generated code — which tends to be stylistically plausible but mechanically weaker — training data quality degrades. Future models produce lower-quality output, which generates more low-quality contributions, which degrades repositories further. This “model collapse” feedback loop has no natural circuit breaker and compounds with zombie component acceleration and licence laundering.

The vibe coding dynamic accelerates this: contributors who generate code without understanding it cannot maintain, debug, or defend it — so even AI-generated contributions that pass review may introduce maintenance liability downstream.

The best point for establishing good governance practices — both internally with your team’s AI usage policy and externally with your contribution posture toward upstream projects — is now, before the feedback loop has fully closed.

For the mechanism behind the feedback loop, see Why AI Pull Requests Cost More Than They Contribute to Open-Source Projects.

Does contributing back to open source actually reduce your supply chain risk?

A February 2026 Linux Foundation report found that active open-source contribution delivers 2 to 5 times return on investment, with 66% of organisations reporting faster upstream responses to security issues. For most engineering teams, this means targeted contributions to the two or three dependencies most critical to your product — not an Open Source Programme Office. Funding via Tidelift or the Open Source Pledge is the lowest-friction option if code contributions are not practical. Teams that contribute back also gain better visibility into the governance health of the projects they depend on, which is itself a risk management advantage.

Passive consumption is not a neutral default. Organisations relying on internal workarounds spend an average of $670,000 annually. The supply chain risk from AI contribution volume is real and growing, but it is manageable with the right operating posture.

For the full business case, see The Business Case for Contributing Back to the Open-Source Projects You Depend On.

Resource Hub: Open-Source Supply Chain Risk Library

Understanding the Problem

| Article | What it covers | |———|—————| | Why AI Pull Requests Cost More Than They Contribute to Open-Source Projects | The cost asymmetry mechanism — why AI contribution volume harms even well-intentioned projects and what the push-based contribution model vulnerability means for maintainer workload | | Curl Bug Bounty Shutdown and the Open-Source Incidents That Proved the Problem Is Real | Documented incident record: curl, Node.js, Ghostty, tldraw, and Django — what happened, what each project did, and what the pattern means |

Governance and Platform Responses

| Article | What it covers | |———|—————| | Three Open-Source Governance Orientations for Managing AI-Generated Contribution Volume | The three governance orientations (Prohibitionist, Boundary-and-Accountability, Quality-First) with project examples and a decision framework for evaluating a dependency’s governance stance | | What GitHub and the OSS Ecosystem Are Building to Protect Maintainers from AI Slop | Platform-level controls at GitHub, the Vouch trust-gating tool, the Open Source Pledge, and where ecosystem infrastructure is and is not keeping up |

Managing Your Exposure

| Article | What it covers | |———|—————| | Adding Open-Source Maintainer Health to Your Software Supply Chain Risk Process | Applying CHAOSS viability metrics to your dependency tree, identifying zombie components, and using the Fork / Fund / Migrate framework | | The Business Case for Contributing Back to the Open-Source Projects You Depend On | The 2 to 5 times ROI evidence from the Linux Foundation, scoping upstream contribution for a small engineering team, and when funding beats code contributions |

FAQ Section

What is “AI slop” in the open-source context?

AI slop is the practitioner term for low-quality AI-generated contributions — pull requests, bug reports, and issue comments that are superficially plausible but incorrect or hallucinated. The defining characteristic is that slop looks genuine from a distance, demanding full review time before its quality can be assessed. In 2025, AI-generated reports grew to approximately 20% of curl’s bug bounty submissions, and by July 2025 only 5% of all submissions were genuine vulnerabilities. For the full documented incident record, see Curl Bug Bounty Shutdown and the Open-Source Incidents That Proved the Problem Is Real.

Is this problem only relevant if my team contributes to open-source projects?

No. The risk lands in your dependency tree regardless of whether your team generates AI contributions. If AI volume strains the maintainers of your critical dependencies, the consequence is slower security patches, accumulated zombie components, and higher bus factor fragility — all of which affect your stack even if your team never submits a PR.

What is a zombie component and how do I identify one?

A zombie component is an open-source library with no development activity in the past two years — no commits, no issue responses, no releases. Start with your SCA tooling (Dependabot, Snyk, or Black Duck will flag components with no recent releases), then cross-reference the project’s GitHub activity and CHAOSS viability metrics for your highest-criticality dependencies. The maintainer health supply chain risk process covers how to build this check into a repeatable quarterly audit.

What is the bus factor and why does it matter for AI contribution risk?

The bus factor is the minimum number of contributors whose departure would jeopardise a project. CHAOSS formalises this as the Contributor Absence Factor. To check it for a dependency, look at the project’s contributor graph on GitHub: how many people have committed in the last 90 days, and how concentrated is the commit activity? A project where one person accounts for 80% or more of recent commits has a bus factor of one regardless of how many open PRs it has. The GitHub maintainer protection controls being rolled out in 2026 can help single-maintainer projects defend their review capacity without closing to contributions entirely.

What does my team need to do differently when using AI tools to contribute to open source?

Require that anyone submitting to an external project using AI-generated code follows the project’s stated AI contribution policy — check for an AGENTS.md file or a CONTRIBUTING.md section on AI use. Treat AI as a drafting tool, not an authoring tool: the person submitting the PR should be able to explain, defend, and maintain what they are submitting. The three governance orientations framework provides a practical decision tree for reading what a project’s policy signals about its contribution health.

How does the EU Cyber Resilience Act affect a company that is not selling into European markets?

The CRA has spillover effects. US and EU SBOM mandates are converging, making SBOM hygiene a baseline procurement requirement regardless of geography. Building SBOM generation into your release process now costs less than retrofitting it when a customer or regulator asks. See Adding Open-Source Maintainer Health to Your Software Supply Chain Risk Process for how to integrate SBOM generation and licence conflict scanning into a unified dependency health review.

The Business Case for Contributing Back to the Open-Source Projects You Depend On

Most CTOs treat open-source dependencies the way they treat electricity — reliable infrastructure that someone else worries about. Until one day it doesn’t work.

When the solo maintainer of Kubernetes External Secrets Operator announced the project was shutting down, every company using it faced a six-month window with no security patches. Log4Shell and the XZ Utils backdoor both trace back to maintainers at the edge of capacity: overloaded, undersupported, unable to properly screen incoming changes.

So this article is not about open-source ethos. It’s about risk management. You’ll get a clear fork/fund/migrate decision framework, financial options that work at SMB scale, and a concrete first step you can take this week.

For the full supply-chain risk picture, start with our pillar page on how AI-generated contributions are reshaping open-source supply chain risk.

Why is contributing back a supply-chain risk strategy, not a charity decision?

Your CFO doesn’t need to care about open-source goodwill. What they do need to care about is maintenance liability on infrastructure your business depends on but does not own. That’s a very different conversation.

Here’s the structural problem. Open-source software appears in 97% of codebases, but most companies contribute nothing back upstream. Black Duck‘s OSSRA 2026 report found 93% of audited codebases contain components with no development activity in the past two years. Some of those are harmless zombie projects. But plenty are quietly accumulating unpatched vulnerabilities with no one responsible for fixing them.

The XZ Utils backdoor (CVE-2024-3094) shows exactly how this failure mode plays out. A malicious contributor spent two years building trust in a project run by a single overloaded maintainer. Then they introduced backdoor code that would have handed remote access to millions of systems. The security community’s conclusion was blunt: when maintainers burn out, supply-chain attacks get easy.

Vibe coding is making it worse. A 2026 arXiv paper found AI-assisted development lowers the cost of building on existing code — but also weakens the user engagement through which maintainers earn returns. Sustaining open source at current scale requires major changes in how maintainers are paid.

The formal metric for this risk is the Contributor Absence Factor (CAF) — a CHAOSS-defined calculation of the minimum number of contributors whose departure would leave a project with less than 50% of its recent commit activity. Your dependency risk assessment should surface your highest-CAF projects. Contributing back is the mitigation action. For the AI-generated contribution pressure and the case for upstream investment across the full risk landscape — mechanism, incident record, governance responses, and platform tooling — the pillar guide covers all six dimensions.

What financial contribution options work at SMB scale?

The Open Source Pledge gives you a defensible starting point: $2,000 per full-time developer per year, paid directly to the maintainers of your choosing. For a 20-developer SMB that’s $40,000/year — roughly comparable to a mid-range SaaS subscription. The money goes via GitHub Sponsors, Open Collective, or direct transfer, not into a foundation’s general fund.

Three vehicles work well at SMB scale:

  1. Open Source Pledge membership — public commitment, $2,000/dev/year floor, annual transparency report, company listed publicly. Founding members include Astral, Tailscale, and Sentry. Best for documented supply-chain due diligence.
  2. GitHub Sponsors — lowest friction, direct to individual maintainers, starting from $5/month. Right for targeted support of specific high-CAF dependencies.
  3. Open Collective — for community-run projects with no single maintainer to sponsor. Transparent fund management with public accountability.

Tidelift offers an intermediary model for SMBs with structured procurement: subscription fees flow to verified package maintainers with SLA-backed guarantees. If your procurement team needs formal vendor agreements, Tidelift converts OSS contribution into a standard vendor spend category. For an overview of how these funding mechanisms fit into the broader ecosystem response to AI-generated contribution pressure on open source, and the platform-level tools that sit alongside them, see our article on what GitHub and the OSS ecosystem are building to protect maintainers from AI slop.

There’s also a compliance angle worth noting if you’re operating in the EU. The Cyber Resilience Act comes into force by December 2027 and requires your entire open-source supply chain to comply with minimum cybersecurity requirements. At least 50% of open-source foundations say they don’t have enough funding to ensure CRA compliance. Paying relevant foundations isn’t generosity at that point — it’s compliance investment.

What does meaningful engineering-hour contribution look like with limited bandwidth?

You don’t need a dedicated OSS engineer. The contributions with the best maintainer impact-to-time-cost ratio, in order, are:

  1. Documentation improvements — maintainers consistently cite documentation debt as a top burnout driver. A well-structured doc PR takes two to four hours and is often the most valuable thing a team contributes.
  2. Issue triage and reproduction — confirming bug reports, adding reproduction steps, closing stale issues. Low technical bar, high time-savings for maintainers.
  3. CI/CD improvements — fixing flaky tests, adding coverage, improving build reliability. Durable contribution that reduces future maintainer burden.
  4. Targeted bug fixes — fix a bug your team has already diagnosed in your own fork. The upstream PR is usually straightforward once the diagnosis is done.

A practical cadence: one engineer-day per sprint across your top-five highest-CAF dependencies. Rotate ownership so no single team member holds the relationship.

One thing to avoid: AI-generated PRs submitted without human review. Daniel Stenberg ended curl‘s bug bounty program because fewer than one in twenty submissions were real bugs, and each one was engaging three to four maintainers for hours. That is the extractive contribution pattern. It consumes more maintainer time than it saves.

The financial versus engineering-hour decision comes down to two variables: your team’s expertise relative to the project, and what the project’s primary bottleneck actually is. If the bottleneck is money, financial contribution wins. If the bottleneck is attention and triage capacity, engineering hours from a team that actually uses the project are more effective.

How do you decide whether to fork, fund, or migrate when a dependency goes dark?

The fork/fund/migrate framework applies when a dependency starts showing abandonment signals — the maintainer announces burnout, PRs stop being accepted, security patches stop appearing, or CAF drops to 1 with no succession plan in sight.

Fork — take ownership of a copy and maintain it internally. This is appropriate only when the dependency is deeply embedded in your stack, there is no viable alternative, and your team genuinely has the expertise to maintain it. The costs: ongoing maintenance burden, security patch responsibility, and compatibility drift over time.

Fund — direct financial support to the original maintainer or a commercial steward. Appropriate when the project has governance and the problem is just resource scarcity. Costs: cash, no operational lift.

Migrate — move to an actively maintained alternative. Appropriate when the functionality is commoditised, alternatives exist, and migration cost is lower than the long-term maintenance risk of staying put.

The decision criteria:

OpenTofu and OpenSearch are the canonical successful community forks. Both worked because a large user community co-maintained under neutral governance. The lesson: forking works at the ecosystem level when the user base is large enough to sustain it. It rarely works at the single-company level.

For more on the dependency risk profile that makes this decision a financial priority, see our article on supply-chain risk assessment.

How can AI capability be contributed productively, not destructively?

AI-generated PRs are one of the primary maintainer-burnout drivers in 2026. So is there a responsible way to contribute AI capability without making the problem worse?

The AISLE/OpenSSL case study is the answer. A research team used expert-guided AI tooling to audit OpenSSL and found 12 previously unknown CVEs — some hiding since 1998. All were responsibly disclosed with fixes included. AISLE ran a similar engagement with curl and reported over 30 valid security issues.

As Drupal project lead Dries Buytaert put it: the difference wasn’t the use of AI. It was expertise and intent. AISLE used AI to amplify deep knowledge. Low-quality reports use AI to replace expertise that wasn’t there.

For SMB teams, the AISLE model comes down to one test: before submitting any AI-assisted contribution, have a domain expert review the output and confirm they would stand behind it. AI accelerates the work. A domain expert’s judgement gates what goes upstream.

Where do you start if you have never contributed upstream before?

Here is the minimum viable programme for a team starting from zero.

This week: Run your package manager (npm list, pip list, cargo tree) or SBOM tool and enumerate your top 20 dependencies. For each one, check the GitHub contributor graph for the past 12 months. Identify your top three highest-CAF, highest-criticality dependencies.

This month: Check whether those three projects have a GitHub Sponsors page, Open Collective fund, or Tidelift listing. Set up a recurring sponsorship at whatever level your budget allows. Steady monthly income matters more to maintainers than irregular lump sums. Even $50 to $100 per project per month creates an established relationship and documented supply-chain due diligence.

Next quarter: Join the maintainer channel for one of your top-three projects. Spend one week reading issues before submitting anything. Scope a documentation improvement or issue triage contribution as your first PR.

For the CFO or board pitch: “We have identified three dependencies with single-maintainer risk. A dependency failure would cost us [estimated migration or incident cost]. Our OSS contribution programme costs [annual budget] and reduces that risk.” For EU-regulated companies, add the CRA compliance hook — by December 2027, your entire open-source supply chain must comply with the Cyber Resilience Act.

The place to start is the risk assessment in our article on adding maintainer health to your supply-chain risk process. Know which dependencies carry the most risk first, then come back to this article’s contribution options with a prioritised list in hand. For the full risk management context that makes this a strategic priority — not just housekeeping — the AI-generated contributions supply chain risk guide covers the full landscape across all six dimensions.

Frequently Asked Questions

Do we need dedicated open-source staff to contribute back?

No. The Open Source Pledge requires only a company-level financial commitment and annual transparency reporting. The sprint-cadence model — one engineer-day per sprint distributed across the team — fits within normal delivery planning.

What does the Open Source Pledge ask of us exactly?

$2,000 per full-time developer per year, paid directly to maintainers via GitHub Sponsors, Open Collective, or direct transfer. Annual transparency report required. No audit mechanism — it’s a voluntary public commitment. Members are listed publicly and may use member badges.

How do we know if a project needs our help?

Three signals to look for: (1) CAF — if one or two people account for the majority of commits, the project is fragile. (2) Activity velocity — increasing issue response times, declining PR merge rates, slowing releases. (3) Direct communication — maintainers who are struggling often say so in GitHub Discussions or changelogs.

Is forking a dependency ever the right answer?

Yes, under the right conditions: the dependency is deeply embedded in your critical path, there is no viable alternative, your team has the domain expertise, and you can find co-maintainers from the user community. Forking a low-community-interest project alone is rarely the right call for an SMB.

Can we contribute AI-generated code upstream responsibly?

Yes, with expert oversight. The AISLE/OpenSSL model: AI tooling used by security experts, all findings verified before disclosure. The irresponsible pattern is automated submission without expert review — exactly what ended curl’s bug bounty program.

What is the Contributor Absence Factor and how is it different from the bus factor?

“Bus factor” is informal shorthand. CAF is the CHAOSS-defined, calculable metric: the minimum number of contributors whose departure would leave the project with less than 50% of its recent commit activity. CAF = 1 is the critical threshold.

How does the Elephant Factor differ from the Contributor Absence Factor?

CAF measures individual contributor concentration. Elephant Factor measures organisational concentration — how many companies could stop contributing before the project loses 50% of its activity. Both matter for assessing whether a project’s health is genuinely community-driven or commercially captured.

What should we do if we discover a critical dependency is already abandoned?

Apply fork/fund/migrate with urgency. First determine whether it’s truly abandoned (no commits in 12 months, maintainer unresponsive) or just under-resourced. For truly abandoned projects: check for community interest in co-maintaining a fork; evaluate migration cost; if neither is viable near-term, pin the version, add security scanning, and accelerate migration planning.

How do we pitch OSS contribution spend to a CFO or board?

Supply-chain risk language: “We have identified dependencies with single-maintainer risk; a dependency failure would cost us [migration or incident cost]; our OSS programme reduces that risk at [annual budget].” EU-regulated companies should add the CRA hook: by December 2027, your entire OSS supply chain must comply.

What is the minimum viable OSS contribution programme for a startup?

Two components: a financial floor and a communication commitment. Financial: recurring sponsorship on your top three highest-CAF dependencies via GitHub Sponsors or Open Collective. Communication: join the maintainer channel and subscribe to release notifications. Half a day per quarter to maintain, and it creates documented supply-chain due diligence.

What GitHub and the OSS Ecosystem Are Building to Protect Maintainers from AI Slop

Open source maintainers are drowning in AI-generated contributions — AI slop — at a volume their review processes were never built to handle. The economics are brutal: tools like Copilot, Cursor, and Windsurf dropped the cost to generate a pull request to near-zero. The cost to review one stayed exactly the same. That’s the structural problem in a nutshell.

GitHub’s February 2026 announcements are the first major platform-level response to this. The broader ecosystem is also moving — trust infrastructure, policy frameworks, funding models. Here’s what GitHub shipped, what Mitchell Hashimoto is building with Vouch, how the ecosystem is responding, and what meaningful contribution actually looks like for a 50–200 person tech company.

For the deeper context on why this affects you, see the open-source supply chain risk problem driving these changes.

What did GitHub actually announce for maintainers in February 2026?

GitHub shipped maintainer-relief tools in February 2026 and framed the AI-driven contribution surge as an “Eternal September” — invoking the 1993 Usenet moment when university students permanently changed online community norms. The message: this is not a temporary spike.

Here’s what shipped: repo-level PR controls (limit PRs to collaborators only, or disable them entirely), pinned issue comments to surface contribution guidelines before PRs are even opened, and temporary interaction limits to restrict who can comment or raise PRs during targeted slop campaigns.

Coming soon: PR deletion so maintainers can remove spam PRs outright. Criteria-based gating (in active exploration) would require a linked issue before a PR can be opened. Automated triage (gh-aw) would evaluate contributions against a project’s CONTRIBUTING.md or AGENTS.md — AI moderating AI.

Worth noting: these are maintainer-relief tools, not AI bans. GitHub is not restricting Copilot usage. That distinction matters.

Why is GitHub in a difficult position on this problem?

GitHub sells Copilot — one of the primary tools generating the contributions that are overwhelming maintainers. Its commercial incentive is to grow Copilot adoption; its platform features then have to filter those same contributions. The same company is the problem accelerant and the solution provider.

That conflict is already shaping decisions elsewhere. Gentoo Linux announced a migration to Codeberg, citing GitHub’s AI conflict and EU digital sovereignty concerns. Projects are voting with their feet. Expect GitHub’s tools to reduce maintainer burden without restricting Copilot in ways that would hurt its growth metrics. Understanding the supply chain context for these platform decisions — the full AI-generated contribution pressure problem — explains why platform-level responses like this are so difficult to get right.

What would criteria-based PR gating change about contribution dynamics?

Criteria-based gating would require a linked issue before a PR can be opened. The structural effect is direct: a contributor who has to open a discussion and have it acknowledged before raising a PR is unlikely to be generating dozens of PRs via automated agents. Low friction for genuine contributors; prohibitive for automated slop at scale.

The numbers back this up. CodeRabbit‘s analysis of 470 open-source PRs found AI-generated PRs contain 1.7 times more issues overall, with security issues up to 2.74 times higher. One developer estimated it takes 12 times longer to review an AI-generated PR than to generate one.

The limitation: criteria-based gating is a platform rule, not a trust signal. It can be gamed. It does not verify intent, quality, or human involvement. It’s a necessary layer, but not a sufficient one. See how to incorporate these platform tools into your supply-chain risk process.

What is Mitchell Hashimoto’s Vouch, and why does trust infrastructure matter?

Vouch is an open source trust management project by Mitchell Hashimoto — co-founder of HashiCorp and creator of Ghostty — that requires contributors to be vouched for by a trusted maintainer before they can interact with a project. It’s experimental, currently trialled on Ghostty.

The key distinction worth understanding: criteria-based gating operates at the PR submission layer — rule-based, platform-enforced. Vouch operates at the contributor identity layer — relational, community-enforced. They’re complementary layers, not competing approaches. The Linux kernel’s Developer Certificate of Origin (DCO, 2004) and the Signed-off-by chain are earlier versions of the same web-of-trust principle.

The honest limitation: the model depends on trusted maintainers having capacity to vouch, which reintroduces the bandwidth problem at a different point.

Beyond GitHub: what is the broader ecosystem building?

Kate Holterhoff at RedMonk surveyed 77 open source organisations in early 2026 and found three governance orientations: prohibitionist (ban AI contributions — Linux Kernel, curl), boundary-and-accountability (permit with disclosure and explicit human ownership — EFF, Blender, Mozilla), and quality-first (gate on output quality regardless of origin — Fedora-influenced projects). The useful heuristic: the farther down the stack, the less permissive with AI you have to be.

The community built tooling ahead of the platforms — the Anti-Slop GitHub Action, CodeRabbit’s slop detection, and good-egg’s contributor reputation scoring. The positive counterexample worth keeping in mind: AISLE used AI to find 12 zero-days in OpenSSL. As Dries Buytaert observed, “It wasn’t the use of AI. It was expertise and intent.”

On financial sustainability: the Open Source Pledge asks companies to contribute $2,000 per developer per year to the OSS they depend on. Tidelift creates a commercial relationship between enterprise dependency consumption and maintainer compensation. Platform tools address the symptom — slop volume. Funding addresses the cause. Jazzband, a Python GitHub organisation, announced its sunsetting in March 2026 because of AI-generated spam. A real OSS project ended.

For a deeper look at the financial and contribution models that complement platform tooling, see our coverage of the Open Source Pledge and Tidelift.

What does meaningful contribution look like at a 50–200 person tech company?

Most companies use far more open source than they contribute back to. A maintainer team that burns out or freezes a project you depend on is a supply chain risk event. Contribution is not charity — it is supply-chain risk management.

Three practical modes that work at SMB scale:

Financial: Open Source Pledge or Tidelift subscriptions are proportionate to team size and require no dedicated engineering time.

Targeted engineering: Fix bugs your team has already encountered in your 3–5 most critical upstream dependencies. Issue triage and documentation carry weight because they come from someone with real usage context.

Internal AI contribution policy: Brief guidelines ensuring engineers understand and can own what they submit before it goes upstream. Even a short checklist tied to a project’s CONTRIBUTING.md changes behaviour.

What it does not look like: automated AI-generated PRs, vibe-coded contributions where the contributor cannot explain the code, PRs where the contributor disappears after submission.

Where is all this heading?

The ecosystem is building a layered defence. Criteria-based gating will likely become a baseline expectation for well-maintained projects, the same way CONTRIBUTING.md files became standard. Vouch is more ambitious — if it matures, OSS contribution shifts from an anonymous model to a relationship-mediated one. GitHub’s dual-position problem will not resolve cleanly; the most likely outcome involves Copilot incorporating contribution-quality guardrails.

The direction of travel is toward higher-quality, relationship-based contributions. Aligning your team’s contribution posture with that direction is good strategy and good citizenship. For more on how to incorporate these platform tools into your supply-chain risk process, see the dependency health assessment framework. For the open-source supply chain risk problem driving these changes — covering the full landscape across all six dimensions — see the complete series overview.

Frequently Asked Questions

What GitHub features protect maintainers from AI-generated contributions?

GitHub shipped several tools in February 2026: repo-level PR controls limiting pull requests to collaborators only or disabling PRs entirely, pinned issue comments for prominent contribution guidelines, and temporary interaction limits. Pull request deletion is coming soon. In development: criteria-based gating (requiring a linked issue before PR submission) and automated triage (gh-aw) that evaluates contributions against a project’s CONTRIBUTING.md.

What is criteria-based PR gating on GitHub?

Criteria-based gating is an upcoming GitHub feature requiring contributors to satisfy defined conditions before submitting a pull request — for example, linking to an existing approved issue. It addresses the asymmetric pressure problem: AI dropped the cost to generate a PR to near-zero, but the review cost did not. A lightweight pre-commitment step makes zero-effort automated PR generation structurally harder without creating a full invitation-only model.

Is GitHub doing enough to protect open source maintainers from AI slop?

GitHub has shipped useful tools and has more in development. The harder question is structural: Copilot drives AI-generated contributions; GitHub’s platform tools filter them; and the commercial incentive to grow Copilot pulls against maintainer protection. The conflict of interest is real. Gentoo Linux’s migration to Codeberg reflects genuine frustration with GitHub’s position.

What is the Open Source Pledge and how does it work?

The Open Source Pledge asks companies to contribute financially to the OSS they depend on — a minimum of $2,000 per full-time equivalent developer per year. Opt-in, not a legal obligation. Payment platforms include thanks.dev, Open Collective, and GitHub Sponsors. The framing is supply-chain risk management: unpaid maintainers are the common factor in major OSS security incidents.

What is Mitchell Hashimoto’s Vouch project?

Vouch is an open source trust management tool requiring contributors to be explicitly vouched for by a trusted maintainer before interacting with a project. It is experimental, currently trialled on Hashimoto’s Ghostty terminal. Vouch represents a web-of-trust approach — relational, community-enforced — as distinct from criteria-based gating, which is rule-based and platform-enforced. The two are complementary layers.

Can a tech company with no dedicated OSS staff contribute meaningfully to open source?

Yes. Three modes scaled to SMB bandwidth: financial (Open Source Pledge at $2,000 per developer per year; Tidelift subscriptions), targeted engineering (1–2 days per quarter on critical dependencies), and policy (an internal AI contribution guideline ensuring engineers understand what they submit). The policy mode costs almost nothing and prevents your team from adding to the slop problem.

What is the asymmetric pressure problem in open source?

Asymmetric pressure is the structural imbalance where AI tools lower the cost to generate a contribution but leave the cost to review it unchanged. Dries Buytaert put it directly: “AI makes it cheaper to contribute to Open Source, but it’s not making life easier for maintainers.” The result is existential: maintainer review capacity is finite; contribution volume is not.

Why did curl end its bug bounty programme?

Curl maintainer Daniel Stenberg ended the programme in January 2026 because AI-generated security reports were flooding it — the confirmed vulnerability rate dropped from above 15% to below 5%. Ending the bounty “removed the incentives for submitting made up lies.” Incentive structures shape contribution behaviour as much as platform rules.

What happened to Jazzband and why does it matter?

Jazzband, a collaborative GitHub organisation hosting Python projects, announced its sunsetting in March 2026. Lead maintainer Jannis Leidel cited a “flood of AI-generated spam PRs and issues.” It is concrete evidence that asymmetric pressure causes real casualties — not inconvenience, but an actual OSS project ending.

How do I know if my team’s AI-assisted OSS contributions qualify as AI slop?

Three tests: Does the contributor understand the code they are submitting? Is it linked to a real problem the team has encountered? Has a human reviewed it who can own it and engage with maintainer feedback? The canonical slop patterns are vibe-coded contributions where the contributor cannot explain the change, unverified AI-generated bug reports, and PRs where the contributor disappears after submission.

What is the web-of-trust model in open source and how does it differ from criteria-based gating?

Web of trust is a decentralised accountability system where trusted participants vouch for new contributors — with historical OSS precedents in the Linux kernel’s Developer Certificate of Origin (2004) and the Signed-off-by chain. Criteria-based gating is rule-based and platform-enforced, operating at PR submission. Web of trust is relational and community-enforced, operating at contributor identity. Both are complementary; neither alone is sufficient.

What governance approaches are OSS projects using for AI contributions in 2026?

RedMonk analyst Kate Holterhoff surveyed 77 organisations and found three orientations: prohibitionist (ban AI contributions — Linux Kernel, curl), boundary-and-accountability (permit with disclosure and human ownership — EFF, Blender, Mozilla), and quality-first (gate on output quality regardless of origin — Fedora-influenced projects). The “stricter the closer to the stack” heuristic holds: security-critical infrastructure trends prohibitionist; application-layer projects trend toward quality-first or boundary-and-accountability.

Adding Open-Source Maintainer Health to Your Software Supply Chain Risk Process

Your software supply chain risk tooling was built on an assumption: when a vulnerability turns up in a dependency, a patched version exists. That assumption breaks the moment the maintainer who would write the patch has burned out and walked away.

The 2026 OSSRA report from Black Duck found that 93% of commercial codebases contain zombie components — dependencies with no development activity in the past two years. Meanwhile, 92% contain components four or more versions behind, and 68% contain licence conflicts — the largest year-over-year increase in the report’s history. The full landscape of AI contribution pressure on open source explains the structural driver: AI coding tools have made it trivially cheap to generate pull requests while doing nothing to reduce the cost of reviewing them.

Your existing risk process has a blind spot. This article gives you a five-step process to close it — no dedicated OSPO headcount required.

Why does maintainer burnout show up in your SBOM?

Your SBOM knows a component exists. It does not know the maintainer quit six months ago.

Standard SBOM tooling — CycloneDX, SPDX, and the SCA platforms built on them — captures component name, version, licence declaration, and known CVEs. A frozen component accumulates unpatched vulnerabilities that never get a CVE assignment because no researcher bothers triaging a dead project.

The tooling gap is structural. Snyk, Sonatype, and Mend were architected around “find the patched version.” They are not designed to signal “there will be no patched version.”

OpenSSL was maintained by two overworked, underpaid people at the time of Heartbleed. The XZ compromise succeeded because a patient attacker targeted a single overworked maintainer. Maintainer burnout is not a community welfare concern — it is a supply-chain risk signal. The cost asymmetry mechanism that explains why maintainer burnout is a risk signal is the structural driver: AI tools reduce contribution cost to near zero while review cost stays constant and high, accelerating burnout in volunteer-maintained projects. For the broader problem this process addresses — why AI-generated contributions create supply-chain risk across all dimensions — the pillar guide provides the full context.

What is a zombie component, and how do you find yours?

A zombie component is an open source dependency with no development activity in the past two years. The OSSRA 2026 formal definition: no commits, no releases, no issue activity in the last 24 months. Present in 93% of audited commercial codebases. This is not an edge case.

So how do you tell a zombie apart from a mature, stable library that just hasn’t needed any recent commits? Check four signals:

  1. Last commit date — if more than 24 months have passed, the component meets the zombie definition
  2. Open security issues with no maintainer response — this is the signal that distinguishes mature from abandoned
  3. README or CHANGELOG notes indicating “feature complete, no further development” — some projects formally communicate stable-and-done status
  4. Fork activity — if the community has created and is actively maintaining a fork, the original project’s effective abandonment has already been acknowledged

A CSS reset library with no commits since 2021 and no open CVEs is not the same risk as an authentication library in the same state.

How to surface zombie components with your existing tooling:

Most SCA tools (Snyk, Black Duck, FOSSA) support filtering by last activity date. Enable this filter and set the threshold to 24 months — it is frequently not on by default. OpenSSF Scorecard provides a “Maintained” check scored 0–10; a score of 0 means no recent commits or issues have been handled.

Not all zombie components require the same response speed. Prioritise by function first (authentication, cryptography, network I/O before UI utilities), then direct versus transitive dependency, then whether a maintained fork exists.

The Kubernetes External Secrets Operator case illustrates the documented incidents showing what happens when dependencies reach this state: when its sole active maintainer took vacation, zero pull requests were merged and 20 new issues opened with no response. Recovery took at minimum six months.

How do you calculate the Contributor Absence Factor for a critical dependency?

The Contributor Absence Factor (CAF) is the fewest number of contributors whose departure would account for 50% of all commit activity.

If you have spent any time on Hacker News, you will have come across “bus factor” — the informal shorthand for the same concept. CHAOSS, the Linux Foundation project that maintains metrics for open source software health, formally renamed it Contributor Absence Factor. Use CAF in your risk reports and governance documents.

Why CAF beats total contributor count: A project with 40 contributors can still have a CAF of 1 if one core developer wrote 80% of the commits. Total contributor count is a vanity metric. CAF is the risk metric.

Worked calculation: Eight contributors with commit counts: 1,000; 433; 343; 332; 202; 90; 42; 33. Total: 2,475. The 50% threshold is 1,237.5. The first contributor accounts for 1,000 commits, the second adds 433 — cumulative total now 1,433, which exceeds the threshold. CAF = 2. Two people effectively control this project’s commit activity.

Three ways to compute CAF:

  1. GitHub API (free): Pull the contributor list with commit counts and walk the list until cumulative commits reach 50% of total. About 30 minutes manually.
  2. OpenSSF Scorecard: The Contributors check scores 0–10 based on commit distribution. A score below 5 signals concerning concentration.
  3. Bitergia Risk Radar: Commercial platform that computes a Total Risk Score incorporating CAF directly — suitable for teams assessing many dependencies at once.

CAF of 2 or lower on a Tier 3 or Tier 4 dependency that handles security-sensitive functions warrants escalation. CAF of 1 warrants an active contingency plan.

Elephant Factor is the organisational-concentration companion to CAF: the fewest organisations comprising 50% of project activity. When one company employs all active committers, you face the Terraform/OpenTofu or Redis/Valkey scenario — the commercial backer makes a unilateral decision, and you scramble to work out whether you can continue. Track it for your Tier 2 dependencies.

For evaluating governance quality as part of your dependency health assessment, CAF provides the quantitative complement to qualitative governance review.

What does the CHAOSS viability framework assess, and how do you use it?

The CHAOSS viability framework (a Linux Foundation project) evaluates an open source dependency across four categories: Compliance and Security, Governance, Community, and Strategy. It gives you a documented, reproducible methodology — which matters because you need to compare results quarter-over-quarter, not make ad-hoc judgements each time.

Compliance and Security: Does the project track CVEs? Are releases signed? Is there a security disclosure policy? OpenSSF Scorecard automates most of this — run it first.

Governance: Is there a CONTRIBUTING.md, a code of conduct, documented decision-making? Does the project have an active maintainer group with more than one person? How a project handles AI contribution inflow is now a governance quality signal — the three governance orientations for evaluating governance quality as part of your dependency health assessment map directly onto this category.

Community: CAF and Elephant Factor are the primary metrics. Supplement them with commit frequency trend and issue response latency.

Strategy: Is the project foundation-backed (Apache, CNCF, Linux Foundation)? Is there commercial backing with paid contributors? Foundation-backed projects provide a structural buffer against AI contribution pressure.

For each category, assign Red/Amber/Green. Red in Compliance and Security, or Red in Community = escalation required. Red in Governance = monitor closely. Red in Strategy plus Red in Community = contingency plan required.

A full CHAOSS assessment takes 30–45 minutes manually. CHAOSS recommends quarterly reassessment — this gives you the trend data to make supply-chain risk arguments at the board level.

How do you classify your dependencies by AI contribution pressure exposure?

Existing SCA risk tiers are based on known vulnerability severity — backward-looking. AI contribution pressure exposure is forward-looking: it asks where future vulnerability discovery will slow down or stop. You need both.

Tier 1 — Foundation-backed, paid contributors (CNCF, Apache, Linux Foundation). Low AI pressure exposure — governance absorbs inflow. Annual CHAOSS check. Default: Monitor.

Tier 2 — Commercially-backed, company employs primary contributors. Medium exposure — risk is vendor strategy change, not burnout. Semi-annual CAF + Elephant Factor check. Default: Monitor Elephant Factor.

Tier 3 — High-star / small volunteer team, no formal governance, high-AI-use language ecosystem (Python, JavaScript, TypeScript). High exposure — high visibility invites AI slop inflow; small team with no institutional buffer. Quarterly CHAOSS viability assessment. Default: Contingency plan if CAF is 2 or lower.

Tier 4 — Single-maintainer, no governance docs, no foundation backing. Single point of failure; burnout = abandonment. Quarterly + contingency planning. Default: Active alternative identification required.

Classification decision logic — four sequential questions:

  1. Is this project backed by a neutral foundation (CNCF, Apache, Linux Foundation)? Yes → Tier 1.
  2. Does a company employ the primary contributors as part of their paid work? Yes → Tier 2.
  3. Is this a high-star project in a high-AI-use language ecosystem with fewer than five active committers and no formal governance? Yes → Tier 3.
  4. Otherwise: Tier 4.

“High-AI-use language ecosystem” means Python, JavaScript/TypeScript, and the Go tool ecosystem. A small volunteer-maintained Python utility library today faces meaningfully more contribution pressure than an equivalent Fortran library.

The practical output is a tiered dependency register — maintainable in your SBOM tooling or a simple spreadsheet. Classify all direct dependencies on initial setup; update classification when structural signals change.

The documented incidents showing what happens when dependencies reach this state — curl, Node.js, Ghostty — provide the evidence for why Tier 3 and Tier 4 classification warrants proactive attention.

What is licence laundering, and why do you need SCA tooling to catch it?

Licence laundering is what happens when AI coding assistants generate code derived from copyleft-licensed sources — GPL, LGPL, AGPL — without retaining the original licence metadata. The result: undisclosed licence obligations embedded in your codebase, or in the codebases of the dependencies you rely on. Standard SCA tools miss this entirely.

OSSRA 2026 records 68% of audited commercial codebases containing licence conflicts — up from 56% the previous year. Only 54% of organisations currently evaluate AI-generated code for IP and licensing risks. One audited codebase contained 2,675 distinct licence conflicts.

Here is the technical pathway: a developer uses an AI assistant to generate a function. The AI reproduces logic derived from GPL-licensed source without attribution. The output file has no licence header. Your SCA tool flags it as “unknown” — typically deprioritised — or misses it entirely because it entered as an inline snippet, not a declared dependency.

And if a dependency you rely on has licence-laundered code embedded in it, you inherit that problem. A copyleft snippet in a proprietary codebase can legally obligate you to release your entire proprietary source code.

Standard SCA tools (Snyk, Sonatype, Mend) check declared licence headers and SPDX identifiers. Two tools have moved ahead of the field for semantic fingerprinting:

AI code validation features are typically not enabled by default in either tool. Explicitly enable them.

The EU Cyber Resilience Act places supply-chain liability on the downstream commercial manufacturer. Undisclosed licence obligations from AI-generated code create both legal and security exposure — worth flagging for the broader problem this process addresses in any company with EU regulatory exposure.

What does a quarterly OSS health review process look like?

CHAOSS recommends a quarterly cadence because it matches typical engineering governance rhythms — quarterly planning, board reporting — and is frequent enough to catch a project in early-stage decline rather than after it has gone fully dark.

Step 1: OpenSSF Scorecard sweep (~30 minutes, mostly automated) Run against all Tier 1 and Tier 2 dependencies via the CLI or GitHub Actions integration. Flag any dependency with a Maintained score below 5 or a Contributors score below 5. Treat these as escalation triggers for manual investigation.

Step 2: CAF and Elephant Factor check (~45 minutes) For all Tier 3 and Tier 4 dependencies, pull contributor commit data via the GitHub API or Bitergia. Flag any dependency with CAF of 2 or lower. For Tier 2 dependencies, flag any where a single organisation employs more than 50% of active committers.

Step 3: SCA licence scan (~30 minutes, mostly automated) Run FOSSA Snippet Scanning or JFrog Xray with AI code provenance scanning explicitly enabled. Flag any new “unknown” or AI-derived licence entries and add to a legal review queue.

Step 4: Zombie component delta review (~30 minutes) Compare this quarter’s SCA activity report against last quarter’s. Flag any component that moved from “active” to “no recent commits.” Check whether a maintained fork exists — if the community has coalesced around one, migration is a defined path.

Step 5: Escalation and recording (~15 minutes) Critical findings — CAF of 1 on a Tier 3 or Tier 4 dependency, new zombie component in a security-sensitive function, licence conflict — go to the next engineering governance meeting with a recommended action. Three standard responses: find an alternative or maintained fork, sponsor the maintainer or contribute engineering hours, vendor fork internally. Contributing back as a proactive risk reduction strategy provides the business case framework for the second option.

For a stack of 50–100 direct dependencies, the full process takes approximately two to three hours per quarter.

Who owns this process at a 50–500 person company without an OSPO?

The OSPO (Open Source Programme Office) function is a set of responsibilities, not a team. The failure mode at most SMB tech companies is not “we don’t have an OSPO” — it is “nobody has explicit ownership.” Assigning it to existing roles costs nothing and eliminates the gap.

CTO owns policy and escalation decisions: setting the OSS dependency policy, approving contingency plans for Tier 4 dependencies, escalating licence findings to legal. For companies with EU regulatory exposure, the quarterly review output becomes the compliance evidence file.

Engineering Leads and Platform Engineers run quarterly review Steps 1–4. They own the tiered dependency register and make fork-vs-replace recommendations. At the smaller end of the range, this may be a single senior engineer or the CTO directly.

Security Function (whoever owns AppSec or SCA tooling) configures and maintains SCA tooling with AI code provenance scanning enabled, owns the licence scanning, and feeds findings into the quarterly review.

First-quarter bootstrap: Run a one-time audit across all direct dependencies; classify them by tier; create the tiered register; identify any immediate findings (CAF of 1, zombie components in security-sensitive functions, licence conflicts); run CHAOSS viability assessment on all Tier 3 and Tier 4 dependencies. This takes one to two days for a 50–100 dependency stack. After that, the quarterly review is just the delta.

For Tier 3 and Tier 4 dependencies, the most effective long-term risk reduction is upstream contribution — funding a maintainer, contributing engineering hours, or steering a dependency toward foundation governance. Contributing back as a proactive risk reduction strategy makes the cost comparison case. For a complete overview of how AI-generated contributions are reshaping open-source supply chain risk across all dimensions — mechanism, incidents, governance, and platform responses — see the full series.

Frequently Asked Questions

Does our SBOM currently capture maintainer health signals?

Almost certainly not. Standard SBOM tools — CycloneDX, SPDX — capture component name, version, licence declaration, and known CVEs. They do not capture commit frequency, contributor count, or CAF. Layer an OpenSSF Scorecard sweep on top of your SBOM output to get maintainer health signals. Some commercial SCA platforms (Black Duck, Mend) are beginning to add activity signals, but typically not enabled by default.

What SCA tools surface licence laundering from AI-generated code?

JFrog Xray is the most capable tool for semantic fingerprinting of AI-generated code provenance. FOSSA Snippet Scanning provides strong licence compliance scanning for AI-generated code contexts. Standard SCA tools (Snyk, Sonatype, Mend) rely on declared licence headers — they will flag “unknown” entries but do not perform semantic fingerprinting. Enable AI code validation features explicitly; they are not on by default.

How do I prioritise which dependencies to assess first?

Function filter first — any dependency handling authentication, cryptography, network protocols, or deserialisation is a priority regardless of tier. Then tier classification — assess Tier 3 and Tier 4 first. For a 50–100 dependency stack, this typically produces a list of 10–15 high-priority items for the first quarter.

Can I do this assessment without any commercial tooling?

Yes. OpenSSF Scorecard is free and open source; the GitHub API is free for public repositories; the CHAOSS viability framework documentation is publicly available at chaoss.community. The limitation is scale — manual CAF calculation becomes time-consuming above approximately 30 direct dependencies. Add commercial tooling when the manual process exceeds approximately half a working day per quarter.

What does the EU Cyber Resilience Act require specifically for OSS dependencies?

The CRA places supply-chain liability on the downstream commercial manufacturer, not the OSS maintainer. Companies shipping software to EU markets must demonstrate supply-chain due diligence — knowing what OSS dependencies they use, their licence status, and their security maintenance status. The quarterly OSS health review process described here produces the documentation that satisfies this requirement. Full CRA obligations phase in by December 2027; consult legal counsel for your jurisdiction.

What should I do when a Tier 4 dependency has no active maintainer?

Three options in order of preference: (1) Find a maintained fork — check GitHub for forks with recent activity; the community often coalesces around one; (2) Sponsor or contribute to restart maintenance — fund a developer or assign engineering hours; this is a supply-chain investment, not charity; (3) Vendor fork internally — fork the repository, assume maintenance, and apply security patches. The business case for upstream investment develops the framework for option 2.

How often should I re-tier a dependency after initial classification?

Re-tier when a structural signal changes: project transitions from independent to foundation-backed (Tier 3 to Tier 1); a commercial backer withdraws support (Tier 2 to Tier 3); CAF drops to 1 due to a core contributor departure; the project freezes. Between tier-change events, the quarterly review process provides sufficient signal to detect trend changes without requiring full re-classification.

Curl Bug Bounty Shutdown and the Open-Source Incidents That Proved the Problem Is Real

In January 2026, Daniel Stenberg shut down the curl bug bounty programme he’d been running since 2019. Not because the money ran out. Because the economics had become untenable.

87 confirmed vulnerabilities. Over $100,000 USD in rewards. Six and a half years. And it ended because AI-generated security reports had made triage unsustainable.

This is not curl’s bad luck. Node.js dealt with a 19,000-line AI-generated pull request that triggered a formal community petition. Ghostty closed its doors to outside contributors within months of going public. tldraw stopped accepting pull requests entirely. Django’s Security Team documented a new category of AI-generated vulnerability report that required expert evaluation to reject.

Each of these incidents is what AI-generated contribution pressure as a supply-chain concern looks like on the ground. And the OpenSSL/AISLE case shows there is a better way: expert-guided AI analysis found 12 zero-days without a single invalid report reaching the maintainers. The difference is expert verification, not AI involvement.


Why did curl shut down its bug bounty program?

curl’s HackerOne bug bounty programme ended on January 31, 2026. Once, the confirmed-vulnerability rate exceeded 15%. By 2025, it had fallen below 5%. Stenberg put it plainly: “Not only the volume goes up, the quality goes down. So we spend more time than ever to get less out of it than ever.”

Here is why that matters. A well-functioning bug bounty works because generating a credible report is expensive — the time and codebase knowledge required act as a natural quality filter. AI removes that cost on the submission side while leaving the maintainer’s triage cost completely unchanged. That is the underlying mechanism behind most of these incidents.

Stenberg coined “death by a thousand slops” in a July 2025 blog post (daniel.haxx.se/blog/2025/07/14/death-by-a-thousand-slops/). curl’s security.txt started including: “We will ban you and ridicule you in public if you waste our time on crap reports.” On the reports themselves: “You fire up ChatGPT and ask ‘please point out the security problem in the curl project and make it sound horrible’ and it’ll do that.” On the decision to shut it down: “We need to make moves to ensure our survival and intact mental health.”

The replacement is GitHub’s Private Vulnerability Reporting at github.com/curl/curl/security/advisories. No monetary reward. Maintainer-controlled intake. Stenberg documented his further experience at FOSDEM 2026 in “Open Source Security in spite of AI” (fosdem.org/2026/schedule/event/B7YKQ7-oss-in-spite-of-ai/).


What happened when Claude Code generated a 19,000-line pull request for Node.js?

In late 2025, a Node.js TSC member submitted a 19,000-line pull request generated using Claude Code — a complete module refactor that reviewers estimated would take days to assess. The incident triggered a petition signed by over 80 Node.js developers calling for a project-wide ban on AI-assisted contributions — documented in arXiv 2603.26487 as the largest formal community response to a single AI contribution incident on record.

Scale amplified the cost asymmetry rather than demonstrating productivity. There is no accumulated track record behind a single AI-generated module refactor. Every line requires the same evaluation as if it came from an unknown contributor.

The TSC did not implement an outright ban. It implemented a minimum HackerOne Signal score requirement — requiring a track record of valid security submissions before participation is permitted. AI slop cannot have accumulated Signal score.

Matteo Collina, TSC Chair: “My ability to ship is no longer limited by how fast I can code. It’s limited by my skill to review. And I think that’s exactly how it should be.” The moment review stops, accountability stops with it.

The Node.js incident also established that community-driven policy demands — not solely maintainer decisions — could trigger formal governance changes. That precedent matters for how governance responses developed.


Why did Ghostty close its doors to outside contributors?

Ghostty is a GPU-accelerated terminal emulator created by Mitchell Hashimoto, HashiCorp‘s founder. It launched publicly in December 2024 and within months had pivoted to an invitation-only contribution model.

The Ghostty AI policy is zero-tolerance: contributors who submit AI-generated code without adequate human review face bans, with permanent bans for repeat violations. Consequences are named explicitly.

The mechanism is Vouch — a tool Hashimoto built where only contributors vouched for by existing trusted members can submit pull requests. The community debate captures the tradeoff: “a necessary spam filter” versus “an insider’s club where social standing becomes a gatekeeping lever.”

What makes the Ghostty case significant is the timing. It launched in December 2024 with well-resourced leadership, and the problem appeared within months. You do not need years of technical debt for this to happen. You just need an attractive enough target and a low enough submission barrier.

Ghostty’s approach is selective admission, not shutdown — open only to contributors who have been vouched for. The taxonomy of these governance orientations is worth examining against your own dependencies.


When tldraw stopped accepting PRs entirely: what the “nuclear option” looks like

tldraw is an open source infinite canvas and drawing SDK. In January 2026, founder Steve Ruiz announced it would begin automatically closing pull requests from external contributors.

Ruiz introduced the term “well-formed noise”: PRs “that claimed to solve a problem we didn’t have or fix a bug that didn’t exist.” Correct syntax, plausible commit messages, apparent codebase understanding — but detecting them as invalid requires the same review effort as a legitimate contribution. As Ruiz put it: “To an outsider, the result of my fire-and-forget ‘fix button’ might look identical to a professional, well-researched, intellectually serious bug report.”

There was a secondary signal worth noting: even large PRs were “abandoned, languishing because their authors had neglected to sign our CLA.” A human with skin in the game will sign a Contributor Licence Agreement. An AI-generated submission’s author frequently does not.

GitHub lacked adequate tools for controlling external contribution intake — a gap GitHub began addressing in February 2026, though turning off PRs without an alternative channel makes legitimate bug reports invisible.

Ruiz acknowledged the tradeoff honestly: when bad work is virtually indistinguishable from good, “the value of external contribution is probably less than zero.” When a project’s founder cannot sustain PR review, the downstream risk is project abandonment.


How did AI change the trust model for security vulnerability reports?

Django’s Security Team published an update on February 4, 2026 describing a new pattern: “Almost every report now is a variation on a prior vulnerability.” The mechanism was plain: “Clearly, reporters are using LLMs to generate (initially) plausible variations.” A specific example: CVE 2025-13473, patched February 3, 2026, was “a straightforward variation on CVE 2024-39329.” These reports require expert triage time to evaluate and reject.

This is the trust model shift. Security disclosure previously assumed submitting a report was costly enough to filter out noise. That assumption no longer holds. Bug bounty platforms were built for an environment where valid reports are rare and expensive to produce. AI removed the friction that provided the quality filter. Each of these incidents is a concrete data point in the broader risk management context that CTOs managing OSS dependencies now need to account for.

Node.js’s Signal score requirement is the platform-level response. The policy responses that LLVM and EFF developed are the governance layer worth examining next.


What does expert-guided AI bug analysis look like when it actually helps?

AISLE used AI-powered security analysis to discover 12 CVEs in OpenSSL — including a high-severity stack buffer overflow (CVE-2025-15467) enabling remote code execution, and multiple issues dating back to 1998. One of the most scrutinised codebases on the internet.

AISLE also reported over 30 valid security issues to the curl project. Stenberg’s assessment: “amazed by the quality and insights.” His formulation: “A clever person using a powerful tool.”

Their methodology uses context-aware detection with a priority-scoring system to reduce false positives, and human security experts verify every finding before disclosure. Result: 12 CVEs, zero invalid reports. The cost asymmetry mechanism runs in reverse — the expert team absorbs the false-positive filtering cost rather than transferring it to the maintainer.

As Drupal‘s founder Dries Buytaert put it: “AISLE used AI to amplify deep knowledge. The low-quality reports used AI to replace expertise that wasn’t there.” Not AI versus no AI. Whether a qualified human verifies before the maintainer is burdened.


How widespread is the problem? What the 2026 data says about scale

The incidents above are not outliers. They are the visible surface of a pattern the 2026 data documents at scale.

Black Duck‘s 2026 Open Source Security and Risk Analysis (OSSRA) report, based on 947 commercial codebases across 17 industries, found that 93% contained at least one “zombie component” — an open source dependency with no development activity in the past two years, receiving no patches, no bug fixes, no maintenance. When a vulnerability is discovered in a project that hasn’t been touched in years, there is often no maintainer left to fix it.

The vulnerability numbers are stark: open source vulnerabilities per codebase rose 107% year-over-year to an average of 581. 78% of audited codebases contained high-risk vulnerabilities; 44% contained critical-risk issues. 65% of organisations experienced a software supply chain attack in the past year.

arXiv 2601.15494 (“Vibe Coding Kills Open Source”) models how vibe coding severs the engagement loop through which maintainers previously earned returns, while accelerating downstream OSS usage. Its conclusion: “Sustaining OSS at its current scale under widespread vibe coding requires major changes in how maintainers are paid.” The Tidelift State of the Open Source Maintainer provides the human-scale evidence.

Each zombie component in your stack is the downstream product of a maintainer who ran out of capacity to continue. The incidents in this article document how that capacity runs out. For what this means for your dependency risk assessment — including how to identify which of your dependencies is at similar risk — the supply-chain risk process framework provides the operational next step. For the full AI-generated contribution pressure as a supply-chain concern across all six dimensions, the pillar guide maps the complete landscape.


Frequently Asked Questions

Is the curl bug bounty program coming back?

No. The January 2026 shutdown was framed as permanent, with language pointing toward possible escalation, not reinstatement. curl now accepts security reports through GitHub’s Private Vulnerability Reporting at github.com/curl/curl/security/advisories and email to [email protected]. No monetary reward. The structural conditions that caused the shutdown have not changed.

How can a project protect itself from AI-generated bug report floods?

There is no single solution. The documented range includes: platform reputation gating (Node.js’s HackerOne Signal score), invitation-only access control (Ghostty), full external PR closure (tldraw), and replacement of bug bounty intake with maintainer-controlled private reporting (curl). GitHub announced partial platform-level mitigation in February 2026. The arXiv 2603.26487 paper provides a taxonomy of 12 governance strategies for a more systematic view.

What is the difference between AI slop and AI-assisted analysis?

Whether a qualified human verifies findings before they reach the maintainer. AI slop: generated and submitted directly, triage cost transferred to the maintainer. Expert-guided AI analysis: reviewed and verified by a domain expert before disclosure. Stenberg’s formulation: “A clever person using a powerful tool” versus volume of unreviewed output. The policy responses that LLVM and EFF developed put this distinction into practice.

Are small projects more vulnerable than large ones?

The real risk axis is maintainer review capacity versus inbound volume, not project size. Small projects with single maintainers have less triage capacity. But large, high-visibility projects are more attractive targets — higher bug bounty rewards, bigger reputation payoff from major CVE attribution. OSSRA 2026’s 93% zombie component figure cuts across project sizes.

What happened with the Node.js 19,000-line AI-generated pull request?

A Node.js TSC member submitted a 19,000-line pull request generated using Claude Code — a complete module refactor that reviewers estimated would take days to assess. Over 80 developers signed a petition calling for a project-wide AI contribution ban. The TSC implemented a minimum HackerOne Signal score requirement rather than an outright ban — filtering low-quality submissions without closing the project to all AI-assisted contributions.

What is a “zombie component” and why does it matter for my software stack?

A zombie component is OSSRA 2026’s term for an open source dependency with no development activity in the past two years — present in active commercial codebases but receiving no patches, no bug fixes, no maintenance. Found in 93% of 947 audited codebases. Any vulnerability discovered in a zombie component will remain unpatched indefinitely. They are the supply-chain artefact of maintainer burnout. arXiv 2601.15494 and the Tidelift State of the Open Source Maintainer document how we got here.

Why did tldraw stop accepting pull requests from outside contributors?

tldraw founder Steve Ruiz closed external PRs in January 2026 after an influx of “well-formed noise” AI-generated contributions made triage unsustainable — PRs that appeared formally correct but were based on incorrect premises or fabricated issues, requiring the same review effort as legitimate contributions to identify as invalid. Ruiz noted that GitHub lacked adequate tools for controlling external contribution intake; GitHub began addressing this with new PR controls in February 2026.

How does expert-guided AI security analysis find vulnerabilities without flooding maintainers?

Context-aware AI analysis, a priority-scoring system that filters out low-confidence findings, and mandatory expert verification before disclosure. Result: 12 CVEs in OpenSSL, some dating to 1998, without a single invalid report reaching the maintainer team. The expert team absorbs the false-positive filtering cost rather than transferring it to the maintainer.

Where can I find Daniel Stenberg’s original post on ending the curl bug bounty?

Stenberg’s blog at daniel.haxx.se/blog/2026/01/26/the-end-of-the-curl-bug-bounty/ is the primary source. Current disclosure process: github.com/curl/curl/security/advisories and curl.se/.well-known/security.txt. His FOSDEM 2026 talk is at fosdem.org/2026/schedule/event/B7YKQ7-oss-in-spite-of-ai/.

What did “death by a thousand slops” mean?

Stenberg coined the phrase in a July 2025 blog post (daniel.haxx.se/blog/2025/07/14/death-by-a-thousand-slops/) to describe the cumulative burden of AI-generated security reports. It adapts “death by a thousand cuts”: no single report is fatal, but the aggregate volume consumes maintainer time to the point of unsustainability. Widely cited across Ars Technica, Socket.dev, and The Register — because it names a countable category of harm rather than describing the problem abstractly.

Three Open-Source Governance Orientations for Managing AI-Generated Contribution Volume

Open-source projects are scrambling to write formal AI contribution policies — but they’re not all arriving at the same answer. RedMonk’s survey of 77 OSS organisations found a fragmented landscape. Some projects ban AI contributions outright. Others require disclosure and accountability. Others don’t care how code was produced as long as it passes review.

The first systematic attempt to map all of this comes from arXiv 2603.26487 — “Beyond Banning AI” by Yang, He, and Zhou of Peking University — which analysed 67 highly visible OSS projects and derived three governance orientations and twelve operational strategies from what they found.

This article explains each orientation, walks through the LLVM, EFF, and Ghostty policies as concrete examples, and gives you a practical rubric for evaluating your dependencies and writing your own internal contribution rules. These governance choices sit at the upstream layer of the full context of open-source supply chain risk.

Why do open-source projects need formal AI contribution policies now?

AI tools have changed the economics of open-source contribution. Generating a pull request now takes seconds. Reviewing one still takes as long as it always did.

Before AI tools, a PR was a signal of genuine interest. Maintainers extended the benefit of the doubt because the effort required to send one was credible. AI has broken that signal. As GitHub put it in their “Welcome to the Eternal September of Open Source” post: “The cost to create has dropped but the cost to review has not.”

The canonical example of how bad this gets is the Node.js incident: a 19,000-line PR generated with Claude Code triggered a petition signed by over 80 developers calling for a project-wide ban. The cost asymmetry that makes this problem structural is documented in arXiv 2601.15494. The curl bug bounty shutdown is the logical endpoint — fabricated AI security reports were costing more to process than the programme was worth.

Without an explicit policy, maintainers are individually enforcing unwritten rules. That creates inconsistency, resentment, and burnout. Policy formalises what maintainers already know: the contribution economics have changed and something has to give.

What are the three governance orientations, and what does each one assume?

The arXiv 2603.26487 framework identifies three top-level orientations. These aren’t specific policies — they’re underlying stances about risk, trust, and what AI-generated contributions actually represent. Real projects often blend elements from more than one.

O1 — Prohibitionist: AI-generated contributions present structural risk — provenance uncertainty, licence contamination, or review-capacity overload — that normal review processes can’t reliably catch. Categorical exclusion or strict access control is the rational response.

O2 — Boundary-and-Accountability: AI-assisted contributions are fine if the contributor discloses AI tool use and demonstrates genuine understanding of what they submitted. The policy governs contributor behaviour, not the capability of the tool.

O3 — Quality-First / Tool-Agnostic: Contributions get evaluated on merit regardless of how they were produced. Existing quality gates — CI/CD, code review standards — are sufficient. AI-specific rules add friction without proportional benefit.

The key distinction worth understanding: O1 and O2 govern AI inputs directly. O3 governs outputs only. O1 assumes provenance is the primary risk. O2 assumes contributor accountability is. O3 assumes your review pipeline can catch anything that matters. None of these has won — and the right answer depends on your project’s specific bottleneck. Understanding AI contribution pressure reshaping OSS governance at the supply chain level helps make sense of why these orientations differ so sharply.

What does a Prohibitionist policy look like in practice? (LLVM)

LLVM’s policy is prohibitionist-adjacent rather than fully prohibitionist. It doesn’t ban AI tools outright, but it restricts them in ways that functionally exclude many AI-assisted workflows. The LLVM AI Tool Use Policy is the canonical example of this approach.

The foundation is a human-in-the-loop requirement: “Contributors must read and review all LLM-generated code or text before they ask other project members to review it… they should be able to answer questions about their work.”

Two specific prohibitions do the heavy lifting. First, LLVM bans AI agents acting without human approval — explicitly naming the GitHub @claude agent. Second, LLVM bans AI tools for “good first issues” — the primary entry point for low-effort, high-volume AI submissions. That removes the most obvious vector for turning the project into a spam target.

LLVM also formalises the concept of extractive contribution: “a contribution should be worth more to the project than the time it takes to review it.” Maintainers can apply an extractive label to off-track PRs, and persistent non-compliance escalates to moderation.

What LLVM explicitly permits: AI-assisted contributions where the contributor has reviewed the output and can defend design decisions. This is a standard, not a blanket ban. The policy relies on review culture rather than automated enforcement. The incidents that forced these policies into existence made LLVM’s formal approach necessary.

What does a Boundary-and-Accountability policy require from contributors? (EFF)

The Electronic Frontier Foundation published its LLM-assisted contribution policy in February 2026. The policy opens with a candid acknowledgement of the tension: “Banning a tool is against our general ethos, but this class of tools comes with an ecosystem of problems.”

The EFF policy has a clean two-part structure.

Boundary (disclosure): Contributors must disclose when they use LLM tools.

Accountability (demonstrated understanding): Comments and documentation must be authored by a human. Where LLVM’s accountability is asserted at review time — can you defend this in conversation? — EFF’s is embedded in the submission artefact itself. Self-authored comments mean the human’s understanding has to be visible in what they submit.

EFF doesn’t ban LLMs. “Their use has become so pervasive a blanket ban is impractical to enforce.” Instead the policy creates constraints that make AI-assisted contributions viable only when the contributor genuinely understands what they’re submitting.

The disclosure requirement (A2 in arXiv 2603.26487) is the most widely adopted single strategy in the 67-project corpus. The tradeoff: the barrier is lower than LLVM’s, but it relies on contributor honesty. A bad-faith contributor can tick the disclosure box while submitting code they don’t understand. For cases where disclosure failed to prevent problems, see the curl bug bounty shutdown and the incidents that followed.

When does invitation-only become the right answer? (Ghostty)

Ghostty — Mitchell Hashimoto’s terminal emulator — implemented one of the most structurally restrictive governance responses in the OSS ecosystem. Pull requests are only accepted from contributors explicitly vouched for by existing trusted contributors.

The mechanism is the Vouch project: only vouched contributors can submit PRs, the trust graph is explicit and decentralised, and trusted contributors can endorse newcomers to grow the inner circle deliberately.

What drove Ghostty to this was simple: the cost-benefit ratio of accepting unsolicited contributions turned negative. tldraw arrived at the same endpoint through platform automation — automated closure of all unsolicited PRs. Steve Ruiz summed it up bluntly: “In a world of AI coding assistants, is code from external contributors actually valuable at all?” His project was receiving PRs that “claimed to solve a problem we didn’t have or fix a bug that didn’t exist.”

This is the extreme end of the O1 orientation: not rules about what AI contributions must include, but structural access control that prevents unsolicited contributions entirely.

The community costs are real. The contributor pool shrinks, feature development from outside slows, and the model can feel unwelcoming to skilled new contributors who happen to be unknown. The counterargument: many maintainers already use informal vouching. Vouch simply codifies what already happens.

And invitation-only isn’t the same as closing the project. New contributors can still submit patches by publishing them publicly and asking trusted contributors to pull them. The threshold question is economic: when the marginal cost of reviewing unsolicited contributions consistently exceeds their marginal value, this becomes defensible.

What do the twelve governance strategies tell us about the policy design space?

The twelve strategies from arXiv 2603.26487 are the operational implementations of the three orientations. They show that the same underlying stance can be implemented through very different mechanisms — and that you have more options than just picking an orientation and running with it.

The strategies break into four functional groups. Function A governs entry and input qualification: A1 Boundary Exclusion, A2 Transparency and Disclosure, A3 Compliance and Provenance Safeguarding. Function B governs responsibility and evidence restoration: B1 Accountability Reinforcement, B2 Verification and Evidence Gating, B3 AI Tooling Governance via AGENTS.md. Function C governs review burden and workflow protection: C1 Scope and Intentionality Control, C2 Capacity and Queue Control, C3 Moderation and Sanctions, C4 Security Channel Governance. Function D governs infrastructure and institutional adjustment: D1 Channel and Platform Reconfiguration, D2 Incentive Redesign.

Three strategies are worth looking at more closely.

A2 (Transparency and Disclosure) is the most commonly adopted strategy in the corpus. It’s the minimum viable policy — compatible with both O2 and O3, requires no structural access changes, and is the baseline for anything more sophisticated. A disclosure checkbox in your PR template is A2 in isolation. Better than nothing, but it doesn’t create accountability or reduce volume on its own.

B3 (AI Tooling Governance / AGENTS.md) is double-edged. The governance benefit is clear, but adding AGENTS.md also signals you’re AI-friendly, which can attract more submissions than you intended. Be cautious if you’re already at capacity.

B2 (Criteria-Based Gating): GitHub Community Discussion #185387 proposes requiring a linked, triaged issue before a PR can be opened. GitHub has shipped some relief features but criteria-based gating isn’t generally available yet.

The key insight: strategies can be combined across orientations. Adopt A2 disclosure from O2, pair it with B2 criteria-based gating and D2 incentive redesign — you’re not locked to one orientation for everything. For platform tools that operationalise these strategies, see what GitHub and the OSS ecosystem are building to protect maintainers from AI slop. These governance orientations sit within the full context of AI contribution pressure reshaping OSS supply chain risk — a broader picture that spans the economic mechanism, the incident record, platform responses, and risk management frameworks.

How do you assess whether a project’s governance is adequate from the outside?

If you depend on OSS projects you don’t control, the question isn’t which orientation to adopt — it’s whether your dependency has adequate governance given your risk exposure.

Here’s a five-signal rubric.

1. Policy existence. Check CONTRIBUTING.md. Search for “LLM,” “AI,” “generative,” or “Copilot.” Also check PR templates (.github/PULL_REQUEST_TEMPLATE.md) and SECURITY.md. If nothing surfaces an AI policy, the project is effectively O3 by default.

2. Policy specificity. Does the policy name specific required or prohibited behaviours? Vague language like “quality contributions only” is O3 by default, not genuine governance. LLVM’s policy is enforceable. “We value quality” is an aspiration.

3. Enforcement mechanism. Is there automated enforcement — CI quality gates, PR templates, criteria-based gating — or does policy rely entirely on reviewer discretion? Structural enforcement beats human judgement under volume pressure.

4. Contributor health signals. The CHAOSS Contributor Absence Factor measures the fewest number of committers comprising 50% of project activity. If one or two people account for half of all activity, the governance framework depends on those individuals staying engaged — and AI contribution volume is a new stressor on exactly that vulnerability.

5. Recent governance activity. Has the project updated its contribution policies since 2025? Projects that haven’t updated since 2023 may not have addressed the AI contribution volume shift at all.

For critical dependencies, check recent PR history to confirm the policy is actually being applied. Prioritise scrutiny by: critical path, unavailable alternatives, and CVE exposure history. Using governance quality as a risk signal in your dependency audits is where this assessment becomes actionable.

What should your own upstream contribution policy say about AI tools?

For engineering teams that contribute upstream to OSS projects, an internal contribution policy on AI tool use is both a governance obligation and a reputation management tool. Bad AI-assisted contributions damage your standing with the maintainers whose goodwill you depend on. Here are five things your policy needs to address.

1. Disclosure requirement. State whether contributors must disclose AI tool use in PR descriptions or commit messages. Match the receiving project’s policy where one exists.

2. Accountability standard. Contributors must be able to explain every line they submit. Adopt the LLVM human-in-the-loop requirement as the default unless a project explicitly operates under O3.

3. Documentation and comments. Do not submit AI-generated explanatory comments or documentation. Write your own. If you can’t write the comments, you haven’t understood the code.

4. Context check before contributing. Verify whether the target project has an explicit AI contribution policy. If the project is O1 or prohibitionist-adjacent like LLVM, don’t use AI tools in that contribution regardless of quality.

5. Contribution type scope. Apply stricter standards to security-sensitive contributions. Django‘s security team said it plainly: “Almost every report now is a variation on a prior vulnerability… Clearly, reporters are using LLMs to generate (initially) plausible variations.” Contributing fabricated security findings gets you blacklisted from the security channel.

Formalising this protects your team’s standing, reduces extractive contribution exposure, and gives contributors a clear standard to work to. Using governance quality as a risk signal in your dependency audits is where this framework becomes actionable — it’s part of how you manage supply-chain risk across the open-source dependencies your product depends on. For the complete picture of how AI-generated contributions are reshaping open-source supply chain risk across all dimensions, the pillar page maps each risk area and links to the full series.

FAQ

Which governance orientation should our open-source project adopt?

Match orientation to project profile. O1 suits small-core projects with high technical standards and limited maintainer bandwidth. O2 suits large contributor communities where AI-assisted contributions from skilled contributors have genuine value. O3 suits projects with robust CI/CD where the review pipeline can catch quality problems regardless of where the code came from.

If you’re unsure, start with O2’s minimum viable implementation: the A2 disclosure requirement. It’s the most commonly adopted approach, provides a clear accountability baseline, and doesn’t require structural access changes to implement.

What is AGENTS.md and should we use it?

AGENTS.md is a repository-level instruction file that gives AI coding agents project-specific constraints — what to avoid, how to format PRs, what testing is required. It’s Strategy B3 (AI Tooling Governance).

The catch: adding AGENTS.md signals that you’re AI-friendly, which can attract more contributions than you intended. Use it if you’re O3 and want AI tools to work well with your project. Be cautious if you’re O1, or if the openness signal would create more review volume than you can absorb.

Does a quality-first policy actually work?

O3 works if your CI/CD is comprehensive and your reviewers can identify AI-generated code containing subtle logical errors or maintainability problems after it passes automated checks.

The structural risk is that it absorbs the full volume increase without reducing it. O3 is right for projects with well-funded, professional contributor bases where review capacity can scale. It’s risky for volunteer-maintained projects where maintainer time is fixed.

How do we know if a project has an AI contribution policy?

Check CONTRIBUTING.md. Search for “LLM,” “AI,” “generative,” or “Copilot.” Also check PR templates (.github/PULL_REQUEST_TEMPLATE.md) and SECURITY.md. If nothing surfaces, the project is effectively O3 by default.

What is the difference between the LLVM and EFF policies?

Both are O2 (Boundary-and-Accountability) but their accountability mechanism differs. LLVM’s accountability is demonstrated at review time: can you defend any line in discussion? EFF’s is embedded in the submission artefact itself: human-authored comments make understanding visible in what is submitted.

LLVM also prohibits specific use cases — AI agents acting autonomously, AI for “good first issues” — while EFF’s scope is narrower and framed through a civil liberties lens.

What is the arXiv 2603.26487 paper and why is it the reference for this framework?

“Beyond Banning AI: A First Look at GenAI Governance in Open Source Software Communities” by Yang, He, and Zhou (Peking University, March 2026) is the first systematic qualitative study of AI contribution governance across a large OSS project corpus — 67 highly visible projects — yielding the three orientations and twelve strategies. No equivalent framework exists in practitioner or analyst literature.

What is criteria-based gating and is it available now?

Criteria-based gating (Strategy B2) would require a PR to be linked to a pre-existing, triaged issue before it can be opened. The proposal is tracked in GitHub Community Discussion #185387. GitHub has shipped some relief features but criteria-based gating isn’t generally available yet.

Can I just add a disclosure checkbox to our PR template and call it done?

A disclosure checkbox is A2 in isolation — the minimum viable O2 implementation. It doesn’t create accountability, filter quality, or reduce volume on its own. Pair it with an explicit accountability statement and quality gates that apply regardless of AI disclosure status.

How does the Ghostty invitation-only model differ from just closing the project to outside contributors?

Invitation-only changes who can submit pull requests — not who can use, fork, or raise issues. New contributors can still submit patches by publishing them publicly and asking trusted contributors to pull them. The Vouch system makes the trust graph explicit so the inner circle can grow deliberately.

What happens when a project changes orientation mid-stream?

Policy changes create friction. Developers contributing under O3 norms may resist new O2 requirements. The most successful transitions include a clear public explanation of the reason, a transition period, and acknowledgement that the goal is sustainability, not restriction for its own sake. LLVM, EFF, and Ghostty all published explanatory posts alongside their policy changes.

Why AI Pull Requests Cost More Than They Contribute to Open-Source Projects

Something has changed in how open-source contributions work. AI tools mean anyone can generate a formally correct pull request in minutes — sometimes seconds. The cost of submitting has basically hit zero. The cost of reviewing hasn’t moved at all.

That is a structural cost asymmetry and it is putting real pressure on maintainer bandwidth across major projects. Daniel Stenberg ended curl‘s seven-year bug bounty programme. tldraw auto-closes all external AI PRs. Ghostty moved to invitation-only contributions.

The uncomfortable truth is that good-faith AI contributions and bad-faith ones impose the same triage cost. The reviewer’s workload is identical either way. This article explains why — and it’s the entry point to the broader open-source supply chain risk landscape, which extends to dependency security and licence compliance.


Why did the cost of contributing to open source just collapse?

GitHub’s pull request model was always designed to lower barriers. Before it, contributing meant subscribing to mailing lists, learning community norms, and formatting patches correctly. That friction filtered for real engagement.

The push-based contribution model meant any external actor could submit code to any public repository with no prior relationship required. Code first, conversation later. AI tools then removed the last meaningful friction gate — before LLMs, generating a contribution still required actually reading the codebase and demonstrating some engagement. That time cost filtered for genuine interest.

The numbers show the result. Excalidraw received more than twice as many PRs in Q4 2025 as in Q3. Curl’s security report queue hit AI slop rates above 20% by mid-2025, averaging about two AI-generated reports per week.

Vibe coding is the contributor-side practice producing this volume: AI prompts to generate code without deep codebase understanding, submit, and move on. It weakens the user engagement through which many maintainers earn returns — the kind of engagement that historically made reviewing worthwhile.

That cost gets transferred somewhere. It lands on the reviewer.


What does it actually cost a maintainer to review an AI-generated pull request?

Reviewing a pull request means reading code, understanding intent, running tests, checking alignment with project direction, and formulating feedback or a rejection rationale. None of those steps can be skipped without risk.

A calculation that has circulated widely puts the asymmetry in concrete terms: a contributor spends roughly 7 minutes generating a vibe-coded PR while a maintainer spends roughly 84 minutes reviewing it. That’s a 12× cost multiplier, borne entirely by the person who did not initiate the transaction.

The curl security team consists of seven members; every report engages 3–4 of them, sometimes for 30 minutes, sometimes for several hours. At peak, eight reports arrived in a single week. Stenberg’s team was spending multiple days per week on triage before he ended the bug bounty — one that had been running since 2019, paid over $90,000 in awards, and fixed 81 genuine vulnerabilities.

And even rejected contributions aren’t free. Triage — opening, scanning, categorising, and closing a PR — takes 5–15 minutes per item. Review cost does not scale with generation cost. The two are structurally independent.

The real-world incidents that prove this dynamic — the curl bug bounty shutdown, tldraw’s auto-close policy, the Node.js 19,000-line Claude Code PR — each reflect the same arithmetic playing out at scale.


What is extractive contribution, and why does it apply even to good-faith AI PRs?

“Extractive contribution” comes from Nadia Eghbal’s Working in Public (2020): a contribution where the marginal cost of reviewing and merging it exceeds the marginal benefit to the project. LLVM operationalised this in their December 2025 AI Tool Use Policy with a golden rule: a contribution should be worth more to the project than the time it takes to review it. That standard applies regardless of contributor intent.

The ease of creation adds a burden to the maintainer because there is an imbalance of benefit: the contributor gets the credit, the maintainer gets the maintenance burden. Good-faith contributors using AI tools produce exactly the same review burden as bad-faith ones.

tldraw’s experience illustrates this perfectly. Early AI PRs at tldraw all looked good — formally correct, tests passing. Problems only emerged when patterns of abandonment and wrong-directedness became apparent: the AI had taken the issue at face value without understanding the codebase. The contributor believed they were helping. The result was still extractive.

This is the distinction most coverage misses: because intent cannot be determined without performing the review, the distinction between good-faith and bad-faith AI contributions does not change the maintainer’s workload. The problem isn’t about malicious intent — it is about where the cost falls.


How does vibe coding break the loop that made open source work?

Traditional open-source contribution followed a clear cycle. A developer used a project, hit a problem, engaged with the issue tracker, understood the maintainer’s priorities, and submitted a fix. The submission arrived with embedded project knowledge — and a contributor relationship had begun.

That cycle produced two things at once: a potential contribution and an engaged community member. Vibe coding severs this loop entirely. Building software without directly reading documentation, reporting bugs, or engaging with maintainers means the submission arrives without the context that made it valuable.

Steve Ruiz described his own first major open-source contribution as requiring sustained engagement, issue history reading, and multiple iterations. His framing: context and alignment must come first; implementation is almost ceremonial once those are in place. Vibe coding inverts this entirely.

LLVM explicitly forbids using AI tools to fix issues labelled “good first issue”. These issues exist to grow the contributor base through learning. Outsourcing that learning produces a PR without producing a contributor. As LLVM put it: “Passing maintainer feedback to an LLM doesn’t help anyone grow, and does not sustain our community.”


What is the Eternal September problem, and why are we living it again?

In September 1993, AOL connected millions of new users to Usenet permanently — new, norm-unaware users kept arriving at scale without the infrastructure to absorb them. It became the September that never ended.

GitHub applied this metaphor to the current AI PR surge in February 2026: a continuous inflow of low-context contributions that existing norms and tooling were not designed to handle. Unlike the original surge — a one-time event — AI tool adoption keeps accelerating.

Steve Ruiz sharpened the metaphor with “Eternal Sloptember.” AOL users were human newcomers who could eventually learn norms. AI-generated contributions have no learning mechanism. A rejected AI PR provides no feedback that prevents an identical one tomorrow. The volume compounds without correction.

Usenet did not recover its pre-surge character. The individual project responses — tldraw’s auto-close, LLVM’s policy, Ghostty’s invitation-only model — address the symptoms. The structural shift in contribution economics is what they are responding to.


Push vs. pull: what would a healthier contribution model look like?

When contribution generation cost approaches zero, the push model becomes a mechanism for imposing attention cost on maintainers at zero cost to themselves. The maintainer cannot refuse receipt.

The pull-based model inverts the flow: a maintainer identifies a need, opens a discussion, and invites code after agreement on problem and approach. Friction reappears at the social layer.

Real projects have implemented this. Ghostty moved to invitation-only contributions. LLVM’s issues-first guidance effectively implements a pull model at the policy layer. Mitchell Hashimoto’s Vouch project implements trust management where contributors must be vouched for before participating. Steve Ruiz narrowed tldraw’s community contribution to reporting, discussion, and perspective — the parts where human engagement still has value.

The tradeoffs are real. A pull model requires more maintainer time upfront, reduces serendipitous contributions, and may deter valid contributions from new contributors. But it changes where friction is positioned so that costs and benefits stay aligned.

For platform-level changes underway, GitHub has already shipped repo-level pull request controls — and is exploring criteria-based gating that requires a linked issue before a PR can be opened.


Where does this leave the maintainers bearing the cost?

The cost asymmetry is not self-correcting. The gap between contribution generation cost (trending toward zero) and review cost (fixed) will widen as AI tools become more capable.

The high-profile cases — curl, tldraw, Ghostty — represent a small fraction of the real distribution. The problem is distributed across thousands of less-visible projects maintained by people without the platform to make their situation public. Understanding how AI-generated contribution pressure propagates through your dependency stack is the starting point for treating this as a supply-chain risk question rather than a community etiquette one.

RedMonk analysed the generative AI policies of 77 open source organisations and found stances mapping as permissive, ban, or undecided — with no consensus standard. Each project has developed its own approach independently.

The next escalation layer extends beyond individual PRs. Cloudflare rebuilt Next.js with AI in approximately one week — what the Evilginx author Kuba calls “slop-forking.” If AI can regenerate a codebase with minimal attribution, the economic case for maintaining permissive licences weakens. The legal frameworks for AI-mediated derivation under MIT or Apache 2.0 are not settled.

If your team consumes open-source dependencies, this is not a spectator problem. The libraries your products depend on are maintained by people operating under exactly these pressures. Rising issue response times, maintainers posting about unsustainable workloads, policy changes like auto-close or invitation-only models — these are the signals to watch.

For the full risk picture across the open-source supply chain and how OSS communities are responding with governance policies, the structural question is the same: who bears the cost, and for how long. For a complete open-source supply chain risk overview covering all six risk dimensions — from this cost asymmetry mechanism through to dependency risk management and contributing back — see the full guide.


FAQ

Is AI-assisted coding the same as vibe coding?

No. AI-assisted coding uses AI as a tool within a workflow the developer understands and controls — they review, validate, and are accountable for the output. Vibe coding uses AI to generate code without deep understanding of what it produces or the codebase it targets.

LLVM’s human-in-the-loop policy formalises the line: contributors must read and review all AI-generated content before asking others to review it. Steve Ruiz’s position: “If you know the codebase and know what you’re doing, writing great code has never been easier.” The distinction is contributor knowledge, not tool use.

Can AI PRs ever be good for a project?

Yes — but the bar is higher than it sounds. The difference between high-quality AI-assisted research and AI slop is expertise. AISLE found 12 zero-days in OpenSSL using AI-powered analysis tools. Joshua Rogers’ AI-assisted security research on curl produced around 50 merged fixes. Both used AI to amplify deep knowledge, not replace it.

Using “can AI PRs be good” to sidestep the harder structural question misses the point. Even if some AI PRs are good, the aggregate cost of processing the volume of bad and mediocre ones imposes net harm on the project.

What counts as a good-faith AI contribution?

Good faith alone is not sufficient. LLVM’s policy is the most explicit standard available: contributors must personally review and understand every line of AI-generated content and answer questions about it during review. The submission must represent the contributor’s own work. Note tool usage in the PR description or commit message.

In practice: deep familiarity with the project, prior issue-tracker engagement, a problem statement discussed with maintainers, and personal accountability.

Why can’t maintainers just ignore low-quality PRs?

Ignoring a PR is not neutral. An open PR signals ongoing consideration; an ignored one creates ambiguity about project responsiveness. Ignoring also requires assessment — reading enough to know a PR can safely be ignored is a subset of full review cost, not zero. tldraw found the volume made “just ignore them” unsustainable — auto-closing the entire class was cheaper than triaging individually.

What is “death by a thousand slops”?

A phrase coined by Daniel Stenberg describing the cumulative effect of individually small but collectively overwhelming AI-generated submissions. Each report might consume 10–15 minutes — manageable in isolation. Eight such reports in a week, each engaging 3–4 team members for up to several hours, on a team where members have only a few hours per week for curl, is not.

What is the “push-based contribution model” and why is it a vulnerability?

The push-based model is GitHub’s standard PR paradigm: any external actor can fork a repository and submit a PR without prior permission or relationship. Before GitHub, contributing required mailing lists and patch formats — barriers that filtered for engagement. When generation cost drops to zero, this model becomes a one-way attention extraction mechanism.

What is “slop-forking” and why should your team pay attention?

Slop-forking is using AI to consume an entire open-source codebase and produce a derivative product. Cloudflare rebuilt Next.js with AI in approximately one week — the most prominent current example. The unresolved question: if AI regenerates source code entirely, does the original open-source licence still apply? If the legal frameworks do not constrain this, the incentive to maintain permissive open-source projects weakens for everyone.

Are maintainers overreacting to AI pull requests?

No. The reactions are proportionate to documented volume increases. And even if AI-generated code quality improves to human parity, the generation cost approaches zero while the review cost remains fixed. The structural imbalance does not require quality to be an issue.

How does this affect organisations that consume open-source dependencies?

Directly. A maintainer under the pressure Stenberg’s team was experiencing is an abandonment risk, and abandoned dependencies accumulate technical debt and security exposure. Watch for rising issue response times, maintainers posting about unsustainable workloads, and policy changes like auto-close or invitation-only models. Require human-in-the-loop review before any AI-assisted PR goes to an external project.

What is the “Eternal September” problem in open source?

In September 1993, AOL connected mainstream internet users to Usenet, permanently overwhelming community norms with norm-unaware newcomers. The community never recovered. GitHub applied the metaphor to the current AI PR surge in February 2026. What makes the AI version sharper: unlike AOL users who could learn norms over time, AI-generated contributions have no learning mechanism. A rejected AI PR provides no feedback that prevents an identical one tomorrow. And unlike the original surge, AI tool adoption keeps accelerating.

What AI Governance Actually Requires and Why Most Policies Fall Short

Most organisations treat AI governance as a documentation problem. They write a policy, circulate it, and consider the work done. The gap between what the policy says and what AI systems are actually doing in production is where risk accumulates. McKinsey’s 2025 State of AI survey found nearly nine in ten organisations are using AI regularly, yet most have not begun scaling it with mature governance in place.

This guide maps the full governance terrain: from diagnosing shadow AI, through building an operating model and assigning accountability, to enforcing rules at runtime and satisfying regulators. Each section links to a detailed article.

In This Series

Why do most AI governance policies fail to actually control risk?

Most AI policies fail because they describe intent but cannot enforce controls. A policy tells employees what they should do — it cannot stop an agent from accessing data it should not touch, or catch a model drifting toward outputs that no longer match approved behaviour. Static policy applied to dynamic systems is an architectural failure, not a compliance gap that more documentation resolves.

Three blind spots compound the gap: visibility (you do not know all the AI tools in your environment), ownership (systems with no assigned human owner cannot be governed), and decision authority (it is unclear who can stop an AI system when something goes wrong). 99% of organisations report financial losses from AI-related risks, with 64% exceeding $1 million (EY, 2025). Agentic AI makes this larger still — a policy reviewed before deployment cannot anticipate what an autonomous agent will do six months later. Once you accept governance requires infrastructure, the first question is: what do you not know about?

What is shadow AI and why is it the biggest governance gap in most organisations?

Shadow AI is any AI tool, model, or agent operating in your organisation without formal approval, oversight, or a defined owner. Your policy does not apply to systems you do not know about — and AI tools carry more risk than traditional shadow IT because they act on data, produce outputs used in decisions, and take autonomous actions in production systems. The governance consequence is the same regardless of scale.

Reco.ai’s 2025 State of Shadow AI Report found 71% of office workers admit to using AI tools without IT department approval. The “Bring Your Own Agent” pattern makes things worse — as ArmorCode’s Nikhil Gupta puts it, employees “need 20 minutes and a credit card” to deploy an autonomous agent with no owner and no approval record. You need a continuously updated inventory before any governance structure can take effect — and that inventory is the foundation for building a model that governs what you find.

What does an AI operating model actually require?

An AI operating model defines who approves AI systems, who owns them in production, what controls apply at each risk level, and how governance is enforced — not merely documented. It integrates people, processes, and technology so that governance is embedded into how AI is adopted and operated, rather than retrofitted after systems are already running without oversight.

Most organisations have policies that express intent. An operating model translates that intent into repeatable decisions and enforceable controls. McKinsey found fewer than 10% of AI use cases make it out of pilot mode — the operating model gap, not the technology gap, is the primary constraint. The model needs to be proportionate: rigorous enough to catch high-risk AI, lightweight enough not to slow down teams using low-risk tools. With the model in place, the next question is: who owns the decisions these systems make?

Who is accountable when enterprise AI causes a business mistake?

In most organisations, the answer is unclear — and that ambiguity is itself a governance failure. Accountability requires defined ownership at three levels: who owns the AI system, who owns the decision it informed, and who has authority to stop the system when something goes wrong. Without explicit assignment, accountability defaults to no one — which means no one is monitoring, and no one acts when a problem surfaces.

The Air Canada chatbot case showed what this looks like — a tribunal held the airline liable when its AI provided outdated fare information. Three structures matter: a governance lead with cross-functional authority, clear business-unit ownership per AI application, and defined stop authority — the right to suspend or roll back a system without multi-team approval. Accountability also determines what happens at runtime, because someone has to own the enforcement layer.

What is runtime AI governance and how is it different from policy governance?

Runtime AI governance means enforcing controls at the moment an AI agent acts — in production, in real time — rather than through policy review before deployment or audit after the fact. It includes prompt firewalling, identity and least-privilege enforcement, behavioural monitoring, egress controls, and continuous audit-trail generation. Policy governance describes what should happen; runtime governance enforces what does happen, against live system behaviour.

The distinction matters most for agentic AI. Agents fail differently from traditional software: a broken API call throws an exception, but an agent reasoning failure produces confident, plausible output that is wrong — no error, no alert, no log entry. In multi-agent workflows, bad output becomes the next agent’s input. Yet only 48% of organisations monitor their production AI systems for accuracy, drift, and misuse (Gradient Flow, 2025). Without a continuous record of what agents are doing, there is no enforcement surface — which raises the question of how you detect what is running outside your governance entirely.

How do you detect shadow AI and create sanctioned pathways that employees will actually use?

Shadow AI detection requires two things operating in parallel: technical discovery (scanning network traffic, SaaS usage logs, development environments, and software supply chains for undeclared AI) and a sanctioned pathway that makes the approved route faster and less friction-heavy than going around it. Detection without a viable alternative drives shadow AI underground rather than eliminating it from your governance surface.

Passive shadow AI — employees using unauthorised apps — is findable through SaaS usage monitoring. Active shadow AI — agents deployed without IT knowledge, MCP servers introduced by individual developers — requires deeper supply chain scanning. Reco.ai found that shadow AI tools become entrenched, with some running for over 400 days before detection. Blanket blocking just drives usage underground. Sanctioned pathways — a fast-track approval process, an approved tool catalogue, self-service provisioning — give employees a governed alternative that does not impede their work. Once that equilibrium exists, you need a way to tell whether the programme is actually reducing risk.

How do you know whether your AI governance programme is actually working?

Most organisations are measuring the wrong things. Usage counts — number of approved AI tools, number of employees trained — describe activity, not outcomes. The metrics that indicate governance health are different: shadow AI coverage rate, policy violation rate in production, time from incident detection to resolution, and the ratio of sanctioned to unsanctioned AI usage over time.

Every ungoverned system accumulates governance debt — risk that surfaces at the worst possible moment. As David Talby of John Snow Labs puts it: “Organisations without auditable oversight across AI systems will face higher costs, whether through fines, forced system withdrawals, reputational damage, or legal fees.” The “say-do ratio” — how often AI systems behave consistently with the policies written for them — is a useful diagnostic. Only 30% have deployed generative AI to production with documented governance (Gradient Flow, 2025). Proactive measurement provides the evidence base regulators are starting to require.

What do the EU AI Act, NIST AI RMF, and ISO 42001 actually require your organisation to do?

The three frameworks converge on the same core requirements: know what AI systems you are operating, classify them by risk, assign ownership and accountability, implement proportionate controls, generate evidence that governance was applied, and monitor AI behaviour after deployment. The EU AI Act makes these requirements binding for high-risk AI. ISO 42001 makes them certifiable. NIST AI RMF structures them as voluntary operational practice.

The EU AI Act enters general application on August 2, 2026. High-risk systems must comply with conformity assessment, documentation, human oversight, and post-market monitoring. Penalties reach 35 million euros or 7% of global turnover, and the Act applies regardless of where you are incorporated if your AI affects people in the EU. This is not only a European concern — Colorado’s AI Act takes effect June 30, 2026, and California and Texas have passed their own requirements. Cross-framework mapping avoids duplicating effort: an AI inventory satisfies EU AI Act registration, NIST AI RMF’s Map function, and ISO 42001 clause 8.4 simultaneously.

Resource Hub: AI Governance Library

Understanding the Governance Gap

Building Governance Infrastructure

Measuring and Reporting Governance Health

Frequently Asked Questions

What is the difference between an AI policy and AI governance that actually works?

An AI policy is a document that describes what your organisation intends. AI governance that works is the infrastructure — operating model, accountability structures, runtime enforcement, and measurement — that makes those intentions enforceable at scale. Most organisations have the former; few have the latter. The gap between them is where the risk accumulates.

Do I need to comply with the EU AI Act if my company is based outside Europe?

If your AI systems affect people in the EU — including SaaS products that make consequential decisions for EU customers — the EU AI Act applies regardless of where you are incorporated. Most companies will need at least a basic compliance assessment before August 2026.

What is compliance theatre in AI governance?

Governance activities that produce the appearance of control without the substance: annual AI policy sign-offs, usage surveys without enforcement, governance committees with no authority to stop a deployment. These are programmes that satisfy auditors on paper but would not survive a real incident inquiry.

How many AI tools is the average enterprise running without IT approval?

More than most organisations expect. The 2025 State of Shadow AI Report found smaller companies are hit hardest — those with 11 to 50 employees averaged 269 unsanctioned AI tools per 1,000 employees, and some tools ran for over a year before detection. If you have not actively inventoried your AI usage, do not assume the answer is close to zero.

What is an AI operating model — do I need one at 200 employees?

An AI operating model is the system that governs how AI is adopted and managed in your organisation. At 200 employees you almost certainly need one — the question is how lightweight it can be while still addressing genuine risks. At minimum: an inventory process, a risk classification, defined approval pathways, and ownership assignment for every AI system in production.

Can AI governance and developer productivity coexist?

Yes — but only when governance is designed as an enablement function rather than a gatekeeping function. IBM compressed its AI project approval process from weeks to five minutes by embedding compliance checks directly into the provisioning platform. Governance that gives developers a safe, fast lane reduces shadow AI without reducing output.

What the EU AI Act NIST and ISO 42001 Actually Require Organisations to Do

Boards and legal teams are asking harder questions about AI. The internal risk argument — “something might go wrong” — stopped moving budgets a while ago. What cuts through is external obligation: “we are legally required to act, and the penalties are specific.” Three frameworks now give you exactly that external obligation. The EU AI Act (Regulation EU 2024/1689) is binding law with extraterritorial reach. The NIST AI Risk Management Framework is a voluntary US standard that has quietly become a global reference point. ISO/IEC 42001 is a certifiable international management system standard you can implement right now. This article is part of our series on what enterprise AI governance actually requires in practice — and it translates what each framework actually requires into CTO-level action items, shows where they overlap, and demonstrates that one governance investment satisfies all three. The underlying reality — “polyrisk” — is that regulatory, reputational, operational, and legal AI risks compound each other. These frameworks exist because that compounding is real.

What Is the Difference Between AI Governance and AI Compliance?

AI governance is your internal management system — how your organisation assigns accountability, makes decisions about AI, and enforces controls day to day. AI compliance is the external demonstration of that management: evidence presented to regulators, auditors, or customers that governance exists and functions.

Both need the same underlying structures: named roles, documented processes, risk registers, and incident response authority. Without genuine governance structures underneath, compliance artefacts cannot be produced — and any evidence you present to regulators won’t hold up under scrutiny.

The EU AI Act, NIST AI RMF, and ISO/IEC 42001 describe the same internal governance structures from different angles — law and enforcement, operational risk management, and certifiable management system respectively. For a board presentation, here’s how to frame it: governance is the operating model investment; compliance is the return. The polyrisk concept makes this concrete — a regulatory breach triggers reputational damage, which triggers customer churn, which triggers legal exposure. One governance programme, including how ISO 42001 blueprints an AI operating model, addresses all of it.

What Does the EU AI Act Actually Require of Companies Deploying AI — Not Just Building It?

Here’s the assumption most SaaS companies make: “We use third-party AI APIs, so we’re not covered.” That assumption doesn’t hold. The EU AI Act distinguishes between providers and deployers. A provider develops and places an AI system on the market under their own name. A deployer uses a third-party AI system under their own authority. A SaaS company embedding OpenAI or Anthropic into its product is likely acting as both simultaneously — deployer of the base model and provider of the combined product.

The enforcement timeline is already partially in effect. AI literacy obligations (Article 4) became applicable in February 2025. GPAI model rules entered force in August 2025. High-risk AI system obligations become fully enforceable in August 2026.

High-risk AI systems are defined in Annex III across eight sectors — employment and workforce management (hiring and performance tools), biometric identification, essential services, and education among them. If your product touches any of these, the full obligation set applies: risk management system, technical documentation, human oversight, data governance, and conformity assessment.

The penalty numbers frame the board conversation nicely. Fines reach €35M or 7% of global turnover for prohibited AI practices; up to €15M or 3% for high-risk violations. For a 200-person SaaS company with €15M ARR, 3% is €450,000. That’s a material number.

Shadow AI is where a lot of teams miss exposure they didn’t know they had. Engineering teams shipping internal LLM plug-ins to external users may be acting as GPAI providers — and the August 2025 GPAI enforcement deadline has passed. The starting point is clarifying how EU AI Act requirements translate to accountability structures for each role your organisation actually occupies.

What Does the NIST AI Risk Management Framework Say About Accountability and Operating Models?

The NIST AI RMF is a voluntary US framework — no legal force. But it has become a de facto global reference standard adopted across jurisdictions, and the structures it describes are the same structures the EU AI Act requires.

The framework is built around four functions: Govern (establish accountability structures, policies, and AI risk culture); Map (understand your AI systems and their context); Measure (analyse and monitor risks, performance, and bias); Manage (prioritise and respond to risks in your workflows).

The Govern function is where the real accountability work lives: policies, roles with decision authority, risk oversight processes. Stop authority — the formally assigned right to halt an AI system in production — is the operational output of Govern. NIST demands that someone is accountable for each governance decision. That accountability maps directly to the structures in how ISO 42001 blueprints an AI operating model.

There’s a practical efficiency worth knowing about here. NIST has published a crosswalk to ISO 42001 clauses. Risk assessments using NIST guidance serve as direct evidence for ISO 42001 audits — one set of work, two frameworks served. The AI governance execution requirements these frameworks share — inventory, accountability assignment, monitoring — map directly to the operational practice the series covers.

What Does ISO/IEC 42001 Actually Require, and Does Your Company Need to Care?

ISO/IEC 42001 is the management system layer that ties the whole programme together. It is the first global AI Management System (AIMS) standard — structured like ISO 27001 for information security, which is familiar territory for most technical leadership teams. Published in December 2023, it defines requirements for establishing, implementing, and continually improving a formal AI management system.

In practical terms: define the scope of your AI activities; appoint responsible roles with defined authorities; conduct risk and impact assessments across the AI lifecycle; maintain an AI system register; establish continual improvement processes.

Certification is voluntary. ISO 42001 is not yet harmonised under the EU AI Act and does not confer automatic presumption of conformity. But what you build when you implement it is the documented governance infrastructure EU AI Act compliance requires. Certification provides third-party verification for auditors and enterprise customers.

For a 50–500 person company without certification intent, ISO 42001 gives you a ready-made blueprint that’s faster to adapt than to build from scratch. The key clause mappings are worth knowing: Clause 6.1 (risk and impact assessment) maps to EU AI Act Article 9; Clause 5 (leadership) maps to the accountability structures required by NIST Govern function; Clause 7.2 (competence) addresses the Article 4 AI literacy obligation.

How Do You Satisfy Multiple Frameworks Without Duplicating Effort?

The EU AI Act, NIST AI RMF, and ISO 42001 share a common structural core: risk identification, accountability assignment, documented controls, and ongoing monitoring. A single governance programme implemented against one framework simultaneously satisfies the others.

Risk-tiered governance is the synthesis: lightweight checks for low-risk AI systems; rigorous documentation, human oversight, and conformity assessment for Annex III high-risk systems. This mirrors the EU AI Act’s risk tiers, ISO 42001’s risk-based approach, and NIST AI RMF’s Map function all at once.

Here’s how the crosswalk works in practice. One AI system inventory satisfies EU AI Act registration requirements, ISO 42001 Clause 8.4, and NIST AI RMF Map function simultaneously. One risk assessment document serves all three. The internal AI working group — legal, engineering, product, compliance — is the organisational structure that both ISO 42001 Clause 5 and the EU AI Act’s deployer obligations require. Build it once. The sequenced path is straightforward: AI system inventory → provider vs. deployer classification → risk tier classification → accountability role assignment → monitoring.

The stop authority test makes for a sharp board presentation. Ask the room: “Who in this organisation has formal authority to halt an AI system in production right now if it is causing harm?” If no one can answer in ten seconds, your organisation is simultaneously non-compliant with NIST AI RMF Govern, EU AI Act human oversight obligations, and ISO 42001 Clause 6.1. One question that exposes the shadow AI governance gap across all three frameworks at once.

How Do You Use Regulatory Requirements to Make the Case to Your Board?

Internal risk arguments have a ceiling. External obligation arguments land differently. Regulation provides the urgency that governance investment needs to clear board-level scrutiny.

The penalty exposure is concrete. For a 200-person company with €15M ARR, EU AI Act fines are material — €450K for high-risk violations, over €1M for prohibited AI practices. Add product launch delays for non-compliant products and the risk profile becomes a straightforward board conversation.

Market access is the strategic argument. EU market access increasingly requires demonstrated AI Act compliance. Enterprise procurement and client due diligence questionnaires already reference ISO 42001 and NIST AI RMF. Governance is becoming a commercial gate, not just a regulatory one.

The confidence gap is worth surfacing too. EY data shows 82% of executives believe their existing policies protect against unauthorised AI use; only 14.4% of organisations have full security approval for AI agent deployment (Gravitee). The gap between what executives believe and what’s actually in place is a governance liability your board is carrying without knowing it.

The investment framing is simple: bounded vs. unbounded. The cost of a minimum viable governance programme — AI inventory, role assignment, risk classification, accountability matrix — is bounded. The cost of a regulatory enforcement action or a reputational incident from an ungoverned high-risk AI system is not. For a complete overview of what enterprise AI governance actually requires in practice across all dimensions — operating model, accountability, runtime enforcement, and measurement — see the series overview.

FAQ

Does the EU AI Act apply to my company if we are based outside the EU?

Yes — any organisation serving EU-based users is subject to the Act, regardless of where you’re based. It’s the same extraterritorial logic as GDPR. US-headquartered SaaS companies with EU customers are in scope.

What is the difference between a provider and a deployer under the EU AI Act?

A provider develops and places an AI system on the market under their own name. A deployer uses a third-party system under their own authority. A SaaS company embedding a third-party LLM may simultaneously be both. Providers carry the heavier burden: technical documentation, conformity assessment, CE-marking for high-risk systems.

When do EU AI Act obligations become legally enforceable?

Prohibited AI systems banned and AI literacy (Article 4) applicable: February 2025. GPAI model rules in force: August 2025. High-risk AI system obligations fully enforceable: August 2026. All remaining systems: August 2027.

Is ISO/IEC 42001 certification mandatory for EU AI Act compliance?

No. ISO 42001 is voluntary and not a harmonised standard — it does not confer automatic presumption of conformity. But implementing its management system builds the governance infrastructure EU AI Act compliance requires. Certification provides third-party verification of that infrastructure.

Do I need to implement both NIST AI RMF and ISO 42001, or is one sufficient?

Neither is legally required for most growth-stage SaaS companies. Implementing ISO 42001 satisfies most NIST AI RMF guidance through shared structural requirements. For resource-constrained teams, ISO 42001 plus the NIST crosswalk is the most efficient path to multi-framework coverage.

What counts as a high-risk AI system under the EU AI Act?

Annex III defines eight sectors: biometric identification, critical infrastructure, education, employment/workforce management (including recruitment tools), essential services, law enforcement, migration, and administration of justice. AI systems used in these sectors for the specified purposes must meet the full high-risk obligation set.

What are the EU AI Act penalties for non-compliance?

Prohibited AI practices: up to €35M or 7% of global annual turnover. High-risk system violations: up to €15M or 3%. Incorrect information to authorities: up to €7.5M or 1%. For SMEs and start-ups, the lower of the absolute or percentage figure applies.

What is the minimum I need to do before the August 2026 EU AI Act deadline?

Build an AI system inventory first — identify all AI systems in use and classify each as provider, deployer, or both. Classify against the EU AI Act risk tiers. For Annex III high-risk systems, begin risk management documentation and human oversight design. The Article 4 AI literacy obligation has been in effect since February 2025. Each high-risk system needs a named accountable owner before August 2026.

What does it mean to give someone “stop authority” over an AI system?

Stop authority is the formally assigned right of a named individual to pause, halt, or roll back an AI system in production without escalation. EU AI Act Article 14 requires deployers to design for technically feasible human intervention. Under NIST AI RMF and ISO 42001, stop authority is the operational test of whether governance is real — if no one has explicit halt authority, governance documents are theoretical.