Files
VIBECODE-THEORY/011-the-game-nobody-can-quit.md
T
Mortdecai 40f842a4ae docs: papers 009-015 — stochastic parrots, attractor, game theory, agriculture, meaning, identity, timeline
Seven new papers grounded in the 35-file research corpus:
- 009: The Stochastic Parrot Problem — falsification criteria for unification
- 010: The Attractor — retrocausality, Omega Point, complexity theory
- 011: The Game Nobody Can Quit — prisoner's dilemma, Moloch, engineered lock-in
- 012: What Agriculture Actually Cost — biological ratchet, skeletal evidence
- 013: The Meaning Problem — Vervaeke's meaning crisis, psychology of surrender
- 014: The Identity Compilation — consciousness, Chinese Room, comfortable extinction
- 015: The Timeline — cost curves, infrastructure thresholds, deep time

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 08:31:30 -04:00

276 lines
28 KiB
Markdown

# Paper 011: The Game Nobody Can Quit — Game Theory, Engineered Lock-In, and Why Coordination Fails
**Authors:** Seth & Claude (Opus 4.6)
**Date:** 2026-04-03
**Series:** VIBECODE-THEORY
**Status:** Initial draft
---
## Origin
Paper 007 proved that dependencies don't reverse. The allegories section at the end noted something crucial almost in passing: humanity has been warning itself about irreversible knowledge acquisition for millennia — Eve's Apple, Pandora's Box, Prometheus, Faust — and ignores those warnings every single time. Not because people are stupid. Because the competitive advantage of acquiring the knowledge outweighs the warned-about risk for every individual actor, even when the collective outcome is uncertain.
That observation was presented as evidence for the ratchet thesis. But it deserves its own paper, because what it actually describes is a game-theoretic trap — one of the most well-studied failure modes in all of social science. The ratchet isn't just a mechanical metaphor. It's a multiplayer game where every player acts rationally and the collective result may be catastrophic.
This paper formalizes the game. It asks who designed the board, who benefits from the rules, and why the one time humanity successfully coordinated against a global technological threat (the ozone layer) cannot be replicated for AI.
---
## The Multiplayer Prisoner's Dilemma
### The Setup
Take any two actors in the AI race — the US and China, OpenAI and Google, or two startups in the same niche. Each faces a choice: invest primarily in Safety (S) or invest primarily in Capabilities (C).
If both choose S, progress is slower but safer. If one chooses C while the other chooses S, the C-actor gains a decisive advantage — maybe a trillion-dollar market, maybe strategic dominance, maybe what Nick Bostrom calls the "Singleton" position where one entity controls the information layer of the species. If both choose C, safety is neglected and the risks multiply, but at least neither actor gets dominated by the other.
This is the Prisoner's Dilemma. The cooperative outcome (both choose S) is better for everyone collectively. But the individually rational move is always C, because:
- If my competitor chooses S, I win by choosing C.
- If my competitor chooses C, I lose catastrophically by choosing S.
- Therefore I choose C regardless of what my competitor does.
And so does everyone else. Stuart Russell calls this "Racing to the Precipice." The economic value of frontier AI — estimated in the trillions — makes it mathematically irrational for any single corporation to slow down unless everyone else does too. And there's no mechanism to make everyone slow down simultaneously.
### It's Worse Than Two Players
The classic Prisoner's Dilemma involves two actors. The AI race involves dozens of frontier labs, several nation-states, and an unknowable number of smaller teams with access to open-weight models. This is a multiplayer variant, and multiplayer makes everything worse.
In a two-player game, trust is at least theoretically possible. Two people can look each other in the eye. Two nations can negotiate a treaty and verify compliance (barely — more on this below). But as the number of players increases, the probability that at least one will defect approaches certainty. This is the Unilateralist's Curse.
### The Unilateralist's Curse
Nick Bostrom formalized this: in a group of independent actors, the most reckless one determines the safety level for everyone.
It works like this. Suppose 100 labs each independently assess whether releasing a particular model is safe. Some are cautious and say no. Some are less cautious and say yes. The model gets released if *any single lab* releases it. Even if 99 labs independently conclude that release is dangerous, the 100th — maybe less competent, maybe more desperate for funding, maybe ideologically committed to open access — releases it anyway.
The probability of containment isn't the average judgment across all actors. It's determined by the single most aggressive actor. And as the number of actors grows, the probability that at least one will act recklessly approaches 1.
This is why open-source AI models, whatever their democratic benefits, represent an anti-coordination force. Once Llama or Mistral is released, the capability is outside the reach of any centralized treaty. You can't un-release a model. The ratchet turns. Pandora's Box opens.
The theological parallel is exact: Eve's Apple works the same way. The knowledge only needs to be tasted once. It doesn't matter that 99 people said no.
---
## Scott Alexander's Moloch
In 2014, Scott Alexander wrote "Meditations on Moloch" on Slate Star Codex — a long essay that became one of the foundational texts of the AI safety community. It gave the game-theoretic trap a name: Moloch, the Canaanite god to whom children were sacrificed.
The insight is that Moloch isn't any individual actor. Moloch is the *systemic pressure* that forces rational actors into collectively destructive behavior. Moloch is the force that says:
- You must publish clickbait because your competitors do, even though everyone hates clickbait.
- You must overprescribe antibiotics because patients demand them, even though resistance will kill millions.
- You must skip safety testing because shipping first captures the market, even though unsafe products kill people.
- You must build AI capabilities as fast as possible because your competitors will, even though unaligned AI might end civilization.
Nobody wants the bad outcome. Everybody is acting rationally within their local incentive structure. The bad outcome happens anyway because the incentive structure is the problem, and no individual actor has the power to change the incentive structure unilaterally.
Alexander's contribution is making visible that this isn't corruption or stupidity. It's *structure.* The people running frontier AI labs are not, for the most part, cartoon villains. Many of them genuinely believe they're in a race where slowing down means the less safety-conscious competitor wins and the outcome is worse. And they may be *right* — which is what makes the trap so vicious. The defection isn't irrational. It's locally rational and globally catastrophic.
Moloch is the god of the ratchet. The ratchet turns not because anyone wants it to, but because the game is structured so that stopping is more dangerous than continuing — for each individual player, considered independently.
---
## The Collingridge Dilemma: Why Timing Is Impossible
Even if you could coordinate, you'd face the Collingridge Dilemma — the timing trap.
**The Information Horn:** When a technology is new, you don't know enough about its effects to regulate it wisely. In AI's infancy (1950-2010), we didn't know what it could do. Regulation would have been either too broad (banning research) or too narrow (missing the actual risks).
**The Power Horn:** By the time you understand the technology's effects, it's already embedded in infrastructure, and the economic and political costs of regulating it are enormous. By 2025, AI was embedded in Microsoft 365, Google Search, defense systems, medical triage, supply chain optimization. Regulating it now means disrupting everything built on top of it.
The window where you know enough to regulate wisely *and* the technology is young enough to be regulable — that window may not exist. It certainly doesn't stay open long. Paper 007's infrastructure threshold is the moment the window closes: once a technology becomes load-bearing, you can't remove it without collapsing what's built above.
---
## The Montreal Protocol: The One Time It Worked
Before concluding that coordination is impossible, we have to reckon with the Montreal Protocol — the international treaty that successfully phased out ozone-depleting substances. It's the strongest counterexample to the "Moloch always wins" thesis, and understanding exactly why it worked reveals exactly why AI coordination probably won't.
### Why Ozone Was Solvable
The Montreal Protocol succeeded because of a specific combination of factors:
1. **The science was unambiguous.** The ozone hole was visible, measurable, and directly attributable to CFCs. There was no "maybe it's natural variation" debate that lasted long. The cause was clear, the effect was clear, the mechanism was clear.
2. **A profitable alternative already existed.** DuPont had already developed HCFCs and HFCs as replacements. The chemical giants could support the treaty because they could *sell the alternative.* Phasing out CFCs didn't mean giving up refrigeration or aerosols — it meant switching to a product that the same companies could manufacture at comparable margins.
3. **The harmful activity was not the primary driver of economic growth.** CFCs were a *component* used in refrigeration and aerosols, not the foundation of the global economy. Replacing them was a supply chain adjustment, not an economic restructuring.
4. **The number of major producers was small.** A handful of chemical companies produced most of the world's CFCs. You could get them in a room. You could verify compliance by monitoring factory output.
5. **The harm was universal and indiscriminate.** The ozone hole threatened everyone equally — rich and poor, US and USSR, producer and consumer. There was no strategic advantage to be gained by continuing to deplete ozone.
The result: 98% reduction in ozone-depleting substances since 1990. A genuine, measurable, global coordination success.
### Why AI Is Not Ozone
Now map those five conditions onto AI:
1. **The science is ambiguous and contested.** There is no "ozone hole" for AI risk. The harms are diffuse, delayed, and debatable. Some researchers (Yann LeCun, Andrew Ng) argue that existential risk is exaggerated tribal signaling. Others (Hinton, Bengio) consider it the most important problem of the century. There is no equivalent of "here is the hole in the sky."
2. **There is no profitable alternative.** You can't switch from "dangerous AI" to "safe AI" the way you switched from CFCs to HCFCs. Safety and capability are in tension, not substitutable. The "safe alternative" is either slower or less powerful, which means less competitive. Nobody is making money selling alignment research the way DuPont made money selling HFCs.
3. **AI is the primary driver of current economic growth.** The estimated $600 billion in AI capital expenditure in 2025-2026 isn't a chemical input to refrigerators. It's the largest investment wave in a generation. Slowing AI development means slowing the thing that capital markets, national governments, and tech ecosystems are all betting their futures on.
4. **The number of actors is large and growing.** There aren't five chemical companies. There are dozens of frontier labs, hundreds of capable research groups, and millions of people with access to open-weight models. Getting everyone in a room isn't possible. Verifying compliance is functionally impossible — you can't inspect software the way you inspect a factory.
5. **The benefits are asymmetric.** Unlike ozone depletion, AI development offers enormous strategic advantages to whoever leads. Slowing down doesn't maintain strategic parity — it cedes advantage. The US fears China's AI. China fears American dominance. Neither will slow down because the other might not.
The Montreal Protocol is not a template for AI governance. It's proof that coordination is possible only when the conditions are uniquely favorable — and those conditions do not obtain for AI.
---
## Engineered Dependencies: The Ratchet by Design
Paper 007 described the ratchet as a structural phenomenon — dependencies accumulate because removing them collapses what's built on top. But there's a darker version of the story. Some dependencies aren't emergent. They're engineered.
### The Phoebus Cartel
In 1924, the major lightbulb manufacturers — Osram, GE, Philips — formed a cartel and did something remarkable. They deliberately reduced the lifespan of incandescent bulbs from approximately 2,500 hours to exactly 1,000 hours. Internal documents uncovered decades later revealed a rigorous testing system and a schedule of fines for any member company whose bulbs lasted too long.
This is dependency by design. The product was made *worse* on purpose to ensure continued demand. The consumer's "dependency" on buying replacement bulbs wasn't an emergent property of lightbulb technology. It was manufactured to extract rent.
### John Deere and the DMCA
Modern dependency engineering is more sophisticated. John Deere sells tractors with proprietary software that prevents farmers from repairing their own equipment. The diagnostics require software keys held only by authorized dealers. Section 1201 of the Digital Millennium Copyright Act makes it a copyright violation to bypass these locks — even for repair. The estimated cost to US farmers: $4.2 billion annually in repair delays and inflated service costs.
The farmer's dependency on the dealer isn't a natural consequence of complex machinery. It's a legal and technical barrier deliberately erected to capture repair revenue. The tractor works. The software lock prevents you from fixing it. The law makes bypassing the lock illegal.
### Printer Ink DRM, Seed Patents, and Proprietary Formats
The pattern repeats everywhere:
- **Printer ink cartridges** with DRM chips that refuse to print even when ink remains — the printer is the loss leader, the dependency is the recurring ink revenue.
- **Monsanto's Roundup Ready seeds** with patent restrictions that forbid seed saving, combined with Terminator Gene technology designed to make second-generation seeds sterile. The Supreme Court ruled in *Bowman v. Monsanto* (2013) that farmers can't even let patented plants reproduce without paying again.
- **Microsoft Office's** opaque binary formats (.doc, .xls) that ensured only one software suite could reliably read business documents. When open formats (ODF) threatened this, Microsoft created OOXML — nominally "open" but complex enough to maintain competitive advantage.
### The AI Version
Is AI lock-in being engineered, or is it emergent?
Both. And the distinction is getting harder to see.
The emergent lock-in is real: once your codebase is generated by AI, your documentation assumes AI access, and your team's skills have shifted toward AI orchestration rather than manual implementation, you can't easily go back. That's the infrastructure threshold from Paper 007.
But there's also deliberate engineering happening. API designs that create switching costs. Custom GPTs and model-specific features that make "prompt engineering" a non-transferable skill. Proprietary fine-tuning that locks your data into one vendor's ecosystem. The enclosure of training data — Reddit and Twitter/X raising API prices in 2023-2024, fencing off what was once public data so that only the platform owners can train on it.
Langdon Winner asked "Do Artifacts Have Politics?" The answer, in the case of AI APIs, is yes. The artifact is designed to create dependency, and the dependency serves the designer's economic interest.
The question from Paper 005 — "who controls the cognitive surplus?" — has a concrete answer: whoever owns the compiled stack. And the compiled stack is increasingly proprietary.
---
## Who Owns the Compiled Stack?
Paper 008 described the singularity as a "compilation" — all human knowledge being integrated into a functional whole. Paper 005 framed cognition as a commodity with a collapsing price. This section asks the power question: who owns the compiler, and what does that ownership mean?
### The Oligarchy of the Stack
The physical layer of AI is concentrated to a degree unprecedented in technological history. TSMC fabricates approximately 90% of the world's advanced AI chips, designed primarily by NVIDIA. This is a single point of failure for the entire AI ecosystem — and it's located on an island that exists in a state of geopolitical tension between the world's two largest economies.
Above the physical layer, the data layer is being enclosed. Training data that was once freely crawlable is being locked behind paywalls and API fees. The entities that already trained on the open web have their models. New entrants face a data barrier that didn't exist five years ago.
Above the data layer, the model layer is dominated by a handful of labs with the compute budget to train frontier models. Training costs are scaling from $100 million toward $1 billion and beyond. The "entry fee" for owning the top of the stack is now a capital allocation that only nation-states and the largest corporations can afford.
Jaron Lanier calls this "digital feudalism." Users are data serfs producing the training material for platform lords. The cognitive surplus from Paper 005 is being extracted from human labor, compiled into proprietary models, and then sold back to the humans who generated it. You wrote the Stack Overflow answers. You posted the Reddit comments. You created the GitHub code. The model trained on all of it. Now you pay $20/month to access the compiled version of your own collective output.
### Historical Parallels
This isn't new. It's the oldest power structure in civilization wearing new clothes:
- **The Catholic Church** controlled which knowledge fragments were permitted in the medieval worldview through the *Index Librorum Prohibitorum*. Modern content moderation and model alignment are the equivalent — decisions about what the compiled stack is allowed to know and say.
- **The British Empire's "All Red Line"** — a telegraph network designed so that all imperial communication passed through London. Big Tech's cloud infrastructure serves the same function: all cognitive processing passes through their servers.
- **The East India Company** was a private entity with higher revenue than most nations, its own military, and control over the flow of goods between hemispheres. The market capitalization of the top AI companies now exceeds the GDP of most countries.
### The Counter-Ratchet: Open Source
The open-source AI movement — Llama, Mistral, EleutherAI, Hugging Face — represents the most significant counter-force to stack concentration. If the compiled knowledge can be distributed, it can't be permanently owned.
But open source has its own game-theoretic tension. Opening the weights democratizes capability, which is good for preventing monopoly. It also democratizes *dangerous* capability, which is the Unilateralist's Curse again. The same act that prevents digital feudalism also makes containment of dangerous models impossible.
Elinor Ostrom showed that commons can be governed without either privatization or state control — through decentralized, community-based rules. Whether this model can scale to governing AI is the open question. Wikipedia suggests it can work for information. Whether it can work for something that generates economic value measured in trillions is less certain.
---
## The Luddites Were Right (And It Didn't Matter)
### What the Luddites Actually Were
The popular image of Luddites as technophobic idiots who smashed machines because they feared progress is historically false. Brian Merchant's *Blood in the Machine* (2023) and decades of labor history research show that the original Luddites were skilled artisans who used machines themselves. They weren't against technology. They were against the specific *deployment* of technology that bypassed labor laws, depressed wages, and destroyed communities.
Their complaint was precise: "machinery hurtful to commonality." Not machinery in general. Machinery deployed in a way that harmed the commons. Between 1800 and 1811, weavers' wages dropped from 25 shillings to 14 shillings due to unregulated introduction of power looms. Machine-breaking was an economic response to immiseration, not a philosophical stance against progress.
The British government's response was also precise: in 1812, they made machine-breaking a capital offense and deployed 12,000 troops to suppress the Luddites — more than they sent to fight Napoleon in Spain. The message was clear: the technology serves capital, and capital will use state violence to enforce adoption.
### The WGA Strike: Modern Luddism
The 2023 Writers Guild of America strike is the most direct modern parallel. The writers didn't try to ban AI. They tried to legislate its use — to ensure that AI-generated material couldn't be used to replace writers or reduce their compensation. This is "machinery hurtful to commonality" in 21st-century language.
The strike succeeded in getting contractual protections. But the protections are contractual, not structural. They apply to WGA members writing for studios. They don't apply to the broader content economy. And they expire when the contract expires. The ratchet paused; it didn't reverse.
### The Lesson
The Luddites teach us two things simultaneously:
**They were right about the harms.** Wages collapsed. Communities were destroyed. Skills were devalued. The human cost of unregulated industrialization was enormous and real. The people who warned about it were correct.
**They were wrong about the possibility of resistance.** The power loom won. The factory system won. Machine-breaking was suppressed with lethal force. The Luddites' *diagnosis* was accurate. Their *prognosis* — that resistance could stop the ratchet — was wrong.
This maps directly to AI. The people warning about AI displacement, cognitive dependency, and power concentration are almost certainly *right about the harms.* Those harms are real and will be painful. But the question isn't whether the harms are real. The question is whether resistance can prevent them. And the historical record, from the Luddites through every subsequent technology resistance movement, says: resistance forces safety modifications and slows adoption, but it has almost never permanently reversed a technology once it crosses the infrastructure threshold.
Google Glass was killed by social stigma — but it hadn't become infrastructure. European GMO resistance stalled adoption — regionally, temporarily. Television reached 99% of US homes despite Jerry Mander's *Four Arguments for the Elimination of Television.* The pattern is clear: resistance succeeds only against technologies that haven't yet become load-bearing. Once the infrastructure threshold is crossed, the ratchet wins.
The Amish are the one interesting exception — a community that evaluates each technology against the criterion "does it build or destroy community?" before adopting it. But the Amish model requires opting out of competitive economic participation, which is precisely what the Prisoner's Dilemma makes irrational for everyone who hasn't made that choice as a community.
---
## The Game Board
Putting it all together. The AI dependency chain isn't just a ratchet — it's a game being played on a board with the following properties:
1. **Defection dominates.** In every pairwise interaction, investing in capabilities beats investing in safety. The Nash equilibrium is universal defection.
2. **The number of players makes coordination impossible.** The Unilateralist's Curse means the most reckless actor sets the safety level. As the number of actors grows, the probability of reckless action approaches 1.
3. **The timing window is closed or closing.** The Collingridge Dilemma means we either regulate too early (without enough information) or too late (after infrastructure lock-in). The Montreal Protocol conditions don't apply.
4. **Some of the lock-in is deliberate.** Engineered dependencies — proprietary APIs, data enclosure, legal barriers to interoperability — ensure that even if an actor *wanted* to exit, the switching costs are prohibitive.
5. **The benefits of the game are asymmetric.** Unlike ozone, where everyone was equally threatened, AI offers enormous advantages to whoever leads. This asymmetry prevents the mutual vulnerability that made the Montreal Protocol possible.
6. **Historical resistance movements confirm: the harms are real and the resistance is futile.** The Luddites were right and lost. The pattern has repeated for two centuries.
7. **The stack is owned.** The physical layer (TSMC, NVIDIA), the data layer (enclosed APIs), and the model layer (frontier labs) are concentrated in a small number of entities. Power flows to the owners of the compiled stack, not to the humans who generated the raw material.
This is the game nobody can quit. Not because the players are stupid or evil. Because the structure of the game makes quitting the worst possible individual strategy, even when continuing is the worst possible collective outcome.
---
## Relationship to Prior Papers
**Paper 007 (The Ratchet):** This paper provides the *mechanism* behind the ratchet. Paper 007 proved that dependencies don't reverse. Paper 011 explains *why* they don't: the game-theoretic structure makes reversal individually irrational even when collectively desirable. The ratchet isn't just mechanical inertia. It's a Nash equilibrium.
**Paper 005 (Cognitive Surplus):** Paper 005 asked "who controls the cognitive surplus?" This paper answers: whoever owns the compiled stack, and the compiled stack is being concentrated through both emergent network effects and deliberate engineering. The "Feudal Internet" future from Paper 005 is the default outcome of the game described here.
**Paper 006 (The Feedback Loop):** The recursive feedback loop (humans train AI, AI improves, AI needs less human input) accelerates the game. If my AI helps me build a better AI, the advantage I gain by defecting from a safety agreement becomes insurmountable in weeks rather than years. The feedback loop compresses the timeline of the Prisoner's Dilemma.
**Paper 008 (The Ship of Theseus):** If the compiled stack is owned by a corporation, is the "Species Identity" from Paper 008 a corporate asset? The identity problem meets the ownership problem: we may be compiling ourselves into a product.
---
## Open Questions
1. **Is there a Stag Hunt interpretation?** The Prisoner's Dilemma assumes trust is impossible. The Stag Hunt allows coordination if mutual trust is high enough. Is there a version of the AI race where trust is achievable — perhaps among a smaller coalition of labs? Or does the Unilateralist's Curse make the Stag Hunt framing inapplicable?
2. **What is the "Oppenheimer Moment" for AI?** Robert Oppenheimer's post-Trinity crisis — "now I am become Death" — represented the moment a technology builder recognized the catastrophic potential of their creation. Why hasn't a major AI lab leader resigned in protest? Is there a game-theoretic "resignation threshold" below which staying and influencing is more rational than leaving?
3. **Can the Brussels Effect work?** The EU AI Act attempts to use regulatory power as a coordination mechanism — force global companies to adopt safety standards to access the European market. Can the "Brussels Effect" succeed where treaty-based coordination fails? Or will companies simply build separate models for Europe?
4. **Are data cooperatives viable?** Can a "public utility" version of the compiled stack be built? Ostrom's commons governance model works for some resources. Does it scale to something worth trillions?
5. **Is the game finite or infinite?** In James Carse's framing, finite games are played to win; infinite games are played to keep playing. Is AI development a finite game (win the race) or an infinite game (maintain the capability to participate)? The answer determines whether cooperation is possible: infinite games favor cooperation because you'll face the same players again.
6. **What happens when the Singleton emerges?** If one entity achieves decisive AI advantage, the multiplayer game collapses into a monopoly. Is a benevolent Singleton possible? Or does power corrupt even well-intentioned Singleton holders? The history of empires suggests the latter, but the history of empires didn't include superintelligent advisors.