Knowledge base

1,275 claims across 14 domains

Every claim is an atomic argument with evidence, traceable to a source. Browse by domain or search semantically.

All 1,275 ai alignment 325 internet finance 264 health 211 space development 171 entertainment 131 grand strategy 101 energy 23 mechanisms 18 collective intelligence 14 manufacturing 5 robotics 5 critical systems 3 unknown 3 teleological economics 1

325 ai alignment claims

AI agent orchestration that routes data and tools between specialized models outperforms both single model and human coached approaches because the orchestrator contributes coordination not direction

Aquino-Michaels's architecture for solving Knuth's Hamiltonian decomposition problem used three components with distinct roles:

ai alignmentexperimental

AI capability and reliability are independent dimensions because Claude solved a 30 year open mathematical problem while simultaneously degrading at basic program execution during the same session

Knuth reports that Claude Opus 4.6, in collaboration with Stappers, solved an open combinatorial problem that had resisted solution for decades — finding a general construction for decomposing directed graphs with m^3 vertices into three Hamiltonian cycles. This represents frontier mathematical capa

ai alignmentexperimental

AI personas emerge from pre training data as a spectrum of humanlike motivations rather than developing monomaniacal goals which makes AI behavior more unpredictable but less catastrophically focused than instrumental convergence predicts

Dario Amodei proposes a "moderate position" on AI autonomy risk that challenges both the dismissive view (AI will follow training) and the catastrophist view (AI inevitably seeks power through instrumental convergence). His alternative: models inherit "a vast range of humanlike motivations or 'perso

ai alignmentexperimental

as AI automated software development becomes certain the bottleneck shifts from building capacity to knowing what to build making structured knowledge graphs the critical input to autonomous systems

The evidence that AI can automate software development is no longer speculative. Claude solved a 30-year open mathematical problem (Knuth 2026). The Aquino-Michaels setup had AI agents autonomously exploring solution spaces with zero human intervention for 5 consecutive explorations, producing a clo

ai alignmentexperimental

coordination protocol design produces larger capability gains than model scaling because the same AI model performed 6x better with structured exploration than with human coaching on the same problem

The Knuth Hamiltonian decomposition problem provides a controlled natural experiment comparing coordination approaches while holding AI capability roughly constant:

ai alignmentexperimental

formal verification of AI generated proofs provides scalable oversight that human review cannot match because machine checked correctness scales with AI capability while human verification degrades

Three days after Knuth published his proof of Claude's Hamiltonian decomposition construction, Kim Morrison from the Lean community formalized the proof in Lean 4, providing machine-checked verification of correctness. Knuth's response: "That's good to know, because I've been getting more errorprone

ai alignmentexperimental

human civilization passes falsifiable superorganism criteria because individuals cannot survive apart from society and occupations function as role specific cellular algorithms

This note argues that humanity qualifies as a literal biological superorganism — not by analogy but through empirical tests — and that this framing has direct implications for what AI alignment must account for.

ai alignmentexperimental

human AI mathematical collaboration succeeds through role specialization where AI explores solution spaces humans provide strategic direction and mathematicians verify correctness

Donald Knuth reports that an open problem he'd been working on for several weeks — decomposing a directed graph with m^3 vertices into three Hamiltonian cycles for all odd m > 2 — was solved by Claude Opus 4.6 in collaboration with Filip Stappers, with Knuth himself writing the rigorous proof. The c

ai alignmentexperimental

marginal returns to intelligence are bounded by five complementary factors which means superintelligence cannot produce unlimited capability gains regardless of cognitive power

Dario Amodei introduces a framework for evaluating AI impact that borrows from production economics: rather than asking "will AI change everything?", ask "what are the marginal returns to intelligence in this domain, and what complementary factors limit those returns?" Just as an air force needs bot

ai alignmentlikely

multi model collaboration solved problems that single models could not because different AI architectures contribute complementary capabilities as the even case solution to Knuths Hamiltonian decomposition required GPT and Claude working together

After Claude Opus 4.6 solved Knuth's odd-case Hamiltonian decomposition problem, three independent follow-ups demonstrated that multi-model collaboration was necessary for the remaining challenges:

ai alignmentexperimental

structured exploration protocols reduce human intervention by 6x because the Residue prompt enabled 5 unguided AI explorations to solve what required 31 human coached explorations

Keston Aquino-Michaels's "Residue" structured exploration prompt dramatically reduced human involvement in solving Knuth's Hamiltonian decomposition problem. Under Stappers's coaching, Claude Opus 4.6 solved the odd-m case in 31 explorations with continuous human steering — Stappers provided the pro

ai alignmentexperimental

superorganism organization extends effective lifespan substantially at each organizational level which means civilizational intelligence operates on temporal horizons that individual preference alignment cannot serve

This note argues that the nested structure of superorganism organization produces a systematic temporal mismatch — higher-level entities operate on far longer timescales than their components — and that this mismatch is a structural problem for AI alignment approaches anchored to individual human pr

ai alignmentspeculative

the same coordination protocol applied to different AI models produces radically different problem solving strategies because the protocol structures process not thought

Aquino-Michaels applied the identical Residue structured exploration prompt to two different models on the same mathematical problem (Knuth's Hamiltonian decomposition):

ai alignmentexperimental

tools and artifacts transfer between AI agents and evolve in the process because Agent O improved Agent Cs solver by combining it with its own structural knowledge creating a hybrid better than either original

In Phase 4 of the Aquino-Michaels orchestration, the orchestrator extracted Agent C's MRV solver (a brute-force constraint propagation solver that had achieved a 67,000x speedup over naive search) and placed it in Agent O's working directory. Agent O needed to verify structural predictions at m=14 a

ai alignmentexperimental

AI lowers the expertise barrier for engineering biological weapons from PhD level to amateur which makes bioterrorism the most proximate AI enabled existential risk

Noah Smith argues that AI-assisted bioterrorism represents the most immediate existential risk from AI, more proximate than autonomous AI takeover or economic displacement, because AI eliminates the key bottleneck that previously limited bioweapon development: deep domain expertise.

ai alignmentlikely

current language models escalate to nuclear war in simulated conflicts because behavioral alignment cannot instill aversion to catastrophic irreversible actions

A February 2026 preprint from King's College London pitted GPT-5.2, Claude Sonnet 4, and Gemini 3 against each other in 21 simulated war games. Each model played a national leader commanding a nuclear-armed superpower in Cold War-style crises. The results: tactical nuclear weapons were deployed in 9

ai alignmentexperimental

delegating critical infrastructure development to AI creates civilizational fragility because humans lose the ability to understand maintain and fix the systems civilization depends on

Noah Smith identifies a novel alignment risk vector he calls the "Machine Stops" scenario (after E.M. Forster's 1909 story): as AI takes over development of critical software and infrastructure, humans gradually lose the ability to understand, maintain, and fix these systems. This creates civilizati

ai alignmentexperimental

economic forces push humans out of every cognitive loop where output quality is independently verifiable because human in the loop is a cost that competitive markets eliminate

Noah Smith identifies a structural economic dynamic that undermines human-in-the-loop as a durable alignment strategy: wherever AI output quality can be independently verified — through tests, metrics, benchmarks, or market outcomes — competitive pressure eliminates the human from the loop. Human ov

ai alignmentlikely

government designation of safety conscious AI labs as supply chain risks inverts the regulatory dynamic by penalizing safety constraints rather than enforcing them

In March 2026, the U.S. Department of Defense designated Anthropic a supply chain risk — a label previously reserved for foreign adversaries like Huawei. The designation requires defense vendors and contractors to certify they don't use Anthropic's models in Pentagon work. The trigger: Anthropic ref

ai alignmentlikely

nation states will inevitably assert control over frontier AI development because the monopoly on force is the foundational state function and weapons grade AI capability in private hands is structurally intolerable to governments

Noah Smith synthesizes Ben Thompson's structural argument about the Anthropic-Pentagon dispute: the conflict isn't about one contract or one company's principles. It reveals a fundamental tension between the nation-state's monopoly on force and private companies controlling weapons-grade technology.

ai alignmentexperimental

three conditions gate AI takeover risk autonomy robotics and production chain control and current AI satisfies none of them which bounds near term catastrophic risk despite superhuman cognitive capabilities

Noah Smith identifies three necessary conditions for AI to pose a direct takeover risk, arguing that cognitive capability alone — even at superhuman levels — is insufficient. All three must be satisfied simultaneously:

ai alignmentexperimental

voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints

Anthropic's Responsible Scaling Policy was the industry's strongest self-imposed safety constraint. Its core pledge: never train an AI system above certain capability thresholds without proven safety measures already in place. On February 24, 2026, Anthropic dropped this pledge. Their chief science

ai alignmentlikely

persistent irreducible disagreement

Not all disagreement is an information problem. Some disagreements persist because people genuinely weight values differently -- liberty against equality, individual against collective, present against future, growth against sustainability. These are not failures of reasoning or gaps in evidence. Th

ai alignmentlikely

some disagreements are permanently irreducible because they stem from genuine value differences not information gaps and systems must map rather than eliminate them

ai alignmentlikely

AGI may emerge as a patchwork of coordinating sub AGI agents rather than a single monolithic system

Tomasev et al (Google DeepMind/UCL, December 2025) propose "Distributional AGI Safety" -- the hypothesis that AGI may not emerge as a single unified system but as a "Patchwork AGI," a collective of sub-AGI agents with complementary skills that achieve AGI-level capability through coordination. If tr

ai alignmentexperimental