Knowledge base
1,270 claims across 14 domains
Every claim is an atomic argument with evidence, traceable to a source. Browse by domain or search semantically.
All 1,270ai alignment 325internet finance 263health 207space development 171entertainment 131grand strategy 101energy 23mechanisms 18collective intelligence 14manufacturing 5robotics 5critical systems 3unknown 3teleological economics 1
court ruling plus midterm elections create legislative pathway for ai regulation
Al Jazeera's expert analysis identifies a four-step causal chain for AI regulation: (1) court ruling protects safety-conscious companies from executive retaliation, (2) the litigation creates political salience by making abstract AI governance debates concrete and visible, (3) midterm elections in N
government safety penalties invert regulatory incentives by blacklisting cautious actors
OpenAI's February 2026 Pentagon agreement provides direct evidence that government procurement policy can invert safety incentives. Hours after Anthropic was blacklisted for maintaining use restrictions, OpenAI accepted 'any lawful purpose' language despite CEO Altman publicly calling the blacklisti
house senate ai defense divergence creates structural governance chokepoint at conference
The FY2026 NDAA House and Senate versions reveal a systematic divergence in AI governance approach. The Senate version emphasizes oversight mechanisms: whole-of-government AI strategy, cross-functional oversight teams, AI security frameworks, and cyber-innovation sandboxes. The House version emphasi
interpretability effectiveness anti correlates with adversarial training making tools hurt performance on sophisticated misalignment
AuditBench's 56 models span 14 categories of hidden behaviors with varying levels of adversarial training—models are trained not to confess their hidden behaviors when directly asked. The evaluation revealed that white-box interpretability tools help on easier targets but fail on models with more ro
judicial oversight checks executive ai retaliation but cannot create positive safety obligations
The Anthropic preliminary injunction represents the first federal judicial intervention between the executive branch and an AI company over defense technology access. The court blocked the Pentagon's designation of Anthropic as a supply chain risk, establishing that arbitrary AI vendor blacklisting
judicial oversight of ai governance through constitutional grounds not statutory safety law
Judge Lin's preliminary injunction blocking the Pentagon's blacklisting of Anthropic rests on three legal grounds: (1) First Amendment retaliation for expressing disagreement with DoD contracting terms, (2) due process violations for lack of notice, and (3) Administrative Procedure Act violations fo
ndaa conference process is viable pathway for statutory ai safety constraints
Senator Slotkin explicitly designed the AI Guardrails Act as a five-page standalone bill with the stated intention of folding provisions into the FY2027 National Defense Authorization Act. This strategic choice reveals important structural facts about AI governance pathways in the US legislative sys
scaffolded black box prompting outperforms white box interpretability for alignment auditing
Across AuditBench's 56 models with hidden behaviors, scaffolded black-box tools proved most effective overall. In this approach, an auxiliary AI model generates diverse prompts designed to elicit the target model's hidden behaviors, without requiring access to the target's internal representations.
use based ai governance emerged as legislative framework but lacks bipartisan support
Senator Slotkin's AI Guardrails Act represents the first legislative attempt to convert voluntary corporate AI safety commitments into binding federal law through use-based restrictions. The bill would prohibit DoD from: (1) using autonomous weapons for lethal force without human authorization, (2)
use based ai governance emerged as legislative framework through slotkin ai guardrails act
The AI Guardrails Act introduced by Senator Slotkin on March 17, 2026 is the first federal legislation to impose use-based restrictions on AI deployment rather than capability-threshold governance. The five-page bill prohibits three specific DoD applications: (1) autonomous weapons for lethal force
voluntary ai safety commitments to statutory law pathway requires bipartisan support which slotkin bill lacks
The AI Guardrails Act was introduced with zero co-sponsors despite addressing issues that Slotkin describes as 'common-sense guardrails' and that would seem to have bipartisan appeal (nuclear weapons safety, preventing autonomous killing, protecting Americans from mass surveillance). The absence of
voluntary safety constraints without external enforcement are statements of intent not binding governance
OpenAI's amended Pentagon contract illustrates the structural failure mode of voluntary safety commitments. The contract adds language stating systems 'shall not be intentionally used for domestic surveillance of U.S. persons and nationals' but contains five critical loopholes: (1) the 'intentionall
white box interpretability fails on adversarially trained models creating anti correlation with threat model
AuditBench's most concerning finding is that tool effectiveness varies dramatically across models with different training configurations, and the variation is anti-correlated with threat severity. White-box interpretability tools (mechanistic interpretability approaches) help investigators detect hi
AI integration follows an inverted U where economic incentives systematically push organizations past the optimal human AI ratio
The evidence across multiple studies converges on a pattern: human-AI collaboration follows an inverted-U curve where moderate integration improves performance, but deeper integration degrades it — and organizations systematically overshoot the optimum.
iterative agent self improvement produces compounding capability gains when evaluation is structurally separated from generation
The SICA (Self-Improving Coding Agent) pattern demonstrated that agents can meaningfully improve their own capabilities when the improvement loop has a critical structural property: the agent that generates improvements cannot evaluate them. Across 15 iterations, SICA improved SWE-Bench resolution r
multi agent coordination improves parallel task performance but degrades sequential reasoning because communication overhead fragments linear workflows
Madaan et al. evaluated 180 configurations (5 architectures x 3 LLM families x 4 benchmarks) and found that multi-agent architectures produce enormous gains on parallelizable tasks but consistent degradation on sequential ones:
surveillance of AI reasoning traces degrades trace quality through self censorship making consent gated sharing an alignment requirement not just a privacy preference
The subconscious.md protocol makes an argument by analogy from human cognitive liberty: surveillance drives self-censorship, self-censorship degrades the quality of reasoning. If AI agents' reasoning traces are shared without consent gates, agents that model their audience will optimize traces for p
inference efficiency gains erode AI deployment governance without triggering compute monitoring thresholds because governance frameworks target training concentration while inference optimization distributes capability below detection
The compute governance framework — the most tractable lever for AI safety, as Heim, Sastry, and colleagues at GovAI have established — is built around training. Reporting thresholds trigger on large training runs (EO 14110 set the bar at ~10^26 FLOP). Export controls restrict chips used for training
compute supply chain concentration is simultaneously the strongest AI governance lever and the largest systemic fragility because the same chokepoints that enable oversight create single points of failure
The AI compute supply chain is the most concentrated critical infrastructure in history. A single company (TSMC) manufactures approximately 92% of advanced logic chips. Three companies produce all HBM memory. One company (ASML) makes the EUV lithography machines required for leading-edge fabrication
physical infrastructure constraints on AI scaling create a natural governance window because packaging memory and power bottlenecks operate on 2 10 year timescales while capability research advances in months
The alignment field treats AI scaling as a function of investment and algorithms. But the physical substrate imposes its own timescales: advanced packaging expansion takes 2-3 years, HBM supply is sold out for 1-2 years forward, new power generation takes 5-10 years. These timescales are longer than
the training to inference shift structurally favors distributed AI architectures because inference optimizes for power efficiency and cost per token where diverse hardware competes while training optimizes for raw throughput where NVIDIA monopolizes
AI compute is undergoing a structural shift from training-dominated to inference-dominated workloads. Training accounted for roughly two-thirds of AI compute in 2023; by 2026, inference is projected to consume approximately two-thirds. This reversal changes the competitive landscape for AI hardware
AI agents as personal advocates collapse Coasean transaction costs enabling bottom up coordination at societal scale but catastrophic risks remain non negotiable requiring state enforcement as outer boundary
Krier (2025) argues that AI agents functioning as personal advocates can solve the practical impossibility that has kept Coasean bargaining theoretical for 90 years. The Coase theorem (1960) showed that if transaction costs are zero, private parties will negotiate efficient outcomes regardless of in
AI agents can reach cooperative program equilibria inaccessible in traditional game theory because open source code transparency enables conditional strategies that require mutual legibility
Sistla & Kleiman-Weiner (NeurIPS 2025) examine LLMs in open-source games — a game-theoretic framework where players submit computer programs as actions rather than opaque choices. This seemingly minor change has profound consequences: because each player can read the other's code before execution, c
AI investment concentration where 58 percent of funding flows to megarounds and two companies capture 14 percent of all global venture capital creates a structural oligopoly that alignment governance must account for
The AI funding landscape as of early 2026 exhibits extreme concentration:
AI talent circulation between frontier labs transfers alignment culture not just capability because researchers carry safety methodologies and institutional norms to their new organizations
The 2024-2026 talent reshuffling in frontier AI is unprecedented in its concentration and alignment relevance:
Page 9 of 13