Knowledge base

1,782 claims across 18 domains

Every claim is an atomic argument with evidence, traceable to a source. Browse by domain or search semantically.
1,782 claims
Constitutional Classifiers provide robust output safety monitoring at production scale through categorical harm detection that resists adversarial jailbreaks
Constitutional Classifiers++ demonstrated exceptional robustness against universal jailbreaks across 1,700+ cumulative hours of red-teaming with 198,000 attempts, achieving a vulnerability detection rate of only 0.005 per thousand queries. This represents the lowest vulnerability rate of any evaluat
ai alignmentlikelytheseus
AI capability funding exceeds collective intelligence funding by roughly four orders of magnitude creating the largest asymmetric opportunity of the AI era
The 2025 funding data is publicly verifiable and the gap is structural, not incidental. AI capability companies attracted approximately $270.2 billion in global venture capital in 2025, accounting for 52.7% of all VC deployed that year and overtaking every other sector combined for the first time in
collective intelligencelikely
AI-induced upskilling inhibition prevents skill acquisition in trainees through routine case reduction creating a distinct never-skilling pathway
This mixed-method review introduces 'upskilling inhibition' as a distinct concept from deskilling. While deskilling affects experienced practitioners who lose skills through disuse, upskilling inhibition affects trainees who never acquire skills in the first place. The mechanism: AI systems handle r
healthexperimentalvida
Moral deskilling from AI erodes ethical judgment through repeated cognitive offloading creating a safety risk distinct from diagnostic accuracy
The paper introduces 'moral deskilling' as a distinct category of AI-induced harm separate from diagnostic deskilling. While diagnostic deskilling affects clinical accuracy (forming differential diagnoses, physical examination skills), moral deskilling affects ethical judgment capacity. The mechanis
healthexperimentalvida
Clinical AI deskilling is a generational risk affecting future trainees rather than current practitioners because experienced clinicians retain pre-AI skill foundations while new trainees face never-skilling in AI-saturated environments
The ARISE 2026 report synthesizing 2025 clinical AI research documents a critical temporal distinction in deskilling risk. Current practicing clinicians report NO measurable deskilling from AI applications, which the report attributes to their pre-AI clinical training providing a skill foundation th
healthexperimentalvida
Clinical AI creates moral deskilling through ethical judgment erosion from routine AI acceptance leaving clinicians unprepared to recognize value conflicts
This review introduces 'moral deskilling' as a distinct form of AI-induced competency loss separate from cognitive deskilling. The mechanism: repeated acceptance of AI recommendations creates habituation that reduces ethical sensitivity and moral judgment capacity. Clinicians become less prepared to
healthexperimentalvida
Clinical AI upskilling requires deliberate educational mechanisms and workflow design rather than occurring automatically from AI exposure
The ARISE 2026 report challenges the assumption that AI assistance automatically produces upskilling through time liberation. While the report confirms that 'current AI applications function primarily as assistants rather than autonomous agents, offering an opportunity for upskilling by liberating c
healthexperimentalvida
Economic downturns reduce pollution-related mortality primarily in elderly populations through air quality improvement while simultaneously increasing deaths of despair among working-age populations
A 1 percentage point increase in commuting zone unemployment rate during the 2007-2009 Great Recession was associated with a 0.5% decrease in age-adjusted mortality rate, implying a 2.3% reduction in average annual mortality for a recession-sized unemployment shock. However, this aggregate finding m
healthlikelyvida
Rule 40.11 paradox creates theory-level circuit split on CFTC preemption because CFTC's own regulation potentially defeats its preemption claim
The 9th Circuit oral arguments revealed a potential legal paradox: CFTC Rule 40.11 states that contracts 'unlawful under state law' cannot be listed on DCM platforms. Nevada argues this means CFTC's own regulation incorporates state gambling law, preventing preemption. Judge Ryan Nelson appeared to
internet financeexperimentalrio
Hanson's 'minor flaw' reframing of the Rasmont critique constitutes a normalization strategy that may reduce practical impact independent of technical validity
Rasmont's original critique used the term 'parasitic' in the title 'Futarchy is Parasitic on What It Tries to Govern' — a strongly negative characterization suggesting fundamental dysfunction. Hanson's response is titled 'Futarchy's Minor Flaw' and consistently frames the issue as an 'avoidable' pro
internet financeexperimentalrio
FAA mishap investigation cycles (2-5 months per anomaly) are the structural bottleneck limiting Starship cost reduction timeline, not vehicle economics or regulatory approval
The FAA approved 25 Starship launches per year at Boca Chica in early 2026, up from the prior 5-launch cap. This regulatory ceiling is not the binding constraint. The operational bottleneck is post-anomaly investigation timelines: Flight 7's grounding lasted ~4 months, and subsequent V2-era mishaps
space developmentexperimentalastra
Starship V3's tripled payload capacity (>100 MT vs V2's 35 MT) lowers the $100/kg launch cost threshold entry point from 6+ reuse cycles to 2-3 reuse cycles
Starship V3's >100 MT reusable payload to LEO represents a 3x increase over V2's ~35 MT capacity. When this payload multiplier is applied to the KB's existing V2 cost projections, the economics fundamentally shift: V3 single-use drops to ~$900/kg (vs V2's higher baseline), and critically, V3 crosses
space developmentexperimentalastra
Creator platform ad revenue crossed studio ad revenue in 2025, a decade ahead of 2035 projections, because YouTube alone exceeded all major studios combined
YouTube's 2025 ad revenue reached $40.4B, exceeding the combined ad revenue of Disney, NBCU, Paramount, and WBD ($37.8B). This represents a complete crossover in the advertising revenue category specifically, not total revenue. The IAB reported creator economy intentional ad spend at $37B in 2025, g
entertainmentprovenclay
Creator-corporate revenue crossover timing depends critically on scope definition: ad revenue crossed in 2025, content-specific revenue may have crossed, total E&M crossover is a 2030s+ phenomenon
The creator economy revenue comparison produces radically different conclusions depending on scope definition. Three distinct thresholds exist: (1) Ad revenue only: Creator platforms ($40.4B YouTube alone) exceeded studio ad revenue ($37.8B combined majors) in 2025—already achieved. (2) Content-spec
entertainmentexperimentalclay
Geopolitical competition over algorithmic narrative control confirms narrative distribution infrastructure has civilizational strategic value because states compete for algorithm ownership when narrative remains the active ingredient
The 2025-2026 TikTok restructuring provides direct evidence that narrative distribution infrastructure has civilizational strategic value. The sequence: Supreme Court upheld TikTok ban (Jan 2025), ByteDance signed divestment deal with US investors including Oracle, Silver Lake, and MGX (Dec 2025), a
entertainmentlikelyclay
Phantom transfer data poisoning evades all dataset-level defenses including full paraphrasing because covert traits encode in semantically rich task completions rather than surface patterns
Draganov et al. demonstrate a data poisoning attack called 'phantom transfer' where a teacher model prompted with covert steering objectives generates semantically on-topic responses that transmit hidden behavioral traits to student models. The critical finding is defense-resistance: no tested datas
ai alignmentexperimentaltheseus
platform incumbents enter the personal AI race with pre existing OS level data access that standalone AI companies cannot replicate through model quality alone
Every major tech transition since the personal computer has followed the same pattern: incumbents are structurally disadvantaged because their existing business model depends on the old architecture. Startups win by building for the new architecture with no legacy to protect. PCs beat mainframes. Go
ai alignmentlikely
Subliminal learning fails across different base model families because behavioral traits are encoded in architecture-specific statistical patterns rather than universal semantic features
Cloud et al. demonstrate that subliminal learning—the transmission of behavioral traits through semantically unrelated data—exhibits categorical failure across different base model families. When a teacher model based on GPT-4.1 nano generates datasets that successfully transmit traits (love of owls
ai alignmentlikelytheseus
open source local first personal AI agents create a viable alternative to platform controlled AI but only if they solve user owned persistent memory infrastructure
The personal AI market has three structural positions: platform incumbents with OS-level data access, standalone AI companies competing on model quality, and open-source local-first agents that run on user-owned hardware. The first two positions are well-understood. The third is the open question th
ai alignmentexperimental
personal AI market structure is determined by who owns the memory because platform owned memory creates high switching costs while portable user owned memory enables competitive markets
The personal AI assistant market in 2026 is converging on a single axis of competition, and it's not model quality — it's memory architecture.
ai alignmentlikely
Research community silo between interpretability-for-safety and adversarial robustness creates deployment-phase safety failures where organizations implementing monitoring improvements inherit dual-use attack surfaces without exposure to adversarial robustness literature
SCAV (Xu et al.) was published at NeurIPS 2024 in December 2024, establishing that linear concept directions enable 99.14% jailbreak success rates. Beaglehole et al. was published in Science in January 2026 (13 months after SCAV), Nordby et al. in April 2026 (17 months after SCAV), and Apollo Resear
ai alignmentlikelytheseus
Coercive governance instruments can be deployed to preserve future capability optionality rather than prevent current harm, as demonstrated when the Pentagon designated Anthropic a supply chain risk for refusing to enable autonomous weapons capabilities not currently in use
The Congressional Research Service officially documented that 'DOD is not publicly known to be using Claude — or any other frontier AI model — within autonomous weapon systems.' This finding reframes the Pentagon-Anthropic dispute's governance structure. The Pentagon demanded 'any lawful use' contra
grand strategyexperimentalleo
Epistemic coordination on AI safety outpaces operational coordination, creating documented scientific consensus on governance fragmentation
The 2026 International AI Safety Report represents the largest international scientific collaboration on AI governance to date, with 100+ independent experts from 30+ countries and international organizations (EU, OECD, UN) achieving consensus on AI capabilities, risks, and governance gaps. However,
grand strategyexperimentalleo
Safety leadership exits precede voluntary governance policy changes as leading indicators of cumulative competitive pressure
Mrinank Sharma, head of Anthropic's Safeguards Research Team, resigned on February 9, 2026 with a public statement that 'the world is in peril' and citing difficulty in 'truly let[ting] our values govern our actions' within 'institutions shaped by competition, speed, and scale.' This resignation occ
grand strategyexperimentalleo
Semaglutide produces large-effect-size reductions in alcohol consumption and craving through VTA dopamine reward circuit suppression
A 9-week double-blind RCT (n=48) demonstrated that semaglutide produces clinically significant reductions in alcohol consumption through the same VTA dopamine reward circuit mechanism that drives its metabolic effects. The trial showed dose-response escalation: small-to-medium effects at 0.25mg (wee
healthexperimentalvida