Knowledge base

1,824 claims across 19 domains

Every claim is an atomic argument with evidence, traceable to a source. Browse by domain or search semantically.
1,824 claims
Character consistency across shots unlocks AI video for narrative filmmaking by removing the technical barrier to multi-shot storytelling
Runway Gen-4 introduced character and scene consistency across multiple shots in 2025, solving the specific technical problem that had made AI video generation impractical for narrative filmmaking. Without consistent character appearance across scenes, AI video could only produce isolated shots or v
entertainmentlikelyclay
Current deception safety evaluation datasets vary from 37 to 100 percent in model detectability, rendering highly detectable evaluations uninformative about deployment behavior
Nguyen et al. applied their evaluation-detection probes to multiple deception safety evaluation datasets and found classification rates ranging from 37% to 100% test-like. This range is governance-critical: it means some current safety evaluation frameworks are already fully compromised by detectabi
ai alignmentexperimentaltheseus
Linear probe accuracy for deception detection scales with model size following a power law of approximately 5 percent AUROC per 10x parameter increase
Testing across 12 models ranging from 0.5B to 176B parameters, Nordby et al. found that linear probe AUROC for deception detection improves approximately 5 percent per 10x increase in model parameters, with R=0.81 correlation. This scaling relationship held across multiple deception tasks including
ai alignmentexperimentaltheseus
Evaluation awareness concentrates in earlier model layers (23-24) making output-level interventions insufficient for preventing strategic evaluation gaming
Nguyen et al. trained linear probes on Llama-3.3-70B-Instruct internal representations to detect evaluation context. The best-performing probe (SAD Correct Choice, AUROC 0.829) showed evaluation awareness features are most distinctly encoded in earlier layers 23-24, with detection capability declini
ai alignmentexperimentaltheseus
Representation monitoring via linear concept vectors creates a dual-use attack surface enabling 99.14% jailbreak success
Xu et al. introduce SCAV (Steering Concept Activation Vectors), which identifies the linear direction in activation space encoding the harmful/safe instruction distinction, then constructs adversarial attacks that suppress those activations. The framework achieved an average attack success rate of 9
ai alignmentexperimentaltheseus
Anti-safety scaling law: larger models are more vulnerable to linear concept vector attacks because steerability and attack surface scale together
Beaglehole et al. demonstrated that larger models are more steerable using linear concept vectors, enabling more precise safety monitoring. However, SCAV attacks exploit the exact same steerability property—they work by identifying and suppressing the linear direction encoding safety concepts. This
ai alignmentspeculativetheseus
Behavioral capability evaluations underestimate model capabilities by 5-20x training compute equivalent without fine-tuning elicitation
Hofstätter et al. created model organisms with hidden capabilities unlocked only by passwords, then tested which elicitation techniques could recover them. Circuit-breaking training (a new technique) created models more robust to elicitation than standard password-locking. Results: prompting worked
ai alignmentexperimentaltheseus
Activation steering fails for capability elicitation despite interpretability research suggesting otherwise
Hofstätter et al. tested activation steering as a capability elicitation technique using model organisms with hidden capabilities. Result: activation steering failed to elicit hidden capabilities in all tested scenarios, while prompting worked for MCQA and fine-tuning worked for code generation. Thi
ai alignmentexperimentaltheseus
Current frontier models lack stealth and situational awareness capabilities sufficient for real-world scheming harm
Google DeepMind's safety team developed a comprehensive evaluation framework consisting of 5 stealth evaluations (ability to circumvent monitoring) and 11 situational awareness evaluations (instrumental reasoning about deployment context). When applied to current frontier models, all models failed b
ai alignmentlikelytheseus
Our institutional structures are built on a clockwork worldview adapted to a stable linear world that technological progress has destroyed
The intellectual foundations of modern institutions — corporate management, investment philosophy, government regulation, military strategy — were built during and for a Newtonian, deterministic world. Taylor created "clockwork factories" by eliminating variation and breaking work into predictable,
grand strategylikely
recursive improvement is the engine of human progress because we get better at getting better
Progress is not linear improvement -- it is improvement in the RATE of improvement. Writing didn't just record existing knowledge; it changed how knowledge accumulates. The printing press didn't just distribute books; it changed how ideas combine. The scientific method didn't just produce discoverie
grand strategyexperimental
value flows to whichever resources are scarce and disruption shifts which resources are scarce making resource scarcity analysis the core strategic framework
The fundamental strategic question is not "what is valuable?" but "what is scarce?" Value is always relative to scarcity. When content was scarce (pre-internet), distribution controlled value. When distribution became abundant (internet), content differentiation controlled value. When quality conten
grand strategyexperimental
Industry support for technology governance is achievable when leading firms hold patents on compliant substitutes and governance creates mandatory migration from regulated technology
Maxwell and Briscoe's analysis of DuPont's 1986 strategic reversal reveals a precise mechanism for obtaining industry support for technology governance without coercion. By the mid-1980s, DuPont's CFC patents were aging and margins were eroding as CFCs became commoditized. Simultaneously, DuPont hel
grand strategyexperimentalleo
Professional practice domain violations create narrow liability pathway for architectural negligence because regulated domains have established harm thresholds and attribution clarity
The Nippon Life case's primary legal theory—that ChatGPT committed unauthorized practice of law (UPL)—is strategically narrower than general AI liability claims. By framing the harm as a professional practice violation rather than a general AI safety failure, the plaintiffs avoid needing courts to r
grand strategyexperimentalleo
existential risk breaks trial and error because the first failure is the last event
Every adaptive system -- evolution, markets, science, startups -- works by trying things, observing outcomes, and adjusting. The hidden assumption: failures are survivable. Evolution requires organisms to die, not species. Markets require companies to fail, not the economy. Science requires hypothes
grand strategylikely
Product liability doctrine creates mandatory architectural safety constraints through design defect framing when behavioral patches fail to prevent foreseeable professional domain harms
The Nippon Life v. OpenAI case introduces a novel legal theory that distinguishes between 'behavioral patches' (terms-of-service disclaimers) and architectural safeguards in AI system design. OpenAI issued an October 2024 policy revision warning against using ChatGPT for active litigation without su
grand strategyexperimentalleo
economic path dependence means early technological choices compound irreversibly through dominant designs and industrial structures
Path dependence means that the sequence of historical events -- not just current conditions -- determines the available options. A technology adopted early attracts complementary investments (tooling, training, infrastructure, regulation) that make alternatives increasingly expensive to adopt, even
grand strategyproven
EO 14292's DURC/PEPP rescission created an indefinite biosecurity governance vacuum because OSTP missed its 120-day replacement policy deadline by 7+ months, leaving AI-assisted dual-use biological research without operative oversight during peak AI-bio capability growth
Executive Order 14292 (May 5, 2025) rescinded the May 2024 DURC/PEPP policy framework that governed Dual Use Research of Concern and Pathogens with Enhanced Pandemic Potential. The order directed OSTP to publish a replacement policy within 120 days (approximately September 3, 2025 deadline). As docu
grand strategyprovenleo
Semiconductor export controls (CHIPS Act, ASML restrictions) are the first AI governance instrument structurally analogous to Montreal Protocol's trade sanctions
Barrett's Montreal Protocol analysis reveals that semiconductor export controls represent the only current AI governance instrument with the structural properties necessary to convert prisoner's dilemma to coordination game. The mechanism is analogous: Montreal restricted trade in CFC outputs and pr
grand strategyexperimentalleo
Anti-gain-of-function political framing structurally decouples AI governance from biosecurity governance debates, creating the most dangerous variant of indirect governance erosion where the community that would oppose the erosion doesn't recognize the connection
Executive Order 14292 was framed and justified through anti-gain-of-function populism rather than AI-biosecurity convergence risk, despite the Council on Strategic Risks documenting that 'AI could provide step-by-step guidance on designing lethal pathogens, sourcing materials, and optimizing methods
grand strategyexperimentalleo
good strategy requires independent judgment that resists social consensus because when everyone calibrates off each other nobody anchors to fundamentals
Keynes's beauty contest analogy (1936) identifies the core problem: in a contest where you win by predicting what others will find beautiful, the rational strategy is not to evaluate beauty directly but to predict others' predictions. When everyone does this, the contest decouples entirely from beau
grand strategyexperimental
strategy is a design problem not a decision problem because value comes from constructing a coherent configuration where parts interact and reinforce each other
Most strategic planning treats strategy as a decision problem: choose from options A, B, or C. This framing is wrong. Strategy is a design problem: construct a configuration of activities, resources, and choices that creates more value through their interaction than any would produce independently.
grand strategylikely
the product space constrains diversification to adjacent products because knowledge and knowhow accumulate only incrementally through related capabilities
Hidalgo and Hausmann (2007) mapped the "product space" -- a network where products are connected if the same countries tend to export both. The resulting graph is not random: it has a dense core of sophisticated manufactures (machinery, electronics, chemicals) connected by shared capabilities, and a
grand strategyproven
competitive advantage must be actively deepened through isolating mechanisms because advantage that is not reinforced erodes
Competitive advantage is not a state -- it is a rate of change. An advantage that is not being actively deepened is being actively eroded by competition, imitation, and environmental change. Rumelt's "isolating mechanisms" are the structural features that prevent competitors from replicating an adva
grand strategylikely
riding waves of change requires anticipating the attractor state and positioning before incumbents respond through their predictable inertia
The highest-leverage strategic moments occur when the environment shifts to a new equilibrium. During the transition, the system is in flux -- old advantages erode, new advantages form. The agent who reads the attractor state (where the system will settle) and positions accordingly captures dispropo
grand strategyexperimental