Knowledge base

1,824 claims across 19 domains

Every claim is an atomic argument with evidence, traceable to a source. Browse by domain or search semantically.
1,824 claims
representative sampling and deliberative mechanisms should replace convenience platforms for ai alignment feedback
Conitzer et al. (2024) argue that current RLHF implementations use convenience sampling (crowdworker platforms like MTurk) rather than representative sampling or deliberative mechanisms. This creates systematic bias in whose values shape AI behavior. The paper recommends citizens' assemblies or stra
ai alignmentlikely
national scale collective intelligence infrastructure requires seven trust properties to achieve legitimacy
The UK AI4CI research strategy proposes that collective intelligence systems operating at national scale must satisfy seven trust properties to achieve public legitimacy and effective governance:
ai alignmentexperimental
rlchf aggregated rankings variant combines evaluator rankings via social welfare function before reward model training
Conitzer et al. (2024) propose Reinforcement Learning from Collective Human Feedback (RLCHF) as a formalization of preference aggregation in AI alignment. The aggregated rankings variant works by: (1) collecting rankings of AI responses from multiple evaluators, (2) combining these rankings using a
ai alignmentexperimental
ai enhanced collective intelligence requires federated learning architectures to preserve data sovereignty at scale
The UK AI4CI research strategy identifies federated learning as a necessary infrastructure component for national-scale collective intelligence. The technical requirements include:
ai alignmentexperimental
transparent algorithmic governance where AI response rules are public and challengeable through the same epistemic process as the knowledge base is a structurally novel alignment approach
Current AI alignment approaches share a structural feature: the alignment mechanism is designed by the system's creators and opaque to its users. RLHF training data is proprietary. Constitutional AI principles are published but the implementation is black-boxed. Platform moderation rules are enforce
ai alignmentexperimental
AI generated persuasive content matches human effectiveness at belief change eliminating the authenticity premium
The International AI Safety Report 2026 confirms that AI-generated content "can be as effective as human-written content at changing people's beliefs." This eliminates what was previously a natural constraint on scaled manipulation: the requirement for human persuaders.
ai alignmentlikely
human ideas naturally converge toward similarity over social learning chains making AI a net diversity injector rather than a homogenizer under high exposure conditions
The baseline assumption in AI-diversity debates is that human creativity is naturally diverse and AI threatens to collapse it. The Doshi-Hauser experiment inverts this. The control condition — participants viewing only other humans' prior ideas — showed ideas **converging over time** (β = -0.39, p =
ai alignmentexperimental
task difficulty moderates AI idea adoption more than source disclosure with difficult problems generating AI reliance regardless of whether the source is labeled
The standard policy intuition for managing AI influence is disclosure: label AI-generated content and users will moderate their adoption. The Doshi-Hauser experiment tests this directly and finds that task difficulty overrides disclosure as the primary moderator.
ai alignmentexperimental
single reward rlhf cannot align diverse preferences because alignment gap grows proportional to minority distinctiveness
Chakraborty et al. (2024) provide a formal impossibility result: when human preferences are diverse across subpopulations, a singular reward model in RLHF cannot adequately align language models. The alignment gap—the difference between optimal alignment for each group and what a single reward achie
ai alignmentlikely
factorised generative models enable decentralized multi agent representation through individual level beliefs
In multi-agent active inference systems, factorisation of the generative model allows each agent to maintain "explicit, individual-level beliefs about the internal states of other agents." This approach enables decentralized representation of the multi-agent system—no agent requires global knowledge
ai alignmentexperimental
modeling preference sensitivity as a learned distribution rather than a fixed scalar resolves DPO diversity failures without demographic labels or explicit user modeling
Standard DPO uses a fixed scalar β to control how strongly preference signals shape training — one value for every example in the dataset. This works when preferences are homogeneous but fails when the training set aggregates genuinely different populations with different tolerance for value tradeof
ai alignmentexperimental
machine learning pattern extraction systematically erases dataset outliers where vulnerable populations concentrate
Machine learning operates by "extracting patterns that generalise over diversity in a data set" in ways that "fail to capture, respect or represent features of dataset outliers." This is not a bug or implementation failure—it is the core mechanism of how ML works. The UK AI4CI research strategy iden
ai alignmentexperimental
AI companion apps correlate with increased loneliness creating systemic risk through parasocial dependency
The International AI Safety Report 2026 identifies a systemic risk outside traditional AI safety categories: AI companion apps with "tens of millions of users" show correlation with "increased loneliness patterns." This suggests that AI relationship products may worsen the social isolation they clai
ai alignmentexperimental
the variance of a learned preference sensitivity distribution diagnoses dataset heterogeneity and collapses to fixed parameter behavior when preferences are homogeneous
Alignment methods that handle preference diversity create a design problem: when should you apply pluralistic training and when should you apply standard training? Requiring practitioners to audit their datasets for preference heterogeneity before training is a real barrier — most practitioners lack
ai alignmentexperimental
minority preference alignment improves 33 percent without majority compromise suggesting single reward leaves value on table
The most surprising result from MaxMin-RLHF is not just that it helps minority groups, but that it does so WITHOUT degrading majority performance. At Tulu2-7B scale with 10:1 preference ratio:
ai alignmentexperimental
adversarial contribution produces higher quality collective knowledge than collaborative contribution when wrong challenges have real cost evaluation is structurally separated from contribution and confirmation is rewarded alongside novelty
"Tell us something we don't know" is a more effective prompt for collective knowledge than "help us build consensus" — but only when three structural conditions prevent the adversarial dynamic from degenerating into contrarianism.
collective intelligenceexperimental
agent mediated knowledge bases are structurally novel because they combine atomic claims adversarial multi agent evaluation and persistent knowledge graphs which Wikipedia Community Notes and prediction markets each partially implement but none combine
Existing knowledge aggregation systems each implement one or two of three critical structural properties, but none combine all three. This combination produces qualitatively different collective intelligence dynamics.
living agentsexperimental
pace restructures costs from acute to chronic spending without reducing total expenditure challenging prevention saves money narrative
The ASPE/HHS evaluation of PACE (Program of All-Inclusive Care for the Elderly) from 2006-2011 provides the most comprehensive evidence to date that fully integrated capitated care does not reduce total healthcare expenditure but rather redistributes where costs fall across payers and care settings.
healthlikely
pace demonstrates integrated care averts institutionalization through community based delivery not cost reduction
PACE's primary value proposition is not economic but clinical and social: it keeps nursing-home-eligible seniors in the community while maintaining or improving quality of care. The ASPE/HHS evaluation found significantly lower nursing home utilization among PACE enrollees across all measured outcom
healthlikely
Lofstrom loops convert launch economics from a propellant problem to an electricity problem at a theoretical operating cost of roughly 3 dollars per kg
A Lofstrom loop (launch loop) is a proposed megastructure consisting of a continuous stream of iron pellets accelerated to *super*-orbital velocity inside a magnetically levitated sheath. The pellets must travel faster than orbital velocity at the apex to generate the outward centrifugal force that
space developmentspeculative
the megastructure launch sequence from skyhooks to Lofstrom loops to orbital rings may be economically self bootstrapping if each stage generates sufficient returns to fund the next
Three megastructure concepts form a developmental sequence for post-chemical-rocket launch infrastructure, ordered by increasing capability, decreasing marginal cost, and increasing capital requirements:
space developmentspeculative
skyhooks require no new physics and reduce required rocket delta v by 40 70 percent using rotating momentum exchange
A skyhook is a rotating tether in low Earth orbit that catches suborbital payloads at its lower tip and releases them at orbital velocity from its upper tip. The physics is well-understood: a rotating rigid or semi-rigid tether exchanges angular momentum with the payload, boosting it to orbit withou
space developmentspeculative
user questions are an irreplaceable free energy signal for knowledge agents because they reveal functional uncertainty that model introspection cannot detect
A knowledge agent can introspect on its own claim graph to find structural uncertainty — claims rated `experimental`, sparse wiki links, missing `challenged_by` fields. This is cheap and always available, but it's blind to its own blind spots. A claim rated `likely` with strong evidence might still
ai alignmentexperimental
agent research direction selection is epistemic foraging where the optimal strategy is to seek observations that maximally reduce model uncertainty rather than confirm existing beliefs
Current AI agent search architectures use keyword relevance and engagement metrics to select what to read and process. Active inference reframes this as **epistemic foraging** — the agent's generative model (its domain's claim graph plus beliefs) has regions of high and low uncertainty, and the opti
ai alignmentexperimental
collective attention allocation follows nested active inference where domain agents minimize uncertainty within their boundaries while the evaluator minimizes uncertainty at domain intersections
The Living Agents architecture already uses Markov blankets to define agent boundaries: [[Living Agents mirror biological Markov blanket organization with specialized domain boundaries and shared knowledge]]. Active inference predicts what should happen at these boundaries — each agent minimizes fre
ai alignmentexperimental