Category:AI safety
For pages related to AI safety and alignment. These pages should be moved to a dedicated AI safety wiki at some point.
Pages in category "AI safety"
The following 116 pages are in this category, out of 116 total.
A
C
- Can the behavior of approval-direction be undefined or random?
- Carl Shulman
- Changing selection pressures argument
- Christiano's operationalization of slow takeoff
- Coherence and goal-directed agency discussion
- Comparison of AI takeoff scenarios
- Comparison of terms related to agency
- Competence gap
- Content sharing between AIs
- Continuous takeoff
- Corrigibility
- Corrigibility may be undesirable
- Counterfactual of dropping a seed AI into a world without other capable AI
D
H
L
- Laplace's rule of succession argument for AI timelines
- Late 2021 MIRI conversations
- Late singularity
- List of arguments against working on AI safety
- List of big discussions in AI alignment
- List of breakthroughs plausibly needed for AGI
- List of critiques of iterated amplification
- List of disagreements in AI safety
- List of success criteria for HRAD work
- List of teams at OpenAI
- List of technical AI alignment agendas
- List of terms used to describe the intelligence of an agent
- List of thought experiments in AI safety
- Lumpiness
M
N
P
R
S
- Scaling hypothesis
- Science argument
- Second species argument
- Secret sauce for intelligence
- Secret sauce for intelligence vs specialization in intelligence
- Selection effect for successful formalizations
- Selection effect for who builds AGI
- Short-term preferences-on-reflection
- Simple core
- Simple core of consequentialist reasoning
- Single-architecture generality
- Single-model generality
- Soft-hard takeoff
- Something like realism about rationality
- Statistical analysis of expert timelines argument for AI timelines
- Stupid questions
- Sudden emergence