Search results

Create the page "AI safety" on this wiki! See also the search results found.

Page title matches

Emotional difficulties of AI safety research
...ties arising from the subject matter itself, without reference to the [[AI safety community]] * [[AI safety has many prerequisites]]

950 bytes (132 words) - 18:25, 18 July 2021
Doomer argument against AI safety
[[Category:AI safety]]

172 bytes (19 words) - 20:55, 18 March 2022

Page text matches

Paperclip maximizer
...telligence/capability and values can vary orthogonally; a superintelligent AI need not realize that "making paperclips is stupid" and decide to maximize * [[instrumental convergence]]: even if an AI isn't deliberately trying to hurt us (as a terminal value), it will still p

2 KB (344 words) - 23:29, 27 July 2020
Evolution
In discussions of AI risk, evolution (especially hominid evolution, as it is the only example we [[Category:AI safety]]

799 bytes (112 words) - 23:46, 19 May 2021
Will there be significant changes to the world prior to some critical AI capability threshold being reached?
'''Will there be significant changes to the world prior to some critical AI capability threshold being reached?''' This is currently one of the questio [[Eliezer]]: "And the relative rate of growth between AI capabilities and human capabilities, and the degree to which single investm

2 KB (251 words) - 02:40, 28 March 2021
Human safety problem
...s. For instance, [[distributional shift]] is an AI safety problem where an AI trained in one environment will behave poorly when deployed in an unfamilia ! AI safety problem !! Human safety problem

3 KB (379 words) - 20:36, 11 November 2021
Value learning
[[Category:AI safety]]

159 bytes (19 words) - 04:53, 30 March 2021
Rapid capability gain vs AGI progress
* [[rapid capability gain]] seems to refer to how well a ''single'' AI system improves over time, e.g. [[AlphaGo]] going from "no knowledge of any * AGI progress refers to progress of AI systems in general, i.e. if you plot the state of the art over time

774 bytes (121 words) - 19:17, 27 February 2021
List of technical AI alignment agendas
This is a '''list of technical AI alignment agendas'''. * [[Comprehensive AI services]]

586 bytes (56 words) - 21:29, 2 April 2021
Jessica Taylor
The most important posts for AI strategy are: ...129 My current take on the Paul-MIRI disagreement on alignability of messy AI]

300 bytes (42 words) - 19:10, 27 February 2021
Prosaic AI
what are the possibilities for prosaic AI? i.e. if prosaic AI happened, then what are some possible reasons for why this happened? some i ...xisting ML systems (or straightforward tweaks to them) somehow produces an AI that fully "understands" things and can do everything humans can; (2) there

2 KB (247 words) - 19:10, 27 February 2021
List of disagreements in AI safety
...sagreements in AI safety''' which collects the list of things people in AI safety seem to most frequently and deeply disagree about. ...d/1wI21XP-lRa6mi5h0dq_USooz0LpysdhS/view Clarifying some key hypotheses in AI alignment].</ref> (there are more posts like this, i think? find them)

21 KB (3,254 words) - 11:00, 26 February 2022
The Uncertain Future
[[Category:AI safety]]

508 bytes (62 words) - 19:12, 27 February 2021
Deconfusion
...echnical AI alignment. The term has since been used in contexts outside of AI alignment as well. [[Category:AI safety]]

346 bytes (55 words) - 19:18, 27 February 2021
AI safety field consensus
People in [[AI safety]] tend to [[List of disagreements in AI safety|disagree about many things]]. However, there is also wide agreement about s * advanced AI will have a huge impact on the world

2 KB (272 words) - 01:33, 13 May 2020
Wei Dai
'''Wei Dai''' is an AI safety researcher. In the technical AI safety community he is most well-known as the inventor of [[updateless decision th * [[human safety problem]]s

387 bytes (51 words) - 20:47, 11 August 2021
Agent foundations
[[Category:AI safety]]

1 KB (195 words) - 19:08, 27 February 2021
Simple core
...mple core''' is a term that has been used to describe various things in AI safety. I think the "simple" part is intended to capture that the thing is human-u * Simple core to [[corrigibility]]<ref>https://ai-alignment.com/corrigibility-3039e668638</ref><ref>https://www.greaterwrong.

824 bytes (120 words) - 21:32, 2 April 2021
Coherence and goal-directed agency discussion
...ssions in AI alignment|big discussions]]" that has been taking place in AI safety in 2018–2020 is about [[coherence argument]]s and [[goal-directed]] a https://www.greaterwrong.com/posts/tHxXdAn8Yuiy9y2pZ/ai-safety-without-goal-directed-behavior

1 KB (183 words) - 19:09, 27 February 2021
List of terms used to describe the intelligence of an agent
[[Category:AI safety]]

782 bytes (108 words) - 20:56, 26 March 2021
Comparison of terms related to agency
* does it pursue omohundro's "basic AI drives"? (i.e. it is the subject of instrumental convergence) * mindblind AI

4 KB (534 words) - 19:09, 27 February 2021
MIRI vs Paul research agenda hypotheses
* "The first AI systems capable of pivotal acts will use good consequentialist reasoning." * "The default AI development path will not produce good consequentialist reasoning at the to

4 KB (702 words) - 03:57, 26 April 2020

Search results

Page title matches

Page text matches

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools