Difference between revisions of "AI safety field consensus"
(6 intermediate revisions by the same user not shown) | |||
Line 3: | Line 3: | ||
* [[Orthogonality thesis]] | * [[Orthogonality thesis]] | ||
* [[Instrumental convergence]] | * [[Instrumental convergence]] | ||
− | * Edge instantiation | + | * [[Edge instantiation]] |
− | * Goodhart problems | + | * [[Patch resistance]] |
+ | * Goodhart problems i.e. awareness that [[Goodhart's law]] is a thing, and general attention/wariness of it | ||
* AGI possible in principle (as in, it is virtually certain that humans can create AGI) | * AGI possible in principle (as in, it is virtually certain that humans can create AGI) | ||
+ | * advanced AI will have a huge impact on the world | ||
+ | * [[Counterfactual of dropping a seed AI into a world without other capable AI]] (?) (even Robin Hanson agrees) | ||
+ | * [[Discontinuities in usefulness of whole brain emulation technology]] (?) (even Robin Hanson agrees) | ||
+ | * from https://intelligence.org/files/AIFoomDebate.pdf#page=517 | ||
+ | ** "Machine intelligence would be a development of almost unprecedented impact and risk, well worth considering now." | ||
+ | ** "Feasible approaches include direct hand-coding, based on a few big and lots of little insights, and on emulations of real human brains." | ||
+ | ** "Machine intelligence will, more likely than not, appear within a century, even if the progress rate to date does not strongly suggest the next few decades." | ||
+ | ** "Math and deep insights (especially probability) can be powerful relative to trend fitting and crude analogies." | ||
+ | ** "Long-term historical trends are suggestive of future events, but not strongly so." | ||
+ | ** "Some should be thinking about how to create “friendly” machine intelligences." | ||
+ | |||
+ | see also "Background AI safety intuitions" section in [https://agentfoundations.org/item?id=1129] | ||
one operationalization might be something like: what are the things relevant to AI safety that all of [[Eliezer Yudkowsky]], [[Paul Christiano]], [[Robin Hanson]], [[Rohin Shah]], [[Dario Amodei]], and [[Wei Dai]] agree on? | one operationalization might be something like: what are the things relevant to AI safety that all of [[Eliezer Yudkowsky]], [[Paul Christiano]], [[Robin Hanson]], [[Rohin Shah]], [[Dario Amodei]], and [[Wei Dai]] agree on? | ||
+ | |||
+ | [[Category:AI safety]] |
Latest revision as of 01:33, 13 May 2020
People in AI safety tend to disagree about many things. However, there is also wide agreement about some other things (which people outside the field often disagree about).
- Orthogonality thesis
- Instrumental convergence
- Edge instantiation
- Patch resistance
- Goodhart problems i.e. awareness that Goodhart's law is a thing, and general attention/wariness of it
- AGI possible in principle (as in, it is virtually certain that humans can create AGI)
- advanced AI will have a huge impact on the world
- Counterfactual of dropping a seed AI into a world without other capable AI (?) (even Robin Hanson agrees)
- Discontinuities in usefulness of whole brain emulation technology (?) (even Robin Hanson agrees)
- from https://intelligence.org/files/AIFoomDebate.pdf#page=517
- "Machine intelligence would be a development of almost unprecedented impact and risk, well worth considering now."
- "Feasible approaches include direct hand-coding, based on a few big and lots of little insights, and on emulations of real human brains."
- "Machine intelligence will, more likely than not, appear within a century, even if the progress rate to date does not strongly suggest the next few decades."
- "Math and deep insights (especially probability) can be powerful relative to trend fitting and crude analogies."
- "Long-term historical trends are suggestive of future events, but not strongly so."
- "Some should be thinking about how to create “friendly” machine intelligences."
see also "Background AI safety intuitions" section in [1]
one operationalization might be something like: what are the things relevant to AI safety that all of Eliezer Yudkowsky, Paul Christiano, Robin Hanson, Rohin Shah, Dario Amodei, and Wei Dai agree on?