Difference between revisions of "Aligning smart AI using slightly less smart AI"
Line 3: | Line 3: | ||
==External links== | ==External links== | ||
+ | * [[Paul Christiano]] makes the argument [https://www.facebook.com/groups/aisafety/posts/920154224815359/?comment_id=920160664814715&reply_comment_id=920212811476167 here] | ||
* [[Richard Ngo]] brings up this argument in [https://www.lesswrong.com/posts/7im8at9PmhbT4JHsW/ngo-and-yudkowsky-on-alignment-difficulty#1_1__Deep_vs__shallow_problem_solving_patterns] | * [[Richard Ngo]] brings up this argument in [https://www.lesswrong.com/posts/7im8at9PmhbT4JHsW/ngo-and-yudkowsky-on-alignment-difficulty#1_1__Deep_vs__shallow_problem_solving_patterns] | ||
Latest revision as of 11:21, 26 February 2022
A strategy that some (who focus on machine learning safety) have cited for their relative optimism on the difficulty of AI alignment: we humans wouldn't need to directly align a superintelligence, but rather only need to align AI systems slightly smarter than ourselves, and from there, each "generation" of AI systems will align slightly smarter systems, and so on.
External links
- Paul Christiano makes the argument here
- Richard Ngo brings up this argument in [1]