Difference between revisions of "Late 2021 MIRI conversations"

From Issawiki
Jump to: navigation, search
Line 5: Line 5:
 
|-
 
|-
 
| [https://www.lesswrong.com/s/n945eovrA3oDueqtq/p/hwxj4gieR7FWNwYfa Ngo and Yudkowsky on AI capability gains]
 
| [https://www.lesswrong.com/s/n945eovrA3oDueqtq/p/hwxj4gieR7FWNwYfa Ngo and Yudkowsky on AI capability gains]
| This conversation jumps between a lot of different topics, including both object level points and meta points (about how one ought to reason in these kinds of situations). Some of the main covered topics are: (1) recursive self-improvement and consequentialism (Eliezer defends the position that these are useful abstractions that apply to the real world, whereas Richard thinks they are wrong, or at least messier to apply to the real world than Eliezer believes or at the very least that the validity of these concepts has not been comprehensively argued for by Eliezer); a meta-disagreement about what Eliezer got wrong in the [[Hanson-Yudkowsky debate]] (Eliezer thinks he made a 'correct argument about a different subject' rather than an 'incorrect argument about the correct subject', and thinks he didn't properly take into account the [[law of earlier failure]] -- Hanson's arguments failed much more prosaically than Eliezer expected; Richard on the other hand believes Eliezer's error is about the abstraction on consequentialism/recursive self-improvement being messier to apply to the real world than Eliezer expected);
+
| This conversation jumps between a lot of different topics, including both object level points and meta points (about how one ought to reason in these kinds of situations). Some of the main covered topics are: (1) recursive self-improvement and consequentialism (Eliezer defends the position that these are useful abstractions that apply to the real world, whereas Richard thinks they are wrong, or at least messier to apply to the real world than Eliezer believes or at the very least that the validity of these concepts has not been comprehensively argued for by Eliezer); a meta-disagreement about what Eliezer got wrong in the [[Hanson-Yudkowsky debate]] (Eliezer thinks he made a 'correct argument about a different subject' rather than an 'incorrect argument about the correct subject', and thinks he didn't properly take into account the [[law of earlier failure]] -- Hanson's arguments failed much more prosaically than Eliezer expected; Richard on the other hand believes Eliezer's error is about the abstraction on consequentialism/recursive self-improvement being messier to apply to the real world than Eliezer expected); (3) the concept of utility (Eliezer defends the concept being useful and coherent and applicable to analyzing advanced AI; Richard is skeptical and goes on and on about advance predictions and such);
 
| I found this conversation frustrating to read.
 
| I found this conversation frustrating to read.
 
|
 
|

Revision as of 23:24, 31 July 2022

https://www.lesswrong.com/s/n945eovrA3oDueqtq

Title Summary My thoughts Further reading/keywords
Ngo and Yudkowsky on AI capability gains This conversation jumps between a lot of different topics, including both object level points and meta points (about how one ought to reason in these kinds of situations). Some of the main covered topics are: (1) recursive self-improvement and consequentialism (Eliezer defends the position that these are useful abstractions that apply to the real world, whereas Richard thinks they are wrong, or at least messier to apply to the real world than Eliezer believes or at the very least that the validity of these concepts has not been comprehensively argued for by Eliezer); a meta-disagreement about what Eliezer got wrong in the Hanson-Yudkowsky debate (Eliezer thinks he made a 'correct argument about a different subject' rather than an 'incorrect argument about the correct subject', and thinks he didn't properly take into account the law of earlier failure -- Hanson's arguments failed much more prosaically than Eliezer expected; Richard on the other hand believes Eliezer's error is about the abstraction on consequentialism/recursive self-improvement being messier to apply to the real world than Eliezer expected); (3) the concept of utility (Eliezer defends the concept being useful and coherent and applicable to analyzing advanced AI; Richard is skeptical and goes on and on about advance predictions and such); I found this conversation frustrating to read.
Ngo's view on alignment difficulty Richard Ngo puts forth his own case about why he is more optimistic (compared to Eliezer Yudkowsky) about humanity handling the creation of AGI well. Ngo's case relies on several points: (1) he expects a continuous takeoff where more and more tasks are automated (including the ability to "only to answer questions" but to do so at human level) without achieving AGI; (2) the difficulty of achieving AGI (he distinguishes between task-based reinforcement learning and open-ended reinforcement learning, and says the latter is what leads to AI catastrophe, but also that the latter is much more difficult due to slowness of real-world feedback and the difficulty of creating sufficiently rich artificial environments); (3) "The US and China preventing any other country from becoming a leader in AI requires about as much competent power as banning chemical/biological weapons"; (4) there is enough competent power at the level of 'banning chemical/biological weapons'; (5) this competent power will be used to halt progress on AI outside a US-China collaboration (?) (the optimism here relies on (1): continuous takeoff allows compelling cases of misalignment to occur and convince governments). So now his actual case (which is only implicit in the post, so I am reading between the lines here) is something like: The difficulty of AGI buys us some time (2). Meanwhile, progress on task-based/narrow AI will continue (1), and will produce compelling cases of AI misalignment, leading US/China to halt progress on AI outside a US-China collaboration (3,4,5). I did not like how this document was written; I was expecting a clear case for AI optimism to be stated explicitly, but instead it was a meandering sequence of points, most of them not obviously related to AI optimism. At the end I was left thinking, "What even is the case here?" and had to backtrack and skip around the post a few more times to figure out what seems to be the argument (but even now, I am not sure if I fully understand). ASML

See also

What links here