Difference between revisions of "Secret sauce for intelligence"
(→Evidence) |
|||
(68 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | https://sideways-view.com/2018/02/24/takeoff-speeds/ | + | '''Secret sauce for intelligence''' (also known as '''one algorithm''',<ref>https://aiimpacts.org/likelihood-of-discontinuous-progress-around-the-development-of-agi/#One_algorithm</ref> '''simple core algorithm''', '''[[Lumpiness|lumpy]] AI progress''', '''intelligibility of intelligence''' (this might actually be different?),<ref>https://intelligence.org/files/HowIntelligible.pdf</ref> '''small number of breakthroughs needed for AGI''' and many other phrases) is a disagreement or a cluster of disagreements within AI safety. Resolving this is important for thinking about the shape of [[AI takeoff]]. |
+ | |||
+ | The common forms/framings of the disagreement are: | ||
+ | |||
+ | * whether progress in AI is "[[Lumpiness|lumpy]]", i.e. coming in a small number of chunks that contribute greatly to AI capabilities; there are a small number of discrete insights required to create an AGI rather than many small/messy insights | ||
+ | * whether intelligence is "simple" in some sense, maybe Kolmogorov complexity | ||
+ | * whether the core function of intelligence is understandable/intelligible in some sense | ||
+ | * whether it is possible to specialize in intelligence itself (i.e. whether intelligence can be considered a single narrow technology, like a car, rather than something that is an umbrella term for a whole collection of separate narrow technologies and cannot be specialized in, like a city or a country) | ||
+ | |||
+ | Proponents of the secret sauce view believe that progress is lumpy, and that intelligence is simple and understandable. | ||
+ | |||
+ | Unlike a [[missing gear for intelligence]], the secret sauce argument says the final piece leading to AGI is a big breakthrough, rather than minor design tweaks. | ||
+ | |||
+ | ==History== | ||
+ | |||
+ | This disagreement was a (the?) core part of the Hanson-Yudkowsky AI Foom debate. | ||
+ | |||
+ | ==Evidence== | ||
+ | |||
+ | What kinds of evidence would shift our beliefs to one side of the disagreement or the other? This section lists all the things that have been given as reasons for believing or not believing in a secret sauce. | ||
+ | |||
+ | * [[List of breakthroughs plausibly needed for AGI]]: We can enumerate discrete advances in our understanding of AI, like Bayes nets and Judea Pearl's work on causality. The idea is to extrapolate from here, and say that building an AGI requires a few more of these things. | ||
+ | * Looking at [[lumpiness]] of innovations in general, e.g. "Hanson answers that there’s a large literature on economic and ecological innovations, basically saying that the vast majority of innovation consists of small gains. It’s lots of little steps over time that slowly make various fields better."<ref>https://intelligence.org/files/AIFoomDebate.pdf#page=556</ref> | ||
+ | * Looking at citation [[lumpiness]] in CS/AI/ML (specifically, searching for a deviation from other fields), and deviations in innovation lumpiness (relative to e.g. software in general). "Presumably the basis for this claim is that some people think they see a different distribution among some subset of AI software, perhaps including machine learning software. I don’t see it yet, but the obvious way for them to convince skeptics like me is to create and analyze a formal dataset of software projects and innovations. Show us a significantly-deviating subset of AI programs with more economic scope, generality, and lumpiness in gains. Statistics from such an analysis could let us numerically estimate the chances of a single small team encompassing a big fraction of AGI software power and value."<ref>http://www.overcomingbias.com/2016/03/how-different-agi-software.html</ref> | ||
+ | * Hominid [[evolution]] | ||
+ | ** Comparing scientific ability in chimps vs humans suggests that some final piece was added to chimp cognition in a relatively short span of time to greatly improve the level of general intelligence, suggesting some kind of secret sauce. (This can also suggest a [[Missing gear for intelligence#Evidence|missing gear]], if the final piece was just some minor tweaks.) | ||
+ | ** The difference in general intelligence between humans and chimps suggests evolution discovered intelligence relatively quickly<ref>https://aiimpacts.org/likelihood-of-discontinuous-progress-around-the-development-of-agi/#One_algorithm</ref> | ||
+ | ** Evolution argument: human intelligence was built by evolution, suggesting that AGI can also be built incrementally/iteratively<ref>https://sideways-view.com/2018/02/24/takeoff-speeds/</ref> | ||
+ | ** The information content of the human genome is about 750 megabytes,<ref>https://en.wikipedia.org/wiki/Human_genome#Information_content</ref> and gives one upper bound of how much information must be baked into an AI (rather than learned by interacting with the environment). Taking into account the fact that most of the DNA is junk and that the genes for the brain are only a part of the entire genome further reduces the bound.<ref name="jane-street-debate">https://docs.google.com/document/pub?id=17yLL7B7yRrhV3J9NuiVuac3hNmjeKTVHnqiEa6UQpJk</ref><ref>https://www.lesswrong.com/posts/jd9LajtGWv93NC8uo/source-code-size-vs-learned-model-size-in-ml-and-in-humans</ref> | ||
+ | * the [[science argument]] | ||
+ | * looking at things like GPT-2 and GPT-3 [https://www.greaterwrong.com/posts/ZFtesgbY9XwtqqyZ5/human-psycholinguists-a-critical-appraisal] [https://www.gwern.net/newsletter/2020/05#gpt-3] | ||
+ | * [[Prosaic AI]] argument: it's plausible that we can already build [[prosaic AI]], suggesting either that there aren't big insights left to be discovered, or that any further insights discovered will be refinements to what we already know rather than being a totally new breakthrough<ref>https://sideways-view.com/2018/02/24/takeoff-speeds/</ref><ref>https://aiimpacts.org/likelihood-of-discontinuous-progress-around-the-development-of-agi/#One_algorithm</ref> | ||
+ | ** A related idea: assuming we can get to prosaic AI, it seems like a lot of the progress is coming from minor innovations rather than huge breakthroughs in understanding. So even if we can't get prosaic AI, we might believe that in that world, we would discover something big, but then it takes years of tinkering afterwards to get it to work well. | ||
+ | * Content argument: looking at historical AI development, key insights have played a relatively small role<ref>https://sideways-view.com/2018/02/24/takeoff-speeds/</ref> | ||
+ | * Some reason to suspect that there's a common core to a bunch of intellectual activities<ref>https://aiimpacts.org/likelihood-of-discontinuous-progress-around-the-development-of-agi/#One_algorithm</ref> | ||
+ | * "Second, he asserts that the brain “just doesn’t look all that complicated” in comparison to human-made pieces of technology such as computer operating systems (p.444)." [https://docs.google.com/document/d/1lgcBauWyYk774gBwKn8P_h8_wL9vLLiWBr6JMmEd_-I/edit#] | ||
+ | |||
+ | ==Discovery by human or AI== | ||
+ | |||
+ | I think in a standard visualization of "secret sauce", the secret sauce is imagined as being discovered by human AI researchers. This makes sense because if there ''is'' a secret sauce to be discovered, and it hasn't been discovered yet, then the AI that exists at that point in time is not that smart, so it can't discover the secret sauce. | ||
+ | |||
+ | But if a secret sauce exists, then it was discovered by a blind search (namely, evolution), so an alternative visualization is that if a secret sauce exists, it will be discovered by machine learning systems using search/gradient-descent-like mechanisms. | ||
+ | |||
+ | See also: [[prosaic AI]]. | ||
+ | |||
+ | ==Is a secret sauce sufficient for a discontinuity?== | ||
+ | |||
+ | Historical lack of discontinuities, even among "simple" things<ref>https://aiimpacts.org/likelihood-of-discontinuous-progress-around-the-development-of-agi/#One_algorithm</ref> -- actually, this point doesn't really say anything about secret sauce; it's instead saying that even given secret sauce, we shouldn't expect a discontinuity | ||
+ | |||
+ | Even if there is a secret sauce, AI takeoff could still be continuous if hardware is a bottleneck (i.e. we might discover all the important facts about intelligence early on but not have enough computing power to run it cheaply). | ||
+ | |||
+ | The [[prosaic AI]] argument is also relevant here; we might have already discovered the secret sauce. | ||
+ | |||
+ | ==See also== | ||
+ | |||
+ | * [[Secret sauce for intelligence vs specialization in intelligence]] | ||
+ | * [[Simple core of consequentialist reasoning]] | ||
+ | * [[Prosaic AI]] | ||
+ | * [[Missing gear vs secret sauce]] | ||
+ | * [[Progress in self-improvement]] | ||
+ | * [[Textbook test for AI theory]] | ||
+ | |||
+ | ==Notes== | ||
+ | |||
+ | To what extent is this the same debate as [[something like realism about rationality]]? It seems like if intelligence/rationality is simple, then there ''will'' be a secret sauce. But if it's not simple it could go either way (maybe there's a final essential "gear" that needs to be added to make things really work, in which case there is a secret sauce, or maybe everything just gradually improves and there's nothing like a "last gear"). So my initial thought is that intelligibility of intelligence implies secret sauce, but not conversely. Now, about rationality realism specifically, I'm not sure. | ||
https://jacoblagerros.wordpress.com/2018/03/09/brains-and-backprop-a-key-timeline-crux/ | https://jacoblagerros.wordpress.com/2018/03/09/brains-and-backprop-a-key-timeline-crux/ | ||
https://web.archive.org/web/20200218080005/https://lw2.issarice.com/posts/4Q5s8qGyCtzfYtCZX/is-there-a-compute-efficient-algorithm-for-agency (i guess this one argues against) | https://web.archive.org/web/20200218080005/https://lw2.issarice.com/posts/4Q5s8qGyCtzfYtCZX/is-there-a-compute-efficient-algorithm-for-agency (i guess this one argues against) | ||
+ | |||
+ | http://benjaminrosshoffman.com/openai-makes-humanity-less-safe/#comment-128508 | ||
+ | |||
+ | https://srconstantin.wordpress.com/2017/02/21/strong-ai-isnt-here-yet/ | ||
from https://arbital.com/p/general_intelligence/ | from https://arbital.com/p/general_intelligence/ | ||
− | <blockquote>An Artificial General Intelligence would have the same property; it could learn a tremendous variety of domains, including domains it had no inkling of when it was switched on.< | + | <blockquote><p>An Artificial General Intelligence would have the same property; it could learn a tremendous variety of domains, including domains it had no inkling of when it was switched on. |
+ | <p>More specific hypotheses about how general intelligence operates have been advanced at various points, but any corresponding attempts to define general intelligence that way, would be theory-laden. The pretheoretical phenomenon to be explained is the extraordinary variety of human achievements across many non-instinctual domains, compared to other animals. | ||
+ | <p>[…] | ||
+ | <p>To the extent one credits the existence of 'significantly more general than chimpanzee intelligence', it implies that there are common cognitive subproblems of the huge variety of problems that humans can (learn to) solve, despite the surface-level differences of those domains. Or at least, the way humans solve problems in those domains, the cognitive work we do must have deep commonalities across those domains. These commonalities may not be visible on an immediate surface inspection.</blockquote> | ||
'But in general, the hypothesis of general intelligence seems like it should cash out as some version of: "There's some set of new cognitive algorithms, plus improvements to existing algorithms, plus bigger brains, plus other resources--we don't know how many things like this there are, but there's some set of things like that--which, when added to previously existing primate and hominid capabilities, created the ability to do better on a broad set of deep cognitive subproblems held in common across a very wide variety of humanly-approachable surface-level problems for learning and manipulating domains. And that's why humans do better on a huge variety of domains simultaneously, despite evolution having not preprogrammed us with new instinctual knowledge or algorithms for all those domains separately."' -- this doesn't really tell us which of those things it was that helped the most. | 'But in general, the hypothesis of general intelligence seems like it should cash out as some version of: "There's some set of new cognitive algorithms, plus improvements to existing algorithms, plus bigger brains, plus other resources--we don't know how many things like this there are, but there's some set of things like that--which, when added to previously existing primate and hominid capabilities, created the ability to do better on a broad set of deep cognitive subproblems held in common across a very wide variety of humanly-approachable surface-level problems for learning and manipulating domains. And that's why humans do better on a huge variety of domains simultaneously, despite evolution having not preprogrammed us with new instinctual knowledge or algorithms for all those domains separately."' -- this doesn't really tell us which of those things it was that helped the most. | ||
+ | |||
+ | [[Rob Bensinger]] makes the same argument here: https://www.greaterwrong.com/posts/D3NspiH2nhKA6B2PE/what-evidence-is-alphago-zero-re-agi-complexity/comment/awsEzHzgD5Rv2YGPo | ||
see also discussion at https://www.facebook.com/yudkowsky/posts/10154018209759228?comment_id=10154018937319228 | see also discussion at https://www.facebook.com/yudkowsky/posts/10154018209759228?comment_id=10154018937319228 | ||
Line 16: | Line 89: | ||
"Yes, IF there are just one or two insights that can create a very general AGI which is far more capable than previous systems, and if that fact is unanticipated, then it might happen that a small team creates this AGI, and it stays better than other systems for a sufficient time to have a big differential effect. So as I've said our key dispute is about the lumpiness and number of key insights needed to create a general capable AGI." [https://www.facebook.com/yudkowsky/posts/10154018209759228?comment_id=10154018937319228&reply_comment_id=10154019078714228] | "Yes, IF there are just one or two insights that can create a very general AGI which is far more capable than previous systems, and if that fact is unanticipated, then it might happen that a small team creates this AGI, and it stays better than other systems for a sufficient time to have a big differential effect. So as I've said our key dispute is about the lumpiness and number of key insights needed to create a general capable AGI." [https://www.facebook.com/yudkowsky/posts/10154018209759228?comment_id=10154018937319228&reply_comment_id=10154019078714228] | ||
− | == | + | [[Robin Hanson]]: |
+ | |||
+ | <blockquote>You might claim that once we have enough good simple tools, complexity will no longer be required. With enough simple tools (and some data to crunch), a few simple and relatively obvious combinations of those tools will be sufficient to perform most all tasks in the world economy at a human level. And thus the first team to find the last simple general tool needed might “foom” via having an enormous advantage over the entire rest of the world put together. At least if that one last tool were powerful enough. I disagree with this claim, but I agree that neither view can be easily and clearly proven wrong.<ref>https://www.greaterwrong.com/posts/D3NspiH2nhKA6B2PE/what-evidence-is-alphago-zero-re-agi-complexity</ref></blockquote> | ||
+ | |||
+ | ==References== | ||
+ | |||
+ | <references/> | ||
− | + | [[Category:AI safety]] |
Latest revision as of 23:45, 19 May 2021
Secret sauce for intelligence (also known as one algorithm,[1] simple core algorithm, lumpy AI progress, intelligibility of intelligence (this might actually be different?),[2] small number of breakthroughs needed for AGI and many other phrases) is a disagreement or a cluster of disagreements within AI safety. Resolving this is important for thinking about the shape of AI takeoff.
The common forms/framings of the disagreement are:
- whether progress in AI is "lumpy", i.e. coming in a small number of chunks that contribute greatly to AI capabilities; there are a small number of discrete insights required to create an AGI rather than many small/messy insights
- whether intelligence is "simple" in some sense, maybe Kolmogorov complexity
- whether the core function of intelligence is understandable/intelligible in some sense
- whether it is possible to specialize in intelligence itself (i.e. whether intelligence can be considered a single narrow technology, like a car, rather than something that is an umbrella term for a whole collection of separate narrow technologies and cannot be specialized in, like a city or a country)
Proponents of the secret sauce view believe that progress is lumpy, and that intelligence is simple and understandable.
Unlike a missing gear for intelligence, the secret sauce argument says the final piece leading to AGI is a big breakthrough, rather than minor design tweaks.
Contents
History
This disagreement was a (the?) core part of the Hanson-Yudkowsky AI Foom debate.
Evidence
What kinds of evidence would shift our beliefs to one side of the disagreement or the other? This section lists all the things that have been given as reasons for believing or not believing in a secret sauce.
- List of breakthroughs plausibly needed for AGI: We can enumerate discrete advances in our understanding of AI, like Bayes nets and Judea Pearl's work on causality. The idea is to extrapolate from here, and say that building an AGI requires a few more of these things.
- Looking at lumpiness of innovations in general, e.g. "Hanson answers that there’s a large literature on economic and ecological innovations, basically saying that the vast majority of innovation consists of small gains. It’s lots of little steps over time that slowly make various fields better."[3]
- Looking at citation lumpiness in CS/AI/ML (specifically, searching for a deviation from other fields), and deviations in innovation lumpiness (relative to e.g. software in general). "Presumably the basis for this claim is that some people think they see a different distribution among some subset of AI software, perhaps including machine learning software. I don’t see it yet, but the obvious way for them to convince skeptics like me is to create and analyze a formal dataset of software projects and innovations. Show us a significantly-deviating subset of AI programs with more economic scope, generality, and lumpiness in gains. Statistics from such an analysis could let us numerically estimate the chances of a single small team encompassing a big fraction of AGI software power and value."[4]
- Hominid evolution
- Comparing scientific ability in chimps vs humans suggests that some final piece was added to chimp cognition in a relatively short span of time to greatly improve the level of general intelligence, suggesting some kind of secret sauce. (This can also suggest a missing gear, if the final piece was just some minor tweaks.)
- The difference in general intelligence between humans and chimps suggests evolution discovered intelligence relatively quickly[5]
- Evolution argument: human intelligence was built by evolution, suggesting that AGI can also be built incrementally/iteratively[6]
- The information content of the human genome is about 750 megabytes,[7] and gives one upper bound of how much information must be baked into an AI (rather than learned by interacting with the environment). Taking into account the fact that most of the DNA is junk and that the genes for the brain are only a part of the entire genome further reduces the bound.[8][9]
- the science argument
- looking at things like GPT-2 and GPT-3 [1] [2]
- Prosaic AI argument: it's plausible that we can already build prosaic AI, suggesting either that there aren't big insights left to be discovered, or that any further insights discovered will be refinements to what we already know rather than being a totally new breakthrough[10][11]
- A related idea: assuming we can get to prosaic AI, it seems like a lot of the progress is coming from minor innovations rather than huge breakthroughs in understanding. So even if we can't get prosaic AI, we might believe that in that world, we would discover something big, but then it takes years of tinkering afterwards to get it to work well.
- Content argument: looking at historical AI development, key insights have played a relatively small role[12]
- Some reason to suspect that there's a common core to a bunch of intellectual activities[13]
- "Second, he asserts that the brain “just doesn’t look all that complicated” in comparison to human-made pieces of technology such as computer operating systems (p.444)." [3]
Discovery by human or AI
I think in a standard visualization of "secret sauce", the secret sauce is imagined as being discovered by human AI researchers. This makes sense because if there is a secret sauce to be discovered, and it hasn't been discovered yet, then the AI that exists at that point in time is not that smart, so it can't discover the secret sauce.
But if a secret sauce exists, then it was discovered by a blind search (namely, evolution), so an alternative visualization is that if a secret sauce exists, it will be discovered by machine learning systems using search/gradient-descent-like mechanisms.
See also: prosaic AI.
Is a secret sauce sufficient for a discontinuity?
Historical lack of discontinuities, even among "simple" things[14] -- actually, this point doesn't really say anything about secret sauce; it's instead saying that even given secret sauce, we shouldn't expect a discontinuity
Even if there is a secret sauce, AI takeoff could still be continuous if hardware is a bottleneck (i.e. we might discover all the important facts about intelligence early on but not have enough computing power to run it cheaply).
The prosaic AI argument is also relevant here; we might have already discovered the secret sauce.
See also
- Secret sauce for intelligence vs specialization in intelligence
- Simple core of consequentialist reasoning
- Prosaic AI
- Missing gear vs secret sauce
- Progress in self-improvement
- Textbook test for AI theory
Notes
To what extent is this the same debate as something like realism about rationality? It seems like if intelligence/rationality is simple, then there will be a secret sauce. But if it's not simple it could go either way (maybe there's a final essential "gear" that needs to be added to make things really work, in which case there is a secret sauce, or maybe everything just gradually improves and there's nothing like a "last gear"). So my initial thought is that intelligibility of intelligence implies secret sauce, but not conversely. Now, about rationality realism specifically, I'm not sure.
https://jacoblagerros.wordpress.com/2018/03/09/brains-and-backprop-a-key-timeline-crux/
https://web.archive.org/web/20200218080005/https://lw2.issarice.com/posts/4Q5s8qGyCtzfYtCZX/is-there-a-compute-efficient-algorithm-for-agency (i guess this one argues against)
http://benjaminrosshoffman.com/openai-makes-humanity-less-safe/#comment-128508
https://srconstantin.wordpress.com/2017/02/21/strong-ai-isnt-here-yet/
from https://arbital.com/p/general_intelligence/
An Artificial General Intelligence would have the same property; it could learn a tremendous variety of domains, including domains it had no inkling of when it was switched on.
More specific hypotheses about how general intelligence operates have been advanced at various points, but any corresponding attempts to define general intelligence that way, would be theory-laden. The pretheoretical phenomenon to be explained is the extraordinary variety of human achievements across many non-instinctual domains, compared to other animals.
[…]
To the extent one credits the existence of 'significantly more general than chimpanzee intelligence', it implies that there are common cognitive subproblems of the huge variety of problems that humans can (learn to) solve, despite the surface-level differences of those domains. Or at least, the way humans solve problems in those domains, the cognitive work we do must have deep commonalities across those domains. These commonalities may not be visible on an immediate surface inspection.
'But in general, the hypothesis of general intelligence seems like it should cash out as some version of: "There's some set of new cognitive algorithms, plus improvements to existing algorithms, plus bigger brains, plus other resources--we don't know how many things like this there are, but there's some set of things like that--which, when added to previously existing primate and hominid capabilities, created the ability to do better on a broad set of deep cognitive subproblems held in common across a very wide variety of humanly-approachable surface-level problems for learning and manipulating domains. And that's why humans do better on a huge variety of domains simultaneously, despite evolution having not preprogrammed us with new instinctual knowledge or algorithms for all those domains separately."' -- this doesn't really tell us which of those things it was that helped the most.
Rob Bensinger makes the same argument here: https://www.greaterwrong.com/posts/D3NspiH2nhKA6B2PE/what-evidence-is-alphago-zero-re-agi-complexity/comment/awsEzHzgD5Rv2YGPo
see also discussion at https://www.facebook.com/yudkowsky/posts/10154018209759228?comment_id=10154018937319228
"Yes, IF there are just one or two insights that can create a very general AGI which is far more capable than previous systems, and if that fact is unanticipated, then it might happen that a small team creates this AGI, and it stays better than other systems for a sufficient time to have a big differential effect. So as I've said our key dispute is about the lumpiness and number of key insights needed to create a general capable AGI." [4]
You might claim that once we have enough good simple tools, complexity will no longer be required. With enough simple tools (and some data to crunch), a few simple and relatively obvious combinations of those tools will be sufficient to perform most all tasks in the world economy at a human level. And thus the first team to find the last simple general tool needed might “foom” via having an enormous advantage over the entire rest of the world put together. At least if that one last tool were powerful enough. I disagree with this claim, but I agree that neither view can be easily and clearly proven wrong.[15]
References
- ↑ https://aiimpacts.org/likelihood-of-discontinuous-progress-around-the-development-of-agi/#One_algorithm
- ↑ https://intelligence.org/files/HowIntelligible.pdf
- ↑ https://intelligence.org/files/AIFoomDebate.pdf#page=556
- ↑ http://www.overcomingbias.com/2016/03/how-different-agi-software.html
- ↑ https://aiimpacts.org/likelihood-of-discontinuous-progress-around-the-development-of-agi/#One_algorithm
- ↑ https://sideways-view.com/2018/02/24/takeoff-speeds/
- ↑ https://en.wikipedia.org/wiki/Human_genome#Information_content
- ↑ https://docs.google.com/document/pub?id=17yLL7B7yRrhV3J9NuiVuac3hNmjeKTVHnqiEa6UQpJk
- ↑ https://www.lesswrong.com/posts/jd9LajtGWv93NC8uo/source-code-size-vs-learned-model-size-in-ml-and-in-humans
- ↑ https://sideways-view.com/2018/02/24/takeoff-speeds/
- ↑ https://aiimpacts.org/likelihood-of-discontinuous-progress-around-the-development-of-agi/#One_algorithm
- ↑ https://sideways-view.com/2018/02/24/takeoff-speeds/
- ↑ https://aiimpacts.org/likelihood-of-discontinuous-progress-around-the-development-of-agi/#One_algorithm
- ↑ https://aiimpacts.org/likelihood-of-discontinuous-progress-around-the-development-of-agi/#One_algorithm
- ↑ https://www.greaterwrong.com/posts/D3NspiH2nhKA6B2PE/what-evidence-is-alphago-zero-re-agi-complexity