Difference between revisions of "Goalpost for usefulness of HRAD work"

From Issawiki
Jump to: navigation, search
 
(7 intermediate revisions by the same user not shown)
Line 1: Line 1:
There's a pattern I see where:
+
When thinking about the question of "How useful is [[HRAD]] work?", what standards/goalposts should we use? There's a pattern I see where:
  
* people advocating [[HRAD]] research bring up historical cases like Turing, Shannon, etc. where formalization worked well
+
* people advocating [[HRAD]] research bring up historical cases like Turing, Shannon, etc. where formalization worked well. There is also the [[deconfusion research]] framing, where just understanding what's going on better is a form of progress.
 
* people arguing against HRAD research talk about how "complete axiomatic descriptions" haven't been useful so far in AI, and how they aren't used to describe machine learning systems
 
* people arguing against HRAD research talk about how "complete axiomatic descriptions" haven't been useful so far in AI, and how they aren't used to describe machine learning systems
  
 
It seems like there's a question of what is the relevant goalpost, for deciding whether HRAD work is useful.
 
It seems like there's a question of what is the relevant goalpost, for deciding whether HRAD work is useful.
 +
 +
This is an example of what I mean, when I say that the goalpost is set at an easier spot by MIRI: "Techniques you can actually adapt in a safe AI, come the day, will probably have very simple cores — the sort of core concept that takes up three paragraphs, where any reviewer who didn’t spend five years struggling on the problem themselves will think, “Oh I could have thought of that.” Someday there may be a book full of clever and difficult things to say about the simple core — contrast the simplicity of the core concept of causal models, versus the complexity of proving all the clever things Judea Pearl had to say about causal models. But the planetary benefit is mainly from posing understandable problems crisply enough so that people can see they are open, and then from the simpler abstract properties of a found solution — complicated aspects will not carry over to real AIs later." [https://intelligence.org/files/OpenPhil2016Supplement.pdf#page=13]
 +
 +
In contrast, the kind of goalpost [[Daniel Dewey]] sets in [https://eaforum.issarice.com/posts/SEL9PW8jozrvLnkb4/my-current-thoughts-on-miri-s-highly-reliable-agent-design] seems much harder/more restrictive.
 +
 +
----
  
 
* will early advanced AI systems be understandable in terms of HRAD's formalisms? [https://eaforum.issarice.com/posts/SEL9PW8jozrvLnkb4/my-current-thoughts-on-miri-s-highly-reliable-agent-design#3__What_do_I_think_about_HRAD_]
 
* will early advanced AI systems be understandable in terms of HRAD's formalisms? [https://eaforum.issarice.com/posts/SEL9PW8jozrvLnkb4/my-current-thoughts-on-miri-s-highly-reliable-agent-design#3__What_do_I_think_about_HRAD_]
Line 14: Line 20:
 
==See also==
 
==See also==
  
 +
* [[List of success criteria for HRAD work]]
 
* [[Something like realism about rationality]]
 
* [[Something like realism about rationality]]
 +
 +
==External links==
 +
 +
* https://www.greaterwrong.com/posts/BGxTpdBGbwCWrGiCL/plausible-cases-for-hrad-work-and-locating-the-crux-in-the
  
 
[[Category:AI safety]]
 
[[Category:AI safety]]

Latest revision as of 20:17, 26 June 2020

When thinking about the question of "How useful is HRAD work?", what standards/goalposts should we use? There's a pattern I see where:

  • people advocating HRAD research bring up historical cases like Turing, Shannon, etc. where formalization worked well. There is also the deconfusion research framing, where just understanding what's going on better is a form of progress.
  • people arguing against HRAD research talk about how "complete axiomatic descriptions" haven't been useful so far in AI, and how they aren't used to describe machine learning systems

It seems like there's a question of what is the relevant goalpost, for deciding whether HRAD work is useful.

This is an example of what I mean, when I say that the goalpost is set at an easier spot by MIRI: "Techniques you can actually adapt in a safe AI, come the day, will probably have very simple cores — the sort of core concept that takes up three paragraphs, where any reviewer who didn’t spend five years struggling on the problem themselves will think, “Oh I could have thought of that.” Someday there may be a book full of clever and difficult things to say about the simple core — contrast the simplicity of the core concept of causal models, versus the complexity of proving all the clever things Judea Pearl had to say about causal models. But the planetary benefit is mainly from posing understandable problems crisply enough so that people can see they are open, and then from the simpler abstract properties of a found solution — complicated aspects will not carry over to real AIs later." [1]

In contrast, the kind of goalpost Daniel Dewey sets in [2] seems much harder/more restrictive.


  • will early advanced AI systems be understandable in terms of HRAD's formalisms? [3]
    • lack of historical precedent at applying "complete axiomatic descriptions of AI systems" to help design AI systems [4]
    • lack of success so far at using complete axiomatic descriptions for modern ML systems [5]
    • what will early advanced AI systems look like?
  • how convincing historical examples are (e.g. Shannon, Turing, Bayes, Pearl, Kolmogorov, null-terminated strings in C [6], [7] [8], Eliezer also brings up the Shannon vs Poe chess example) See also selection effect for successful formalizations.

See also

External links