Difference between revisions of "Goalpost for usefulness of HRAD work"
Line 9: | Line 9: | ||
In contrast, the kind of goalpost [[Daniel Dewey]] sets in [https://eaforum.issarice.com/posts/SEL9PW8jozrvLnkb4/my-current-thoughts-on-miri-s-highly-reliable-agent-design] seems much harder/restrictive. | In contrast, the kind of goalpost [[Daniel Dewey]] sets in [https://eaforum.issarice.com/posts/SEL9PW8jozrvLnkb4/my-current-thoughts-on-miri-s-highly-reliable-agent-design] seems much harder/restrictive. | ||
+ | |||
+ | ---- | ||
* will early advanced AI systems be understandable in terms of HRAD's formalisms? [https://eaforum.issarice.com/posts/SEL9PW8jozrvLnkb4/my-current-thoughts-on-miri-s-highly-reliable-agent-design#3__What_do_I_think_about_HRAD_] | * will early advanced AI systems be understandable in terms of HRAD's formalisms? [https://eaforum.issarice.com/posts/SEL9PW8jozrvLnkb4/my-current-thoughts-on-miri-s-highly-reliable-agent-design#3__What_do_I_think_about_HRAD_] |
Revision as of 09:30, 27 May 2020
When thinking about the question of "How useful is HRAD work?", what standards/goalposts should we use? There's a pattern I see where:
- people advocating HRAD research bring up historical cases like Turing, Shannon, etc. where formalization worked well. There is also the deconfusion research framing, where just understanding what's going on better is a form of progress.
- people arguing against HRAD research talk about how "complete axiomatic descriptions" haven't been useful so far in AI, and how they aren't used to describe machine learning systems
It seems like there's a question of what is the relevant goalpost, for deciding whether HRAD work is useful.
This is an example of what I mean, when I say that the goalpost is set at an easier spot by MIRI: "Techniques you can actually adapt in a safe AI, come the day, will probably have very simple cores — the sort of core concept that takes up three paragraphs, where any reviewer who didn’t spend five years struggling on the problem themselves will think, “Oh I could have thought of that.” Someday there may be a book full of clever and difficult things to say about the simple core — contrast the simplicity of the core concept of causal models, versus the complexity of proving all the clever things Judea Pearl had to say about causal models. But the planetary benefit is mainly from posing understandable problems crisply enough so that people can see they are open, and then from the simpler abstract properties of a found solution — complicated aspects will not carry over to real AIs later." [1]
In contrast, the kind of goalpost Daniel Dewey sets in [2] seems much harder/restrictive.
- will early advanced AI systems be understandable in terms of HRAD's formalisms? [3]
- how convincing historical examples are (e.g. Shannon, Turing, Bayes, Pearl, Kolmogorov, null-terminated strings in C [6], [7] [8], Eliezer also brings up the Shannon vs Poe chess example) See also selection effect for successful formalizations.