Difference between revisions of "List of success criteria for HRAD work"

Revision as of 00:41, 3 June 2020

This page is a list of success criteria that have been proposed for HRAD work. Most of these are correlated, so this isn't anything like a list of independent ways HRAD could succeed. The idea is to list out more concrete ways in which HRAD work will be useful.

resembles the work of Turing, Shannon, Bayes, etc
helps AGI programmers avoid mistakes analogous to the use of null-terminated strings in C
early advanced AI systems will be understandable in terms of HRAD's formalisms [1] (need to clarify what it means to be understandable in terms of a formalism)
helps AGI programmers fix problems in early advanced AI systems
helps AGI programmers predict problems in early advanced AI systems
helps AGI programmers postdict/explain problems in early advanced AI systems
ideas from HRAD will be a "useful source of inspiration" for ML/AGI work [2]
when applying HRAD to actual systems, there will be "theoretically satisfying approximation methods" that make this application possible [3]
when applying HRAD to actual systems, the approximation methods used will preserve the important desirable properties of HRAD work [4]
the conceptual framework chosen in HRAD work and the conceptual framework that best describes early advanced AI systems will be compatible enough for it to be enlightening to use HRAD to describe these systems [5]
helps for "broadly understanding how the system is reasoning about the world" [6]
helps for verifying that the AI systems are aligned

@@ Line 11: / Line 11: @@
 * when applying HRAD to actual systems, the approximation methods used will preserve the important desirable properties of HRAD work [https://eaforum.issarice.com/posts/SEL9PW8jozrvLnkb4/my-current-thoughts-on-miri-s-highly-reliable-agent-design]
 * the conceptual framework chosen in HRAD work and the conceptual framework that best describes early advanced AI systems will be compatible enough for it to be enlightening to use HRAD to describe these systems [https://eaforum.issarice.com/posts/SEL9PW8jozrvLnkb4/my-current-thoughts-on-miri-s-highly-reliable-agent-design]
+* helps for "broadly understanding how the system is reasoning about the world" [https://eaforum.issarice.com/posts/SEL9PW8jozrvLnkb4/my-current-thoughts-on-miri-s-highly-reliable-agent-design#Z6TbXivpjxWyc8NYM]
+* helps for verifying that the AI systems are aligned
 ==See also==

Difference between revisions of "List of success criteria for HRAD work"

Revision as of 00:41, 3 June 2020

See also

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools