Difference between revisions of "Simple core"

Latest revision as of 21:32, 2 April 2021

Simple core is a term that has been used to describe various things in AI safety. I think the "simple" part is intended to capture that the thing is human-understandable or even something like 'able to be described by a few mathematical equations'. And the "core" part is intended to say that there might be other more complicated things in addition.

Simple core to corrigibility^[1]^[2]
Simple core of consequentialist reasoning
Simple core to AI safety techniques^[3]
Simple core for impact measures

References

[1] ttps://ai-alignment.com/corrigibility-3039e668638

[2] ttps://www.greaterwrong.com/posts/Djs38EWYZG8o7JMWY/paul-s-research-agenda-faq/comment/79jM2ecef73zupPR4

[3] ttps://intelligence.org/files/OpenPhil2016Supplement.pdf see page 13

[1]

[2]

[3]

@@ Line 1: / Line 1: @@
-'''Simple core''' is a term that has been used to describe various things in AI safety.
+'''Simple core''' is a term that has been used to describe various things in AI safety. I think the "simple" part is intended to capture that the thing is human-understandable or even something like 'able to be described by a few mathematical equations'. And the "core" part is intended to say that there might be other more complicated things in addition.
 * Simple core to [[corrigibility]]<ref>https://ai-alignment.com/corrigibility-3039e668638</ref><ref>https://www.greaterwrong.com/posts/Djs38EWYZG8o7JMWY/paul-s-research-agenda-faq/comment/79jM2ecef73zupPR4</ref>
@@ Line 9: / Line 9: @@
 <references/>
+[[Category:AI safety]]

Difference between revisions of "Simple core"

Latest revision as of 21:32, 2 April 2021

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools