Difference between revisions of "MIRI vs Paul research agenda hypotheses"
(Created page with "from "The concern" in https://agentfoundations.org/item?id=1220 key hopes listed in https://www.greaterwrong.com/posts/HCv2uwgDGf5dyX5y6/preface-to-the-sequence-on-iterated-a...") |
|||
Line 1: | Line 1: | ||
from "The concern" in https://agentfoundations.org/item?id=1220 | from "The concern" in https://agentfoundations.org/item?id=1220 | ||
+ | |||
+ | * "The first AI systems capable of pivotal acts will use good consequentialist reasoning." | ||
+ | * "The default AI development path will not produce good consequentialist reasoning at the top level." | ||
+ | * "Therefore, on the default AI development path, the first AI systems capable of pivotal acts will have good consequentialist subsystem reasoning but not good consequentialist top-level reasoning." | ||
+ | * "Consequentialist subsystem reasoning will likely come “packaged with a random goal” in some sense, and this goal will not be aligned with human interests." | ||
+ | * "Therefore, the default AI development path will produce, as the first AI systems capable of pivotal acts, AI systems with goals not aligned with human interests, causing catastrophe." | ||
key hopes listed in https://www.greaterwrong.com/posts/HCv2uwgDGf5dyX5y6/preface-to-the-sequence-on-iterated-amplification | key hopes listed in https://www.greaterwrong.com/posts/HCv2uwgDGf5dyX5y6/preface-to-the-sequence-on-iterated-amplification | ||
+ | |||
+ | * "If you have an overseer who is smarter than the agent you are trying to train, you can safely use that overseer’s judgment as an objective." | ||
+ | * "We can train an RL system using very sparse feedback, so it’s OK if that overseer is very computationally expensive." | ||
+ | * "A team of aligned agents may be smarter than any individual agent, while remaining aligned." |
Revision as of 07:46, 5 March 2020
from "The concern" in https://agentfoundations.org/item?id=1220
- "The first AI systems capable of pivotal acts will use good consequentialist reasoning."
- "The default AI development path will not produce good consequentialist reasoning at the top level."
- "Therefore, on the default AI development path, the first AI systems capable of pivotal acts will have good consequentialist subsystem reasoning but not good consequentialist top-level reasoning."
- "Consequentialist subsystem reasoning will likely come “packaged with a random goal” in some sense, and this goal will not be aligned with human interests."
- "Therefore, the default AI development path will produce, as the first AI systems capable of pivotal acts, AI systems with goals not aligned with human interests, causing catastrophe."
key hopes listed in https://www.greaterwrong.com/posts/HCv2uwgDGf5dyX5y6/preface-to-the-sequence-on-iterated-amplification
- "If you have an overseer who is smarter than the agent you are trying to train, you can safely use that overseer’s judgment as an objective."
- "We can train an RL system using very sparse feedback, so it’s OK if that overseer is very computationally expensive."
- "A team of aligned agents may be smarter than any individual agent, while remaining aligned."