Difference between revisions of "MIRI vs Paul research agenda hypotheses"

From Issawiki
Jump to: navigation, search
Line 6: Line 6:
 
* "Consequentialist subsystem reasoning will likely come “packaged with a random goal” in some sense, and this goal will not be aligned with human interests."
 
* "Consequentialist subsystem reasoning will likely come “packaged with a random goal” in some sense, and this goal will not be aligned with human interests."
 
* "Therefore, the default AI development path will produce, as the first AI systems capable of pivotal acts, AI systems with goals not aligned with human interests, causing catastrophe."
 
* "Therefore, the default AI development path will produce, as the first AI systems capable of pivotal acts, AI systems with goals not aligned with human interests, causing catastrophe."
 +
 +
Taking Owen's suggestion,<ref>https://agentfoundations.org/item?id=1242</ref> we can change this to:
 +
 +
* "The first AI systems capable of pivotal acts will use good consequentialist reasoning."
 +
* "The default AI development path will not produce good consequentialist reasoning at the top level."
 +
* "Consequentialist subsystem reasoning will likely come “packaged with a random goal” in some sense, and this goal will not be aligned with human interests."
 +
* AI systems capable of pivotal acts with goals not aligned with human interests will cause catastrophe.
  
 
key hopes listed in https://www.greaterwrong.com/posts/HCv2uwgDGf5dyX5y6/preface-to-the-sequence-on-iterated-amplification
 
key hopes listed in https://www.greaterwrong.com/posts/HCv2uwgDGf5dyX5y6/preface-to-the-sequence-on-iterated-amplification

Revision as of 07:48, 5 March 2020

from "The concern" in https://agentfoundations.org/item?id=1220

  • "The first AI systems capable of pivotal acts will use good consequentialist reasoning."
  • "The default AI development path will not produce good consequentialist reasoning at the top level."
  • "Therefore, on the default AI development path, the first AI systems capable of pivotal acts will have good consequentialist subsystem reasoning but not good consequentialist top-level reasoning."
  • "Consequentialist subsystem reasoning will likely come “packaged with a random goal” in some sense, and this goal will not be aligned with human interests."
  • "Therefore, the default AI development path will produce, as the first AI systems capable of pivotal acts, AI systems with goals not aligned with human interests, causing catastrophe."

Taking Owen's suggestion,[1] we can change this to:

  • "The first AI systems capable of pivotal acts will use good consequentialist reasoning."
  • "The default AI development path will not produce good consequentialist reasoning at the top level."
  • "Consequentialist subsystem reasoning will likely come “packaged with a random goal” in some sense, and this goal will not be aligned with human interests."
  • AI systems capable of pivotal acts with goals not aligned with human interests will cause catastrophe.

key hopes listed in https://www.greaterwrong.com/posts/HCv2uwgDGf5dyX5y6/preface-to-the-sequence-on-iterated-amplification

  • "If you have an overseer who is smarter than the agent you are trying to train, you can safely use that overseer’s judgment as an objective."
  • "We can train an RL system using very sparse feedback, so it’s OK if that overseer is very computationally expensive."
  • "A team of aligned agents may be smarter than any individual agent, while remaining aligned."
  • https://agentfoundations.org/item?id=1242