Difference between revisions of "Stupid questions"

From Issawiki
Jump to: navigation, search
 
Line 10: Line 10:
 
* i'm confused about whether/how the distilled agent in [[IDA]] is producing explanations of its own outputs, and also what these explanations look like.
 
* i'm confused about whether/how the distilled agent in [[IDA]] is producing explanations of its own outputs, and also what these explanations look like.
 
* was the timeline discrepancy between [[eliezer]] and [[carl]] ever resolved? if so, what was the resolution/new estimate?
 
* was the timeline discrepancy between [[eliezer]] and [[carl]] ever resolved? if so, what was the resolution/new estimate?
 +
 +
[[Category:AI safety]]

Latest revision as of 20:58, 26 March 2021

  • there's a bunch of different considerations that people talk about (like different takeoff scenarios, comparisons to nuclear arms control, etc.) and it's unclear to me how the answers to these questions should influence our actions. even if we hammer out these strategy questions, would that change any of our actions? like if we suddenly knew with 100% certainty that there are three big insights needed to go from chimpanzee brains to human brains (but we wouldn't know the content of the insights), what does that mean, in terms of what to do about AI safety?
  • what is the minimal set of background assumptions/parameters that are needed to characterize the debate between eliezer and paul? (i am thinking of each person's views as being "emergent" from some set of background assumptions.) e.g. [1] captures some of these, but i don't think this is a minimal set (there are several others that i think are missing, and also some of the hypotheses listed here might be redundant/irrelevant.)
  • in paul's iterated amplification scheme, i don't understand why we can't just stop after the first iteration and use the human-level AI to do things; why do we have to keep amplifying? -- i figured out the answer. i was mistaken about how capable the first round of IDA is, because the writeup itself was confusing. see my comment here for more.
  • what is the difference between informed oversight and reward engineering?
  • what are some "easy"/doable open problems in agent foundations research? (if someone was doing a PhD in agent foundations, what problems would their advisor suggest for them?)
  • what happened to intelligence amplification? in the early days of AI safety, people talked a lot about various intelligence amplification methods for navigating the singularity (e.g. cloning human geniuses, whole brain emulation, cognitive enhancement drugs or implanting technology). The idea is that intelligence amplification will give us aligned entities that are smarter than us, which will help us to eventually get a friendly AI. Intelligence amplification was one of the three "main" paths that were discussed (along with technical alignment work, aka FAI theory, and coordination). nowadays when you look at sort of "who is working on AI safety full time?", you have the technical researchers, who work on different agendas but all of them focused on aligning de novo AI, and then you have the policy people, who are looking at coordination.
  • is there an easy problem of corrigibility? if so, what is it? if not, why did Eliezer introduce the hard problem of corrigibility?
  • AI safety prepping: what can individuals do to maximize their chances of surviving the singularity?
  • does subagent imply mesa-optimizer? (also in the mesa-optimizers paper, "learned algorithm" doesn't seem like it needs to be learned, which is confusing)
  • i'm confused about whether/how the distilled agent in IDA is producing explanations of its own outputs, and also what these explanations look like.
  • was the timeline discrepancy between eliezer and carl ever resolved? if so, what was the resolution/new estimate?