<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.issarice.com/index.php?action=history&amp;feed=atom&amp;title=Stupid_questions</id>
	<title>Stupid questions - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.issarice.com/index.php?action=history&amp;feed=atom&amp;title=Stupid_questions"/>
	<link rel="alternate" type="text/html" href="https://wiki.issarice.com/index.php?title=Stupid_questions&amp;action=history"/>
	<updated>2026-04-10T07:48:39Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.31.6</generator>
	<entry>
		<id>https://wiki.issarice.com/index.php?title=Stupid_questions&amp;diff=2104&amp;oldid=prev</id>
		<title>Issa at 20:58, 26 March 2021</title>
		<link rel="alternate" type="text/html" href="https://wiki.issarice.com/index.php?title=Stupid_questions&amp;diff=2104&amp;oldid=prev"/>
		<updated>2021-03-26T20:58:24Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 20:58, 26 March 2021&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l10&quot; &gt;Line 10:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 10:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* i&amp;#039;m confused about whether/how the distilled agent in [[IDA]] is producing explanations of its own outputs, and also what these explanations look like.&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* i&amp;#039;m confused about whether/how the distilled agent in [[IDA]] is producing explanations of its own outputs, and also what these explanations look like.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* was the timeline discrepancy between [[eliezer]] and [[carl]] ever resolved? if so, what was the resolution/new estimate?&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* was the timeline discrepancy between [[eliezer]] and [[carl]] ever resolved? if so, what was the resolution/new estimate?&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;#160;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;#160;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;[[Category:AI safety]]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Issa</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.issarice.com/index.php?title=Stupid_questions&amp;diff=2103&amp;oldid=prev</id>
		<title>Issa at 20:58, 26 March 2021</title>
		<link rel="alternate" type="text/html" href="https://wiki.issarice.com/index.php?title=Stupid_questions&amp;diff=2103&amp;oldid=prev"/>
		<updated>2021-03-26T20:58:11Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 20:58, 26 March 2021&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l1&quot; &gt;Line 1:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* there&amp;#039;s a bunch of different considerations that people talk about (like different takeoff &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;scenarios&lt;/del&gt;, comparisons to nuclear arms control, etc.) and it&amp;#039;s unclear to me how the answers to these questions should influence our actions. even if we hammer out these strategy questions, would that change any of our actions? like if we suddenly knew with 100% certainty that there are three big insights needed to go from chimpanzee brains to human brains (but we wouldn&amp;#039;t know the content of the insights), what does that mean, in terms of what to do about AI safety?&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* there&amp;#039;s a bunch of different considerations that people talk about (like different &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[[&lt;/ins&gt;takeoff &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;scenario]]s&lt;/ins&gt;, comparisons to nuclear arms control, etc.) and it&amp;#039;s unclear to me how the answers to these questions should influence our actions. even if we hammer out these strategy questions, would that change any of our actions? like if we suddenly knew with 100% certainty that there are three big insights needed to go from chimpanzee brains to human brains (but we wouldn&amp;#039;t know the content of the insights), what does that mean, in terms of what to do about AI safety?&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* what is the minimal set of background assumptions/parameters that are needed to characterize the debate between eliezer and paul? (i am thinking of each person&amp;#039;s views as being &amp;quot;emergent&amp;quot; from some set of background assumptions.) e.g. [https://lw2.issarice.com/posts/mJ5oNYnkYrd4sD5uE/clarifying-some-key-hypotheses-in-ai-alignment] captures some of these, but i don&amp;#039;t think this is a minimal set (there are several others that i think are missing, and also some of the hypotheses listed here might be redundant/irrelevant.)&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* what is the minimal set of background assumptions/parameters that are needed to characterize the debate between eliezer and paul? (i am thinking of each person&amp;#039;s views as being &amp;quot;emergent&amp;quot; from some set of background assumptions.) e.g. [https://lw2.issarice.com/posts/mJ5oNYnkYrd4sD5uE/clarifying-some-key-hypotheses-in-ai-alignment] captures some of these, but i don&amp;#039;t think this is a minimal set (there are several others that i think are missing, and also some of the hypotheses listed here might be redundant/irrelevant.)&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* &amp;lt;strike&amp;gt;in paul&amp;#039;s iterated amplification scheme, i don&amp;#039;t understand why we can&amp;#039;t just stop after the first iteration and use the human-level AI to do things; why do we have to keep amplifying?&amp;lt;/strike&amp;gt; -- i figured out the answer. i was mistaken about how capable the first round of IDA is, because the writeup itself was confusing. see [https://lw2.issarice.com/posts/HqLxuZ4LhaFhmAHWk/iterated-distillation-and-amplification-1#GeBLKN5FJjzDmdGqn my comment here] for more.&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* &amp;lt;strike&amp;gt;in paul&amp;#039;s iterated amplification scheme, i don&amp;#039;t understand why we can&amp;#039;t just stop after the first iteration and use the human-level AI to do things; why do we have to keep amplifying?&amp;lt;/strike&amp;gt; -- i figured out the answer. i was mistaken about how capable the first round of IDA is, because the writeup itself was confusing. see [https://lw2.issarice.com/posts/HqLxuZ4LhaFhmAHWk/iterated-distillation-and-amplification-1#GeBLKN5FJjzDmdGqn my comment here] for more.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* what is the difference between informed oversight and reward engineering?&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* what is the difference between &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[[&lt;/ins&gt;informed oversight&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;]] &lt;/ins&gt;and &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[[&lt;/ins&gt;reward engineering&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;]]&lt;/ins&gt;?&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* what are some &amp;quot;easy&amp;quot;/doable open problems in [[agent foundations]] research? (if someone was doing a PhD in agent foundations, what problems would their advisor suggest for them?)&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* what are some &amp;quot;easy&amp;quot;/doable open problems in [[agent foundations]] research? (if someone was doing a PhD in agent foundations, what problems would their advisor suggest for them?)&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* what happened to intelligence amplification? in the early days of AI safety, people talked a lot about various intelligence amplification methods for navigating the singularity (e.g. cloning human geniuses, [[whole brain emulation]], cognitive enhancement drugs or implanting technology). The idea is that intelligence amplification will give us aligned entities that are smarter than us, which will help us to eventually get a friendly AI. Intelligence amplification was one of the three &amp;quot;main&amp;quot; paths that were discussed (along with technical alignment work, aka FAI theory, and coordination). nowadays when you look at sort of &amp;quot;who is working on AI safety full time?&amp;quot;, you have the technical researchers, who work on different agendas but all of them focused on aligning de novo AI, and then you have the policy people, who are looking at coordination.&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* what happened to &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[[&lt;/ins&gt;intelligence amplification&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;]]&lt;/ins&gt;? in the early days of AI safety, people talked a lot about various intelligence amplification methods for navigating the singularity (e.g. cloning human geniuses, [[whole brain emulation]], cognitive enhancement drugs or implanting technology). The idea is that intelligence amplification will give us aligned entities that are smarter than us, which will help us to eventually get a friendly AI. Intelligence amplification was one of the three &amp;quot;main&amp;quot; paths that were discussed (along with technical alignment work, aka FAI theory, and coordination). nowadays when you look at sort of &amp;quot;who is working on AI safety full time?&amp;quot;, you have the technical researchers, who work on different agendas but all of them focused on aligning de novo AI, and then you have the policy people, who are looking at coordination.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* is there an easy problem of corrigibility? if so, what is it? if not, why did &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;eliezer &lt;/del&gt;introduce the hard problem of corrigibility?&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* is there an easy problem of &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[[&lt;/ins&gt;corrigibility&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;]]&lt;/ins&gt;? if so, what is it? if not, why did &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[[Eliezer]] &lt;/ins&gt;introduce the hard problem of corrigibility?&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* AI safety prepping: what can individuals do to maximize their chances of surviving the singularity?&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* AI safety prepping: what can individuals do to maximize their chances of surviving the singularity?&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* does subagent imply mesa-optimizer? (also in the mesa-optimizers paper, &amp;quot;learned algorithm&amp;quot; doesn&amp;#039;t seem like it needs to be learned, which is confusing)&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* does subagent imply &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[[&lt;/ins&gt;mesa-optimizer&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;]]&lt;/ins&gt;? (also in the mesa-optimizers paper, &amp;quot;learned algorithm&amp;quot; doesn&amp;#039;t seem like it needs to be learned, which is confusing)&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* i&amp;#039;m confused about whether/how the distilled agent in [[IDA]] is producing explanations of its own outputs, and also what these explanations look like.&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* i&amp;#039;m confused about whether/how the distilled agent in [[IDA]] is producing explanations of its own outputs, and also what these explanations look like.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* was the timeline discrepancy between [[eliezer]] and [[carl]] ever resolved? if so, what was the resolution/new estimate?&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* was the timeline discrepancy between [[eliezer]] and [[carl]] ever resolved? if so, what was the resolution/new estimate?&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Issa</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.issarice.com/index.php?title=Stupid_questions&amp;diff=4&amp;oldid=prev</id>
		<title>Issa at 23:36, 17 February 2020</title>
		<link rel="alternate" type="text/html" href="https://wiki.issarice.com/index.php?title=Stupid_questions&amp;diff=4&amp;oldid=prev"/>
		<updated>2020-02-17T23:36:02Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 23:36, 17 February 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l3&quot; &gt;Line 3:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 3:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* &amp;lt;strike&amp;gt;in paul&amp;#039;s iterated amplification scheme, i don&amp;#039;t understand why we can&amp;#039;t just stop after the first iteration and use the human-level AI to do things; why do we have to keep amplifying?&amp;lt;/strike&amp;gt; -- i figured out the answer. i was mistaken about how capable the first round of IDA is, because the writeup itself was confusing. see [https://lw2.issarice.com/posts/HqLxuZ4LhaFhmAHWk/iterated-distillation-and-amplification-1#GeBLKN5FJjzDmdGqn my comment here] for more.&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* &amp;lt;strike&amp;gt;in paul&amp;#039;s iterated amplification scheme, i don&amp;#039;t understand why we can&amp;#039;t just stop after the first iteration and use the human-level AI to do things; why do we have to keep amplifying?&amp;lt;/strike&amp;gt; -- i figured out the answer. i was mistaken about how capable the first round of IDA is, because the writeup itself was confusing. see [https://lw2.issarice.com/posts/HqLxuZ4LhaFhmAHWk/iterated-distillation-and-amplification-1#GeBLKN5FJjzDmdGqn my comment here] for more.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* what is the difference between informed oversight and reward engineering?&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* what is the difference between informed oversight and reward engineering?&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* what are some &amp;quot;easy&amp;quot;/doable open problems in agent foundations research? (if someone was doing a PhD in agent foundations, what problems would their advisor suggest for them?)&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* what are some &amp;quot;easy&amp;quot;/doable open problems in &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[[&lt;/ins&gt;agent foundations&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;]] &lt;/ins&gt;research? (if someone was doing a PhD in agent foundations, what problems would their advisor suggest for them?)&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* what happened to intelligence amplification? in the early days of AI safety, people talked a lot about various intelligence amplification methods for navigating the singularity (e.g. cloning human geniuses, whole brain emulation, cognitive enhancement drugs or implanting technology). The idea is that intelligence amplification will give us aligned entities that are smarter than us, which will help us to eventually get a friendly AI. Intelligence amplification was one of the three &amp;quot;main&amp;quot; paths that were discussed (along with technical alignment work, aka FAI theory, and coordination). nowadays when you look at sort of &amp;quot;who is working on AI safety full time?&amp;quot;, you have the technical researchers, who work on different agendas but all of them focused on aligning de novo AI, and then you have the policy people, who are looking at coordination.&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* what happened to intelligence amplification? in the early days of AI safety, people talked a lot about various intelligence amplification methods for navigating the singularity (e.g. cloning human geniuses, &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[[&lt;/ins&gt;whole brain emulation&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;]]&lt;/ins&gt;, cognitive enhancement drugs or implanting technology). The idea is that intelligence amplification will give us aligned entities that are smarter than us, which will help us to eventually get a friendly AI. Intelligence amplification was one of the three &amp;quot;main&amp;quot; paths that were discussed (along with technical alignment work, aka FAI theory, and coordination). nowadays when you look at sort of &amp;quot;who is working on AI safety full time?&amp;quot;, you have the technical researchers, who work on different agendas but all of them focused on aligning de novo AI, and then you have the policy people, who are looking at coordination.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* is there an easy problem of corrigibility? if so, what is it? if not, why did eliezer introduce the hard problem of corrigibility?&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* is there an easy problem of corrigibility? if so, what is it? if not, why did eliezer introduce the hard problem of corrigibility?&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* AI safety prepping: what can individuals do to maximize their chances of surviving the singularity?&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* AI safety prepping: what can individuals do to maximize their chances of surviving the singularity?&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* does subagent imply mesa-optimizer? (also in the mesa-optimizers paper, &amp;quot;learned algorithm&amp;quot; doesn&amp;#039;t seem like it needs to be learned, which is confusing)&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* does subagent imply mesa-optimizer? (also in the mesa-optimizers paper, &amp;quot;learned algorithm&amp;quot; doesn&amp;#039;t seem like it needs to be learned, which is confusing)&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* i&amp;#039;m confused about whether/how the distilled agent in IDA is producing explanations of its own outputs, and also what these explanations look like.&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* i&amp;#039;m confused about whether/how the distilled agent in &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[[&lt;/ins&gt;IDA&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;]] &lt;/ins&gt;is producing explanations of its own outputs, and also what these explanations look like.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* was the timeline discrepancy between eliezer and carl ever resolved? if so, what was the resolution/new estimate?&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* was the timeline discrepancy between &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[[&lt;/ins&gt;eliezer&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;]] &lt;/ins&gt;and &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[[&lt;/ins&gt;carl&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;]] &lt;/ins&gt;ever resolved? if so, what was the resolution/new estimate?&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Issa</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.issarice.com/index.php?title=Stupid_questions&amp;diff=3&amp;oldid=prev</id>
		<title>Issa: Created page with &quot;* there&#039;s a bunch of different considerations that people talk about (like different takeoff scenarios, comparisons to nuclear arms control, etc.) and it&#039;s unclear to me how t...&quot;</title>
		<link rel="alternate" type="text/html" href="https://wiki.issarice.com/index.php?title=Stupid_questions&amp;diff=3&amp;oldid=prev"/>
		<updated>2020-02-17T23:34:43Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;* there&amp;#039;s a bunch of different considerations that people talk about (like different takeoff scenarios, comparisons to nuclear arms control, etc.) and it&amp;#039;s unclear to me how t...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;* there&amp;#039;s a bunch of different considerations that people talk about (like different takeoff scenarios, comparisons to nuclear arms control, etc.) and it&amp;#039;s unclear to me how the answers to these questions should influence our actions. even if we hammer out these strategy questions, would that change any of our actions? like if we suddenly knew with 100% certainty that there are three big insights needed to go from chimpanzee brains to human brains (but we wouldn&amp;#039;t know the content of the insights), what does that mean, in terms of what to do about AI safety?&lt;br /&gt;
* what is the minimal set of background assumptions/parameters that are needed to characterize the debate between eliezer and paul? (i am thinking of each person&amp;#039;s views as being &amp;quot;emergent&amp;quot; from some set of background assumptions.) e.g. [https://lw2.issarice.com/posts/mJ5oNYnkYrd4sD5uE/clarifying-some-key-hypotheses-in-ai-alignment] captures some of these, but i don&amp;#039;t think this is a minimal set (there are several others that i think are missing, and also some of the hypotheses listed here might be redundant/irrelevant.)&lt;br /&gt;
* &amp;lt;strike&amp;gt;in paul&amp;#039;s iterated amplification scheme, i don&amp;#039;t understand why we can&amp;#039;t just stop after the first iteration and use the human-level AI to do things; why do we have to keep amplifying?&amp;lt;/strike&amp;gt; -- i figured out the answer. i was mistaken about how capable the first round of IDA is, because the writeup itself was confusing. see [https://lw2.issarice.com/posts/HqLxuZ4LhaFhmAHWk/iterated-distillation-and-amplification-1#GeBLKN5FJjzDmdGqn my comment here] for more.&lt;br /&gt;
* what is the difference between informed oversight and reward engineering?&lt;br /&gt;
* what are some &amp;quot;easy&amp;quot;/doable open problems in agent foundations research? (if someone was doing a PhD in agent foundations, what problems would their advisor suggest for them?)&lt;br /&gt;
* what happened to intelligence amplification? in the early days of AI safety, people talked a lot about various intelligence amplification methods for navigating the singularity (e.g. cloning human geniuses, whole brain emulation, cognitive enhancement drugs or implanting technology). The idea is that intelligence amplification will give us aligned entities that are smarter than us, which will help us to eventually get a friendly AI. Intelligence amplification was one of the three &amp;quot;main&amp;quot; paths that were discussed (along with technical alignment work, aka FAI theory, and coordination). nowadays when you look at sort of &amp;quot;who is working on AI safety full time?&amp;quot;, you have the technical researchers, who work on different agendas but all of them focused on aligning de novo AI, and then you have the policy people, who are looking at coordination.&lt;br /&gt;
* is there an easy problem of corrigibility? if so, what is it? if not, why did eliezer introduce the hard problem of corrigibility?&lt;br /&gt;
* AI safety prepping: what can individuals do to maximize their chances of surviving the singularity?&lt;br /&gt;
* does subagent imply mesa-optimizer? (also in the mesa-optimizers paper, &amp;quot;learned algorithm&amp;quot; doesn&amp;#039;t seem like it needs to be learned, which is confusing)&lt;br /&gt;
* i&amp;#039;m confused about whether/how the distilled agent in IDA is producing explanations of its own outputs, and also what these explanations look like.&lt;br /&gt;
* was the timeline discrepancy between eliezer and carl ever resolved? if so, what was the resolution/new estimate?&lt;/div&gt;</summary>
		<author><name>Issa</name></author>
		
	</entry>
</feed>