Search results

Create the page "AI safety" on this wiki! See also the search results found.

Page title matches

List of disagreements in AI safety
...sagreements in AI safety''' which collects the list of things people in AI safety seem to most frequently and deeply disagree about. ...d/1wI21XP-lRa6mi5h0dq_USooz0LpysdhS/view Clarifying some key hypotheses in AI alignment].</ref> (there are more posts like this, i think? find them)

21 KB (3,254 words) - 11:00, 26 February 2022
AI safety field consensus
People in [[AI safety]] tend to [[List of disagreements in AI safety|disagree about many things]]. However, there is also wide agreement about s * advanced AI will have a huge impact on the world

2 KB (272 words) - 01:33, 13 May 2020
AI will solve everything argument against AI safety
...(all of our other problems are so pressing that we're willing to gamble on AI working out by default). I don't think this argument makes much sense. ...at’s one of the reasons why I’m focusing on AI safety, rather than bio-safety.</p>

1 KB (261 words) - 20:02, 23 June 2021
My current thoughts on the technical AI safety pipeline (outside academia)
* [[AI safety technical pipeline does not teach how to start having novel thoughts]] * [[AI safety is not a community]]

931 bytes (138 words) - 01:30, 20 May 2020
AI safety technical pipeline does not teach how to start having novel thoughts
Currently, the [[AI safety community]] does not have an explicit mechanism for teaching new people how [[Category:AI safety meta]]

1 KB (225 words) - 02:28, 28 March 2021
AI safety is not a community
...lly, I think I've been in communities before, and being a part of the [[AI safety community]] does not feel like that. * AI safety has left the "hobbyist stage". People can actually now get paid to think ab

894 bytes (159 words) - 02:28, 28 March 2021
AI safety lacks a space to ask stupid or ballsy questions
[[Category:AI safety meta]]

1 KB (198 words) - 02:28, 28 March 2021
Timeline of my involvement in AI safety
[[Category:AI safety meta]]

283 bytes (42 words) - 21:27, 18 May 2020
AI safety is harder than most things
...nice thank you letter from a different person. In contrast, working on AI safety feels like .... there's absolutely no feedback on whether I'm doing anythin When you've been at AI safety for too long, you're so used to just "[staring] at a blank sheet of paper u

1 KB (190 words) - 02:28, 28 March 2021
Nobody understands what makes people snap into AI safety
...g else to work on AI safety''. Cue all the rationalists who "believe in AI safety" but don't do anything about it. ..., and the people who are actually spending their full-time attention on AI safety? I think this is a very important question, and I don't think anybody under

2 KB (268 words) - 02:35, 28 March 2021
Newcomers in AI safety are silent about their struggles
...tty hard to find people openly complaining about how to get involved in AI safety. You can find some random comments, and there are occasional Facebook threa # For people who want to do technical AI safety research, how are they deciding between MIRI vs Paul vs other technical age

1 KB (195 words) - 02:35, 28 March 2021
Mass shift to technical AI safety research is suspicious
...rough existing evidence and getting better evidence (e.g. from progress in AI research), rather than due to less virtuous reasons like groupthink/prestig [[Category:AI safety meta]]

910 bytes (138 words) - 02:35, 28 March 2021
List of thought experiments in AI safety
...[[Counterfactual of dropping a seed AI into a world without other capable AI]] [[Category:AI safety]]

448 bytes (72 words) - 07:27, 20 May 2020
List of arguments against working on AI safety
This is a '''list of arguments against working on AI safety'''. Personally I think the only one that's not totally weak is opportunity ...out how to affect the long-term future. See also [[Pascal's mugging and AI safety]].

8 KB (1,245 words) - 00:29, 24 July 2022
List of AI safety projects I could work on
* writing some sort of overview of my beliefs regarding AI safety. like, if i was explaining things from scratch to someone, what would that * my current take on [[AI timelines]] (vacation tier)

6 KB (927 words) - 14:25, 4 February 2022
How meta should AI safety be?
I often go back and forth between the following two approaches to AI safety: ...s/conferences. Trust that making a better community will lead to better AI safety work being done.

873 bytes (144 words) - 02:33, 28 March 2021
Pascal's mugging and AI safety
...to [[Pascal's mugging]]. The critic of AI safety argues that working on AI safety has a very small probability of a very big payoff, which sounds suspicious. * Argue that reducing x-risk from AI safety is more like a 1% chance than like an astronomically small chance.

1 KB (147 words) - 22:16, 17 November 2020
Pascal's mugging argument against AI safety
#redirect [[Pascal's mugging and AI safety]]

44 bytes (6 words) - 23:24, 12 November 2020
Unintended consequences of AI safety advocacy argument against AI safety
...as to cause the creation of DeepMind and OpenAI, and to accelerate overall AI progress. I’m not saying that he’s necessarily right, and I’m not say * [[List of arguments against working on AI safety]]

2 KB (353 words) - 21:23, 6 November 2021
Is AI safety no longer a scenius?
...rofessionalized and prestigious. As Nielsen says (abstractly, not about AI safety in particular): "A field that is fun and stimulating when 50 people are inv ...as scenius? Or try to work on [[mechanism design]] so that the larger [[AI safety community]] is more functional than existing "eternal september" type event

2 KB (337 words) - 02:34, 28 March 2021
Emotional difficulties of AI safety research
...ties arising from the subject matter itself, without reference to the [[AI safety community]] * [[AI safety has many prerequisites]]

950 bytes (132 words) - 18:25, 18 July 2021
Doomer argument against AI safety
[[Category:AI safety]]

172 bytes (19 words) - 20:55, 18 March 2022

Page text matches

Main Page
* [[:Category:AI safety]] -- notes on AI safety strategy

572 bytes (84 words) - 21:22, 19 March 2021
Stupid questions
...ent of the insights), what does that mean, in terms of what to do about AI safety? ...lw2.issarice.com/posts/mJ5oNYnkYrd4sD5uE/clarifying-some-key-hypotheses-in-ai-alignment] captures some of these, but i don't think this is a minimal set

3 KB (528 words) - 20:58, 26 March 2021
Carl Shulman
[[Category:AI safety]]

315 bytes (43 words) - 21:52, 28 November 2020
Whole brain emulation
...standing is that MIRI people/other smart people have prioritized technical AI alignment over WBEs because while WBEs would be safer if they came first, p * is there anything else relevant to AI strategy that i should know about?

6 KB (850 words) - 19:16, 27 February 2021
Secret sauce for intelligence
...n AI safety. Resolving this is important for thinking about the shape of [[AI takeoff]]. ...mpy]]", i.e. coming in a small number of chunks that contribute greatly to AI capabilities; there are a small number of discrete insights required to cre

13 KB (1,917 words) - 23:45, 19 May 2021
Different senses of claims about AGI
...rstand good consequentialist reasoning in order to design a highly capable AI system, I’d be less worried by a decent margin." the general MIRI view th ...ve this for aligned AI systems, but not believe it for unaligned/arbitrary AI systems.

1 KB (212 words) - 22:14, 28 April 2020
List of teams at OpenAI
* Safety [https://80000hours.org/podcast/episodes/danny-hernandez-forecasting-ai-progress/] ...nking that enable humans to meaningfully understand, supervise and control AI systems." [http://webcache.googleusercontent.com/search?q=cache:WxgzREJyPTk

6 KB (882 words) - 05:20, 7 October 2020
Hardware-driven vs software-driven progress
https://aiimpacts.org/how-ai-timelines-are-estimated/ ...strategy is the best approach." [https://www.technologyreview.com/s/615181/ai-openai-moonshot-elon-musk-sam-altman-greg-brockman-messy-secretive-reality/

649 bytes (76 words) - 03:59, 26 April 2020
Soft-hard takeoff
...e there is a stereotypical "soft takeoff" until around the point where the AI has somewhat-infra-human level general intelligence, and then once it cross ...the prior emergence of potentially-strategically-decisive AI — that is, AI capabilities that are potentially decisive when employed by some group of i

8 KB (1,206 words) - 01:43, 2 March 2021
Simple core of consequentialist reasoning
...e is a small core of good consequentialist reasoning that is important for AI capabilities and that can be discovered through theoretical research." http [[Category:AI safety]]

1 KB (217 words) - 19:11, 27 February 2021
Analyzing disagreements
...tions about disagreements, particularly disagreements in ai safety about [[AI timelines]], [[takeoff speed]], [[simple core algorithm of agency]], and so ...t strong arguments on multiple sides.) Given this theory, it feels like AI safety should be a one-sided debate; it's a simple matter of fact, so we shouldn't

7 KB (1,087 words) - 22:52, 8 February 2021
Hardware argument for AI timelines
In the context of [[AI timelines]], the '''hardware argument''' is a common argument structure for ...0/10/why_early_singularities_are_softer.html and https://aiimpacts.org/how-ai-timelines-are-estimated/

5 KB (740 words) - 00:24, 12 July 2021
Future planning
** AI safety vs something else? right now AI safety seems like the best candidate for the biggest/soonest change, but i want to ** if AI safety, then what technical agenda seems best? this matters for (1) deciding what

2 KB (285 words) - 18:53, 7 September 2020
How doomed are ML safety approaches?
I want to understand better the MIRI case for thinking that ML-based safety approaches (like [[Paul Christiano]]'s agenda) are so hopeless as to not be # a highly intelligent AI would see things humans cannot see, can arrive at unanticipated solutions,

5 KB (765 words) - 02:32, 28 March 2021
My understanding of how IDA works
...ned/doing things that are "good for Hugh" in some sense; (2) the resulting AI is competitive; (3) Hugh doesn't have a clue what is going on. Many explana ...arial training, verification, transparency, and other measures to keep the AI aligned fit into the scheme. This is a separate confusion I (and [https://w

9 KB (1,597 words) - 00:27, 6 October 2020
Minimal AGI vs task AGI
[[Category:AI safety]]

770 bytes (115 words) - 19:10, 27 February 2021
Counterfactual of dropping a seed AI into a world without other capable AI
...self-improvement." [https://lw2.issarice.com/posts/5WECpYABCT62TJrhY/will-ai-undergo-discontinuous-progress] "When we build AGI we will be optimizing the chimp-equivalent-AI for usefulness, and it will look nothing like an actual chimp (in fact it w

3 KB (426 words) - 20:51, 15 March 2021
Comparison of AI takeoff scenarios
...rategic advantage? / Unipolar outcome? (i.e. not distributed) Can a single AI project get massively ahead (either by investing way more effort into build ...an does our economy today. The issue is the relative rate of growth of one AI system, across a broad range of tasks, relative to the entire rest of the w

4 KB (635 words) - 00:50, 5 March 2021
AlphaGo
...d its successor '''AlphaGo Zero''' are used to make various points in [[AI safety]]. * a single architecture / basic AI technique working for many different games ([[single-architecture generalit

5 KB (672 words) - 20:19, 11 August 2021
Iterated amplification
* [[prosaic AI]] [[Category:AI safety]]

1 KB (125 words) - 03:58, 26 April 2020
Paperclip maximizer
...telligence/capability and values can vary orthogonally; a superintelligent AI need not realize that "making paperclips is stupid" and decide to maximize * [[instrumental convergence]]: even if an AI isn't deliberately trying to hurt us (as a terminal value), it will still p

2 KB (344 words) - 23:29, 27 July 2020
Evolution
In discussions of AI risk, evolution (especially hominid evolution, as it is the only example we [[Category:AI safety]]

799 bytes (112 words) - 23:46, 19 May 2021
Will there be significant changes to the world prior to some critical AI capability threshold being reached?
'''Will there be significant changes to the world prior to some critical AI capability threshold being reached?''' This is currently one of the questio [[Eliezer]]: "And the relative rate of growth between AI capabilities and human capabilities, and the degree to which single investm

2 KB (251 words) - 02:40, 28 March 2021
Human safety problem
...s. For instance, [[distributional shift]] is an AI safety problem where an AI trained in one environment will behave poorly when deployed in an unfamilia ! AI safety problem !! Human safety problem

3 KB (379 words) - 20:36, 11 November 2021
Value learning
[[Category:AI safety]]

159 bytes (19 words) - 04:53, 30 March 2021
Rapid capability gain vs AGI progress
* [[rapid capability gain]] seems to refer to how well a ''single'' AI system improves over time, e.g. [[AlphaGo]] going from "no knowledge of any * AGI progress refers to progress of AI systems in general, i.e. if you plot the state of the art over time

774 bytes (121 words) - 19:17, 27 February 2021
List of technical AI alignment agendas
This is a '''list of technical AI alignment agendas'''. * [[Comprehensive AI services]]

586 bytes (56 words) - 21:29, 2 April 2021
Jessica Taylor
The most important posts for AI strategy are: ...129 My current take on the Paul-MIRI disagreement on alignability of messy AI]

300 bytes (42 words) - 19:10, 27 February 2021
Prosaic AI
what are the possibilities for prosaic AI? i.e. if prosaic AI happened, then what are some possible reasons for why this happened? some i ...xisting ML systems (or straightforward tweaks to them) somehow produces an AI that fully "understands" things and can do everything humans can; (2) there

2 KB (247 words) - 19:10, 27 February 2021
List of disagreements in AI safety
...sagreements in AI safety''' which collects the list of things people in AI safety seem to most frequently and deeply disagree about. ...d/1wI21XP-lRa6mi5h0dq_USooz0LpysdhS/view Clarifying some key hypotheses in AI alignment].</ref> (there are more posts like this, i think? find them)

21 KB (3,254 words) - 11:00, 26 February 2022
The Uncertain Future
[[Category:AI safety]]

508 bytes (62 words) - 19:12, 27 February 2021
Deconfusion
...echnical AI alignment. The term has since been used in contexts outside of AI alignment as well. [[Category:AI safety]]

346 bytes (55 words) - 19:18, 27 February 2021
AI safety field consensus
People in [[AI safety]] tend to [[List of disagreements in AI safety|disagree about many things]]. However, there is also wide agreement about s * advanced AI will have a huge impact on the world

2 KB (272 words) - 01:33, 13 May 2020
Wei Dai
'''Wei Dai''' is an AI safety researcher. In the technical AI safety community he is most well-known as the inventor of [[updateless decision th * [[human safety problem]]s

387 bytes (51 words) - 20:47, 11 August 2021
Agent foundations
[[Category:AI safety]]

1 KB (195 words) - 19:08, 27 February 2021
Simple core
...mple core''' is a term that has been used to describe various things in AI safety. I think the "simple" part is intended to capture that the thing is human-u * Simple core to [[corrigibility]]<ref>https://ai-alignment.com/corrigibility-3039e668638</ref><ref>https://www.greaterwrong.

824 bytes (120 words) - 21:32, 2 April 2021
Coherence and goal-directed agency discussion
...ssions in AI alignment|big discussions]]" that has been taking place in AI safety in 2018–2020 is about [[coherence argument]]s and [[goal-directed]] a https://www.greaterwrong.com/posts/tHxXdAn8Yuiy9y2pZ/ai-safety-without-goal-directed-behavior

1 KB (183 words) - 19:09, 27 February 2021
List of terms used to describe the intelligence of an agent
[[Category:AI safety]]

782 bytes (108 words) - 20:56, 26 March 2021
Comparison of terms related to agency
* does it pursue omohundro's "basic AI drives"? (i.e. it is the subject of instrumental convergence) * mindblind AI

4 KB (534 words) - 19:09, 27 February 2021
MIRI vs Paul research agenda hypotheses
* "The first AI systems capable of pivotal acts will use good consequentialist reasoning." * "The default AI development path will not produce good consequentialist reasoning at the to

4 KB (702 words) - 03:57, 26 April 2020
Selection effect for who builds AGI
[[Category:AI safety]]

815 bytes (133 words) - 19:18, 27 February 2021
List of big discussions in AI alignment
[[Category:AI safety]]

212 bytes (21 words) - 19:10, 27 February 2021
UDASSA
[[Category:AI safety]]

14 KB (2,432 words) - 09:13, 8 January 2023
Late singularity
...ctually better? if AGI is developed in 200 years, what does this say about ai xrisk? this could happen for several reasons: ...l be much harder than building agi", then this might push you to think "ai safety is basically impossible for humans without intelligence enhancement to solv

1 KB (248 words) - 21:33, 4 April 2021
AI prepping
'''AI prepping''' refers to selfish actions one can take in order to survive when It's not clear whether any really good actions for AI prepping exist. Some reasons for optimism are:

6 KB (968 words) - 04:20, 26 November 2022
Lumpiness
...ften used by [[Robin Hanson]] to describe things like innovation, secrets, AI progress, citations. ...ot lumpy. Aren't power laws lumpy? actually maybe he's only saying that if AI progress is lumpy, then its citation patterns should be even lumpier than u

4 KB (648 words) - 06:53, 3 June 2020
Laplace's rule of succession argument for AI timelines
The '''Laplace's rule of succession argument for AI timelines''' uses [[wikipedia:Rule of succession|Laplace's rule of successi ....issarice.com/posts/Ayu5im98u8FeMWoBZ/my-personal-cruxes-for-working-on-ai-safety#AI_timelines

1 KB (218 words) - 02:04, 5 April 2021
Statistical analysis of expert timelines argument for AI timelines
...analysis of experts' AI timelines to come up with some overall estimate of AI timelines. It punts the question of "but where did the experts get their op ..."AI researchers attending X conference", "AI researchers in general", "AI safety researchers").

641 bytes (101 words) - 05:10, 9 April 2021
Kasparov window
I think [[Eliezer]]'s point is that when there's more hardware behind an AI project, the Kasparov window is narrower. ...series of blog posts from [[AI Impacts]] https://aiimpacts.org/?s=time+for+ai+to+cross

320 bytes (55 words) - 22:02, 4 January 2021
Selection effect for successful formalizations
....com/posts/6skeZgctugzBBEBw3/ai-alignment-podcast-an-overview-of-technical-ai-alignment] [[Category:AI safety]]

746 bytes (115 words) - 20:41, 12 April 2021
AI will solve everything argument against AI safety
...(all of our other problems are so pressing that we're willing to gamble on AI working out by default). I don't think this argument makes much sense. ...at’s one of the reasons why I’m focusing on AI safety, rather than bio-safety.</p>

1 KB (261 words) - 20:02, 23 June 2021
Richard Ngo
...f>[https://forum.effectivealtruism.org/users/richard_ngo richard_ngo]. "AI safety research engineer at DeepMind (all opinions my own, not theirs). I'm from N [[Category:AI safety]]

584 bytes (88 words) - 19:11, 27 February 2021
Incremental reading in Anki
Incremental reading provides feeling of emotional safety (which is something that [[Anki]] does in general, but where I think increm feeling like i should maybe ankify some of my ai safety reading from LW, but it's been hard to think of what to even put in. some t

4 KB (687 words) - 01:10, 17 July 2021
Spaced proof review
...eal. This is possible with math, but i'm not sure how to do this with [[AI safety]] (it's not like there's problems i can solve).

8 KB (1,497 words) - 00:01, 2 August 2021
Architecture
...architecture''' is used to mean ... something like the basic design of the AI system (like what kind of machine learning is being used in what way, what ..."mental architecture", "cognitive architecture", the "architecture of the AI"

7 KB (1,128 words) - 23:18, 23 June 2020
One wrong number problem
[[Category:AI safety]]

1 KB (170 words) - 06:54, 3 June 2020
Discontinuities in usefulness of whole brain emulation technology
* [[Counterfactual of dropping a seed AI into a world without other capable AI]] [[Category:AI safety]]

1 KB (180 words) - 09:49, 6 May 2020
Something like realism about rationality
...alism about rationality''' is a topic of debate among people working on AI safety. The "something like" refers to the fact that the very topic of ''what the ...uct to achieve an agreed-upon aim, namely helping to detect/fix/ensure the safety of AGI systems.)

7 KB (1,110 words) - 20:24, 26 June 2020
Competence gap
...nd its ability to solve alignment problems (i.e. design better ''aligned'' AI systems). ...r is imagining some big leap/going from just humans to suddenly superhuman AI, whereas paul is imagining a more smooth transition that powers his optimis

3 KB (477 words) - 00:01, 30 May 2020
Secret sauce for intelligence vs specialization in intelligence
...t [[missing gear]]'/'one wrong number' dynamic, AND each insight makes the AI a little better), then you can't specialize in "intelligence". [[Category:AI safety]]

577 bytes (96 words) - 23:01, 6 July 2020
Narrow window argument against continuous takeoff
...the AI will not hit the human timescale keyhole." From our perspective, an AI will either be so slow as to be bottlenecked, or so fast as to be FOOM. Whe ...challenge time in advance, rather than challenging at a point where their AI seemed just barely good enough, it was improbable that they'd make *exactly

4 KB (728 words) - 16:56, 24 June 2020
List of critiques of iterated amplification
[[Category:AI safety]]

1 KB (130 words) - 19:55, 31 May 2021
Intelligence amplification
[[Category:AI safety]]

61 bytes (8 words) - 01:10, 18 May 2020
There is room for something like RAISE
Self-studying all of the technical prerequisites for [[technical AI safety research]] is hard. The most that people new to the field get is a list of ...pessimism: If hiring capacity is limited at AI safety orgs and mainstream AI orgs only want to hire ML PhDs then new people entering the field will basi

3 KB (447 words) - 18:34, 18 July 2021
My current thoughts on the technical AI safety pipeline (outside academia)
* [[AI safety technical pipeline does not teach how to start having novel thoughts]] * [[AI safety is not a community]]

931 bytes (138 words) - 01:30, 20 May 2020
AI safety technical pipeline does not teach how to start having novel thoughts
Currently, the [[AI safety community]] does not have an explicit mechanism for teaching new people how [[Category:AI safety meta]]

1 KB (225 words) - 02:28, 28 March 2021
AI safety is not a community
...lly, I think I've been in communities before, and being a part of the [[AI safety community]] does not feel like that. * AI safety has left the "hobbyist stage". People can actually now get paid to think ab

894 bytes (159 words) - 02:28, 28 March 2021
Mixed messaging regarding independent thinking
I think the [[AI safety community]] and [[effective altruism]] in general has some mixed messaging [[Category:AI safety meta]]

866 bytes (142 words) - 20:38, 18 May 2020
There is pressure to rush into a technical agenda
AI safety has a weird dynamic going on where: * There are discussions of things like AI timelines, assumptions of various technical agendas, etc., which reveals th

3 KB (435 words) - 02:38, 28 March 2021
AI safety lacks a space to ask stupid or ballsy questions
[[Category:AI safety meta]]

1 KB (198 words) - 02:28, 28 March 2021
Timeline of my involvement in AI safety
[[Category:AI safety meta]]

283 bytes (42 words) - 21:27, 18 May 2020
My take on RAISE
* I'm not sure about the value of explaining things better in AI safety in general: it seems like this would significantly lower the bar to entry ( ...ing for "this is the most complete and coherent curriculum of technical AI safety learning in the world". I actually think it isn't too hard to just cobble t

1 KB (218 words) - 21:33, 18 May 2020
AI safety is harder than most things
...nice thank you letter from a different person. In contrast, working on AI safety feels like .... there's absolutely no feedback on whether I'm doing anythin When you've been at AI safety for too long, you're so used to just "[staring] at a blank sheet of paper u

1 KB (190 words) - 02:28, 28 March 2021
Nobody understands what makes people snap into AI safety
...g else to work on AI safety''. Cue all the rationalists who "believe in AI safety" but don't do anything about it. ..., and the people who are actually spending their full-time attention on AI safety? I think this is a very important question, and I don't think anybody under

2 KB (268 words) - 02:35, 28 March 2021
Giving advice in response to generic questions is difficult but important
[[Category:AI safety meta]]

1 KB (180 words) - 02:32, 28 March 2021
Newcomers in AI safety are silent about their struggles
...tty hard to find people openly complaining about how to get involved in AI safety. You can find some random comments, and there are occasional Facebook threa # For people who want to do technical AI safety research, how are they deciding between MIRI vs Paul vs other technical age

1 KB (195 words) - 02:35, 28 March 2021
It is difficult to find people to bounce ideas off of
[[Category:AI safety meta]]

188 bytes (31 words) - 02:34, 28 March 2021
It is difficult to get feedback on published work
...ities that make a person better: has spent a lot of time thinking about AI safety (or is willing to spend a lot of time to catch up), not afraid to dig into ...e there is no consensus about what is "good"; both of these are true in AI safety

602 bytes (108 words) - 02:34, 28 March 2021
Ongoing friendship and collaboration is important
I think one of the reasons that [[AI safety is not a community]] is that it's difficult to find these deep connections. [[Category:AI safety meta]]

1 KB (195 words) - 02:35, 28 March 2021
Mass shift to technical AI safety research is suspicious
...rough existing evidence and getting better evidence (e.g. from progress in AI research), rather than due to less virtuous reasons like groupthink/prestig [[Category:AI safety meta]]

910 bytes (138 words) - 02:35, 28 March 2021
List of thought experiments in AI safety
...[[Counterfactual of dropping a seed AI into a world without other capable AI]] [[Category:AI safety]]

448 bytes (72 words) - 07:27, 20 May 2020
Content sharing between AIs
* content sharing rarely happens in AI [[Category:AI safety]]

229 bytes (33 words) - 07:59, 20 May 2020
Science argument
...en with AI as well: that there is some sort of core insight that allows an AI to suddenly have much more control over the world, rather than gaining capa [[Category:AI safety]]

1 KB (166 words) - 07:09, 15 June 2021
List of breakthroughs plausibly needed for AGI
...is also important for understanding Eliezer's view about what progress in AI looks like. see https://www.lesswrong.com/posts/5WECpYABCT62TJrhY/will-ai-undergo-discontinuous-progress#The_Conceptual_Arguments

2 KB (269 words) - 07:11, 17 June 2020
Resource overhang
==Resource overhang and AI takeoff== ...so.<ref>https://www.greaterwrong.com/posts/N6vZEnCn6A95Xn39p/are-we-in-an-ai-overhang</ref> See [[scaling hypothesis]].

867 bytes (132 words) - 03:19, 24 February 2021
Hardware overhang
[[Category:AI safety]]

329 bytes (40 words) - 20:58, 27 July 2020
Corrigibility
'''Corrigibility''' is a term used in AI safety with multiple/unclear meanings. I think the term was originally used by [[MIRI]] to mean something like an AI that allowed human programmers to shut it off.

773 bytes (110 words) - 23:29, 8 November 2021
Goalpost for usefulness of HRAD work
...about how "complete axiomatic descriptions" haven't been useful so far in AI, and how they aren't used to describe machine learning systems ...et at an easier spot by MIRI: "Techniques you can actually adapt in a safe AI, come the day, will probably have very simple cores — the sort of core co

3 KB (480 words) - 20:17, 26 June 2020
Highly reliable agent designs
[[List of disagreements in AI safety#Highly reliable agent designs]] [[Category:AI safety]]

342 bytes (56 words) - 06:21, 27 May 2020
Online question-answering services are unreliable
...ts.stackexchange.com/users/273265/riceissa?tab=questions</ref><ref>https://ai.stackexchange.com/users/33930/riceissa?tab=questions</ref><ref>https://biol [[Category:AI safety meta]]

1 KB (159 words) - 02:36, 28 March 2021
Unreliability of online question-answering services makes it emotionally taxing to write up questions
[[Category:AI safety meta]]

428 bytes (69 words) - 02:39, 28 March 2021
AI timelines
'''AI timelines''' refers to the question of when we will see advanced AI technology. For now, see [[List of disagreements in AI safety#AI timelines]]

691 bytes (93 words) - 01:36, 5 April 2021
List of success criteria for HRAD work
* early advanced AI systems will be understandable in terms of HRAD's formalisms [https://eafor * helps AGI programmers fix problems in early advanced AI systems

4 KB (521 words) - 20:18, 26 June 2020
Missing gear vs secret sauce
...big breakthrough? !! Nature of final piece !! Found by humans or found by AI? !! Length of lead time prior to final piece !! Number of pieces !! Explana ...essarily || Restricts the final piece to be about understanding, where the AI goes from "not understanding" to "understanding" something. ||

2 KB (329 words) - 21:16, 9 June 2020
Missing gear for intelligence
...gence]], the missing gear argument does not require that the final part of AI development be a huge conceptual breakthrough; instead, the final piece is ...lly helpful (because it can finally automate some particular part of doing AI research). Then the first project that gets to that point can suddenly grow

8 KB (1,275 words) - 21:42, 30 June 2020
Politicization of AI
...icization of AI''' is a hypothetical concern that discussions about AI and AI risk will become politicized/politically polarized, similar to how discussi ...://www.greaterwrong.com/posts/x4tyb9di28b4n9EE2/trying-for-five-minutes-on-ai-strategy/comment/aPqz3GBypvreaMqnT

1 KB (194 words) - 22:06, 11 July 2021
Fractional progress argument for AI timelines
...the '''extrapolation argument for AI timelines''' takes existing trends in AI progress as well as some cutoff level for AGI to extrapolate when we will g "How capable are the best AI systems today compared to AGI? At the current rate of progress, how long wi

933 bytes (143 words) - 00:19, 12 July 2021
Progress in self-improvement
...adual: before there is an AI that is good at self-improvement, there is an AI that is somewhat good at self-improvement, and so on. ...e moving fast already."<ref>https://meteuphoric.com/2009/10/16/how-far-can-ai-jump/</ref>

3 KB (471 words) - 17:15, 24 June 2020
Kanzi
[[Category:AI safety]]

1 KB (203 words) - 21:30, 30 June 2020
AlphaGo as evidence of discontinuous takeoff
...o all kinds of incremental technological precursors to AlphaGo in terms of AI technology, but they wouldn't be smooth precursors on a graph of Go-playing ...ugh before they make it out to the world? this question matters mostly for AI systems that produce a large qualitative shift in how good they are (like a

3 KB (456 words) - 23:35, 2 August 2022
Short-term preferences-on-reflection
...in hindsight. The logic here is that the user always knows better than the AI." [https://docs.google.com/document/d/11QGpURtFF-JFnWjkdIybW9Q-mI3fiKJjhxc0 [[Category:AI safety]]

890 bytes (143 words) - 23:00, 26 August 2020
Meta-execution
[[Category:AI safety]]

240 bytes (29 words) - 19:02, 23 September 2020
Hyperbolic growth
...ation of takeoff shape has more to do with the inside view details of what AI will look like, and doesn't have anything to do with whether or not the Ind ...t: This is one reason why we shouldn’t use GDP extrapolations to predict AI timelines. It’s like extrapolating global mean temperature trends into th

2 KB (282 words) - 00:07, 2 March 2021
Second species argument
"the second species argument that sufficiently intelligent AI systems could become the most intelligent species, in which case humans cou [[Category:AI safety]]

334 bytes (42 words) - 04:22, 22 October 2020
Sudden emergence
...l systems, followed by a sudden jump to extremely capable/[[Transformative AI|transformative]] systems. Another way to phrase sudden emergence is as "a d The term was coined by [[Ben Garfinkel]] in "[[On Classic Arguments for AI Discontinuities]]".<ref>https://docs.google.com/document/d/1lgcBauWyYk774gB

1 KB (151 words) - 23:23, 25 May 2021
AI takeoff
...n or gradual, how quickly economic activity will accelerate after advanced AI systems appear, and so on). * [[List of disagreements in AI safety#Takeoff dynamics]]

734 bytes (101 words) - 01:01, 5 March 2021
List of arguments against working on AI safety
This is a '''list of arguments against working on AI safety'''. Personally I think the only one that's not totally weak is opportunity ...out how to affect the long-term future. See also [[Pascal's mugging and AI safety]].

8 KB (1,245 words) - 00:29, 24 July 2022
List of AI safety projects I could work on
* writing some sort of overview of my beliefs regarding AI safety. like, if i was explaining things from scratch to someone, what would that * my current take on [[AI timelines]] (vacation tier)

6 KB (927 words) - 14:25, 4 February 2022
How meta should AI safety be?
I often go back and forth between the following two approaches to AI safety: ...s/conferences. Trust that making a better community will lead to better AI safety work being done.

873 bytes (144 words) - 02:33, 28 March 2021
Pascal's mugging and AI safety
...to [[Pascal's mugging]]. The critic of AI safety argues that working on AI safety has a very small probability of a very big payoff, which sounds suspicious. * Argue that reducing x-risk from AI safety is more like a 1% chance than like an astronomically small chance.

1 KB (147 words) - 22:16, 17 November 2020
Pascal's mugging argument against AI safety
#redirect [[Pascal's mugging and AI safety]]

44 bytes (6 words) - 23:24, 12 November 2020
Continuous takeoff
...er Yudkowsky]]'s [[FOOM]] scenario, arguing that the transition from early AI systems to superintelligent systems will not be so immediate, but these vie ...y; even in a [[hard takeoff]] (i.e. "discontinuous takeoff") scenario, the AI system's capability can be modeled as a continuous function.

773 bytes (106 words) - 22:16, 1 March 2021
OpenAI
[[Category:AI safety]] [[Category:AI safety organizations]]

105 bytes (14 words) - 19:54, 22 March 2021
Asymmetric institution
[[Category:AI safety]]

3 KB (415 words) - 21:51, 12 March 2021
Spaced inbox ideas
...t thing is to be able to have separate topics, like my default inbox vs ai safety vs day job.

2 KB (305 words) - 01:17, 17 July 2021
Will it be possible for humans to detect an existential win?
...ct, it seems difficult to tell whether we've "won" or not. For example, an AI might convincingly explain to us that things are going well even when they I think a big part of why I am more pessimistic than most people in the [[AI safety community]] is that others think detecting an "[[existential win]]" will be

691 bytes (115 words) - 02:40, 28 March 2021
Existential win
[[Category:AI safety]]

305 bytes (48 words) - 01:02, 1 December 2020
Unintended consequences of AI safety advocacy argument against AI safety
...as to cause the creation of DeepMind and OpenAI, and to accelerate overall AI progress. I’m not saying that he’s necessarily right, and I’m not say * [[List of arguments against working on AI safety]]

2 KB (353 words) - 21:23, 6 November 2021
Changing selection pressures argument
...ers will always be optimizing for something like the scientific ability of AI systems. [[Category:AI safety]]

1 KB (178 words) - 04:17, 30 August 2021
Scaling hypothesis
...osaic AI? I think the scaling hypothesis implies prosaic AI, but a prosaic AI can make use of lots of different algorithms? * https://www.greaterwrong.com/posts/N6vZEnCn6A95Xn39p/are-we-in-an-ai-overhang/comment/jbD8siv7GMWxRro43

886 bytes (118 words) - 00:45, 12 March 2021
Christiano's operationalization of slow takeoff
...this operationalization has been cited by many others in discussions of [[AI takeoff]]. [[Category:AI safety]]

1,018 bytes (118 words) - 23:48, 25 February 2021
Paul Christiano
'''Paul Christiano''' is an AI safety researcher, previously at [[OpenAI]] and currently for some undisclosed pro [[Category:AI safety]]

234 bytes (27 words) - 23:00, 25 February 2021
HCH
[[Category:AI safety]]

416 bytes (61 words) - 23:03, 25 February 2021
Importance of knowing about AI takeoff
...of knowing about AI takeoff''' is about the "so what?" of knowing which [[AI takeoff]] scenario will happen. How will our actions change if we expect a ...ent) or short-term consumption. In contrast, with more continuous takeoff, AI prepping becomes relatively more important.

1 KB (181 words) - 02:13, 5 March 2021
Debates shift bystanders' beliefs
...useful isn't to shift MIRI or paul; it's so that new people coming into AI safety will pick the "correct" agenda to work on with higher probability. [[Category:AI safety]]

274 bytes (44 words) - 02:30, 28 March 2021
Emotional difficulties of spaced repetition
* [[Emotional difficulties of AI safety research]]

1 KB (160 words) - 18:21, 18 July 2021
Scenius
* [[Is AI safety no longer a scenius?]] [[Category:AI safety meta]]

557 bytes (79 words) - 12:44, 7 February 2022
Is AI safety no longer a scenius?
...rofessionalized and prestigious. As Nielsen says (abstractly, not about AI safety in particular): "A field that is fun and stimulating when 50 people are inv ...as scenius? Or try to work on [[mechanism design]] so that the larger [[AI safety community]] is more functional than existing "eternal september" type event

2 KB (337 words) - 02:34, 28 March 2021
Emotional difficulties of AI safety research
...ties arising from the subject matter itself, without reference to the [[AI safety community]] * [[AI safety has many prerequisites]]

950 bytes (132 words) - 18:25, 18 July 2021
Corrigibility may be undesirable
* a point originally made by [[Wei Dai]] is that if an AI is corrigible to its human operators, then it may have to forgo certain kin [[Category:AI safety]]

271 bytes (45 words) - 02:30, 28 March 2021
Reference class forecasting on human achievements argument for AI timelines
...able mathematical conjectures" to get an [[outside view]] probability of [[AI timelines]]. The [[Open Philanthropy]] report on semi-informative priors is [[Category:AI safety]]

824 bytes (111 words) - 00:14, 12 July 2021
Textbook test for AI theory
...ld the AI in less than a year (i.e. not including any of the data that the AI will use to learn from)? [[Category:AI safety]]

523 bytes (98 words) - 23:32, 18 June 2021
Explosive aftermath
...esis that once the first highly capable AI system is developed, thereafter AI systems will extremely rapidly improve to the level of [[superintelligence] The term was coined by [[Ben Garfinkel]] in "[[On Classic Arguments for AI Discontinuities]]".<ref>https://docs.google.com/document/d/1lgcBauWyYk774gB

1 KB (150 words) - 23:22, 25 May 2021
Robin Hanson
...n''' is an economist who has also written a lot about the future including AI stuff. [[Category:AI safety]]

122 bytes (20 words) - 23:33, 18 June 2021
Single-architecture generality
[[Category:AI safety]]

500 bytes (68 words) - 20:18, 11 August 2021
Single-model generality
[[Category:AI safety]]

358 bytes (48 words) - 20:19, 11 August 2021
Can the behavior of approval-direction be undefined or random?
...be convinced to give high approval to basically ''every'' action. Once the AI becomes good enough at persuasion, it wouldn't necessarily be malicious, bu Maybe the answer is something like "But you gradually train the AI, so at every point it's seeking the approval of a smarter amplified system,

846 bytes (132 words) - 17:24, 15 October 2021
Number of relevant actors around the time of creation of AGI
[[Category:AI safety]]

242 bytes (35 words) - 23:37, 8 November 2021
Weird recursion
[[Category:AI safety]]

184 bytes (23 words) - 23:38, 8 November 2021
Philosophical difficulty
* [[Human safety problem]] * [[Difficulty of AI alignment]]

243 bytes (33 words) - 10:59, 26 February 2022
Difficulty of AI alignment
[[Aligning smart AI using slightly less smart AI]] [[Category:AI safety]]

141 bytes (17 words) - 11:10, 26 February 2022
Aligning smart AI using slightly less smart AI
...tems slightly smarter than ourselves, and from there, each "generation" of AI systems will align slightly smarter systems, and so on. [[Category:AI safety]]

846 bytes (111 words) - 11:21, 26 February 2022
Pivotal act
...ct is an important part of the plan for preventing [[existential doom from AI]]. * Make progress on the full (i.e. not restricted to a limited AI system like present-day systems or [[minimal AGI]]) alignment problem faste

2 KB (218 words) - 15:15, 26 February 2022
Minimal AGI
[[Category:AI safety]]

244 bytes (26 words) - 14:42, 26 February 2022
Doomer argument against AI safety
[[Category:AI safety]]

172 bytes (19 words) - 20:55, 18 March 2022
How similar are human brains to chimpanzee brains?
[[Category:AI safety]]

144 bytes (25 words) - 22:25, 12 April 2022
Late 2021 MIRI conversations
...lesswrong.com/s/n945eovrA3oDueqtq/p/hwxj4gieR7FWNwYfa Ngo and Yudkowsky on AI capability gains] ...hether there will be a period of rapid economics progress from "pre-scary" AI before "scary" cognition appears (Eliezer doesn't think this is likely, but

6 KB (948 words) - 21:27, 1 August 2022

Search results

Page title matches

Page text matches

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools