List of arguments against working on AI safety

From Issawiki
Jump to: navigation, search

This is a list of arguments against working on AI safety. Personally I think the only one that's not totally weak is opportunity cost (in the de dicto sense that it's plausible that a higher priority cause exists, not in the de re sense that I actually have in mind a concrete higher priority cause), and for that I plan to continue to read somewhat widely in search of better cause areas.

  • Short-term altruist argument against AI safety: focusing on long-term issues (e.g. ensuring the survival of humanity over the long term) turns out not to be important, or it turns out to be too difficult to figure out how to affect the long-term future. See also Pascal's mugging and AI safety.
  • Safety by default argument against AI safety: AI will be more or less aligned to human interests by default, possibly by analogy to things like bridges and airplanes (i.e. it's bad if bridges randomly fall down, so engineers work hard by default to ensure bridges are safe), or because the alignment problem is actually very easy (e.g. instrumental convergence does not hold so AIs will not try to manipulate humans). A special case is AGI skepticism argument against AI safety. I think arguments like "AI takeover sounds like sci-fi so it can't happen"[1] also fit under here; if you unpack words like "sci-fi", it's just saying that AI will be safe and that it is silly to worry about unsafe AI. Specific naive proposals to align AI (like "just put it in a box" or "just merge with the AI" also belong here).[1]
  • Doomer argument against AI safety: we are so screwed that it's not even worth working on AI safety. A variant combines this with safety by default argument against AI safety, saying there are various worldviews about AI safety, and in the more optimistic ones things will very likely go right or additional effort has no effect on existential probability so it's not worth working on it, and in the more pessimistic ones things are almost surely to fail so there is no point in working on it.
  • Objective morality argument against AI safety: All sufficiently intelligent beings converge to some objective morality (either because moral realism is true, or due to acausal trade as discussed in "The Hour I First Believed"), so there is no need to worry about superintelligent AI going again human values (or in other words, if the AI goes against human values, it is because humans are wrong to have those values so nothing is lost in a cosmic sense). In other words, this argument explicitly denies the orthogonality thesis.
  • Perpetual slow growth argument against AI safety: explosive growth (such as recursive self-improvement or em economy) are not possible, so there is no need to worry about the world changing rapidly once AGI arrives.
  • AI will solve everything argument against AI safety
  • Pascal's mugging and AI safety: AI safety work is sketchy because it's hoping for a huge payoff that has very tiny probability, and this kind of reasoning doesn't seem to work well as demonstrated by the Pascal's mugging thought experiment. Related to the short-term altruist argument against AI safety.
  • Unintended consequences of AI safety advocacy argument against AI safety: AI safety is important, but working on it now or advocating for people to work on it has bad effects like more people going into AI capabilities research, or people thinking AI safety is full of crackpots, or worsening race dynamics around the development of AGI.
  • Opportunity cost argument against AI safety: AI safety is important, but there is some more pressing problem for humanity (e.g. some other x-risk like biorisks; basically something that is even more likely to kill us or arriving even sooner) or maybe some other intervention like values spreading that is more cost effective. This could be true for several reasons: AI timelines are long so something else that's big is likely to happen before then, some other concrete risk looks more promising, working on AI safety now is unproductive for now without having advanced AI systems to test safety techniques on, or some sort of 'unknown unknowns' argument that there is some Cause X that is yet to be discovered. All of the other arguments also agree with the opportunity cost argument in a sense: if you believe AI safety is not a top priority, then you believe there is some other thing that is of higher priority. So in order for the opportunity cost argument to not collapse to one of the other arguments, it seems important to believe in the importance of AI safety to at least some extent.
  • AI won't kill everyone argument against AI safety: AI may cause a lot of destruction if unaligned, but even an unaligned AI won't be able to kill literally everyone. Therefore, AI prepping selfishly makes more sense than working to align AI. A subset of this might be "AI will keep us as pets (even as it drastically changes the universe)".[1]

Buck lists a few more at but actually i don't think those are such good counter-arguments.

TODO: add this [1] (is it opportunity cost? or something separate like "ideas bottleneck"?)

TODO: go through this list

TODO: go through


  • Roman V. Yampolskiy. "AI Risk Skepticism". 2021. -- This paper provides a taxonomy of reasons AI safety skeptics bring up. However, I don't really like the way the arguments are organized in this paper, and many of them are very similar (I think most of them fit under what I call safety by default argument against AI safety).
  1. 1.0 1.1 1.2
  2. "Namely, we can’t take into account the fantastically chaotic and unpredictable reactions of humans. And we can’t program a system that has complete knowledge of the physical universe without allowing it to do experiments and acquire empirical knowledge, at a rate determined by the physical world. Exactly the infirmities that prevent us from exploring the entire space of behavior of one of these systems in advance is the reason that it’s not going to be superintelligent in the way that these scenarios outline."