Paperclip maximizer

From Issawiki
Revision as of 00:23, 23 March 2020 by Issa (talk | contribs)
Jump to: navigation, search

my understanding is that the paperclip maximizer example was intended to illustrate the following two concepts:

  • orthogonality thesis: intelligence/capability and values can vary orthogonally; a superintelligent AI need not realize that "making paperclips is stupid" and decide to maximize happiness instead
  • instrumental convergence: even if an AI isn't deliberately trying to hurt us (as a terminal value), it will still probably kill us because it is instrumentally valuable to acquire resources, disarm other agents that could interfere, etc.

However, the paperclip maximizer is sometimes also mentioned when talking about more realistic AI designs, and how a "random" goal that an unaligned AI would have is as valuable as paperclipping. (complexity of values, fragility of values)

many discussions about the paperclip maximizer don't seem to realize which concepts the example is designed to illustrate, so they get derailed by comments like "but programming your AI to maximize paperclips is so stupid lol"

https://youtu.be/EXbUgvlB0Zo?t=8391 (go slightly earlier) -- rohin says that the paperclip maximizer example shows that it's actually hard to get an AI to make paperclips in a sane way. this seems to be not the main point of the paperclip maximizer example, but maybe i'm wrong about this.