Short-term preferences-on-reflection
i need to summarize this at some point.
is this right? "Details: “Short-term preferences” does not mean “preferences over what can be achieved in the short-term” but “things that I would still endorse in the short-term”: An AI system that is optimizing for short-term preferences could still do long-term planning and act accordingly, but it does so because that’s what the human wants it to do at that moment. If we changed our mind, the AI would also change action. It would not take actions that we oppose even if 10 years later we would think it would have been good in hindsight. The logic here is that the user always knows better than the AI." [1]