Coherence and goal-directed agency discussion
One of the "big discussions" that has been taking place in AI safety in 2018–2020 is about coherence arguments and goal-directed agency. Some of the topics in this discussion are:
- What point was Eliezer trying to make when bringing up coherence arguments?
- Will the first AGI systems we build be goal-directed?
- Are utility maximizers goal-directed?
- What are the differences and implication structure among intelligence, goal-directedness, expected utility maximizer, coherent, "EU maximizer + X" (for various values of X like "can convert extra resources into additional utility", "has a typical reward function", "has a simple reward function", "wasn't specifically optimized away from goal-directedness")?
https://www.greaterwrong.com/posts/9zpT9dikrrebdq3Jf/will-humans-build-goal-directed-agents
https://www.greaterwrong.com/posts/tHxXdAn8Yuiy9y2pZ/ai-safety-without-goal-directed-behavior
https://www.greaterwrong.com/posts/TE5nJ882s5dCMkBB8/conclusion-to-the-sequence-on-value-learning