Difference between revisions of "Iterated amplification"

From Issawiki
Jump to: navigation, search
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
'''Iterated amplification''' (also called '''iterated distillation and amplification''', and abbreviated '''IDA''') is the technical alignment agenda that [[Paul Christiano]] works on.
 
'''Iterated amplification''' (also called '''iterated distillation and amplification''', and abbreviated '''IDA''') is the technical alignment agenda that [[Paul Christiano]] works on.
 +
 +
Terminology (not necessarily about IDA, but these are some terms frequently used by Paul):
 +
 +
* [[informed oversight]]
 +
* [[adequate oversight]]
 +
* [[overseer]]
 +
* [[bandwidth of the overseer]], [[high bandwidth oversight]], [[low bandwidth oversight]]
 +
* [[reward engineering]]
 +
* [[HCH]], [[Strong HCH]], [[Weak HCH]], [[Humans consulting HCH]]
 +
* [[amplification]]
 +
* [[capability amplification]]
 +
* [[distillation]]
 +
* [[factored cognition]], [[factored evaluation]], [[factored generation]]
 +
* [[corrigibility]]
 +
* [[benign]]
 +
* [[aligned]]
 +
* [[robustness]]
 +
* [[red teaming]]
 +
* [[ALBA]]
 +
* [[optimization daemon]]s
 +
* [[act-based agent]] vs [[goal-directed agent]]
 +
* [[approval-directed agent]]
 +
* [[steering problem]]
 +
* [[prosaic AI]]
 +
* [[bootstrapping]]
 +
* [[catastrophe]]
 +
* [[reliability amplification]]
 +
* [[security amplification]]
 +
* [[universality]]
 +
* [[narrow value learning]] vs [[ambitious value learning]]
 +
* [[learning with catastrophes]], [[optimizing worst-case performance]]
  
 
==See also==
 
==See also==
Line 6: Line 37:
  
 
[[Category:Iterated amplification]]
 
[[Category:Iterated amplification]]
 +
[[Category:AI safety]]

Latest revision as of 03:58, 26 April 2020