Search results

Create the page "Overseer" on this wiki! See also the search results found.

My understanding of how IDA works
...is trying to train. So we won't have something like the agent pushing the overseer over so that it can press to "i approve" button.

9 KB (1,597 words) - 00:27, 6 October 2020
Iterated amplification
* [[overseer]] * [[bandwidth of the overseer]], [[high bandwidth oversight]], [[low bandwidth oversight]]

1 KB (125 words) - 03:58, 26 April 2020
MIRI vs Paul research agenda hypotheses
...is smarter than the agent you are trying to train, you can safely use that overseer’s judgment as an objective." ...We can train an RL system using very sparse feedback, so it’s OK if that overseer is very computationally expensive."

4 KB (702 words) - 03:57, 26 April 2020

Navigation menu