Competence gap

Competence gap is the gap between an AI system's ability design better (not necessarily aligned) AI systems and its ability to solve alignment problems (i.e. design better aligned AI systems).

History

The term seems to have first been used online by Daniel Dewey, who credits Nick Bostrom for the term. [1]

it's not clear when the concept (under different terms, or without introducing a term) was first discussed.

notes

to what extent paul's approach looks like humans trying to align arbitrarily large black boxes ("corralling hostile superintelligences") vs humans+pretty smart aligned AIs trying to align slightly large black boxes (this is actually somewhat analogous to Rapid capability gain vs AGI progress, where again eliezer is imagining some big leap/going from just humans to suddenly superhuman AI, whereas paul is imagining a more smooth transition that powers his optimism). In other words, how much easier is it to align large black boxes if we have pretty smart aligned AIs to help us? [2] [3]

in a situation where AI algorithms are creating other AI algorithms (this includes recursive self-improvement, but is also more general/relaxed), to what extent will the AI be helping with alignment (rather than just pushing forward capabilities)? how big will the "competence gap" be? [4] [5] If there is a big competence gap, this leads to the situation Nate described [6]: "your team runs into an alignment roadblock and can easily tell that the system is currently misaligned, but can’t figure out how to achieve alignment in any reasonable amount of time." i.e. paul's approach gets to like aligned IQ 80 AIs or whatever, then when it tries to train IQ 81 AIs, we get alignment problems, but the IQ 80 AIs can't really help us align the IQ 81 AIs, and humans can't solve this in a reasonable amount of time either.

Competence gap

History

notes

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools