
Stuart Russell co-authored the standard AI textbook used in universities worldwide, giving him unique authority when he warns about AI risks. His book "Human Compatible" argues that the standard model of AI, optimizing a fixed objective, is fundamentally flawed and dangerous. Russell proposes an alternative: AI systems that are uncertain about human preferences and actively seek to learn them. This approach, which he calls "provably beneficial AI," represents one of the most rigorous attempts to solve the alignment problem. Russell has been particularly effective at communicating AI safety concerns to policymakers and the general public, helping move these ideas from fringe to mainstream.
“The problem of controlling superhuman AI is not one we can afford to get wrong.”
paraphrased · 2019