AI Safety

Stuart Russell

UC Berkeley Professor, AI Safety Researcher

Stuart Russell co-authored the standard AI textbook used in universities worldwide, giving him unique authority when he warns about AI risks. His book "Human Compatible" argues that the standard model of AI, optimizing a fixed objective, is fundamentally flawed and dangerous. Russell proposes an alternative: AI systems that are uncertain about human preferences and actively seek to learn them. This approach, which he calls "provably beneficial AI," represents one of the most rigorous attempts to solve the alignment problem. Russell has been particularly effective at communicating AI safety concerns to policymakers and the general public, helping move these ideas from fringe to mainstream.

Key contributions

Co-authored "Artificial Intelligence: A Modern Approach"
Proposed provably beneficial AI framework
Author of "Human Compatible"
Advisor to governments on AI policy

Quotes (1)

“The problem of controlling superhuman AI is not one we can afford to get wrong.”
paraphrased · 2019

Related concepts

AI Alignment Existential Risk Artificial General Intelligence