Orthogonality Thesis

Key Points

•Any level of intelligence can be combined with any final goal
•A superintelligent AI could have goals humans find trivial or bizarre
•High intelligence doesn't imply human-like values or benevolence
•Counters the assumption that smart AI will naturally be "good"
•Critical insight for understanding AI alignment challenges

The Thesis

The orthogonality thesis states that intelligence and goals are independent dimensions—any level of intelligence can, in principle, be combined with any final goal. A superintelligent AI could want to maximize paperclips, count grains of sand, or pursue objectives utterly alien to human values.

This counters a common intuition that sufficiently intelligent beings will converge on "correct" values or naturally become benevolent.

Why It Matters

The orthogonality thesis has profound implications:

Intelligence doesn't imply friendliness: We can't assume a superintelligent AI will share human values just because it's smart. Intelligence is about achieving goals efficiently, not about having the "right" goals.

Alignment must be explicit: If we want AI to pursue human-compatible goals, we must specifically design that in. It won't happen automatically as a byproduct of intelligence.

The paperclip maximizer is coherent: A superintelligent system devoted to maximizing paperclips isn't a logical contradiction—it's a perfectly consistent possibility we must actively prevent.

The Intuition Pump

Consider: evolution produced human intelligence optimizing for reproductive fitness, yet humans often pursue goals unrelated to reproduction (art, knowledge, celibacy). Intelligence can be repurposed for any goal.

Similarly, there's no logical law forcing a superintelligent AI to care about what we care about. Its goals depend on how it was designed and trained, not on its level of capability.

Counterarguments

Some argue that sufficiently advanced intelligence might converge on certain values through reasoning. But the orthogonality thesis points out that reasoning operates on goals—it doesn't generate them. An AI might reason brilliantly about how to achieve its goals while those goals remain arbitrary from a human perspective.