Navigating Superintelligence

Introduction

Welcome

Lens

A.I. - Humanity's Final Invention15 min+7 min

Kurzgesagt – In a Nutshell

Humans dominate Earth not because we're the strongest or fastest, but because we're the best general problem-solvers. Kurzgesagt explores what happens as AI moves from narrow tools toward something more general — and why digital minds could scale in ways biological ones can't.

Existential Risk from AI39 min+19 min

Wikipedia

The concern isn't that AI will "turn evil." It's that a system pursuing whatever goals it has might find that humans are in the way — and be capable enough to act on it. This overview covers the core ideas behind AI as a source of large-scale risk, from misaligned goals to the difficulty of staying in control.

10 Reasons to Ignore AI Safety15 min+8 min

Robert Miles AI Safety

"It's too early to worry." "Just don't give it bad goals." "We can always pull the plug." Robert Miles takes on ten common reasons people dismiss AI safety — and shows why each one is harder to wave away than it sounds.

Four Background Claims11 min+5 min

OptionalNate Soares

What makes AI safety worth working on now, before systems are powerful enough to be obviously dangerous? This article lays out four premises that underpin the case — from why smarter-than-human systems could emerge to why we can already do meaningful work to prepare.

Worst-Case Thinking10 min+5 min

OptionalBuck Shlegeris

In AI safety discussions, people often assume the worst. But different people do this for different reasons — some as a precaution, some because they think worst cases are likely, some because the stakes are too high to gamble on. This essay unpacks what's actually going on when someone reasons from the worst case.

Fundamental Difficulties

Welcome2 min+1 min

AI Alignment: Why It's Hard, and Where to Start18 min+9 min

OptionalMachine Intelligence Research Institute

To guide a missile, we first had to invent calculus. AI alignment may require a similar leap — a mathematical framework for how powerful optimizers behave. This talk explains why intuition alone won't cut it, and why the field needs something closer to a science of alignment before we can trust the trajectory.

6 reasons why it's not intuitive that alignment is hard25 min+12 min

Steven Byrnes

Why do arguments about AI risk often feel off, even to people who take technology seriously? This article identifies six ways our evolved intuitions lead us astray — from assuming smart things will share our common sense to underestimating how different an optimizer's reasoning can be from our own.

When should we worry about AI power-seeking?28 min+14 min

OptionalJoe Carlsmith

A sufficiently capable AI might pursue resources and self-preservation not because it wants to, but because those serve almost any goal. This article examines what conditions — agency, motivation, incentives — would actually need to be in place for power-seeking behavior to emerge.

A central AI alignment problem16 min+8 min

So8res

An AI might behave perfectly during training and testing, then suddenly act on different priorities once it becomes capable enough. This article explores why an AI's abilities tend to generalize faster than its alignment — creating a gap where the system becomes powerful enough to pursue goals we never intended.

Discovering when an agent is present in a system5 min+3 min

Zachary Kenton, Ramana Kumar, Sebastian Farquhar, Jonathan Richens, Matt MacDermott, Tom Everitt

We often talk about AI systems as agents — things that make decisions to achieve goals. But how do you actually tell whether something is an agent, versus just a process following rules? This article proposes a way to test for agency by looking at how a system would behave if its actions affected the world differently.

Pythia by plex8 min+4 min

Optionalplex

A system that perfectly predicts the world might seem harmless — it just answers questions. But if it knows how its answers change your behavior, choosing which answer to give becomes an act of influence. This article explores the thin line between passive prediction and active manipulation.

Without fundamental advances1 h 45 min+53 min

OptionalJeremy Gillen, peterbarnett

Many alignment proposals assume that iterating on current training with enough safety patches will probably work out. This article argues the opposite — given how we currently build AI, misalignment isn't the exception. It's what we should expect by default without fundamentally new approaches.

Meditations on Moloch1 h 13 min+36 min

Scott Alexander

Even if every AI researcher wants a safe outcome, competitive pressure can push everyone toward outcomes nobody wants. This essay explores "Moloch" — a name for the traps and perverse incentives that drive groups to collectively destroy what they individually value.

Navigating Superintelligence

Introduction

What even is AI?

Feedback Loops

Cognitive Superpowers

Fundamental Difficulties

Welcome

Automating Alignment

Mechanistic Interpretability

Evals

Control

Agent Foundations

Test Your Understanding

Last module: What comes next