The Bitter Lesson · Oslo Vibe Coding

A plain summary, so you can get the gist here without leaving.

In this short 2019 essay, the reinforcement learning researcher Rich Sutton points to a pattern that keeps repeating across decades of AI. General methods that ride on raw computing power tend to win, and clever human-crafted shortcuts tend to lose.

What it is

The essay is a reflection on the history of artificial intelligence by someone who lived through much of it. Sutton looks back at fields like chess, the board game Go, speech recognition, and computer vision, and notices the same story playing out again and again. Researchers spend years carefully encoding human knowledge into a system, only to be overtaken by simpler approaches that just learn from data and search through possibilities, given enough computing power.

He calls it bitter because it stings. The handcrafted, knowledge-rich approaches are the ones humans feel proud of. They reflect real insight and hard work. Yet over the long run they keep getting beaten by methods that lean less on human cleverness and more on letting computation do the heavy lifting.

The core idea in plain terms

The lesson rests on one steady fact: computers keep getting faster and cheaper, year after year. So a method that improves automatically as you give it more compute will eventually outrun a method that depends on a fixed amount of human-designed cleverness. The first kind keeps climbing as hardware grows. The second kind stalls at whatever the designers managed to put in.

Sutton points to two general methods that scale especially well: search, which means letting the machine explore many options, and learning, which means letting the machine improve from experience and data. Both get better simply by being given more computation. By contrast, hard-coding human assumptions about how the world works often feels satisfying in the short term but becomes a ceiling later, because the system can only be as good as the rules people wrote for it.

Why it matters

This essay quietly explains a lot about modern AI. Large language models and other recent systems are, in a sense, the bitter lesson in action: relatively general architectures, trained on enormous amounts of data with enormous amounts of compute, rather than tightly specified rules about grammar or meaning. Understanding this helps you read the field with clearer eyes.

For people building with AI today, it carries a humble warning. Be careful before you bake too many of your own assumptions into a system. Sometimes the durable move is to set up a general method, feed it good data and enough compute, and let it find patterns you would not have thought to write down. For our community, it is a foundational mental model: respect human insight, but do not bet against scale.

Key points

A 2019 essay by reinforcement learning researcher Rich Sutton, reflecting on 70 years of AI history.
The recurring pattern: general methods powered by computation beat handcrafted, knowledge-heavy systems over time.
It works because compute keeps getting cheaper, so methods that scale with compute keep improving while fixed human rules stall.
Search and learning are the two general methods that scale best, since both improve with more computation.
It helps explain why modern AI relies on large-scale data and compute rather than hand-written rules, and it cautions builders against over-encoding their own assumptions.

Open the original source

Rich Sutton

New to this? Come build with us.

Reading is good. Building with people is better. Our drop-ins are free and open to total beginners.

RSVP for the next session Browse the whole library