What Is AI?

Module 1 · Lesson 2

A Brief History of AI

From Turing to ChatGPT.

10 min read

In 1956, a handful of mathematicians gathered at Dartmouth College and decided to name a field that didn't yet exist. They called it artificial intelligence and gave themselves ten weeks to solve it.

They did not solve it.

That confidence, and that gap, is the entire history of AI in miniature.

Why this matters

If you don't know the history, you will fall for every hype cycle. Every "this time is different" pitch is a rerun of one we have seen before. Understanding the pattern — breakthrough, overclaim, disappointment, quiet progress, next breakthrough — is what lets you read the 2026 moment for what it actually is instead of what anyone is selling it as.

The pattern that repeats

Strip the dates and the history of AI reads as the same paragraph on loop.

Phase What it looks like How long it lasts
Breakthrough A technique suddenly works. Benchmarks shatter. Papers get weaponized as press releases. 6–24 months of euphoria.
Overclaim Investors pour in. Founders promise AGI in five years. "This time is different" is said in public by serious people. Coincides with the breakthrough.
Disappointment The technique hits the wall of real-world messiness. Production deployments fail. Funding contracts. Think-pieces turn cynical. 2–10 years. Sometimes called a "winter."
Quiet progress Researchers who stayed in the field do unglamorous work on data, compute, and architecture while no one is watching. This is when everything that matters gets built.
Next breakthrough The quiet work compounds. A new technique lands on top of it. Phase 1 begins again. Unpredictable. Always "obvious in retrospect."

This has happened at least three times. It will happen again.

The timeline, compressed

  • 1950 — Turing's question. Alan Turing asks not can machines compute? but can machines think? His paper in Mind proposed what we now call the Turing Test. It is less a scientific instrument than a philosophical provocation dressed as one.
  • 1956 — Dartmouth. McCarthy, Minsky, Rochester and Shannon name the field and promise they'll solve it over the summer.
  • 1970s — First AI winter. Early symbolic systems collapse outside curated environments. Funding evaporates.
  • 1980s — Expert systems boom. Rule-based programs like MYCIN diagnose infections with genuine competence.
  • Late 1980s — Second AI winter. Expert systems are too brittle and too expensive to maintain. Funding evaporates again.
  • 2012 — AlexNet. A deep neural network wins ImageNet by a margin that doesn't look like a win, it looks like a different contest. Deep learning stops being a curiosity.
  • 2016 — AlphaGo. DeepMind's system beats Lee Sedol at Go, a game considered computationally uncrackable. A machine becomes a teacher.
  • 2017 — "Attention Is All You Need." The transformer paper is published. The title was slightly tongue-in-cheek. The field took it literally.
  • 2020 — GPT-3. Scale stops being an engineering parameter and becomes a philosophical one. "Emergent behavior" becomes the term for capabilities that appear without being designed.
  • November 2022 — ChatGPT. 100 million users in two months. The interface, not the model, was the revolution.
  • 2024–2026 — Reasoning models and the efficiency era. System 2 models deliberate before answering. Small specialist models beat large generalists on narrow tasks.

The two winters nobody warns you about

They were not failures

Both AI winters were preceded by researchers confidently predicting general intelligence within a decade. Current researchers are making similar predictions. The winters were not caused by the technology being wrong — they were caused by the gap between what the technology could do and what was promised. That gap is where funding dies. Keep that gap small.

The winters were not wasted time. They were when the unglamorous work happened: better data pipelines, faster hardware, mathematical tools that would later become load-bearing. Every spring is built on a winter's quiet labor.

The transformer moment

Everything about the current AI boom — GPT, Claude, Gemini, Llama, every model you have heard of — sits on top of a single 2017 paper: Attention Is All You Need, by Vaswani, Shazeer, Parmar and colleagues at Google.

Before and after transformers

Pre-transformer (RNNs, LSTMs) Transformer-based
Sequential processing — each word waits for the previous one. Parallel processing — every word attends to every other word at once.
Vanishing gradients. Long-range context degrades badly. Attention lets the model reach across arbitrary distances in the input.
Translation was the best-case outcome. Translation, reasoning, code, images, audio, and protein folding all collapsed into the same architecture.

A single architectural paper reshaped an entire field in five years. That is not normal in computer science. It happens once a decade, maybe.

The ChatGPT discontinuity

GPT-3 had been available to developers for over a year before ChatGPT launched in November 2022. The model was not new. The interface was. A free conversational wrapper turned a developer API into a consumer phenomenon in 60 days.

The real lesson of ChatGPT

The capability existed a year earlier. Nobody cared. The interface was the invention. That is a historical pattern as old as the graphical user interface — and forgetting it means you will miss the next one, because you will be watching the models instead of the surfaces users actually touch.

Try it yourself

Calibrate against the cycle

Think back to the first time you used a conversational AI tool. Write down what you expected it to be like before you tried it, and what it actually felt like.

What most people describe

Two priors show up consistently: either Hollywood's all-knowing sinister intelligence, or a clunky chatbot that mishears your customer-service request. The actual experience lands somewhere neither prior predicted — genuinely useful in some respects, confidently wrong in others, strange in ways that don't map onto either the utopian or dystopian script.

That strangeness is the signal. It means the technology is doing something that doesn't fit the existing narrative — and narratives are what hype cycles run on. If the thing in front of you doesn't match the pitch, the pitch is the problem, not the technology.

Where we are right now

We are in a spring. The question nobody can honestly answer is whether the current wave will hold, collapse into another winter, or finally break the pattern because transformers plus scale plus reasoning-model inference compute is genuinely different from what came before.

The bet worth making

Invest your time in what outlasts the hype — the skills of prompting precisely, choosing models by problem shape, and reasoning clearly about cost and reliability. Those skills compound through winters. The specific model you use this month probably won't.

Key takeaway

AI progress runs on a loop: breakthrough, overclaim, disappointment, quiet progress, next breakthrough. The winters were not failures — they were where the next spring was built. You are in a spring right now. Act accordingly: use the tools, but invest your learning in the fundamentals that survive a winter.

Check your understanding

Quick Check 1

According to the lesson, what was the core problem that caused the AI winters?

Quick Check 2

What does the lesson identify as the most accurate way to understand AI progress over time?

Key Takeaway

AI progress does not look like a steady climb. It looks like long plateaus interrupted by sudden breakthroughs that seem obvious in retrospect and were invisible the day before — and we are in a spring right now.

Try It

The Model Atlas timeline on this site lets you plot any current tool against its architectural ancestors, so the next breakthrough lands on a map you can actually read.