Defining Artificial Intelligence

In 1997, Deep Blue beat Garry Kasparov at chess and the world called it artificial intelligence. Then the engineers explained how it worked: brute-force search across two hundred million positions per second, a hand-tuned evaluation function, no learning, no adaptation. Kasparov lost to a very fast calculator. The public quietly updated its definition of AI and moved on.

This keeps happening. Every time a machine does a thing we used to think required a mind, we inspect the mechanism, decide it doesn't count, and move the goalposts. That is not a bug in how we talk about AI. That is the main thing about how we talk about AI.

Why this matters

If you cannot say what AI is, you cannot evaluate claims about it, pick the right tool for a job, or tell when someone is selling you vaporware. The goal of this lesson is not to give you a crisp definition — the field doesn't have one. The goal is to give you a usable map so you can stop asking "is this really AI?" and start asking the question that actually matters: what kind of system is this, and what is it good for?

The moving-goalposts problem

The formal birth of the field is usually dated to 1956, when John McCarthy, Marvin Minsky and a handful of others gathered at Dartmouth College and coined the term artificial intelligence. Their founding hypothesis: every feature of intelligence can be precisely described and simulated by a machine.

That is not a definition. That is a bet. Seventy years of AI research is largely the story of discovering which parts of that bet were right, which were wrong, and which were right for reasons nobody predicted.

Decade	The thing we called "AI" at the time	What happened next
1960s	Symbolic logic, theorem provers	Worked in toy domains, collapsed in the real world.
1980s	Expert systems, hand-coded rules	Worked narrowly, were brittle and unmaintainable.
1990s	Chess engines (Deep Blue)	Beat humans. Public decided chess wasn't intelligence.
2010s	Deep learning (AlexNet, AlphaGo)	Crushed benchmarks. Public decided pattern recognition isn't intelligence.
2020s	Transformer-based LLMs (GPT, Claude)	Pass the bar exam, write code, pass medical licensing. Public is currently deciding whether this counts.

The pattern is consistent enough to name: intelligence becomes automation the moment we figure out how it works.

The uncomfortable reframe

Every time we automate a cognitive task — chess, translation, image recognition, essay writing — we conclude that task wasn't really intelligence after all. If you keep doing that for seventy years, you end up with a definition of "intelligence" that only includes things machines can't yet do. That is a strange thing to build a category around.

Three useful lenses

Forget the binary question "is this AI?" It won't help you pick a tool or judge a claim. Use these three lenses instead.

Lens 1: how it learned

Learning styles

Paradigm	What you feed it	What it learns
Rule-based (symbolic AI)	Hand-coded if/then rules	Whatever you wrote down. Nothing more.
Supervised learning	Thousands of labelled examples	To generalize from labels to new inputs.
Unsupervised learning	Unlabelled data	Latent structure — clusters, embeddings, similarity.
Reinforcement learning	A reward signal and an environment	Behaviors that maximize cumulative reward over time.
Self-supervised (modern LLMs)	Trillions of tokens, no explicit labels	To predict the next token — and, accidentally, a great deal else.

Lens 2: what shape of problem it was built for

Classification is mature. Generation is recent and improving. Reasoning is genuinely emerging. A fraud-detection system and a language model are both "AI," but they solve opposite shapes of problem and should not be compared as if they were the same thing.

Those numbers are rough. The shape is what matters. If someone sells you a reasoning product with 2024 confidence and 1998 classification reliability, you should notice.

Lens 3: System 1 vs. System 2

Borrowed from Kahneman's cognitive psychology, the distinction has become useful for understanding what kind of compute the model is doing:

System 1 models are fast, intuitive next-token predictors. Claude Haiku, GPT-4o-mini, Gemini Flash. They pattern-match. They answer in hundreds of milliseconds.
System 2 models spend more compute at inference time, essentially thinking before answering. OpenAI's o-series, Anthropic's extended-thinking Claudes, Google's Gemini "Thinking" variants. Multi-step problems are their home ground.

Not a new architecture. A deliberate choice to spend more compute at the moment you ask the question, not during training.

The practical shortcut

If the task is "classify this fast" — use a small System 1 model. If the task is "work through a multi-step problem where step three depends on step two" — pay for System 2. Mixing them up is the single most common mistake in AI-product design.

The efficiency era

The 2020s narrative was "bigger is better." The real 2026 story is more interesting: specificity beats size. A 7-billion-parameter model fine-tuned on medical literature can outperform a 70-billion-parameter generalist on clinical reasoning benchmarks. A domain-tuned legal model can embarrass a frontier generalist on contract review.

Intelligence, as measured by benchmark performance, is not only a function of scale. That is a more radical finding than the headlines make it sound.

Try it yourself

Two-minute calibration

Pick two AI tools you used this month. For each, write down:

What shape of problem was it actually solving? (classification, generation, retrieval, reasoning, multi-modal understanding)
What learning paradigm underpins it? (rule-based, supervised, self-supervised, reinforcement)
Was it the right tool for the shape, or were you forcing a mismatch?

What you're looking for

Most people discover that at least one of the "AI tools" they used this month was doing a job a much simpler system could have done — a regex, a traditional ML classifier, a keyword search. The reverse also happens: you used a lightweight tool for a task that genuinely needed multi-step reasoning and got garbage.

The label "AI" told you nothing about fit. The shape of the problem told you everything. Once you see this distinction, you cannot un-see it — which is the entire point of this lesson.

What this costs you to not understand

The real failure mode

The risk is not that you'll call the wrong thing "AI." The risk is that you'll deploy a System 1 pattern-matcher on a problem that needs System 2 reasoning, or a large generalist on a domain where a small specialist would outperform at a tenth the cost. Mislabeling is a trivia problem. Mismatching is a budget and a correctness problem.

The honest answer to "what is AI?"

The most honest answer may be: a collection of techniques for getting machines to do things we previously thought required human cognition, paired with a perpetual renegotiation of what "cognition" means. That is not a satisfying definition. It is, however, a true one — and the three lenses above (learning paradigm, problem shape, System 1 vs. System 2) will serve you better in practice than any tidier version.

Key takeaway

AI is not a stable category. Stop asking "is this really intelligent?" — the question moves every time the answer becomes yes. Ask instead: what learning paradigm, what problem shape, what inference-time budget? Those three questions will tell you what you are looking at and whether it is the right tool for your job.