Skip to content

Proof-Assistants: How Lean, Coq and Isabelle are changing mathematical proofs

Young man working on coding and network diagrams with multiple screens and papers in a modern office.

Software has now joined in - and it is spotting errors no one could have seen.

For a long time, a quiet rule applied in mathematics: a proof was only as solid as the best minds in the field believed it to be. That assumption is starting to falter. More and more leading researchers are asking programs such as Lean, Coq and Isabelle to verify their theorems line by line. The lone genius gives way to networked teams, and personal trust is replaced by logic that can be checked in code.

From lone genius to networked project

For centuries, mathematical research tended to follow a familiar pattern: an individual or small group works out an idea, writes up a proof, submits it to a journal, and then peers read it over for months. If you are lucky, nobody finds a gap. If you are unlucky, someone discovers-perhaps years later-a flaw that overturns everything.

Peter Scholze, one of Germany’s best-known mathematicians and a Fields Medal winner, felt that uncertainty first-hand. In 2018 he published an extremely intricate proof about so-called “compact spaces”, presented in a new and highly abstract formulation. Only a tiny number of people worldwide could even follow the argument. Scholze himself was not entirely confident that a minute mistake had not slipped in somewhere.

Rather than requesting yet more referee reports, he opted for a radically different approach: he publicly launched the “Liquid Tensor Experiment”. The plan was straightforward: anyone fluent in the proof software Lean should attempt to formalise his entire argument in that language. No more loosely phrased prose-only strictly structured code a machine can understand and verify.

"A theorem is only accepted in this new setting once not only people, but also a strict algorithm, signs off every single line."

After roughly six months, an international team reported success: around 180,000 lines of Lean code covered the whole argument-without a logical gap. For Scholze, this amounted to a level of assurance no traditional peer review could match. For the wider community, it became a turning point: a craft that had existed for millennia suddenly looked like a collaborative, computer-supported enterprise.

Software makes supposedly “uncheckable” proofs controllable

Scholze’s case was not an isolated one. Another high-profile example is the Ukrainian mathematician Maryna Viazovska, who cracked a long-standing puzzle about the densest sphere packing in eight dimensions-an extremely abstract problem that had remained open for centuries. That breakthrough also earned her the Fields Medal in 2022.

Her proof was brilliant, yet so compressed that a complete manual verification would have taken years. A group of researchers therefore chose to translate the work into Lean. For months they broke each section into even smaller logical moves until the entire proof existed as a program. In 2024 the full code was published openly on GitHub-and the proof was then secured in a formal, machine-readable form as well.

This is where the real disruptive potential sits: arguments once dismissed as “too long”, “too technical” or “practically impossible to check” can suddenly be divided into manageable sub-projects.

  • Very large theorems can be split into many small building blocks.
  • Teams spread across multiple continents can work in parallel on different parts.
  • In the end, the machine links all the pieces and verifies the overall logic.

A key component is Mathlib, Lean’s large standard library. It now contains more than a million lines of formalised definitions and proved results. New proofs can build on this expanding base rather than restating everything from scratch. That speeds projects up dramatically and lowers the barrier to entry.

When the computer corrects Fields Medal winners

These programs do not merely rubber-stamp proofs that are already correct. They can also expose weak points that even specialists miss. In 2021, researchers formalised a result that had already been honoured: the work was respected in the field, a prize had been awarded, and reputations were on the line.

While translating the proof into code, Lean halted at an intermediate construction: a required assumption was missing, so the logical chain did not properly hold. Not a single human referee had spotted that inconsistency beforehand. The authors had to adjust their argument and express it with greater precision.

That episode captures what these tools are like in practice. A human reader can tire partway through a 100-page proof or skim over a step out of habit; the software does not. Every variable must be defined, every inference must be justified exactly. The result is fewer informal shortcuts and more resilient, demonstrable logic.

"The machine does not negotiate; it demands completeness - or it simply refuses to approve the next step."

How proof-assistants are changing everyday mathematics

For a long time, these systems were seen as toys for theoretical computer scientists. Anyone who wanted to use them needed programming skills, a great deal of patience, and a certain tolerance for frustration. That is now changing quickly.

Modern interfaces and AI-supported assistants are removing many of the obstacles. Language models propose Lean code when researchers describe part of a handwritten proof. Interactive environments show in real time whether a step is formally valid or whether additional hypotheses are still missing. PhD students can learn, step by step, how to translate their ideas into precise code.

What Lean, Coq and Isabelle actually do

All of these tools belong to the category known as proof-assistants. Their basic principle works like this:

  • Mathematical statements are converted into a strict formal language.
  • The software operates with a fixed rule set of logic and permitted inference rules.
  • Each step of a proof must be derivable under those rules.
  • If there is a leap or a gap anywhere, the proof process stops.

Rather than automatically “inventing” an entire proof, these programs guide people through the construction. They suggest partial routes, check hypotheses, or offer alternatives when a line of attack leads to a dead end. At best, this becomes a dialogue between intuition on one side and formal rigour on the other.

Opportunities, risks and open questions

The upsides are obvious: greater confidence that published results truly hold. Faster checking of extremely complex projects. Better transparency, because every step is explicit in code.

At the same time, a sensitive question arises: how far should the community rely on this software? Will researchers eventually do little more than check that the computer reports “green”, without understanding each step? Some already warn about a kind of “autopilot mathematics”, where only a small group of specialists can properly scrutinise the tools’ own code.

There is also the dependency on particular platforms and programming languages. Anyone who builds a career on Lean-based proofs ties themselves to an ecosystem. What happens if the community shifts to a different system one day? These issues are increasingly appearing in specialist debates.

What changes for students and lecturers

At many universities, courses on formal proofs and proof-assistants are entering the curriculum. Students learn not only classical proof strategies but also how to encode arguments formally. That sharpens understanding: being forced to spell out every seemingly “obvious” statement quickly reveals where one previously relied on intuition rather than genuine comprehension.

Lecturers see an opportunity to create more transparency. Exam questions, for instance, can be accompanied by simple Lean scripts that let learners test whether their approaches are logically rigorous enough. The often mystical notion of a “proof” becomes a clearly structured process that can be practised step by step.

What happens next: human creativity, machine rigour

Many researchers expect the coming years to establish a more divided model of labour: people coin new concepts, take bold guesses, and sketch high-level strategies. Then the detailed work moves into the proof-assistant, supported by AI that recognises useful patterns from millions of lines of existing code.

Especially at the frontier of knowledge-where proofs run to several hundred pages or thousands of lines of code-this combination could push the discipline forward substantially. Projects once considered “too risky” or “too time-consuming” become more realistic. Theories may emerge whose complexity goes far beyond what any single mind could fully survey-yet still count as secure, because every line of formal logic is checkable.

That also shifts what a proof is understood to be. It is no longer only an elegant article in a journal, but a structure made from text, code and jointly maintained libraries. The old image of the solitary genius at a desk makes room for connected teams working with software at the edge of what can be proved in mathematics.


Comments

No comments yet. Be the first to comment!

Leave a Comment