Programming a Mini-Lean in Julia’s Type System

A painter stands at a large canvas in a palace room, surrounded by attendants, a mirror on the far wall reflecting the royal couple
Las Meninas, Diego Velázquez, 1656

This article is based on Guillermo Angeris’s talk “Programming a (mini-)Lean in Julia’s type system”.

A theorem prover, stripped to its engine, is a small trusted kernel, a type checker, and a boundary between the two.

Guillermo Angeris answers this by live-coding a toy theorem-proving kernel inside Julia that illustrates how Lean works architecturally. It is not Lean reimplemented in Julia. It is a tiny kernel that makes the trust boundary visible: if you accept the kernel, then anything built on top of it has to pass through the type checker.

This is the third article in the Theorem Proving series. The first article covers the Curry-Howard correspondence. The second implements a tiny prover explicitly in Python. This article embeds those same ideas inside a host language’s type system.

#The Architecture

You could build a theorem prover by treating proofs as strings and implementing symbol-rewriting rules on top of them. That works, but it fights the host language. Curry-Howard gives a better route for programming languages: make propositions into types, proofs into values, and let the type checker enforce the rules.

Every real theorem prover has the same basic structure:

  1. A small trusted kernel that defines the rules of the game
  2. User proof code that builds on those rules
  3. A type checker that enforces the boundary

The kernel is the only code you have to trust. Everything outside it, every proof, every theorem, can only combine results that the kernel produced. If the kernel is correct, then any proof that type-checks is correct, no matter how complex.

This is sometimes called the de Bruijn criterion: make the kernel small enough that a human can read every line, convince themselves it’s correct, and then trust everything the system derives. Lean’s kernel is a few thousand lines of C++. The mini-Lean’s is 61 lines of Julia.

#Building the Kernel: Peano Arithmetic in Julia’s Type System

Angeris encodes Peano arithmetic entirely within Julia’s type system. Peano arithmetic is a formal way of defining natural numbers (0, 1, 2, 3, …) from scratch using two simple rules.

#Natural Numbers as Types

Natural numbers are defined using just two rules:

  1. Zero exists and is a natural number.
  2. Successor: for any natural number n, there exists S(n), which is also a natural number.

This gives you: 0 = Zero, 1 = Succ{Zero}, 2 = Succ{Succ{Zero}}, 3 = Succ{Succ{Succ{Zero}}}, and so on. Every natural number is either zero or the successor of some other natural number. There is no third option.

In Julia, this becomes an abstract type Nat with two subtypes: Zero <: Nat (a type with no data, just representing the concept of zero) and Succ{N} <: Nat where N must itself be a Nat. So Succ{Zero} is 1, Succ{Succ{Zero}} is 2, and so on.

These exist purely at the type level. Julia never creates runtime values for them. The type checker does all the work during compilation.

#Equality as a Type

Equality between two type-level numbers is a type called Eq{A, B}. The two type parameters are the two sides of the equation. So the proposition “1 + 1 = 2” is the type Eq{Add{Succ{Zero}, Succ{Zero}}, Succ{Succ{Zero}}}. That looks noisy, but all it says is “the sum of 1 and 1 equals 2.”

Here’s the crucial trick: the constructor for Eq is not exported to users. You can write down the type Eq{A, B} (you can state a proposition), but you can’t create a value of that type directly (you can’t fabricate a proof). The only way to get an Eq value is through the axiom functions that the kernel provides.

#Addition at the Type Level

Addition is defined recursively, the same way you’d teach a child to add by counting up:

Julia uses multiple dispatch to pick the right rule based on the types. The type system evaluates these reductions during compilation. By the time your code runs (if there’s anything to run), the types have already been fully computed.

#The Six Axioms

The kernel exports exactly six axiom functions. These are the only legitimate ways to construct equality proofs:

1. Reflexivity. For any n, you can construct Eq{n, n}. Anything equals itself. This is the starting point, the simplest proof that exists.

2. Symmetry. Given Eq{a, b}, produce Eq{b, a}. Equality is a two-way street.

3. Transitivity. Given Eq{a, b} and Eq{b, c}, produce Eq{a, c}. This is the critical chaining axiom. Most proofs are sequences of equalities connected by transitivity: “A = B, and B = C, and C = D, therefore A = D.”

4. Successor congruence. Given Eq{a, b}, produce Eq{Succ{a}, Succ{b}}. Applying the same operation to both sides of an equality preserves equality. This lets you “lift” a proof through a layer of successor.

5. Addition base case: Zero + n = n. Produces Eq{Add{Zero, N}, N}. This directly encodes the first clause of the recursive definition of addition.

6. Addition recursive case: Succ(m) + n = Succ(m + n). Produces Eq{Add{Succ{M}, N}, Succ{Add{M, N}}}. This encodes the second clause.

That’s the entire foundation. Six functions plus the type definitions for Nat, Zero, Succ, Eq, and Add. About 61 lines of Julia with whitespace.

These six axiom functions are the only code allowed to create Eq values from scratch. Everything outside the kernel can only combine Eq values that the axioms produced. The trust boundary from the previous section is now concrete: the 61 lines of kernel are the only code you need to audit.

#Proving 1 + 1 = 2

With the kernel in place, the first goal is to prove that 1 + 1 = 2. Formally, this means constructing a value whose type is:

Eq{Add{Succ{Zero}, Succ{Zero}}, Succ{Succ{Zero}}}

The proof is a chain of axiom applications:

Step 1. Apply addition rule 2 (the recursive case). Succ{Zero} + Succ{Zero} reduces to Succ{Zero + Succ{Zero}}. This gives us an Eq between Add{Succ{Zero}, Succ{Zero}} and Succ{Add{Zero, Succ{Zero}}}. We’ve peeled off one layer of successor from the left operand.

Step 2. Apply addition rule 1 (the base case) to the inner term. Zero + Succ{Zero} reduces to Succ{Zero}. This gives us Eq{Add{Zero, Succ{Zero}}, Succ{Zero}}.

Step 3. Apply successor congruence to step 2. Since Zero + Succ{Zero} = Succ{Zero}, wrapping both sides in Succ gives us Succ{Zero + Succ{Zero}} = Succ{Succ{Zero}}.

Step 4. Apply transitivity to connect steps 1 and 3. Step 1 says Succ{Zero} + Succ{Zero} = Succ{Zero + Succ{Zero}}. Step 3 says Succ{Zero + Succ{Zero}} = Succ{Succ{Zero}}. Transitivity chains them: Succ{Zero} + Succ{Zero} = Succ{Succ{Zero}}. That’s 1 + 1 = 2.

The final value has exactly the goal type. Julia’s type checker verifies every intermediate step. If you pass the wrong Eq to transitivity, or if the types of two chained equalities don’t share a common middle term, compilation fails.

The proof is verbose and mechanical. That’s the point. Every step is independently checkable, and the compiler is the referee.

The entire derivation happens during compilation, at the type level. The proof exists purely as a chain of Eq values whose types constrain each other into a valid derivation. If you mess up any link in the chain, the code won’t compile. The type system catches the error before anything runs, the same way it would catch you trying to add a string to an integer.

#Universal Quantification: For All n, n + 1 = Succ(n)

Proving something about all natural numbers requires a different tool: a function type.

The statement “for all n, n + 1 = Succ(n)” means: no matter what number you pick, adding 1 gives you the next number. Under Curry-Howard, this becomes a function type: given any natural number type N, return a value of type Eq{Add{N, Succ{Zero}}, Succ{N}}.

A specific proof like “1 + 1 = 2” is like testing one input. A universal proof like “for all n, n + 1 = Succ(n)” is like writing a function that passes for every input. You’re not testing, you’re proving.

The proof is a function that takes an abstract type parameter N <: Nat, not a specific number but a placeholder for any number. Inside the body, Angeris chains the same axioms as before, but now they operate on the generic N instead of a concrete Succ{Zero}.

The key insight: the proof function works by structural recursion on N. For Zero, you apply the base case of addition. For Succ{M}, you use the recursive addition rule, apply the proof recursively for M, then use congruence and transitivity to assemble the result.

The fact that this function compiles is the proof. Julia’s type checker has verified that for any possible N, the return type is Eq{Add{N, Succ{Zero}}, Succ{N}}. You can instantiate it: calling the function with Succ{Zero} (the number 1) recovers the specific fact that 1 + 1 = 2. Calling it with Succ{Succ{Zero}} gives you 2 + 1 = 3. But the function itself, the generic version, proves the universal statement.

“For all” doesn’t need special machinery. It’s just a generic function, the same concept you already use when you write code that works on “any list” or “any comparable type.”

#A Small Julia-Specific Bonus

Angeris also shows an extra Julia-specific trick: the subtype lattice ends up reflecting part of the relationship between generic and specific proofs. The real point is still the kernel, the proof terms, and the checker boundary.

#Proving Negation: Zero Does Not Equal One

So far, every proof has been about showing two things are equal. But mathematics also needs negation, the ability to prove that something is false. How do you represent “0 ≠ 1” as a type?

#False as an Uninhabitable Type

The trick is elegant. Define a type called FalseProp that has no public constructor. It’s a type that can never have a value. You can write down the type (you can state “False”), but you can never create a value of it (you can never prove False).

Then define “not P” as “a function from P to FalseProp.” In other words: if assuming P lets you produce a value of FalseProp, something went wrong, because FalseProp values can’t exist. So P must be false.

This is how constructive mathematics works: “not P” means “P leads to a contradiction.”

Julia doesn’t have real access control, so this is a convention. Users could technically call the internal constructor and cheat. (More on this in the limitations section.) But the disciplined version is: the kernel never exports FalseProp’s constructor, so legitimate proofs can never produce one.

#The Last Peano Axiom

The final Peano axiom encodes a basic fact about natural numbers: no successor is zero. 1 is not 0. 2 is not 0. 47 is not 0. You can always count forward, but you never wrap back around to zero.

In the kernel, this axiom is a function with type Eq{Succ{N}, Zero} → FalseProp for any N. It says: give me a proof that some successor equals zero, and I’ll give you a proof of False. Since FalseProp values can never be legitimately created, this means no legitimate proof of Succ(n) = 0 can exist either. The axiom function is allowed to use the internal FalseProp constructor because it’s part of the trusted kernel.

#Proving 0 ≠ 1

To prove that Succ{Zero} ≠ Zero, i.e. 1 ≠ 0, you write a function of type Eq{Succ{Zero}, Zero} → FalseProp:

  1. Assume a “bad assumption,” a hypothetical proof that Succ{Zero} = Zero.
  2. Feed this assumption into the Peano axiom succ_ne_zero.
  3. The axiom returns FalseProp.

The function compiles and type-checks. Its type signature is the statement “1 = 0 implies False,” which is the statement “1 ≠ 0.” Proof complete.

This is proof by contradiction, one of the oldest techniques in mathematics, encoded as a function. You assume something (1 = 0), show that it leads somewhere impossible (FalseProp), and conclude the assumption was wrong. “Impossible” is a type with no values. “Leads to impossible” is a function that returns that type.

A complete system would also need the principle of explosion: “from False, anything follows.” If you somehow had a FalseProp value, you could produce a value of any type. This sounds absurd, but it’s safe because FalseProp values can never exist in the first place.

#Induction: Recursion as Proof

Mathematical induction is how you prove things about all natural numbers. The idea: if something is true for 0, and being true for any number means it’s also true for the next number, then it’s true for all numbers.

Under Curry-Howard, induction doesn’t need to be a separate rule. It is recursive programming. An inductive proof is a recursive function that:

  1. Base case: given Zero, return a proof of P(Zero).
  2. Inductive step: given Succ{N} and a proof of P(N) (obtained by recursion), construct a proof of P(Succ{N}).

If this function compiles, the induction is valid. The compiler verifies that every case is handled and every return type matches the goal.

The universal proof of “for all n, n + 1 = Succ(n)” already uses this pattern. It recurses on the structure of N: handle the zero case, then handle the successor case by calling yourself on a smaller number. Induction is just what recursion looks like when your return types carry logical content.

Angeris starts coding a standalone induction combinator but runs short on time. The idea is a higher-order function (a function that takes other functions as arguments) that accepts a base case proof and a step function, then composes them recursively.

This is where Julia starts to creak. In Lean, function return types can depend on argument values. That’s what “dependent types” means: the type of the output changes based on the input. In Julia, you can simulate this with parametric types and multiple dispatch, but it’s indirect. The type system is powerful enough to encode the proofs, but the ergonomics are not designed for it.

#Where Julia Falls Short

The mini-Lean works, but Julia was not designed for this, and it shows.

#No Private Constructors

Julia has no private fields or constructors. Everything is accessible by name. Angeris demonstrates the problem live by directly typing Eq{Succ{Zero}, Zero}(), fabricating a proof that 1 = 0 without going through any axiom.

Once you have one false statement, you can derive anything. (That’s the principle of explosion from earlier, except now it’s working against you.) The entire system collapses if users cheat. In Lean, the kernel is a hard trust boundary with genuinely private constructors, so this can’t happen. In Julia, it’s an honor system.

#No Tactics, Only Manual Proofs

Lean has a tactic language that automates routine proof steps. ring solves ring equalities, simp simplifies expressions, omega handles linear arithmetic. When you need to prove 2 + 2 = 4 in Lean, it’s one line.

In the mini-Lean, every proof is assembled by hand, axiom by axiom. Extending the 1+1=2 proof to 2+1=3 means more unrolling, more congruences, more transitivities. Angeris hits type errors live and spends minutes debugging, muttering “I have no idea what the hell is wrong here.” He gets it working without fully understanding what changed, shrugs, and moves on.

This is the best demonstration of what tactic automation buys you. In Lean, you’d write by omega and the proof would be done. The tactic system knows how to solve arithmetic equalities automatically. It can verify 100 + 100 = 200 just as easily as 1 + 1 = 2, because it has algorithms instead of manual unrolling. Lean also has simp (simplify using known facts), norm_num (verified numeric computation), and the ability to write custom tactics.

The gap between the mini-Lean and production Lean is almost entirely in automation, not in the foundational ideas. The engine is the same. Lean just has power steering.

#Type Error Messages Are Unhelpful

When a proof step fails in Lean, you see your proof state: what you’re trying to prove, what you already know, and where things went wrong. In Julia, you get a generic type mismatch error with no context. Debugging means sprinkling typeof() calls everywhere and staring at intermediate values. It’s printf debugging, but for types.

#Performance: The @generated Trick

One practical note: Julia recomputes type-level operations on every call by default. For proofs with deep type nesting (imagine proving something about the number 100, which is Succ wrapped 100 times), this gets slow. Julia’s @generated functions can cache type-level computations so the proof verification only happens once per unique set of type parameters.

#What Production Provers Add

This exercise shows what theorem provers are at the kernel level, and what the layers above buy you.

#The Kernel Is the Same

Lean, at its core, works exactly like this mini-Lean. Small trusted kernel, untrusted proof code on top, type checker enforcing the boundary. The difference is scale and scope: Lean’s kernel handles dependent types, universe polymorphism, and inductive types. The mini-Lean handles Peano arithmetic. But the architecture is the same.

Lean’s 200,000+ verified mathematical declarations all reduce to checks against a kernel of a few thousand lines of C++.

#What Lean Adds on Top

LayerMini-LeanLean
Kernel61 lines, Peano arithmetic~4,000 lines C++, dependent type theory
Trust boundaryConvention (honor system)Enforced (private constructors)
Proof styleManual axiom chainingTactics (simp, omega, ring, rw)
AutomationNoneDecision procedures, simplifier, custom tactics
Error messagesGeneric type mismatchProof state: goals, hypotheses, context
ExpressivenessNatural numbers, equalityArbitrary mathematics
LibraryNonemathlib (200,000+ declarations)
ToolingJulia REPLVS Code, LSP, widgets, documentation

The automation layer is the biggest gap. Tactics let you work at the level of mathematical reasoning instead of mechanical axiom chaining. by omega replaces twenty lines of manual unrolling. simp [lemma1, lemma2] replaces long chains of rewrites. Custom tactics let library authors package proof strategies so users don’t have to reinvent them.

#The Universe Tower

Angeris hints at deeper possibilities. Julia has a Type type, the type of types. And Type itself has a type (Type{Type}), and so on. This creates a hierarchy that type theorists call a universe tower, where each level can talk about things at the level below.

Why does this matter? Without it, you can create paradoxes. If you allow a “set of all sets,” you get Russell’s paradox: does the set of all sets that don’t contain themselves contain itself? (If it does, it doesn’t. If it doesn’t, it does.) The universe tower prevents this by saying: types at level 1 can only talk about types at level 0, types at level 2 can only talk about types at level 1, and so on. No level can talk about itself.

In Lean, this hierarchy is carefully designed. Prop (propositions) lives at one level. Type (data types) lives at another. Type 1 (the universe containing Type) lives above that. Julia has hints of the same structure, you can write Type{Type} and it’s a valid expression, but it’s more of an accident of the implementation than a deliberate logical framework.

#Some Host Type Systems Can Do This Too

This works in Julia because Julia’s type system is unusually rich in the right ways: parametric types, abstract supertypes, and multiple dispatch that can drive type-level reduction. Some other languages can host fragments of the same idea too. Haskell with GADTs can get close. TypeScript can encode surprising pieces of it. But this is not something you get automatically from “having generics.”

The lesson is specific: once a host type system is expressive enough, you can start smuggling logical reasoning into it. Julia happens to be just strong enough to make the architecture visible.

#Conclusion

From scratch, with no macros and no dependencies, Angeris builds a working theorem prover in 61 lines of Julia. The kernel defines natural numbers, equality, addition, and six axioms. From these primitives alone, he proves 1+1=2, proves n+1=Succ(n) for all n, and proves 0≠1.

The construction makes visible what is usually hidden: theorem provers are not magic. They are a small trusted kernel, a type checker, and a sharp boundary between the two. Everything else, the tactics, the libraries, the tooling, is engineering built on that foundation.

The next article in this series puts that engine to work: writing real proofs in Lean, with all the automation the mini-Lean is missing.