Zulip Chat Archive

Stream: Lean for teaching

Topic: A high-school exam question solved in Lean

David Wood (Aug 27 2025 at 10:19):

I'm formalising some high-school exam questions in Lean for educational purposes. I thought I might share my results in this channel in case it invites discussion or commentary by others interested in using Lean this way.

My goal is not necessarily to reach an efficient solution, but to elucidate the reasoning involved as simply and clearly as possible. My own knowledge of mathematics is basically stuck at high-school level, so this is probably easier for me than for an expert more burdened by the curse of knowledge.

Here's an example question:
image.png

Here's my solution:

import Mathlib.Data.Nat.Prime.Defs
import Mathlib.Tactic.IntervalCases

def two_is_prime := Nat.prime_two

theorem even_of_dvd_two (h: Even a) : 2 ∣ a := even_iff_two_dvd.mp h

theorem two_even_prime (h: Nat.Prime a) (h2: Even a): a = 2 :=
   Eq.symm $
    (Nat.prime_dvd_prime_iff_eq two_is_prime h).mp (even_of_dvd_two h2)

-- Lots of permutations of basic algebra for inequalities...
theorem div_le_iff_le_mul_right (hb : b ≠ 0) (hba : b ∣ a): a / b ≤ c ↔ a ≤ c * b :=
  Nat.div_le_iff_le_mul_of_dvd hb hba

theorem div_le_iff_le_mul_left (hb : b ≠ 0) (hba : b ∣ a): a / b ≤ c ↔ a ≤ b * c := by
  rewrite [mul_comm]
  exact div_le_iff_le_mul_right hb hba

theorem le_div_iff_mul_le_right (hb : b ≠ 0):  a ≤ c / b ↔ a * b ≤ c :=
  Nat.le_div_iff_mul_le (Nat.zero_lt_of_ne_zero hb)

theorem le_div_iff_mul_le_left  (hb : b ≠ 0):  a ≤ c / b ↔ b * a ≤ c := by
  rewrite [mul_comm]
  exact le_div_iff_mul_le_right hb

theorem q5b:
    -- LHS: Problem
    (
      a * b = c ∧
      Nat.Prime a ∧
      Nat.Prime b ∧
      50 ≤ c ∧ c ≤ 60
      ∧ Even c
    ) ↔
    -- RHS: Solution
    (
      c = 58 ∧ (
        (a = 2 ∧ b = 29) ∨
        (a = 29 ∧ b = 2)
      )
    ) := by
  constructor
  -- Forward pass: solve the problem
  ·
    intro ⟨abc, pa, pb, lb, ub, ec⟩

    -- Insight 1: An even product must have an even factor
    have a_or_b_even := by rwa [←abc, Nat.even_mul] at ec

    -- Insight 2: The only even prime (factor) is 2
    have a_or_b_eq_two := a_or_b_even.imp
      (fun even_a => two_even_prime pa even_a)
      (fun even_b => two_even_prime pb even_b)

    rewrite [← abc, mul_comm] at lb ub

    rcases a_or_b_eq_two with rfl | rfl
    ·
      have lb := (div_le_iff_le_mul_right (by decide) (by decide)).mpr lb
      have ub := ((le_div_iff_mul_le_right (by decide)).mpr ub)
      simp at lb ub

      interval_cases h: b

      -- Primality check
      any_goals absurd pb; decide

      simp at abc
      symm at abc

      exact ⟨abc, Or.inl ⟨rfl, rfl⟩⟩
    ·
      have lb := (div_le_iff_le_mul_left (by decide) (by decide)).mpr lb
      have ub := ((le_div_iff_mul_le_left (by decide)).mpr ub)
      simp at lb ub

      interval_cases h: a

      -- Primality check
      any_goals absurd pa; decide

      simp at abc
      symm at abc

      exact ⟨abc, Or.inr ⟨rfl, rfl⟩⟩
  -- Backwards pass: check your solution
  ·
    intro ⟨hc, hab⟩
    subst hc
    -- We could show more working out here, but the steps are all familiar.
    -- Once you have your answer, it's just a matter of substitution and
    -- calculation.
    rcases hab with ⟨rfl, rfl⟩ | ⟨rfl, rfl⟩ <;> decide

A particularly awkward feature of this question is that the reasoning splits trivially into two cases based on an arbitrary choice of which factor you infer is equal to 2. I guess on paper you would probably leave it implicit that this makes no difference to the result. Maybe I could have implemented that better in Lean by eliminating the disjunction somehow?

It's also possibly not ideal to use interval_cases and any_goals to iteratively rule out non-prime solutions. I've already found the backtracking features of tactics based on try can be a little confusing. On paper, you would obviously write out each case manually, and since there are only a few here (especially if you restrict the interval before splitting into cases), it might be better to do the same with Lean.

I'll note my workflow for using Lean to calculate an initially-unknown answer (as opposed to proving it, once known) is that I started by formulating the question with a placeholder False for the RHS of the ↔. This allowed me to perform all the same reasoning from my premises, eventually finding my solution in interactive-mode, allowing me to define my goal. The mpr direction of the proof then gives me some assurance that my solution is correct—I assume whatever placeholder I use (be it True - trivial - or False - impossible) for my goal initially won't allow me to prove the reverse direction, until it likely represents a true solution to the stated problem.

Kenny Lau (Aug 27 2025 at 10:25):

David Wood said:

elucidate the reasoning involved as simply and clearly as possible

I would have a bunch of have just so that I can clearly see the intermediate steps, which you kinda did with say have lb and have ub, but you didn't actually specify their types

David Wood (Aug 27 2025 at 10:36):

Right, I guess I'm leaning (har har) quite hard on the infoview in interactive mode. I could get into the habit of inlining more of my type annotations, even when Lean originally inferred them for me.

Kevin Buzzard (Aug 27 2025 at 14:05):

yeah my first reaction also is that it's easier to read if it says things like

    -- Insight 1: An even product must have an even factor
    have a_or_b_even : Even a ∨ Even b := by
      rwa [← abc, Nat.even_mul] at ec

i.e. state the type as well as naming the term.

David Wood (Aug 27 2025 at 15:35):

Another, simpler one.

image.png

This time, the reasoning is quite basic (not to mention branchless), so I tried harder to document the calculation tricks you might use in an exam where calculators aren't allowed. The result may seem quite excessive, but maybe educational.

-- All this to avoid a fraction of work calculating 200 / 25... 🥲
theorem div_div_eq_div_mul {a b c : Nat} (h1 : c ∣ b) (h2 : b ∣ a) (h3: 0 < b / c):
    a / (b / c) = (a / b) * c := by
  have: a = a / b * c * (b / c) := by calc
    a = a / b * b := (Nat.div_mul_cancel h2).symm
    _ = (a / b) * ((b / c) * c) := by rw [Nat.div_mul_cancel h1]
    _ = (a / b) * c * (b / c) := by rw [Nat.mul_assoc, Nat.mul_comm (b / c)]

  exact Nat.div_eq_of_eq_mul_left h3 this

theorem q7: (2^a * 3 * 5^2 = 600) ↔ a = 3 := by
  constructor
  ·
    intro eq

    -- Rearrange to isolate term with variable
    rewrite [Nat.mul_assoc] at eq
    have two_pow_a_isolated : 2^a = 600 / (3 * 5^2) :=
      Nat.eq_div_of_mul_eq_left (by decide) eq

    -- Technique: Extract variable from exponent using log of base
    have log_of_equation : (2^a).log2 = (600 / _).log2 :=
      congrArg Nat.log2 two_pow_a_isolated

    calc (a : Nat)
      a = (2^a).log2 := Nat.log2_two_pow.symm
      _ = (600 / (3 * 5 ^ 2)).log2 := log_of_equation
      _ = (600 / 3 / 5 ^ 2).log2 := by rw [← Nat.div_div_eq_div_mul]
      _ = (200 / 5 ^ 2).log2 := by simp

      -- Incredible detour to avoid barely-tricky mental arithmetic (200 / 25)
      _ = (200 / 25).log2 := by simp
      _ = (200 / (4 * 25 / 4)).log2 := by rw [← @Nat.mul_div_cancel_left 25 4 (by decide)]
      _ = (200 / (100 / 4)).log2 := by simp
      _ = (200 / 100 * 4).log2 := by rw [div_div_eq_div_mul (by decide) (by decide) (by decide)]
      _ = (2 * 4).log2 := by simp

      _ = (2 * 2 * 2).log2 := by simp
      _ = (2^3).log2 := by simp
      _ = 3 := by rw [Nat.log2_two_pow]

  ·
    intro h_eq_3
    subst h_eq_3
    calc
      2 ^ 3 * 3 * 5 ^ 2 = 2 * 2 * 2 * 3 * 5 * 5 := by simp
      _ = 2 * 2 * 2 * 3 * 25 := by simp
      _ = 2 * 2 * 2 * 75 := by simp
      _ = 2 * 2 * 150 := by simp
      _ = 2 * 300 := by simp
      _ = 600 := by simp

The first lemma is quite ridiculous and unnecessary, but I admit I was surprised I couldn't find a built-in equivalent. I'm very bad at algorithms for long division and multiplication, so I've always relied on tricks like these to algebraically simplify certain problems in mental arithmetic. I could certainly never justify why such a shortcut is valid. I'm not sure if I'm surprised that the proof turns out to be quite incomprehensible (though it's not unlikely that's a fault with this particular implementation).

Anyway, the result I'm most satisfied with is ending up with an explicit list of techniques that were necessary to solve a given problem. In this one, most of the work is simple calculation, but the key steps are shown to involve the relationship between exponents and logarithms, which Lean can fully explain for us in definitions like log2_two_pow. Well done, Lean! I only wish I had this when I was facing high-school exams.

David Wood (Aug 28 2025 at 09:02):

Next, a simplification problem.

image.png

theorem q8 (x: Int):  5 * (3 * x + 4) - 2 * (x - 1) = 13 * x + 22 := by
  calc
    _ = 5 * (3 * x + 4) - 2 * (x - 1) := rfl
    _ = 5 * (3 * x) + 5 * 4 - _ := by rw [Int.mul_add 5 (3 * x) 4]
    _ = 5 * 3 * x + 5 * 4 - _ := by rw [Int.mul_assoc]
    _ = 15 * x + 20 - 2 * (x - 1) := by simp
    _ = 15 * x + 20 + -(2 * (x - 1)) := by rw [Int.sub_eq_add_neg]
    _ = 15 * x + 20 + -1 * (2 * (x - 1)) := by rw [←Int.neg_one_mul]
    _ = 15 * x + 20 + -1 * (2 * x - 2 * 1) := by rw [Int.mul_sub]
    _ = 15 * x + 20 + -1 * (2 * x - 2) := by rw [Int.mul_one]
    _ = 15 * x + 20 + (-1 * (2 * x) - -1 * 2) := by rw [Int.mul_sub]
    _ = 15 * x + 20 + (-1 * 2 * x - -1 * 2) := by rw [Int.mul_assoc]
    _ = 15 * x + 20 + (-2 * x - -1 * 2) := by rw [Int.neg_one_mul]
    _ = 15 * x + 20 + (-2 * x - -1 * 2) := by rw [Int.neg_one_mul]
    _ = 15 * x + 20 + (-2 * x - -2) := by rw [Int.neg_one_mul]
    _ = 15 * x + 20 + (-2 * x + 2) := by rw [Int.sub_neg]
    _ = 15 * x + 20 + (-(2 * x) + 2) := by rw [Int.neg_mul]
    -- Finally, one of the rewrites fails without additional direction :'(
    _ = 15 * x + 20 + (2 + -(2 * x)) := by rw (occs := .pos [3]) [Int.add_comm]
    _ = 15 * x + 20 + (2 - 2 * x) := by rw [←Int.sub_eq_add_neg]
    _ = 15 * x + 20 + 2 - 2 * x := by rw [←Int.add_sub_assoc]
    _ = 15 * x + (20 + 2) - 2 * x := by rw [Int.add_assoc]
    _ = 15 * x + 22 - 2 * x := by simp
    _ = 22 + 15 * x - 2 * x := by rw [Int.add_comm]
    _ = 22 + (15 * x - 2 * x) := by rw [Int.add_sub_assoc]
    _ = 22 + (15 - 2) * x := by rw [←Int.sub_mul]
    _ = 22 + 13 * x := by simp
    _ = 13 * x + 22 := by rw [Int.add_comm]

I wasn't sure how to formulate this at first because the question only gives an expression to start with, not a Prop. To avoid putting a spurious placeholder solution in like {expression} = 0, I used an existential quantifier to work on this problem: ∃ solution: Int, 5 * (3 * x + 4) - 2 * (x - 1) = solution, but I don't think this was practically helpful at all. I'll need to look for better workflows for exam-style calculational proofs.

As for the proof itself, I got a lot of practice using calc. There's so much work involved, I think it's actually quite good to teach the fundamental rules of high-school algebra. Rearranging equations at a low-level demands a lot of attention paid to associativity and commutativity. Hard work, but probably good to internalise. My informal knowledge of high-school algebra was surely prone to errors due to glossing over these details in inappropriate situations.

I tried relying on placeholders to omit irrelevant sub-expressions for some calc steps, but I have a feeling this only led to some confusion and unpredictable results of Lean's type inference, especially since I didn't have a clearly stated goal for my calculation to begin with.

David Wood (Aug 28 2025 at 13:34):

The indivisibility of some terms made it necessary to use rational numbers for this one.
image.png

-- Seemingly random output of #min_import...
import Mathlib.Analysis.Normed.Field.Lemmas

-- Let s, b, and j be the ages of Sunita, Beth, and Joel
variable {s b j : ℚ}

theorem q10:
  (
    b = s - 1 ∧
    j = 2 * s ∧
    (s + b + j) / 3 = 5 -- Mean age is 5
  ) ↔ (j = 8 ∧ s = 4 ∧ b = 3) := by
    constructor
    ·
      intro ⟨hb, hj, hmean⟩

      have rearranged_mean: s + b + j = 5 * 3 :=
        (div_eq_iff (by decide)).mp hmean

      have hs : 15 = 4 * s - 1 := calc
        15 = 5 * 3 := by norm_num
        _ = s + b + j := rearranged_mean.symm
        _ = s + b + 2 * s := by rw [hj]
        _ = 2 * s + (s + b) := by rw [add_comm]
        _ = 2 * s + s + b := by rw [add_assoc]
        _ = (2 + 1) * s + b := by rw [←add_one_mul]
        _ = 3 * s + b := by norm_num
        _ = 3 * s + (s - 1) := by rw [hb]
        _ = 3 * s + s - 1 := by rw [add_sub_assoc']
        _ = (3 + 1) * s - 1 := by rw [←add_one_mul]
        _ = 4 * s - 1 := by norm_num
      symm at hs

      have hs := eq_add_of_sub_eq hs
      norm_num at hs
      rw [mul_comm] at hs
      have hs := eq_div_of_mul_eq (by decide) hs
      norm_num at hs
      rw [hs] at hb hj
      norm_num at hb hj

      exact ⟨‹j = 8›, ‹s = 4› , ‹b = 3›⟩

    ·
      intro ⟨hj, hs, hb⟩;
      subst hj hs hb
      refine ⟨?hb, ?hj, ?hmean⟩

      -- Again, I will be slightly lazy in the reverse step and let Lean do the
      -- arithmetic
      all_goals norm_num

David Wood (Aug 28 2025 at 13:41):

It was a struggle to find basic algebra theorems for rational numbers that are equivalent to those I'm becoming familiar with for Nat. I tried using Nat at first, and after being unable to prove divisibility for certain steps, I had to abandon that solution and write a lot of it from scratch to accomodate rational numbers. That said, this solution was simpler in the end because you can get away without supplying every lemma with a proof that everything is positive and divisible.

One other surprise was that decide failed to prove a lot of basic arithmetic statements for rational numbers, even though an instance of Decidable was often available. As a result, I had to use norm_num as my default tactic for proving trivial facts.

Sébastien Gouëzel (Aug 28 2025 at 14:05):

Are you aware of linarith, which can solve basic linear arithmetic equalities and inequalities? For example, after intro ⟨hb, hj, hmean⟩, I guess exact ⟨by linarith, by linarith, by linarith⟩ should solve the goal.

David Wood (Aug 28 2025 at 14:17):

Yes, I was vaguely aware—it seems equivalent to norm_num in these cases. I only use these "clever" tactics for arithmetic facts that would be truly obvious to all high-school students, or learned by rote (e.g. multiplication tables), which often incidentally also seem particularly hard to prove manually.

David Wood (Aug 28 2025 at 14:21):

I was surprised that simp, for instance, doesn't work for things like this with rational numbers:

example: 2 * 2 = 4 := by simp -- Goals accomplished 🎉
example: 2 * 2 = (4: ℚ) := by simp -- simp made no progress

I see norm_num and linarith both work, however:

example: 2 * 2 = (4: ℚ) := by linarith -- Goals accomplished 🎉
example: 2 * 2 = (4: ℚ) := by norm_num -- Goals accomplished 🎉

Sébastien Gouëzel (Aug 28 2025 at 14:38):

The whole proof above is also done just by grind.

David Wood (Aug 28 2025 at 14:43):

Though presumably it cannot also explain why it works, like simp? can?

Last updated: Feb 28 2026 at 14:05 UTC