Zulip Chat Archive

Stream: general

Topic: Proof size

Patrick Massot (Dec 02 2019 at 18:33):

This is a meta-question. How would you measure the size of a proof? Consider for instance rat.mul_one, from data/rat/basic.lean. This is a two lines, 111 characters long tactic proof, including a call to simp.

protected theorem mul_one : a * 1 = a :=
num_denom_cases_on' a $ λ n d h,
by change (1:ℚ) with 1 /. 1; simp [h]

Then try:

open tactic
run_cmd
do env ← get_env,
   decl ← env.get `rat.mul_one,
   trace (sizeof $ to_string decl.value)

run_cmd
do env ← get_env,
   decl ← env.get `rat.mul_one,
   proof ← pp decl.value,
   trace (sizeof $ to_string proof)

The first command says this proof is 7751 character long, and the second one says it pretty prints to 2015 characters. How long would you say this proof is? Do you have a better way to ask Lean what it thinks about this?

Mario Carneiro (Dec 02 2019 at 21:39):

I think the best measure would be something like "number of nodes in the expr"

Patrick Massot (Dec 02 2019 at 21:40):

And how would you compute that?

Mario Carneiro (Dec 02 2019 at 21:46):

namespace level
meta def size : level → ℕ
| (succ l)     := size l + 1
| (max l₁ l₂)  := size l₁ + size l₂ + 1
| (imax l₁ l₂) := size l₁ + size l₂ + 1
| _            := 1
end level

namespace expr
meta def size : expr → ℕ
| (var n) := 1
| (sort l) := l.size + 1
| (const n ls) := list.sum (level.size <$> ls) + 1
| (mvar n m t)   := size t + 1
| (local_const n m bi t) := size t + 1
| (app e f) := size e + size f + 1
| (lam n bi e t) := size e + size t + 1
| (pi n bi e t) := size e + size t + 1
| (elet n g e f) := size g + size e + size f + 1
| (macro d args) := list.sum (size <$> args) + 1
end expr

#eval do
  d ← tactic.get_decl ``rat.mul_one,
  tactic.trace d.value.size -- 2256

Mario Carneiro (Dec 02 2019 at 21:47):

Then again, this has a tendency to overcount duplicate subproofs. A better count would deduplicate those

Patrick Massot (Dec 02 2019 at 21:50):

Would it be easy to deduplicate these? Maybe using Simon's traversing technology and collecting a list of subproofs? I don't know what I'm talking about.

Mario Carneiro (Dec 02 2019 at 22:10):

import data.rat

namespace level
meta def size : level → ℕ
| (succ l)     := size l + 1
| (max l₁ l₂)  := size l₁ + size l₂ + 1
| (imax l₁ l₂) := size l₁ + size l₂ + 1
| _            := 1

meta def dedup_size_aux : level → state (native.rb_set level) ℕ | l := do
  s ← get,
  if s.contains l then return 1 else
  match l with
  | (succ l)     := do
    n ← dedup_size_aux l,
    return (n + 1)
  | (max l₁ l₂)  := do
    n₁ ← dedup_size_aux l₁,
    n₂ ← dedup_size_aux l₂,
    return (n₁ + n₂ + 1)
  | (imax l₁ l₂) := do
    n₁ ← dedup_size_aux l₁,
    n₂ ← dedup_size_aux l₂,
    return (n₁ + n₂ + 1)
  | l            := return 1
  end <* modify (λ s, s.insert l)

meta def dedup_size (l : level) : ℕ :=
prod.fst $ (dedup_size_aux l).run $
@native.mk_rb_set _ ⟨λ a b : level, a.lt b⟩ _

end level

namespace expr
meta def size : expr → ℕ
| (var n) := 1
| (sort l) := l.size + 1
| (const n ls) := list.sum (level.size <$> ls) + 1
| (mvar n m t)   := size t + 1
| (local_const n m bi t) := size t + 1
| (app e f) := size e + size f + 1
| (lam n bi e t) := size e + size t + 1
| (pi n bi e t) := size e + size t + 1
| (elet n g e f) := size g + size e + size f + 1
| (macro d args) := list.sum (size <$> args) + 1

meta def lift {α} : state (native.rb_set level) α → state (native.rb_set level × native.rb_set expr) α
| ⟨f⟩ := ⟨λ ⟨s, t⟩, let ⟨a, s'⟩ := f s in ⟨a, s', t⟩⟩

meta def dedup_size_aux : expr → state (native.rb_set level × native.rb_set expr) ℕ
| e := do
  s ← get,
  if s.2.contains e then return 1 else
  match e with
  | (sort l) := do
    n ← lift l.dedup_size_aux,
    return (n + 1)
  | (const n ls) := do
    ns ← list.traverse (lift ∘ level.dedup_size_aux) ls,
    return (list.sum ns + 1)
  | (mvar n m t) := do
    n ← t.dedup_size_aux,
    return (n + 1)
  | (local_const n m bi t) := do
    n ← t.dedup_size_aux,
    return (n + 1)
  | (app e f) := do
    n₁ ← e.dedup_size_aux,
    n₂ ← f.dedup_size_aux,
    return (n₁ + n₂ + 1)
  | (lam n bi e t) := do
    n₁ ← e.dedup_size_aux,
    n₂ ← t.dedup_size_aux,
    return (n₁ + n₂ + 1)
  | (pi n bi e t) := do
    n₁ ← e.dedup_size_aux,
    n₂ ← t.dedup_size_aux,
    return (n₁ + n₂ + 1)
  | (elet n g e f) := do
    n₁ ← g.dedup_size_aux,
    n₂ ← e.dedup_size_aux,
    n₃  ← f.dedup_size_aux,
    return (n₁ + n₂ + n₃ + 1)
  | (macro d args) := do
    ns ← list.traverse dedup_size_aux args,
    return (list.sum ns + 1)
  | _ := return 1
  end <* modify (λ ⟨s₁, s₂⟩, ⟨s₁, s₂.insert e⟩)

meta def dedup_size (e : expr) : ℕ :=
prod.fst $ (dedup_size_aux e).run
(@native.mk_rb_set _ ⟨λ a b : level, a.lt b⟩ _,
 @native.mk_rb_set _ ⟨λ a b : expr, a.lt b⟩ _)
end expr

#eval do
  d ← tactic.get_decl ``rat.mul_one,
  tactic.trace d.value.size -- 2256

#eval do
  d ← tactic.get_decl ``rat.mul_one,
  tactic.trace d.value.dedup_size -- 633

Patrick Massot (Dec 02 2019 at 22:33):

Thank you very much! I'll need time to decipher this, but first I must sleep...

Oliver Nash (Jun 16 2022 at 15:13):

I'd be mildly interested to see a timeseries of the median proof size in Mathlib for some longish history (sampled monthly, say). Is that easy to generate?

Oliver Nash (Jun 16 2022 at 15:14):

(I emphasise mildly, nobody should go to trouble over this.)

Arndt Schnabel (Jun 17 2022 at 06:57):

I once did something similar for Metamaths' set.mm, it definitely looks cool: https://github.com/void4/mmplot#readme log of expanded (including duplicate) number of proof steps vs number of subtheorems :)

Arndt Schnabel (Jun 17 2022 at 08:58):

Going through the natural number game, it might be neat to visualize what the "induction depths" of the proofs are! Say, the expression 0^succ(0) seems to require 4 levels (definition of natural numbers, addition, multiplication and power). What "level of induction" are modern mathematical proofs on?!

Mario Carneiro (Jun 17 2022 at 09:03):

Some proofs do require nested inductions (the commutativity of addition is a famous example), but anything about concrete natural numbers like 0 ^ 1 doesn't because we can just use the definitional equations a bunch of times. (On the other hand, sometimes induction can be used to speed up a proof that can be done without it: 0 ^ 1000 can be evaluated to 1 with no inductions by using pow_succ 1000 times, or with one level of induction by proving 0 ^ (n+1) = 1.)

Arndt Schnabel (Jun 17 2022 at 09:16):

Not sure how easily this is possible in Lean, but seeing the number of induction tactics the verifier encounters while checking the full (deduplicated) dependency DAG would be interesting to plot for all proofs in the database!

Mario Carneiro (Jun 17 2022 at 09:35):

yes, that sounds doable

Mario Carneiro (Jun 17 2022 at 11:17):

Here's an approximate implementation of how many "nested inductions" are needed to prove a theorem. It's too slow to run on all of mathlib, but I've run it on some of the core theorems from NNG:

import tactic.core
open tactic

meta mutual def get_expr_depth, get_const_depth
  (env : environment) (consts : ref (name_map (bool × ℕ))) (exprs : ref (expr_map (bool × ℕ)))
with get_expr_depth : expr → tactic (bool × ℕ)
| e := do
  s ← read_ref exprs,
  match s.find e with
  | some n := pure n
  | none := do
    let (f, es) := e.get_app_fn_args,
    b ← es.mfoldl (λ x e, (max x ∘ prod.snd) <$> get_expr_depth e) 0,
    v ← match f with
    | (expr.const n _) := do (i, a) ← get_const_depth n, pure (i, if i then a + b else max a b)
    | (expr.lam _ _ _ e) := do (i, a) ← get_expr_depth e, pure (i, max a b)
    | (expr.elet _ _ e f) := do
      (_, a₁) ← get_expr_depth e,
      (_, a₂) ← get_expr_depth f,
      pure (ff, max (max a₁ a₂) b)
    | _ := pure (ff, 0)
    end,
    v <$ write_ref exprs (s.insert e v)
  end
with get_const_depth : name → tactic (bool × ℕ)
| n := do
  m ← read_ref consts,
  match m.find n with
  | some sp := pure sp
  | none := do
    d ← env.get n,
    (b, r) ← match d with
    | (declaration.defn n us t v _ _) := get_expr_depth v
    | (declaration.thm n us t v) := get_expr_depth v.get
    | (declaration.cnst n us t _) := pure (if env.is_recursor n then (tt, 1) else (ff, 0))
    | (declaration.ax n us t) := pure (if env.is_recursor n then (tt, 1) else (ff, 0))
    end,
    pure (b && d.type.pi_codomain.get_app_fn.is_var, r)
  end

run_cmd
  using_new_ref mk_name_map $ λ m, using_new_ref (expr_map.mk _) $ λ r, do
    env ← tactic.get_env,
    [``nat.rec, ``nat.rec_on, ``nat.cases_on, ``nat.below, ``nat.brec_on,
      ``nat.succ, ``nat.pred,
      ``nat.add, ``nat.add_zero, ``nat.add_succ, ``nat.zero_add, ``nat.succ_add,
      ``nat.add_comm, ``nat.add_assoc, ``nat.add_left_comm,
      ``nat.mul, ``nat.mul_zero, ``nat.mul_succ, ``nat.zero_mul, ``nat.succ_mul,
      ``nat.right_distrib, ``nat.mul_comm,
      ``nat.left_distrib, ``nat.mul_assoc
    ].mmap' (λ d, do
      (_, n) ← get_const_depth env m r d,
      trace (d, n))

(nat.rec, 1)
(nat.rec_on, 1)
(nat.cases_on, 1)
(nat.below, 1)
(nat.brec_on, 2)
(nat.succ, 0)
(nat.pred, 1)
(nat.add, 2)
(nat.add_zero, 2)
(nat.add_succ, 2)
(nat.zero_add, 4)
(nat.succ_add, 4)
(nat.add_comm, 5)
(nat.add_assoc, 10)
(nat.add_left_comm, 10)
(nat.mul, 4)
(nat.mul_zero, 4)
(nat.mul_succ, 4)
(nat.zero_mul, 11)
(nat.succ_mul, 15)
(nat.right_distrib, 15)
(nat.mul_comm, 18)
(nat.left_distrib, 18)
(nat.mul_assoc, 21)

Mario Carneiro (Jun 17 2022 at 11:20):

These numbers are probably a lot larger than you might expect. Part of that is because the method is pessimistic when it comes to definitions that look like recursors with arguments in weird places that have more recursors in them, and the other part is that lean uses some "unnecessary" recursions to do some of its definitions, especially anything written with the equation compiler - these always go through nat.brec_on which itself requires a nested induction to define, which means for example that nat.add needs two nested inductions to define

Patrick Johnson (Jun 17 2022 at 12:29):

Mario Carneiro said:

Some proofs do require nested inductions (the commutativity of addition is a famous example).

Your algorithm doesn't agree with that :)

lemma nat.add_comm' {m n : ℕ} : m + n = n + m := let r := @nat.rec_on in
begin
  apply r n,
  { apply r m,
    { refl },
    { rintro n h, change n.succ = (0 + n).succ, rw ←h, refl }},
  { have h₁ : ∀ (m n : ℕ), m.succ + n = (m + n).succ,
    { rintro m n, apply r n,
      { refl },
      { rintro n h, change (m.succ + n).succ = _, rw h, refl }},
    rintro n h, rw [h₁, ←h], refl},
end

It says (nat.add_comm', 1)

Mario Carneiro (Jun 17 2022 at 13:45):

That's why I said it's an approximation. It's actually pretty hard to get a straight answer to how many inductions there are in a thing when you can create higher order functions that might be hiding an induction and apply them an unbounded number of times

Mario Carneiro (Jun 17 2022 at 13:46):

It is a lot easier to do this analysis over, say, PA where you have to explicitly apply the induction axiom (although it is still hard to count whether one induction proof is "nested inside" another)

Mario Carneiro (Jun 17 2022 at 13:48):

Gentzen's proof of consistency of PA actually does this kind of accounting, but the nesting depths correspond not to natural numbers but to ordinals less than $\epsilon_0$

Mario Carneiro (Jun 17 2022 at 13:49):

In lean I think you can make much more fearsome inductive proofs, which unfold to a transfinitely nested application of the induction axiom

Arndt Schnabel (Jun 17 2022 at 15:57):

Another question: With algorithms, a space-time tradeoff is often possible. Has something similar been observed with proofs (e.g. for proof length or depth), with more abstract 'resources' as tradeable parameters (e.g. like the number of induction tactics used)?

Arndt Schnabel (Jun 17 2022 at 16:11):

I'm not sure if such noob/crank deliberations are welcome here, but I've always wondered how mathematicians reason about "proof space", and if anything can be metamathematically proven about it.

Mathematicians reason about proof space in practice somehow, but problems like

Q: Given a set of proofs, is it possible to estimate the deduction distance and other properties of yet unproven statements, saving resources during the proof search?

are in the most general case impossible to automate.

A: For all but the simplest systems and properties, the problem of telling whether a formula has that property with respect to that system is undecidable; and even for very simple systems and properties, this problem is almost always computationally infeasible.

from: https://math.stackexchange.com/questions/3477810/estimating-meta-mathematical-properties-of-conjectures

Yet, mathematicians do it. Intuitions like "this unsolved problem is hard" can be approximated through social knowledge ('many have tried and failed') or own failed attempts ('it's difficult to approach/reason about this', 'i'm lacking knowledge'). Unlike programming, there are no syntactic or semantic limits, so it must be extremely difficult to judge something like the proof complexity/depth of an unsolved problem in advance. Yet, could there be provably accurate heuristics?

Gabriel Ebner (Jun 17 2022 at 16:33):

Mario Carneiro said:

It is a lot easier to do this analysis over, say, PA where you have to explicitly apply the induction axiom (although it is still hard to count whether one induction proof is "nested inside" another)

IIRC you only need a single (obviously non-nested) induction in PA (though you're getting more quantifiers in the induction formula as a tradeoff).

Mario Carneiro (Jun 17 2022 at 17:21):

Hm, that's interesting. Do you have a reference, or a sketch for how to e.g. prove the ackermann function exists without using two nested inductions?

Mario Carneiro (Jun 17 2022 at 17:53):

Ah, I got it:

constant A : ℕ → ℕ → Prop
axiom A_0 : A 0 0
axiom A_0_succ {n} : A 0 n → A 0 (n + 1)
axiom A_succ_0 {m} : A m 1 → A (m + 1) 0
axiom A_succ_succ {m n} : (∀ n, A m n) → A (m + 1) n → A (m + 1) (n + 1)

theorem ack (m n) : A m n :=
begin
  have : ∀ n,
    A 0 n ∧
    (∀ a, (∀ b, A a b) → A (a + 1) n) ∧
    ((∀ b, A 0 b) → (∀ a, (∀ b, A a b) → ∀ b, A (a + 1) b) → ∀ b, A n b),
  { intro n, induction n with n IH,
    { exact ⟨A_0, λ a h1, A_succ_0 (h1 _), λ h1 h2, h1⟩ },
    { obtain ⟨IH1, IH2, IH3⟩ := IH,
      exact ⟨A_0_succ IH1, λ a h1, A_succ_succ h1 (IH2 _ h1), λ h1 h2, h2 _ (IH3 h1 h2)⟩ } },
  exact (this _).2.2 (λ n, (this n).1) (λ a h1 b, (this b).2.1 _ h1) _
end

Mario Carneiro (Jun 17 2022 at 17:56):

for comparison here's the "obvious" proof using nested inductions:

theorem ack (m n) : A m n :=
begin
  induction m with m IH generalizing n,
  { induction n with n ih,
    { exact A_0 },
    { exact A_0_succ ih } },
  { induction n with n ih,
    { exact A_succ_0 (IH _) },
    { exact A_succ_succ IH ih } },
end

Gabriel Ebner (Jun 17 2022 at 19:17):

It's a fairly trivial observation. You just make a big conjunction (essentially). My advisor used to point this out all the time, here's a paper with that observation among other stuff: https://lmcs.episciences.org/4071

Mario Carneiro (Jun 17 2022 at 20:06):

The part that I was struggling with was how to do a bunch of inductions where the result of one depends on having done another part of the proof. The solution is to add these parts as additional hypotheses in the implication (these are the hypotheses in the third conjunct of the IH in the example) and stitch everything back together after the induction proof.

Patrick Johnson (Jun 18 2022 at 04:25):

Interesting. It means the commutativity of addition can also be proved using only one induction without "cheating"

lemma nat.add_comm' {m n : ℕ} : m + n = n + m :=
begin
  have h : ∀ a, (0 + a = a) ∧ (∀ (b : ℕ), b.succ + a = (b + a).succ) ∧
    ∀ (b : ℕ), (∀ a, 0 + a = a) → (∀ (a : ℕ), ∀ (b : ℕ), b.succ + a = (b + a).succ) →
    b + a = a + b,
  { intro a, induction a with a ih,
    { exact ⟨rfl, λ b, rfl, λ b h _, (h _).symm⟩ },
    { rcases ih with ⟨ih₁, ih₂, ih₃⟩, refine ⟨_, λ b, _, λ b h₁ h₂, _⟩,
      { change (0 + a).succ = _, rw ih₁ },
      { change (b.succ + a).succ = _, rw ih₂, refl },
      { rw [h₂, ←ih₃ _ h₁ h₂], refl }}},
  exact (h _).2.2 _ (λ _, (h _).1) (λ _ _, (h _).2.1 _),
end

Although the algorithm says (nat.add_comm', 10), probably because we use local proof h multiple times.

Mario Carneiro (Jun 18 2022 at 05:44):

I think the issue might be that eq.rec counts as a recursion that is nested inside the nat.rec proof

Mario Carneiro (Jun 18 2022 at 05:45):

using a hypothesis multiple times does not count against you in that method

Last updated: May 02 2025 at 03:31 UTC