Zulip Chat Archive

Stream: general

Topic: Proofs by computation

Shubakur Mitra (Nov 14 2025 at 18:35):

In order to deal with computation-heavy proofs we can either use decide, native_decide or metaprogramming.

decide is inefficient because it literally uses kernel reductions and defeqs, and dealing with well-founded recursion is impractical
native_decide cannot be trusted because it avoids kernel altogether
metaprogramming requires a lot of work to deal with each specific proof

Are there any plans to create a tactic that is somewhere between decide and native_decide, that is, instead of relying on kernel reductions, the elaborator constructs the proof of equality while executing the compiled code, handles well-founded recursion efficiently and has only a linear slowdown compared to native_decide?

Kenny Lau (Nov 14 2025 at 18:39):

that's what norm_num does?

Shubakur Mitra (Nov 14 2025 at 18:51):

That's an example, yes. But here we are proposing a tactic that works for arbitrary computable decidability instances.

Kenny Lau (Nov 14 2025 at 18:53):

I don't underestand why everyone fixates on decide and computability. There's so much more to maths than algorithms.

Eric Wieser (Nov 14 2025 at 19:00):

But here we are proposing a tactic that works for arbitrary computable decidability instances.

I think the only reasonable specification of such a tactic is that it is exactly decide

Shubakur Mitra (Nov 14 2025 at 19:06):

decide performs reductions in the kernel. It also reduces terms inside proofs. It cannot handle well-founded recursion efficiently. On the other hand, elaborator can skip reducing proofs and use structural recursion instead of well-founded recursion.

Shubakur Mitra (Nov 14 2025 at 19:08):

native_decide does all sorts of optimizations, but it accepts the result blindly, instead of constructing the proof of equality while executing the code.

Kenny Lau (Nov 14 2025 at 19:09):

Eric Wieser said:

I think the only reasonable specification of such a tactic is that it is exactly decide

I think it's reasonable to have a tactic that does "implemented by" but is safer, like say if someone formalises the AKS primality test in a file way later than where Nat.Prime was defined

Jakub Nowak (Nov 14 2025 at 19:48):

Kenny Lau said:

I think it's reasonable to have a tactic that does "implemented by" but is safer, like say if someone formalises the AKS primality test in a file way later than where Nat.Prime was defined

You can create instance Decidable Nat.Prime that uses AKS primality test.

Jakub Nowak (Nov 14 2025 at 19:52):

So basically, you want native_decide that doesn't use unsafe things, like @[implemented_by], but still executes code natively, and somehow produces the proof along the way?

Jakub Nowak (Nov 14 2025 at 19:59):

Or you want something like decide that produces the proof unsafely, but faster, and kernel only checks the proof at the end?

Jakub Nowak (Nov 14 2025 at 20:00):

I think both things I described are kinda the same. .-.

Aaron Liu (Nov 14 2025 at 20:10):

Shubakur Mitra said:

decide performs reductions in the kernel. It also reduces terms inside proofs. It cannot handle well-founded recursion efficiently. On the other hand, elaborator can skip reducing proofs and use structural recursion instead of well-founded recursion.

I don't get it. Aren't both sides the same in this regard? Elaborator and kernel are both slow in reducing well-founded recursion I thought

Aaron Liu (Nov 14 2025 at 20:11):

this is also one of the reasons I try to make all my definitions use structural recursion instead of well-founded recursion, so that they will be fast in the elaborator and in the kernel

Shubakur Mitra (Nov 14 2025 at 20:28):

By "elaborator" we mean any sort of meta program that constructs a proof and sends it to the kernel (as opposed to the kernel reducing the proof term while performing typechecking). It would certainly be possible to implement a tactic that would simulate computation by performing structural recursion in the same way that is done by a compiled program. The tactic would act like simp, but more constrained; optimized to unfold functions and reduce match patterns with explicit constructors, and to generate a proof term along the way.

Adrian Marti (Nov 16 2025 at 18:29):

I am certainly far from being an expert, but something like native_decide, except that it produces an output proof object that can be checked by the kernel should be possible in theory, no?

Henrik Böving (Nov 16 2025 at 18:53):

Adrian Marti said:

I am certainly far from being an expert, but something like native_decide, except that it produces an output proof object that can be checked by the kernel should be possible in theory, no?

You have just described what a "normal" tactic like simp or grind does :P

Shubakur Mitra (Nov 16 2025 at 18:59):

The question was whether the community or FRO is planning to implement the described functionality. The four-color theorem, BB(5), chromatic number of the plane, Thue-Morse infinite words, and many other theorems are proved by reducing to a large-scale computation. It is true that computation is relevant only to a subset of math problems, and it is also true that reducing an abstract problem to a computable form requires a lot of reasoning, but once the problem is reduced to an algorithm, the prover should assist in executing the algorithm in a reliable and efficient manner.

Shubakur Mitra (Nov 16 2025 at 18:59):

Here is an example of how a relatively efficient computation can be achieved using simp only and typeclasses. Consider this function:

def fstdig (n : ℕ) : ℕ :=
  if n < 10 then n else fstdig (n / 10)

example : fstdig 7654321 = 7 := by
  decide -- fails

decide fails because fstdig uses wf recursion. But if we implement some type classes and instances, we can utilize simp to simulate the computation:

Long code

import Mathlib

def fstdig (n : ℕ) : ℕ :=
  if n < 10 then n else fstdig (n / 10)

class Eval {α : Type*} (x : α) (y : outParam α) : Prop where
  eq : x = y

@[simp] def C := @id
@[simp] def D := @id

theorem evalC {α : Type*} {x y : α} [H : Eval (C x) y] : C x = y := H.eq
theorem evalD {α : Type*} {x y : α} [H : Eval (D x) y] : D x = y := H.eq

attribute [simp] instOfNatNat
instance {P : Prop} [H : Decidable P] : Decidable (C P) := H
instance : Eval (D True) True := ⟨rfl⟩
instance : Eval (D False) False := ⟨rfl⟩
instance {α : Type*} {x y : α} {P : Prop} [H : Decidable P] :
  Eval (C $ if P then x else y) (C $ D $ if C P then x else y) := ⟨rfl⟩
instance {α : Type*} {x y : α} : Eval (D $ if True then x else y) x := ⟨rfl⟩
instance {α : Type*} {x y : α} : Eval (D $ if False then x else y) y := ⟨rfl⟩
instance : Eval (C (0 : ℕ)) 0 := ⟨rfl⟩
instance {n : ℕ} : Eval (C ofNat(n + 1)) n.succ := ⟨rfl⟩
instance {n m : ℕ} : Eval (C $ n = m) (D $ C n = C m) := ⟨rfl⟩
instance : Eval (D $ 0 = 0) True := ⟨by simp⟩
instance {n : ℕ} : Eval (D $ 0 = n.succ) False := ⟨by simp⟩
instance {n : ℕ} : Eval (D $ n.succ = 0) False := ⟨by simp⟩
instance {n m : ℕ} : Eval (D $ n.succ = m.succ) (D $ n = m) := ⟨by simp⟩
instance {n : ℕ} : Eval (D $ 0 = ofNat(n + 1)) False := ⟨by simp⟩
instance {n : ℕ} : Eval (D $ ofNat(n + 1) = 0) False := ⟨by simp⟩
instance {n m : ℕ} : Eval (D $ ofNat(n + 1) = ofNat(m + 1)) (D $ n = m) := ⟨by simp⟩
instance {n m : ℕ} : Eval (C $ n < m) (D $ C n < C m) := ⟨by simp⟩
instance {n : ℕ} : Eval (D $ n < 0) False := ⟨by simp⟩
instance {n : ℕ} : Eval (D $ 0 < n.succ) True := ⟨by simp⟩
instance {n m : ℕ} : Eval (D $ n.succ < m.succ) (D $ n < m) := ⟨by simp⟩
instance {n : ℕ} : Eval (D $ 0 < ofNat(n + 1)) True := ⟨by simp⟩
instance {n m : ℕ} : Eval (D $ ofNat(n + 1) < ofNat(m + 1)) (D $ n < m) := ⟨by simp⟩
instance {n m : ℕ} : Eval (C $ n + m) (D $ C n + C m) := ⟨rfl⟩
instance {n : ℕ} : Eval (D $ n + 0) n := ⟨rfl⟩
instance {n m : ℕ} : Eval (D $ n + ofNat(m + 1)) (D $ n + m).succ := ⟨rfl⟩
instance {n : ℕ} : Eval (C $ fstdig n) (D $ fstdig $ C n) := ⟨rfl⟩
instance {n : ℕ} : Eval (D $ fstdig n) (C $ if n < 10 then n else fstdig (n / 10)) :=
  ⟨by rw [fstdig]; rfl⟩

Shubakur Mitra (Nov 16 2025 at 18:59):

And now we can prove it easily:

example : fstdig 7654321 = 7 := by
  change C _; simp only [evalC, evalD]

It's faster than decide, but slower than native_decide and requires manual definitions of instances. Functions C and D act as markers for simp. Marker C first reduces the expression to a structure of pure constructors, then D traverses the expression and performs reductions of the registered functions by pattern matching on their arguments. The reduction is performed in the bottom-up direction (eager evaluation), except for if-then-else in which case it is lazy.

metakuntyyy (Nov 16 2025 at 19:03):

Kenny Lau said:

Eric Wieser said:

I think the only reasonable specification of such a tactic is that it is exactly decide

I think it's reasonable to have a tactic that does "implemented by" but is safer, like say if someone formalises the AKS primality test in a file way later than where Nat.Prime was defined

Hilariously. I formalised that one in metamath: https://us.metamath.org/mpeuni/aks5.html

Last time I asked the general consensus was that there are better tactics to decide if a number is prime. I can port my proof to mathlib, I don't know if it would fit the library though. I have started the proof few months ago and have about 400 LoC, but that one will take a while to formalise. It took me if I'd have to guess around 200-300 hours to get it into metamath.

Alfredo Moreira-Rosa (Nov 16 2025 at 20:18):

@Shubakur Mitra I think, like always, when you have an idea and people don't see the benefit, write a solution yourself and share it.
If people find it useful, they'll use it. And if it's successful it may endup integrated if that's your goal.

Chris Bailey (Nov 16 2025 at 21:02):

There was a gadget added recently for instructing the kernel to eagerly reduce terms that seems to be intended for this kind of reflection: https://leanprover-community.github.io/mathlib4_docs/Init/Core.html#eagerReduce

I do not think it's a simple matter of implementation and man hours to make something like Coq/Rocq's vm_compute work for Lean, because reduction in Lean relies on type inference and definitional equality. Rocq's implementation (at least as far as it's explained in the vm_compute paper) are more amenable to this, even though vm_compute is probably more comparable to native_decide in that it greatly expands the TCB.

I spent a while working out how to adapt a krivine machine with eager annotations to work for kernel reduction in Lean, but even with annotations you end up needing to quote/read-back too frequently for there to be any advantage. Caching and equality checking with closures ends up being a disaster.

Last updated: Dec 20 2025 at 21:32 UTC