Zulip Chat Archive
Stream: Machine Learning for Theorem Proving
Topic: great 8b models for generating lean proofs?
Bulhwi Cha (Feb 18 2026 at 14:12):
Are there language models with eight billion parameters on HuggingFace that produce Lean proofs without errors?
Bulhwi Cha (Feb 18 2026 at 14:16):
Yong Lin said:
Our small model, Goedel-Prover-V2-8B, reaches 83.3% on MiniF2F test set at Pass@32, matching the performance of prior state-of-the-art DeepSeek-Prover-V2-671B while being nearly 100 times smaller in model size.
Bulhwi Cha (Feb 18 2026 at 14:22):
I don't have enough money to pay for ChatGPT Plus or Claude Pro, so I'll try using Goedel-Prover-V2-8B.
Bulhwi Cha (Feb 18 2026 at 15:09):
namespace Sequence
theorem const_of_rec (f : Nat → α) (hcon : ∀ (n : Nat), f (n + 1) = f n) (n : Nat) : f n = f 0 := by
have h : ∀ (m : Nat), f m = f 0 := by
intro m
induction m with
| zero => rfl
| succ m ih =>
rw [hcon, ih]
exact h n
theorem const_of_strongRec (f : Nat → α) {c : Nat}
(hcon : ∀ {n : Nat}, c ≤ n → f (n + 1) = f n) {n : Nat} (hle : c ≤ n) : f n = f c := by
have h : ∀ (k : Nat), f (c + k) = f c := by
intro k
induction k with
| zero => rfl
| succ k ih =>
have h₁ : c ≤ c + k := Nat.le_add_right c k
have h₂ : f (c + k + 1) = f (c + k) := hcon (n := c + k) (by simpa [Nat.add_assoc] using h₁)
rw [Nat.add_assoc] at *
simpa [Nat.add_assoc, Nat.add_comm, Nat.add_left_comm] using h₂.trans ih
have h₁ : f n = f c := by
have h₂ : f (c + (n - c)) = f c := h (n - c)
have h₃ : c + (n - c) = n := by
have h₄ : c ≤ n := hle
omega
rw [h₃] at h₂
exact h₂
exact h₁
end Sequence
My proofs
Bulhwi Cha (Feb 18 2026 at 15:10):
I'm satisfied with the result. My desktop computer can run Goedel-Prover-V2-8B, so it's free in a sense!
Bulhwi Cha (Feb 18 2026 at 15:51):
Sadly, it failed to prove the following theorems:
-- this code is written in Lean v4.27.0
import Mathlib.Data.Set.Basic
namespace Set
open Function
universe u
theorem ext' {α : Type u} {A B : Set α} (h : ∃ (f : A → B), Surjective f ∧
∀ (a : A), ∃ (haB : a.val ∈ B), f a = ⟨a.val, haB⟩) : A = B := by
sorry
theorem ext_iff' {α : Type u} {A B : Set α} : A = B ↔ ∃ (f : A → B), Surjective f ∧
∀ (x : A), ∃ (hxB : x.val ∈ B), f x = ⟨x, hxB⟩ := by
sorry
end Set
Bulhwi Cha (Feb 18 2026 at 16:15):
Thread
Here are my proofs:
-- this code is written in Lean v4.26.0
import Mathlib.Data.Set.Basic
namespace Set
open Function
universe u
theorem ext' {α : Type u} {A B : Set α} (h : ∃ (f : A → B), Surjective f ∧
∀ (a : A), ∃ (haB : a.val ∈ B), f a = ⟨a.val, haB⟩) : A = B := by
let ⟨f, hsur, heq⟩ := h
ext
constructor
· next x =>
intro hxA
let ⟨(hxB : x ∈ B), (hfx : f ⟨x, hxA⟩ = ⟨x, hxB⟩)⟩ := heq ⟨x, hxA⟩
exact hxB
· next y =>
intro hyB
let ⟨⟨x, hxA⟩, hfa₁⟩ := hsur ⟨y, hyB⟩
let ⟨(hxB : x ∈ B), (hfa₂ : f ⟨x, hxA⟩ = ⟨x, hxB⟩)⟩ := heq ⟨x, hxA⟩
have : x = y := by
refine Subtype.mk.inj (show ⟨x, hxB⟩ = (⟨y, hyB⟩ : B) from ?_)
rw [← hfa₁, hfa₂]
subst y
exact hxA
theorem ext_iff' {α : Type u} {A B : Set α} : A = B ↔ ∃ (f : A → B), Surjective f ∧
∀ (x : A), ∃ (hxB : x.val ∈ B), f x = ⟨x, hxB⟩ := by
constructor
· intro h
exists fun x ↦ ⟨x.val, h ▸ x.property⟩
constructor
· intro y
exists ⟨y.val, h.symm ▸ y.property⟩
· intro x
exists (h ▸ x.property)
· exact ext'
end Set
Bulhwi Cha (Feb 18 2026 at 16:39):
Goedel-Prover-V2-8B also failed to prove the following theorem:
-- this code is written in Lean v4.27.0
@[simp] theorem List.isSuffixOf_iff_IsSuffix [BEq α] [LawfulBEq α] {l₁ l₂ : List α} :
isSuffixOf l₁ l₂ ↔ l₁ <:+ l₂ := by
sorry
Thread
Bulhwi Cha (Feb 18 2026 at 16:43):
Here's my proof:
namespace List
@[simp] theorem isPrefixOf_iff_IsPrefix [BEq α] [LawfulBEq α] {l₁ l₂ : List α} :
isPrefixOf l₁ l₂ ↔ l₁ <+: l₂ := by
match l₁, l₂ with
| [], _ => simp [nil_prefix]
| _::_, [] => simp
| a::as, b::bs =>
constructor
· intro h
simp only [isPrefixOf, Bool.and_eq_true, beq_iff_eq] at h
let ⟨t, ht⟩ := isPrefixOf_iff_IsPrefix.mp h.2
exists t
simpa [h.1]
· intro ⟨t, ht⟩
simp only [cons_append, cons.injEq] at ht
have hpf : as <+: bs := ⟨t, ht.2⟩
simpa [isPrefixOf, ht.1] using isPrefixOf_iff_IsPrefix.mpr hpf
@[simp] theorem isSuffixOf_iff_IsSuffix [BEq α] [LawfulBEq α] {l₁ l₂ : List α} :
isSuffixOf l₁ l₂ ↔ l₁ <:+ l₂ := by
match l₁, l₂ with
| [], _ => simp [nil_suffix, isSuffixOf]
| _::_, [] => simp [isSuffixOf, isPrefixOf]
| a::as, b::bs =>
constructor
· intro h
simp only [isSuffixOf, reverse_cons] at h
let ⟨t, ht⟩ := isPrefixOf_iff_IsPrefix.mp h
exists t.reverse
simpa using congrArg reverse ht
· intro ⟨t, ht⟩
have hpf : (as.reverse ++ [a]) <+: (bs.reverse ++ [b]) := by
exists t.reverse
simpa using congrArg reverse ht
simpa [isSuffixOf, reverse_cons] using isPrefixOf_iff_IsPrefix.mpr hpf
end List
ChatGPT 5.2 Thinking's Lean code
Bulhwi Cha (Feb 18 2026 at 17:07):
What do we need to do for these relatively small language models to be much better at generating Lean proofs? Do we need more formalized theorems that are readily available to the public?
Bulhwi Cha (Feb 18 2026 at 17:25):
Are handmade natural language and formal language pairs still worthwhile?
Last updated: Feb 28 2026 at 14:05 UTC