Zulip Chat Archive

Stream: Machine Learning for Theorem Proving

Topic: great 8b models for generating lean proofs?


Bulhwi Cha (Feb 18 2026 at 14:12):

Are there language models with eight billion parameters on HuggingFace that produce Lean proofs without errors?

Bulhwi Cha (Feb 18 2026 at 14:16):

Yong Lin said:

Our small model, Goedel-Prover-V2-8B, reaches 83.3% on MiniF2F test set at Pass@32, matching the performance of prior state-of-the-art DeepSeek-Prover-V2-671B while being nearly 100 times smaller in model size.

Bulhwi Cha (Feb 18 2026 at 14:22):

I don't have enough money to pay for ChatGPT Plus or Claude Pro, so I'll try using Goedel-Prover-V2-8B.

Bulhwi Cha (Feb 18 2026 at 15:09):

namespace Sequence

theorem const_of_rec (f : Nat  α) (hcon :  (n : Nat), f (n + 1) = f n) (n : Nat) : f n = f 0 := by
  have h :  (m : Nat), f m = f 0 := by
    intro m
    induction m with
    | zero => rfl
    | succ m ih =>
      rw [hcon, ih]
  exact h n

theorem const_of_strongRec (f : Nat  α) {c : Nat}
    (hcon :  {n : Nat}, c  n  f (n + 1) = f n) {n : Nat} (hle : c  n) : f n = f c := by
  have h :  (k : Nat), f (c + k) = f c := by
    intro k
    induction k with
    | zero => rfl
    | succ k ih =>
      have h₁ : c  c + k := Nat.le_add_right c k
      have h₂ : f (c + k + 1) = f (c + k) := hcon (n := c + k) (by simpa [Nat.add_assoc] using h₁)
      rw [Nat.add_assoc] at *
      simpa [Nat.add_assoc, Nat.add_comm, Nat.add_left_comm] using h₂.trans ih
  have h₁ : f n = f c := by
    have h₂ : f (c + (n - c)) = f c := h (n - c)
    have h₃ : c + (n - c) = n := by
      have h₄ : c  n := hle
      omega
    rw [h₃] at h₂
    exact h₂
  exact h₁

end Sequence

My proofs

Bulhwi Cha (Feb 18 2026 at 15:10):

I'm satisfied with the result. My desktop computer can run Goedel-Prover-V2-8B, so it's free in a sense!

Bulhwi Cha (Feb 18 2026 at 15:51):

Sadly, it failed to prove the following theorems:

-- this code is written in Lean v4.27.0

import Mathlib.Data.Set.Basic

namespace Set

open Function

universe u

theorem ext' {α : Type u} {A B : Set α} (h :  (f : A  B), Surjective f 
     (a : A),  (haB : a.val  B), f a = a.val, haB) : A = B := by
  sorry

theorem ext_iff' {α : Type u} {A B : Set α} : A = B   (f : A  B), Surjective f 
     (x : A),  (hxB : x.val  B), f x = x, hxB := by
  sorry

end Set

Bulhwi Cha (Feb 18 2026 at 16:15):

Thread

Here are my proofs:

-- this code is written in Lean v4.26.0

import Mathlib.Data.Set.Basic

namespace Set

open Function

universe u

theorem ext' {α : Type u} {A B : Set α} (h :  (f : A  B), Surjective f 
     (a : A),  (haB : a.val  B), f a = a.val, haB) : A = B := by
  let f, hsur, heq := h
  ext
  constructor
  · next x =>
    intro hxA
    let (hxB : x  B), (hfx : f x, hxA = x, hxB) := heq x, hxA
    exact hxB
  · next y =>
    intro hyB
    let ⟨⟨x, hxA, hfa₁ := hsur y, hyB
    let (hxB : x  B), (hfa₂ : f x, hxA = x, hxB) := heq x, hxA
    have : x = y := by
      refine Subtype.mk.inj (show x, hxB = (y, hyB : B) from ?_)
      rw [ hfa₁, hfa₂]
    subst y
    exact hxA

theorem ext_iff' {α : Type u} {A B : Set α} : A = B   (f : A  B), Surjective f 
     (x : A),  (hxB : x.val  B), f x = x, hxB := by
  constructor
  · intro h
    exists fun x  x.val, h  x.property
    constructor
    · intro y
      exists y.val, h.symm  y.property
    · intro x
      exists (h  x.property)
  · exact ext'

end Set

Bulhwi Cha (Feb 18 2026 at 16:39):

Goedel-Prover-V2-8B also failed to prove the following theorem:

-- this code is written in Lean v4.27.0

@[simp] theorem List.isSuffixOf_iff_IsSuffix [BEq α] [LawfulBEq α] {l₁ l₂ : List α} :
    isSuffixOf l₁ l₂  l₁ <:+ l₂ := by
  sorry

Thread

Bulhwi Cha (Feb 18 2026 at 16:43):

Here's my proof:

namespace List

@[simp] theorem isPrefixOf_iff_IsPrefix [BEq α] [LawfulBEq α] {l₁ l₂ : List α} :
    isPrefixOf l₁ l₂  l₁ <+: l₂ := by
  match l₁, l₂ with
  | [],   _  => simp [nil_prefix]
  | _::_, [] => simp
  | a::as, b::bs =>
    constructor
    · intro h
      simp only [isPrefixOf, Bool.and_eq_true, beq_iff_eq] at h
      let t, ht := isPrefixOf_iff_IsPrefix.mp h.2
      exists t
      simpa [h.1]
    · intro t, ht
      simp only [cons_append, cons.injEq] at ht
      have hpf : as <+: bs := t, ht.2
      simpa [isPrefixOf, ht.1] using isPrefixOf_iff_IsPrefix.mpr hpf

@[simp] theorem isSuffixOf_iff_IsSuffix [BEq α] [LawfulBEq α] {l₁ l₂ : List α} :
    isSuffixOf l₁ l₂  l₁ <:+ l₂ := by
  match l₁, l₂ with
  | [],   _  => simp [nil_suffix, isSuffixOf]
  | _::_, [] => simp [isSuffixOf, isPrefixOf]
  | a::as, b::bs =>
    constructor
    · intro h
      simp only [isSuffixOf, reverse_cons] at h
      let t, ht := isPrefixOf_iff_IsPrefix.mp h
      exists t.reverse
      simpa using congrArg reverse ht
    · intro t, ht
      have hpf : (as.reverse ++ [a]) <+: (bs.reverse ++ [b]) := by
        exists t.reverse
        simpa using congrArg reverse ht
      simpa [isSuffixOf, reverse_cons] using isPrefixOf_iff_IsPrefix.mpr hpf

end List

ChatGPT 5.2 Thinking's Lean code

Bulhwi Cha (Feb 18 2026 at 17:07):

What do we need to do for these relatively small language models to be much better at generating Lean proofs? Do we need more formalized theorems that are readily available to the public?

Bulhwi Cha (Feb 18 2026 at 17:25):

Are handmade natural language and formal language pairs still worthwhile?


Last updated: Feb 28 2026 at 14:05 UTC