Zulip Chat Archive

Stream: Machine Learning for Theorem Proving

Topic: chatgpt proved a theorem in 6 minutes and 20 seconds


Bulhwi Cha (Feb 23 2026 at 17:21):

/-- `isMin_below min a` means that `min` is a minimal element of the set `{y : α | r y a}` with
respect to `r`. -/
def Relation.isMin_below {α : Sort u} (r : α  α  Prop) (min a : α) : Prop :=
  r min a   y : α⦄, r y a  ¬r y min

open Relation

/-- If `a` is accessible through a binary relation `r`, then for every descending chain `f` starting
from `a`, it is not false that `f` ends at a minimal element of the set `{y : α | r y a}` with
respect to `r`.  -/
theorem Acc.not_not_descending_chain_ends_at_min_of_acc {α : Sort u} {r : α  α  Prop} {a : α}
    (acc : Acc r a)
    {f : Nat  α}
    (hsta : f 0 = a)
    (hcon :  n : Nat⦄, isMin_below r a (f n)  f (n + 1) = f n)
    (hdes :  n : Nat⦄, ¬isMin_below r a (f n)  r (f (n + 1)) (f n)) :
    ¬¬∃ (c : Nat), isMin_below r a (f c)   m : Nat⦄, c  m  f m = f c :=
  sorry

ChatGPT 5.2 Thinking proved the original version of the above theorem in 6 minutes and 20 seconds; I spent 11 hours, 11 minutes, and 45 seconds to prove it.

I dropped out of Korea Aerospace University in 2023, and I don't know much about undergraduate mathematics, although I've been using the Lean theorem prover for four years.

I'm not sure whether I'll be able to prove undergraduate-level theorems as fast as the state-of-the-art AI agents, even after I become more knowledgeable about undergraduate mathematics.

However, I'll keep proving theorems by myself for the following reasons:

  1. While trying to prove a theorem, I find out which lemmas are important for achieving my goal.
  2. I understand a theorem much better after I prove it without looking at an AI agent's proof.
  3. Reviewing an AI agent's proof isn't fun.

Still, it's impressive that GPT-5.3-Codex, Claude Sonnet 4.6, Claude Opus 4.6, and Gemini 3.1 Pro can write Lean 4 code to prove basic theorems about induction and recursion. Personally, I don't want to pay money for using these models, so I'll try to find ways to use them for free.

Thomas Waring (Feb 23 2026 at 23:35):

perhaps interestingly the gpt proof uses Classical.choice (and Quot.sound) where yours does not, which i assume was intended judging from the double negation

Bulhwi Cha (Feb 24 2026 at 01:23):

That's right. I had a feeling that I could write a constructive proof of some modified version of the theorem I intended to prove.


Last updated: Feb 28 2026 at 14:05 UTC