Zulip Chat Archive

Stream: Machine Learning for Theorem Proving

Topic: gptf generating theorems

Kevin Buzzard (Apr 29 2021 at 15:30):

I was showing one of my sons gptf (in the lean-liquid repo) and he suggested that instead of training GPT to generate proofs, could one use it to generate theorem statements? I am not a computer scientist but this sounded like a reasonable enough question to me!

Jesse Michael Han (Apr 29 2021 at 15:45):

yeah this would be a fun experiment! this has been studied elsewhere: the N2Formal team had some experiments with unconditional generation in HOL Light

currently the model is trained on theorem naming: lean statement -> theorem_name_in_mathlib, one could reverse the role of the names and statements and then try seeing what happens when you supply a reasonable-looking theorem name it's never seen during training

one way you can try playing with (goal-conditioned) conjecturing now is to use gptf {prefix := "have :"} to force it to synthesize a have-statement

Alex J. Best (Apr 29 2021 at 15:46):

If you visit https://beta.openai.com/playground?model=formal-lean-wm-to-tt-m1-m2-v4-c4 and sign in (assuming you signed up for the beta they announced some time back) you can play around with just clicking submit and having the model generate statements and then try and prove them. Basically given any prefix gptf will try and complete it, so you can start with something like

<|endoftext|>LN] GOAL α : Type u_1, _inst_1 : group α,  g h : G ⊢ g * h

and click submit to see what gptf thinks this term should be, I just tried this a couple of times and got g * h = g * h by rfl, but more interestingly also the theorem

g * h * h⁻¹ = g PROOFSTEP rw [mul_assoc, ← gpow_add, add_right_cancel_iff, one_mul]

not sure that the proof works in this case :grinning:

Jesse Michael Han (Apr 29 2021 at 15:51):

neat, thanks Alex! yeah the playground at beta.openai.com might be your best bet for playing with the model while avoiding all the gptf theorem proving machinery

Jason Rute (Apr 29 2021 at 15:58):

And if you want top level theorems with no assumptions, use the prefix GOAL ⊢ or TYPE (and whatever boilerplate you need at the beginning). The TYPE keyword might be better even since that was just trained on Lean top level theorems (and definitions, maybe).

Jason Rute (Apr 29 2021 at 16:01):

See Figure 4 in https://arxiv.org/pdf/2102.06203.pdf for all the available keywords.

Last updated: May 02 2025 at 03:31 UTC