Zulip Chat Archive
Stream: general
Topic: Decomposing pattern matching tactics for AI
Frederick Pu (Dec 25 2024 at 16:04):
how would you suggest aesop to disccharge the current goal into 2 new goals using cases with pattern matching? ie
cases x with | => | =>
Frederick Pu (Dec 25 2024 at 16:18):
So does pantograph support doing tree search on proofs that involve pattern matching? Like what would happen if you try to a trace a theorem with the following tactic.
cases x with
| inl l => _
| inr r => _
Jason Rute (Dec 25 2024 at 16:19):
This particular issue has come up many times. I brought it up when Lean 4 was under development. I think mathlib has a one line tactic which does this and this is what I assume Lean Dojo, LLMLean, and ABEL are likely to suggest. Another approach suggested by Daniel Selsam was to use cases x with | => _? | => _?
or something like that (I’m forgetting the correct syntax). It’s ugly but it works, and he showed how you can decompose Lean proofs into this form when collecting data. I don’t know how AlphaProof handles this because I thought they did have a number of nested proofs. I’d be curious. Also, in the long term the way DeepSeek-Prover v1.5 and others do it where they don’t consider a proof to be automic tactics but just text where you have certain break points (or so I understand it), might be a better approach in the end. It would work equally well on term proofs and writing code in general.
Frederick Pu (Dec 25 2024 at 16:21):
so at each step it predicts a template that it needs to fill in later?
Jason Rute (Dec 25 2024 at 16:21):
What is “it”? I mentioned many approaches.
Jason Rute (Dec 25 2024 at 16:26):
The approach suggested by Selsam is a valid Lean tactic (when I get the syntax correct) that introduces goals for each _?
. You could just leave it like that. It looks ugly, but it isn’t wrong. I would think of the DeepSeek-Prover v1.5 approach more like text generation where they fix a proof at the first place it breaks. But it is all a matter of perspective and you could think of some of these approaches including the common mathlib tactics tactics as making “templates”.
Frederick Pu (Dec 25 2024 at 16:27):
sorry, I meant deepseek prover
Frederick Pu (Dec 25 2024 at 16:28):
so if it reached a goal state where it wanted to use cases, would it just spit out
cases x with
| inl l =>
| inr r =>
which would give the tree search algorithms two errors, which it then treats like goals?
Jason Rute (Dec 25 2024 at 16:30):
Like I said, if you use the mathlib version of cases it does this. Or if you use the Selsam approach, it also does this. Or in the DeepSeek MCTs case if you generate a whole proof and the second case breaks then you just fix the second case leaving the rest the same. None are exactly templates but have the same effect.
Jason Rute (Dec 25 2024 at 16:31):
I’m on my phone so I can’t give specific examples.
Frederick Pu (Dec 25 2024 at 16:42):
so the deepseek llm takes the previous partial proof and all the error messages and outputs an entirely new version of the same proof. So for example if the previous proof was
cases x with
| inl l =>
| inr r =>
it's output would be
cases x with
| inl l => linarith
| inr r =>
that is it doesnt just predict modifications, but rewrites the entire proof again at each search step?
Frederick Pu (Dec 25 2024 at 16:47):
also for the selsam approach, would that mean you need to preprocess all the lean proofs/programs and replace all of the pattern matching code with holes?
Frederick Pu (Dec 25 2024 at 16:49):
for the selsam approach would each tactic have the option to assigned "linearizer" function that is capable of converting a the tactic application into it's corresponding hole command version
Jason Rute (Dec 25 2024 at 17:20):
Just to be clear none of the approaches I mentioned literally output
cases x with
| inl l =>
| inr r =>
The DeepSeek approach I think write a full proof and then fixes it at the first error keeping the rest of the proof the same. I don’t quite understand what they would do if, say, the first case errored but the second case worked. @Huajian Xin?
As for the Selsam approach, that tactic I wrote is a valid tactic (if I got the syntax right) that behaves just like mathlib’s version of cases. It is a single tactic which changes the current goal into two subgoals. End of story. The advantage is that you can take a real instance of the nested cases tactic in Lean and decompose it into multiple atomic tactics instead of one super tactic, where the first tactic is `cases x with | => _? | => _?. (If this is still confusing, maybe we should wait until tomorrow when I have my computer to write you some code showing what I mean.)
Frederick Pu (Dec 25 2024 at 17:26):
kk thx
Jason Rute (Dec 26 2024 at 12:56):
I moved the conversation with @Frederick Pu here from #general > What do you hope AI can help you achieve with Lean in 2025?
Jason Rute (Dec 26 2024 at 13:25):
Here is my longer response. Say you have a theorem of this form in your library you are extracting data from.
example (a : ℕ) (h : a = 0 ∨ a = 1) : a * a = a := by
cases h with
| inl hl => rw [hl]
| inr hr => rw [hr]
I don't exactly know how all the data extraction libraries do it, but possibly it you could get the following three proofstate goal pairs:
a : ℕ
h : a = 0 ∨ a = 1
⊢ a * a = a
cases h with | inl hl => rw [hl] | inr hr => rw [hr]
a : ℕ
hl : a = 0
⊢ a * a = a
rw [hl]
a : ℕ
hr : a = 1
⊢ a * a = a
rw [hr]
Notice the first proofstate-tactic is the full case tactic including all subtactics. Some libraries may not record that one that since there are more atomic tactics inside.
Jason Rute (Dec 26 2024 at 13:25):
Now for an AI that works by predicting one tactic at a time and records data as above, I think it would end up predicting the following in a setting like this:
- Either it would predict the full long nested tactic
cases h with | inl hl => rw [hl] | inr hr => rw [hr]
which could possibly have errors since it is so long, or - it would predict a variant with
cases'
, something likecases' h with hl hr
, since that is more common in Mathlib.
Jason Rute (Dec 26 2024 at 13:25):
Neither is great, since it isn't capturing an atomic tactic which covers just the case split. It is either getting lucky since there are similar tactics in mathlib which are atomic or it is just predicting the full nested tactic, but that is long and more error prone.
Jason Rute (Dec 26 2024 at 13:25):
So Daniel Selsam in a conversation from many years ago (#Machine Learning for Theorem Proving > Lean gym for Lean 4) showed that is it possible to break up the above proof automatically into the following while recording (and the method is applicable to anytime you have tactics inside tactics).
example (a : ℕ) (h : a = 0 ∨ a = 1) : a * a = a := by
cases h with | inl hl => ?_ | inr hr => ?_
. rw [hl]
. rw [hr]
I don't think any of the public recording libraries do this, and it leads to slightly ugly proofs. (I suppose one could predict a proof in this form, but then use metaprogramming to get rid of the ?_
in the final result, although I don't know if it would fully work in practice.)
Jason Rute (Dec 26 2024 at 13:25):
I think DeepSeek-Prover v1 and other whole proof generation methods just train on whole proofs and predict whole proofs, so again the prediction might be something like:
cases h with
| inl hl => rw [hl]
| inr hr => simp
This is great if it works since the model can use full Lean tactic syntax, but again their may be errors in some parts of the proof (as there is in my example with simp
). DeepSeek-Prover v1.5 as well as many other theorem provers for Isabelle and Coq (Lyra, PALM, Cobblestone) try to repair the proof where it is broken. I think for the above proof that would mean just replacing simp
with a different proof using the goal at that point in the proof. I'm not sure about if the proof was broken in multiple places. For example:
cases h with
| inl hl => simp --broken
| inr hr => simp --broken
I'm pretty sure CobbleStone would separately fix both simp
instances independently. I don't know about DeepSeek-Prover v1.5. Would it try to fix each simp
independently, or just start generating text at the location of the first broken simp
, or just start over at the location of cases ...
? @Huajian Xin? (This is my same question above, but hopefully stated more clearly.)
David Renshaw (Dec 26 2024 at 14:00):
possible to break up the above proof automatically into the following while recording
Just to demonstrate that this is indeed possible, I'll point out that animate-lean-proofs
automatically decomposes compound tactics:
example.gif
Adam Topaz (Dec 26 2024 at 14:08):
@David Renshaw can you give a pointer to the code that does this decomposition?
David Renshaw (Dec 26 2024 at 14:10):
It happens here.
Adam Topaz (Dec 26 2024 at 14:10):
Thanks!
David Renshaw (Dec 26 2024 at 14:10):
You know the syntax span of the current tactic step, and you can compute the syntax spans of its children, so you can then subtract them.
Eric Wieser (Dec 26 2024 at 14:25):
Note that not every compound tactic permits ?_
contents
David Renshaw (Dec 26 2024 at 14:27):
Yeah, I think my code might be somewhat sloppy about whether the ?_
is replacing something in term mode or in tactic mode.
David Renshaw (Dec 26 2024 at 14:28):
Also, I'm willing to massage proofs a bit before passing them to my tool.
David Renshaw (Dec 26 2024 at 14:29):
@Eric Wieser can you give an example of a tactic you specifically have in mind?
Frederick Pu (Dec 26 2024 at 14:52):
what abotu something like all_goals?
Frederick Pu (Dec 26 2024 at 15:00):
what's CobbleStone?
Frederick Pu (Dec 26 2024 at 15:05):
i wonder how aesop deals with pattern matching, does it just use the non pattern matching version of cases?
Frederick Pu (Dec 26 2024 at 15:09):
at the end of the day, i think there needs to be a way for each tactic to say how it wants to be traced (maybe using an attribute tag like @[trace_rule]). since lean as @Mario Carneiro said there is no way to truly distinguish between macros and tactics.
Jason Rute (Dec 26 2024 at 15:48):
@Frederick Pu You can find a Cobblestone (an AI theorem prover for Roqc/Coq) discussed on the Roqc/Coq Zulip here: https://coq.zulipchat.com/#narrow/stream/252087-Machine-learning-and-automation/topic/Cobblestone/near/479935600. There are also other relevant papers in that channel.
Frederick Pu (Dec 26 2024 at 16:27):
Frederick Pu said:
at the end of the day, i think there needs to be a way for each tactic to say how it wants to be traced (maybe using an attribute tag like @[trace_rule]). since lean as Mario Carneiro said there is no way to truly distinguish between macros and tactics.
actually on second thought it would probably be better do that at the syntax level for example we could have
syntax inductionAlt := ppDedent(ppLine) inductionAltLHS + " => " (tracePoint << hole <|> syntheticHole <|> tacticSeq)
where tracePoint
is just a dummy parser that reduces to the empty string. (pls excuse if the syntax is slightly wrong, but I hope my general point makes sense)
Frederick Pu (Dec 26 2024 at 16:51):
this would provide a more general way of doing what @Daniel Selsam did: https://github.com/dselsam/lean4/commit/f577b4c6e64b0f823ac91b7c67eef6b30f3862cf
Frederick Pu (Dec 26 2024 at 16:54):
although it will probably require changing how the syntax command works. since Syntax is basically an opaque type as of now. maybe we have a secondary syntax command syntaxTrace
that's used to make the traced version of the syntax that way we don't need to build everything off of a custom fork of lean4 just to get the correct tracing properties
Eric Wieser (Dec 26 2024 at 17:03):
David Renshaw said:
Eric Wieser can you give an example of a tactic you specifically have in mind?
I think things like ring <;> (simp; ?_)
and maybe next x y => ?_
.
Frederick Pu (Jan 04 2025 at 23:09):
if we can get aesop to generate holed tactics based on input strings, would we be able to replace the rmax search algorithm used in deepseek prover with a lean metaprogram? Since rmax search seems to be best first search.
Last updated: May 02 2025 at 03:31 UTC