Zulip Chat Archive

Stream: Machine Learning for Theorem Proving

Topic: Building proof trees from tactic proofs

Matěj Kripner (Jan 28 2025 at 20:55):

Do you know about any way to parse tactic-style proofs into proof trees? I.e. a node using induction tactic on Nat should have two independent outgoing branches (up to potential metavariable coupling) corresponding to zero and succ cases. This is how I think about proofs in Lean. However, it is not how they are represented in Lean's AST and elaboration. Instead, induction simply has one output containing two new goals (and possibly some other unchanged goals) which are used as inputs to some other tactic down the line. That leaves the problem of pairing output goals (from Elab.TacticInfo.goalsAfter) with input goals (goalsBefore) via the appropriate tactics which is not trivial because

IDs of the goal metavariables can change between tactics and cannot be used for identifying a goal over time (as far as I understand),
not all goals have a userName,
the exact syntax of goals can change over time because of metavariables,
it's unclear which and how many goals were affected by a tactic (consider try, repeat, any_goals, ...) nor how many new goals were created.

Additionally, some tactics do not operate on the main goal or even go against the tree formulation by operating on multiple goals but can be handled manually when building the tree (relatively easily?), e.g. all_goals, switch, case, rotate_left, <;>, have, ...

Metavariable coupling means that goals cannot always be split into independent branches, so the objective is to split them into minimal "connected components", i.e. as much as possible.

The only time something like this has been done that I know of is in the paper HyperTree Proof Search for Neural Theorem Proving (@Guillaume) where they interfaced directly with the Lean C++ kernel but they unfortunately didn't port it to Lean 4. Another option is to use the syntax-level InfoTrees extracted from elaboration using a Lean script (approach used by LeanDojo @Kaiyu Yang). Yet another is to actually run the tactics in Lean and report their result (used by Pantograph @Leni Aniva). Using neither of these for proof tree extraction seem straightforward so I'm going to try my luck here before I spend a few months on trial-and-error :) Any suggestions or ideas are welcome.

My goal with this is to compute for each state the number of its descendants - something approximating the difficulty of going from that state to proof end. Such data can then be used for a ML-based critic guiding a tree search.

Leni Aniva (Jan 28 2025 at 20:58):

I think the biggest problem is that the tactics essentially are free to do whatever they want with the syntax tree going in, and the common protocol of all the tactics is they modify the current set of goals in some way. In this respect it would be impossible to write a system that pairs Lean source code with tactics, unless you write specific handlers for each tactic. Moreover its very easy to mix tactic and expression style proofs, so the question of pairing goalsAfter and goalsBefore may not be defined in some cases.

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Jan 28 2025 at 21:47):

One approximation to 'compute subgoals created by a tactic T from a given goal G' is to look at the term that G's metavariable has been assigned to. Then we define the subgoals of G as the set of metavariables appearing in the term t where G := t is the metavariable assignment created by T. In practice you have to filter this set a bit, and as Leni said the general problem isn't solvable, but it works reasonably well for many tactics. You can try it out here.

Jason Rute (Jan 29 2025 at 00:31):

Like the hypertree proof search paper, I did this also in Lean 3 (if I understand the question). I didn't use the kernel, but I used that there was a particular piece of Lean 3 code that ran before and after each tactic (in Lean 3). If I modified that, then I could:

record goal states and positions every time each tactic was run (which could be multiple times per tactic)
I could use the tactic environment to keep counters and stacks to figure out where I was in the proof tree. (Here by "proof tree" I mean the environment tree, where a state extends another state if it is in a later branch of the environment.)
I think I also keep track of goal mvarids and hashes.

Jason Rute (Jan 29 2025 at 00:31):

So for tactics like rw [foo]; repeat {intro <|> simp} I could see the goal state before and after the full tactic command, the goal state before and after rw [foo], the goal state before and after each invocation of repeat {...}, the goal state before and after intro (if it succeeded), and the goal state before and after simp if it ran.

Jason Rute (Jan 29 2025 at 00:31):

Then I found with this data, I could usually piece together the needed information to make a tree. There may have been exceptions, but I think 90 percent of cases it worked.

Jason Rute (Jan 29 2025 at 00:31):

I have no idea if anything like this is applicable to Lean 4. But if it helps, my code is https://github.com/jasonrute/lean_proof_recording

Jason Rute (Jan 29 2025 at 00:31):

But my recommendation would be to gather the data and look at it closely for patterns. For example, https://github.com/kim-em/lean-training-data is very easy to modify to get extra data like metavariable ids, hashes, and other data which might help. Then when you have enough data you could see if you could extract a tree in 90 percent of cases.

Jason Rute (Jan 29 2025 at 00:31):

Also, to address your bullets:

I would test that metavariable ID assumption at scale. (It doesn't sound like this is usual, but I don't know Lean 4 well so maybe it is.) If they change, why? Are the goals the same expressions? If so, you could use hashes to compare? If they are somehow fundamentally different, then maybe that should be taken into account.
Yeah, userName is probably not the best unique id, but if it does exist it could help.
If there is metavariable coupling, then you may not want to treat the coupled goals as different, since changing one goal will change its coupled goals.
As for tactic combinators, is this not tracked with tools like https://github.com/kim-em/lean-training-data ? At least, in that you can see the before and after for each atomic tactic application inside repeat, try, any_goals even if that tactic is run multiple times? (Actually for any_goals and similar tactics, wouldn't you be able to see the exact before goal for each tactic application?)

Jason Rute (Jan 29 2025 at 00:31):

Also, maybe ask @Fabian Glöckle, since he extended HTPS to Lean 4. Or talk to @David Renshaw since he has tools that show each step of a proof. (I thought his tools were able to distinguish what goal a tactic is being applied to, but maybe that is a hack.)

David Renshaw (Jan 29 2025 at 00:55):

That leaves the problem of pairing output goals (from Elab.TacticInfo.goalsAfter) with input goals (goalsBefore) via the appropriate tactics which is not trivial because
* IDs of the goal metavariables can change between tactics and cannot be used for identifying a goal over time (as far as I understand)

animate-lean-proofs performs a multi-stage transformation of an InfoTree to get a linear sequence of actions that can be animated. It links up tactic steps by matching MVarIds of output goals and input goals. This works well, at least for the proofs that I've tried to animate so far. It currently does not work for proofs that are not properly focused (i.e. those that would not pass the multiGoal linter), but that might be fixable.

Michael Rothgang (Jan 29 2025 at 08:13):

I believe @Damiano Testa thought about such questions (matching goals before and after a tactic) in the context of mathlib's multi-goal and flexible linters

Damiano Testa (Jan 29 2025 at 08:14):

Yes, and even more so in the flexibleTactic linter.

Damiano Testa (Jan 29 2025 at 08:16):

Ultimately, it mostly boils down to seeing if an MVarId name stayed the same, and then making the assumption that the names that do not appear before and after the tactic are matched.

Damiano Testa (Jan 29 2025 at 09:04):

I think that PaperProof also does the same thing.

Last updated: May 02 2025 at 03:31 UTC