Zulip Chat Archive

Stream: Machine Learning for Theorem Proving

Topic: request for help on the REPL

Kim Morrison (Jan 08 2025 at 01:32):

I've been busy, and have not yet upgraded the REPL to v4.15.0-rc1. Would anyone be interested in looking into this?

There is an open PR at https://github.com/leanprover-community/repl/pull/61, and the CI errors show what needs to be fixed.

The underlying causes are

lean#6051 (making Meta.Context.config private) and
lean#5835 feat: structure auto-completion & partial InfoTrees

Bryan Gin-ge Chen (Jan 08 2025 at 02:10):

I've pushed a commit fixing the second bulletpoint, but I'm at a loss for the first one.

llllvvuu (Jan 08 2025 at 04:27):

A bit sketchy but here's a workaround for the first bulletpoint: https://github.com/leanprover-community/repl/pull/64/files

Kim Morrison (Jan 08 2025 at 05:50):

Thanks @Bryan Gin-ge Chen and @llllvvuu, putting those together gets us to v4.15.0-rc1. I merged, and cut tags for v4.15.0-rc1 and v4.15.0.

Kim Morrison (Jan 14 2025 at 05:35):

Anyone, perhaps @Bryan Gin-ge Chen or @llllvvuu, interested in having a look at https://github.com/leanprover-community/repl/pull/67, the failing REPL bump to v4.16.0-rc1? Proof states are being duplicated.

llllvvuu (Jan 14 2025 at 07:12):

I can take a look around this time tomorrow

llllvvuu (Jan 15 2025 at 10:20):

repl#68

Kim Morrison (Jan 15 2025 at 11:30):

@llllvvuu, nice, thank you!

Kim Morrison (Apr 03 2025 at 10:15):

I've just put up https://github.com/leanprover-community/repl/pull/79, which attempts to move the repl up to v4.19.0-rc2, but this still has some failing tests.

Kim Morrison (Apr 03 2025 at 10:15):

Would anyone be interested in taking a look at this?

Kim Morrison (Apr 03 2025 at 10:17):

Separately, a few people have contacted me recently about wanting to help fix various known problems with the REPL. May I suggest that discussion about this happens here? I don't have a lot of time for the REPL at the moment, and I hope that others who are using it can take over.

Pinging @Auguste Poiroux and @Matěj Kripner as two people who have expressed such interest recently! :-)

Lenny Taelman (Apr 03 2025 at 15:10):

I find it very valuable to have REPL included in the lean core release cycle, always up to date with the latest lean release. So as one of the REPL 'users' I wanted to say thank you to everybody who has helped and will help making this possible!

Auguste Poiroux (Apr 03 2025 at 15:27):

A quick fix to pass the failing tests is to revert the change of default behavior for importModules(see https://github.com/leanprover/lean4/pull/6325). I made a pull request in the Lean REPL repo here.

Sebastian Ullrich (Apr 03 2025 at 15:38):

It is problematic that unpickling can be done more than once in a single process but this likely is not worse than in previous Lean versions. Otherwise the PR looks good to me.

Kim Morrison (Apr 03 2025 at 23:32):

Thank you @Auguste Poiroux for the quick fix. The REPL is now available on v4.19.0-rc2 (including with a toolchain tag v4.19.0-rc2.

Matěj Kripner (Apr 09 2025 at 12:05):

@Kim Morrison Thank you for the ping! Unfortunately, the REPL sometimes accepts incorrect proofs - see these examples:

https://github.com/leanprover-community/repl/issues/44#issuecomment-2694743268
https://github.com/leanprover-community/repl/issues/44#issuecomment-2789411256

I will repost some of them here:

example : 1 = 0 := by
  cases 1
  rfl
  apply ?succ

example : 1 = 0 := by
  cases' exists_ne 2 with n h
  simpa [] using h _

theorem self_application : 1 = 0 := by
  rw [self_application]

I'm currently reading through the REPL code, but so far I have no idea how to fix this. Any help is appreciated!

Kim Morrison (Apr 09 2025 at 12:07):

Yes, the basic problem is that proofs done in "tactic mode" in the REPL are never sent to the kernel for checking.

Kim Morrison (Apr 09 2025 at 12:08):

I think you do these in "command" mode (i.e. give it the whole declaration at once) then it is sent to the kernel, and the error is correctly reported. Does that agree with your understanding?

Matěj Kripner (Apr 09 2025 at 12:12):

Kim Morrison said:

Yes, the basic problem is that proofs done in "tactic mode" in the REPL are never sent to the kernel for checking.

I did not realize that this is the case. Do you think it is doable to send the proof to kernel after each tactic invocation in tactic mode? Not using the kernel at all seems problematic.

Matěj Kripner (Apr 09 2025 at 12:13):

Kim Morrison said:

I think you do these in "command" mode (i.e. give it the whole declaration at once) then it is sent to the kernel, and the error is correctly reported. Does that agree with your understanding?

Yes, the errors are reported in command mode but not in tactic mode.

Kim Morrison (Apr 09 2025 at 12:14):

The problem is that you can only send complete proofs --- in tactic mode, if you are not finished yet, then the proof term has metavariables still (and so the kernel will reject it).

Matěj Kripner (Apr 09 2025 at 12:32):

Kim Morrison said:

The problem is that you can only send complete proofs --- in tactic mode, if you are not finished yet, then the proof term has metavariables still (and so the kernel will reject it).

I see. Do you think there is a way to reject these incorrect proofs in tactic mode without calling the kernel? Or is it necessary to think of a way to make the kernel verify even incomplete proofs?

At first glance it seems that even incomplete proofs could be sent to the kernel if we add sorry at the end - that is what we do all the time when writing proofs in Lean.

Kim Morrison (Apr 09 2025 at 12:39):

Yes, the "right" fix (the REPL is, at this point, a pile of hacks, so what is one more?) is perhaps to have a custom tactic (probably make use of the existing done tactic) that the REPL intercepts, and sends the currently proof term to the kernel for checking.

Kim Morrison (Apr 09 2025 at 12:39):

Actually, sorry, that is dumb.

Kim Morrison (Apr 09 2025 at 12:40):

The right thing is just to have the REPL detect when there are no more goals (/no more metavariables in the proof term) and send it to the kernel then. Currently there is just no step in tactic mode that does this.

Auguste Poiroux (Apr 09 2025 at 12:50):

In the LeanDojo repl, they indeed send the proof to the kernel once there are no more goals. See this file. LeanDojo repl is unfortunately not compatible with Lean >= 4.12.0 (issue). I started implementing a similar idea for the Lean REPL in this PR a few months ago, but I got stuck because of my limited understanding of the problem ^^

Matěj Kripner (Apr 09 2025 at 13:33):

Kim Morrison said:

Yes, the "right" fix (the REPL is, at this point, a pile of hacks, so what is one more?) is perhaps to have a custom tactic (probably make use of the existing done tactic) that the REPL intercepts, and sends the currently proof term to the kernel for checking.

I think this first idea is nicer, because it rejects invalid tactics immediately instead of waiting for the proof to finish.

Adam Topaz (Apr 09 2025 at 13:34):

All done does is check whether there are any unsolved goals. I don’t think it involves the kernel.

Adam Topaz (Apr 09 2025 at 13:35):

docs#Lean.Elab.Tactic.done

Matěj Kripner (Apr 09 2025 at 13:43):

Auguste Poiroux said:

In the LeanDojo repl, they indeed send the proof to the kernel once there are no more goals. See this file. LeanDojo repl is unfortunately not compatible with Lean >= 4.12.0 (issue). I started implementing a similar idea for the Lean REPL in this PR a few months ago, but I got stuck because of my limited understanding of the problem ^^

The PR looks very promising! I think we should try to finish it. Do you have time to discuss the approach you tried and why it does not work yet?
If you prefer, we can arrange a call - maybe @Kim Morrison and anyone interested can join.

Auguste Poiroux (Apr 09 2025 at 14:58):

I just updated the PR with new tests corresponding to @Matěj Kripner minimal examples. I have found that the code in the PR correctly catches all these cases. It also catches this example from @RexWang:

{"cmd": "import Mathlib"}

{"cmd": "theorem amc12a_2002_p12 (f : ℝ → ℝ) (k : ℝ) (a b : ℕ) (h₀ : ∀ x, f x = x ^ 2 - 63 * x + k)\n    (h₁ : f a = 0 ∧ f b = 0) (h₂ : a ≠ b) (h₃ : Nat.Prime a ∧ Nat.Prime b) : k = 122 := by sorry", "env": 0}

{"tactic": "apply (mul_right_inj' (sub_ne_zero.2 ?_)).1", "proofState": 0}

{"tactic": "ring_nf at h₁ h₂ ⊢", "proofState": 1}

Apparently it wasn't caught before (see issue).
Error messages might be different from the ones provided in the command mode or in the Lean IDE. The code in the PR runs some checks before doing a kernel check, resulting sometimes in an uninformative error Incomplete: contains metavariable(s).
However, the code still doesn't catch the following self-application issue:

theorem ex : False := by
   exact?

It is also the only case I know where it fails to catch the issue so far.

Matěj Kripner (Apr 09 2025 at 16:40):

Auguste Poiroux said:

I just updated the PR with new tests corresponding to Matěj Kripner minimal examples. I have found that the code in the PR correctly catches all these cases. It also catches this example from RexWang:
{"cmd": "import Mathlib"}

{"cmd": "theorem amc12a_2002_p12 (f : ℝ → ℝ) (k : ℝ) (a b : ℕ) (h₀ : ∀ x, f x = x ^ 2 - 63 * x + k)\n    (h₁ : f a = 0 ∧ f b = 0) (h₂ : a ≠ b) (h₃ : Nat.Prime a ∧ Nat.Prime b) : k = 122 := by sorry", "env": 0}

{"tactic": "apply (mul_right_inj' (sub_ne_zero.2 ?_)).1", "proofState": 0}

{"tactic": "ring_nf at h₁ h₂ ⊢", "proofState": 1}
Apparently it wasn't caught before (see issue).
Error messages might be different from the ones provided in the command mode or in the Lean IDE. The code in the PR runs some checks before doing a kernel check, resulting sometimes in an uninformative error Incomplete: contains metavariable(s).
However, the code still doesn't catch the following self-application issue:
theorem ex : False := by
   exact?
It is also the only case I know where it fails to catch the issue so far.

This looks great! I think this is generally the correct approach and it's well-tested. Even if it fails for exact? and similar, it still dramatically decreases the "attack surface" of the REPL.

@Kim Morrison do you think it would be possible to merge it?

Matěj Kripner (Apr 09 2025 at 16:42):

I also like the idea of rootGoals. For theorem provers, it removes the need to manually replace by blocks with sorry when starting a proof.

Kim Morrison (Apr 10 2025 at 00:23):

Tests are currently failing, but yes, after that, happy to take a look and hopefully merge.

Auguste Poiroux (Apr 10 2025 at 08:11):

Tests are now passing. Tests for cases mentioned in this issue are implemented in the PR, except the exact? self-application issue, which is now removed.

Auguste Poiroux (Apr 10 2025 at 08:12):

On a related note, I think it could be nice to store somewhere tests that don't pass yet, either in a separate branch or in a folder.

Kim Morrison (Apr 10 2025 at 08:24):

There are in fact some disabled tests -- if there is no matching .expected.out file then the test framework doesn't check anything.

Kim Morrison (Apr 10 2025 at 08:25):

I've merged the PR. Thank you, @Auguste Poiroux, for fixing this for us all!

Auguste Poiroux (Apr 10 2025 at 14:00):

We might have merged the previous PR a bit too fast. It turns out that this PR is rejecting a large class of valid proofs. I created a new PR fixing this, along with new tests to check for slightly more complex proofs (all tests are passing): link.

Matěj Kripner (Apr 10 2025 at 16:32):

@Auguste Poiroux Thank you, this is a great step forward. The new PR looks sensible.

One practical issue I came across is that proofStatus is "Incomplete" even when the goals introduced by sorry are solved. E.g. when using have, you would send something like:

{"tactic": "have h : x > 0 := by sorry", "proofState": 0}

(as in https://github.com/leanprover-community/repl/blob/master/test/name_generator.in)
Even after finishing the proof, proofStatus stays "Incomplete" (see https://github.com/leanprover-community/repl/blob/master/test/name_generator.expected.out).

Auguste Poiroux (Apr 10 2025 at 16:41):

This is a good point. At the moment, only linear sequences of proofsteps, with each proofstep depending on the previous one, are checked. Let's call this a branch in the proof search. Each branch are independent from each other, and thus cannot influence each other proofStates. The thing with "have ... := sorry" is that this creates a new branch. So when you send proofSteps to this branch, the main branch is not aware of it. The main branch will always have the "have ... := sorry" proofstep.
Hmm, I don't know how other REPL-like tools solve this ^^'

Matěj Kripner (Apr 10 2025 at 16:55):

Auguste Poiroux said:

This is a good point. At the moment, only linear sequences of proofsteps, with each proofstep depending on the previous one, are checked. Let's call this a branch in the proof search. Each branch are independent from each other, and thus cannot influence each other proofStates. The thing with "have ... := sorry" is that this creates a new branch. So when you send proofSteps to this branch, the main branch is not aware of it. The main branch will always have the "have ... := sorry" proofstep.
Hmm, I don't know how other REPL-like tools solve this ^^'

Yeah, I don't see a way around that either. The branches have to be independent. But sorries are already reported by the REPL, so maybe we can just remove the pf.hasSorry check and not report the "Incomplete: contains sorry" error?

Auguste Poiroux (Apr 10 2025 at 16:59):

I think we prefer to have false negatives rather than false positives when doing proof verification, so I wouldn't remove the pf.hasSorry check. But you can consider that "Incomplete: the proof contains sorries" is "Completed" for your use case, if you know that you have a correct proof for your branch.
Another external workaround would be to first finish the proof associated to the sorry introduced by have, and then to send {"tactic": "have h : x > 0 := by <your successful proof here>", "proofState": 0}. You can then continue with the main proof and you won't have a sorry issue.

Matěj Kripner (Apr 10 2025 at 17:10):

Auguste Poiroux said:

I think we prefer to have false negatives rather than false positives when doing proof verification, so I wouldn't remove the pf.hasSorry check. But you can consider that "Incomplete: the proof contains sorries" is "Completed" for your use case, if you know that you have a correct proof for your branch.
Another external workaround would be to first finish the proof associated to the sorry introduced by have, and then to send {"tactic": "have h : x > 0 := by <your successful proof here>", "proofState": 0}. You can then continue with the main proof and you won't have a sorry issue.

In that case can we move the pf.hasSorry down so that it is the last check before "Completed"? Because otherwise it is not safe to ignore it as it might shadow e.g. the kernel check.

Auguste Poiroux (Apr 10 2025 at 17:28):

I see your point, ok doing that in the current PR :+1:

Auguste Poiroux (Apr 10 2025 at 18:09):

I updated the PR, the exact? self-application issue is now resolved as well. I believe we don't have any invalid proofs that are accepted in tactic mode now.

As mentioned by @llllvvuu, the exact? issue has already been partially resolved in this commit. It is a question of which environment is provided to the tactic. The previous commit solved the problem for all commands starting from an existing environment state. In the PR, we now solve it for the initial commands as well.

Kim Morrison (Apr 11 2025 at 00:04):

merged, thank you!

Auguste Poiroux (Apr 11 2025 at 12:13):

For people interested in having all the latest features and bug fixes of Lean REPL backported to older Lean versions, I am maintaining a fork doing that. Essentially, I’m decoupling Lean versions from REPL versions.
For instance, in the latest version v1.0.6, the recently merged proof validation feature has been backported and tested for all Lean versions between v4.7.0-rc1 and v4.19.0-rc2.

Auguste Poiroux (Apr 11 2025 at 12:13):

How it works: each REPL version lives in its own branch, and each commit within that branch corresponds to a specific Lean version. Whenever the official REPL is updated, I copy my latest branch and rebase it on the new REPL using stacked-git. This workflow is generally efficient and only requires a few manual fixes in most cases.

Matěj Kripner (Apr 14 2025 at 15:24):

@Auguste Poiroux I think there is still one more issue with the pf.hasExprMVar check in getProofStatus. It can be seen here:

https://github.com/leanprover-community/repl/blob/master/test/name_generator.in
https://github.com/leanprover-community/repl/blob/master/test/name_generator.expected.out

When we execute {"tactic": "have h : x > 0 := by sorry", "proofState": 0} and descend into the branch given in sorries, we can never get to a "Completed" state (I think) because goals in the main branch will be unsolved from our perspective. This is again the problem of isolation between branches. However, while "contains sorries" can perhaps be safely ignored, "contains metavariables" cannot, because then some invalid proofs might get accepted.

Do you see a way how this could be resolved?

Auguste Poiroux (Apr 14 2025 at 15:47):

This check is not really necessary since addDecl a few line later fails in a similar way (aside that it is probably more efficient to do this check before doing a kernel check?). In the name_generator.in example, we get "Error: kernel type check failed: (kernel) declaration has metavariables '[anonymous]'" instead of "Incomplete: contains metavariable(s)" if we remove the pf.hasExprMVar check. I am not sure why we get this error instead of "Incomplete: contains sorry". People with more experience will probably know better than me about the combination of addDecl method with have ... := sorry :)

Auguste Poiroux (Apr 14 2025 at 16:07):

For this branching problem, on top of the {"tactic": "have h : x > 0 := by <your successful proof here>", "proofState": 0} strategy, another way to enforce complete proof check would be to change the repl behavior when a sorry is run through tactic mode.
The general idea would be to remember all open goals in one repl proofstate once we are in tactic mode, in particular including those opened by have-like tactics starting new subproofs (calc is another example I am thinking of). This is by opposition to the current way it works where, for each sorry encountered in tactic mode, a new proofstate is instantiated. This way, we remove the branching problem.
From a user point of view, that would be quite nice I believe, however implementation-wise I am not sure how hard it would be to have this.

Matěj Kripner (Apr 15 2025 at 15:00):

@Auguste Poiroux I think the problem stems from the fact that getProofStatus attempts to verify the whole proof (ie whether the whole proof term currently type-checks against the goal in proofState.rootGoals). This way, when we are in the sorry branch of have ... := by sorry, the whole proof term will always have a metavariable in the place of the main proof branch.
Wouldn't it make more sense to instead verify only the new assignment created in the current step? Ie in createProofStepResponse, look at which goals are newly assigned in proofState which weren't assigned in old, and run type checks for them. If all these assignments are correct, the final proof term should be correct as well.
As a bonus, this approach might be faster since the size of the type-checked term is not growing linearly.
Do you think this approach makes sense?

Auguste Poiroux (Apr 15 2025 at 15:24):

Interesting idea, so you mean that in getProofStatus, instead of checking the whole proof up to the rootGoals here, we instead compare to the previous goal? I am not sure how this will behave, but this is definitely worth a try. I am a bit worried that it will introduce false positives. Do you have time to test this idea?

Matěj Kripner (Apr 16 2025 at 16:50):

I think the verification of step (old :: ProofSnapshot, new :: ProofSnapshot) could work something like this:

(1) Look at all goals in old which no longer exist in new. For each one:
- (2) Retrieve it's assignment in new. For each metavariable in the assignment:
  - (3) If it is a goal which exists in new, it's fine and we do nothing since we can assume it will be solved in future steps. In this case, we replace it with sorry in the assignment so that the expression can be sent to kernel.
  - (4) Otherwise, it's an error since there is a metavariable which won't be assigned in the future.
- (5) Send the assignment with metavariables replaced by sorry to kernel for a type check.

I created an initial pull request here: https://github.com/leanprover-community/repl/pull/85, but it's very much work in progress (doesn't work yet). Any improvements are welcome.

Do you think this is a valid approach? If it works, it's probably the cleanest solution and avoids false negatives like https://github.com/leanprover-community/repl/blob/master/test/name_generator.in

Matěj Kripner (Apr 16 2025 at 17:00):

Currently, the problem is that I probably don't understand how terms are assigned during a tactic proof execution. I hoped that when I apply a tactic to a goal G and get a list of goals G_1, ..., G_n, what it means is that we assign to G an expression which contains metavariables G_1, ..., G_n. However, when I use MetavarContext.getExprAssignmentCore to get the expression assigned to G after the tactic and then Meta.collectMVars on the result, I get many more metavariables than how many goals are created.

Auguste Poiroux (Apr 16 2025 at 20:12):

This looks like a good idea to me, but I must admit I will be of limited help here as this reaches my knowledge limits. One thing though is that metavariables are used for other things than just goals, hence I get many more metavariables than how many goals are created.

Matěj Kripner (Apr 17 2025 at 10:28):

Here is a concrete example. REPL input:

{"cmd": "example : 1 = 0 := by sorry"}
{"tactic": "cases 1", "proofState": 0}

The initial goal from sorry is _uniq.65. After the tactic is executed, the two new goals are _uniq.67.56 (zero) and _uniq.67.70 (succ), and the old goal has this assignment (obtained using proofState.metaState.mctx.getExprAssignmentCore?):

_uniq.65 = ?_uniq.67.23 (OfNat.ofNat.{0} Nat 1 (instOfNatNat 1))

Running Meta.collectMVars on this expression yields:

#[_uniq.67.23, _uniq.67.41, _uniq.67.47, _uniq.67.59, _uniq.67.52, _uniq.67.56, _uniq.67.62, _uniq.67.67, _uniq.67.70]

Note that the two new goals (_uniq.67.56 and _uniq.67.70) are listed as expected (and have neither assignment nor delayed assignment). Also note that all other listed mvars except these two have a delayed assignment, specifically:

Delayed assignment for _uniq.67.23: _uniq.67.22
Delayed assignment for _uniq.67.41: _uniq.67.40
Delayed assignment for _uniq.67.47: _uniq.67.46
Delayed assignment for _uniq.67.59: _uniq.67.58
Delayed assignment for _uniq.67.52: _uniq.67.51
Delayed assignment for _uniq.67.62: _uniq.67.61
Delayed assignment for _uniq.67.67: _uniq.67.66

So an updated version of the algorithm looks at all mvars in the assignment E of a goal G which have no assignment or delayed assignment, checks that they are goals in the new proofSnapshot (so that we know they will be assigned later if the proof finishes) and replaces them with sorry so that E can be checked by the kernel.
The problem is that the mvars are not explicitly visible in E - in our caseE = ?_uniq.67.23 (OfNat.ofNat.{0} Nat 1 (instOfNatNat 1)). So we cannot easily replace them with sorry. They are only retrieved by Meta.collectMVars, traversing all the delayed assignments.

My issue is that I don't understand delayed assignment. Maybe we are retrieving ContextInfo incorrectly? How do we "instantiate" the delayed assignments?

@Kim Morrison Do you know someone who could help us here? TLDR is that we are trying to type-check each new goal assignment after each tactic execution so that at the end of the proof we can be sure the whole proof term is correct.

RexWang (Apr 17 2025 at 21:17):

Auguste Poiroux

I think we prefer to have false negatives rather than false positives when doing proof verification, so I wouldn't remove the pf.hasSorry check.

I have a different perspective on this matter.
For validation purposes, command mode can be used to double-check proofs and prevent false positives once the proof is complete. However, false negatives are less acceptable as they might prevent exploration of correct proof paths.
Meanwhile, tactic mode provides considerable performance benefits. We should weigh the trade-off between rigorous validation and computational efficiency.
Benchmark results would help determine whether the additional validation effort justifies the performance cost.

Auguste Poiroux (Apr 17 2025 at 21:36):

I see what you mean. One option is to just ignore the proofStatus attribute I guess, right? Or is your idea to add an option to completely skip the kernel check?

RexWang (Apr 17 2025 at 22:05):

The current changes looks good to me with low overhead. I just wanted to emphasize that we should also take efficiency into consideration while addressing the FPs.

Auguste Poiroux (Apr 17 2025 at 22:30):

I do agree with what you said in the first message. We went from having a few false positives to having a few false negatives instead, which is a bit annoying. Hopefully we can fix them.
Regarding efficiency, I don't know if we can make the kernel check faster. But I do think that having an option to skip the kernel check sounds reasonable.

Matěj Kripner (Apr 18 2025 at 08:10):

I agree that there should be an option to skip the kernel check to preserve low overhead. I also agree that the false negatives should be resolved. Currently, false negatives occur basically every time there is branching (e.g. using have ... := by ...). Concretely, for input:

{"cmd": "example : 1 = 1 := by sorry"}
{"tactic": "have : 2 = 2 := by sorry", "proofState": 0}
{"tactic": "rfl", "proofState": 1}
{"tactic": "rfl", "proofState": 2}

The first rfl closes the have branch, but we get "proofStatus": "Incomplete: contains metavariable(s)" (because the main branch is not resolved in the have branch - note that this would also be the case if we swapped the two rfls since the branches are independent). The second rfl closes the main branch, but we get "proofStatus": "Incomplete: contains sorry" (which can safely be ignored, so that's OK).

I see two ways to resolve this. Either we keep the current approach of type checking the whole proof term, but then we need to substitute sorry for each side branch that gets created so that the proof term does not contain their metavariables. Or we can switch to verifying only the goal assignments that happen in the current step, thus ensuring that the whole proof term is valid as well.

Auguste Poiroux (Apr 18 2025 at 11:22):

To me, regarding the branching issue, there is also a third option we should consider. The idea is to linearize the proofstates, i.e. applying a tactic should yield at most one proofstate. How? By stacking all the goals in one proofState. Concretely, in the example above, after running

{"tactic": "have : 2 = 2 := by sorry", "proofState": 0}

we would get

{'proofStatus': 'Incomplete: open goals remain', 'proofState': 1, 'goals': ['this : 2 = 2\n⊢ 1 = 1', '⊢ 2 = 2']}

instead of

{'sorries': [{'proofState': 1, 'goal': '⊢ 2 = 2'}], 'proofStatus': 'Incomplete: open goals remain', 'proofState': 2, 'goals': ['this : 2 = 2\n⊢ 1 = 1']}

Beyond the potential difficulty of implementing this, there is also the fact that the command mode should act in a similar way. For instance, if a given proof inside a command contains several sorries, then all these sorries must be collected in one proofstate. In short, we should have exactly one proofstate per declaration.

Auguste Poiroux (Apr 18 2025 at 11:28):

Oh I just realized, I think this is roughly what Pantograph does if I am not mistaken @Leni Aniva right? (among other new amazing features)

Leni Aniva (Apr 18 2025 at 16:10):

Auguste Poiroux said:

Oh I just realized, I think this is roughly what Pantograph does if I am not mistaken Leni Aniva right? (among other new amazing features)

yes this is one of the reasons why pantograph exists

Leni Aniva (Apr 18 2025 at 16:30):

Pantograph has a solution for the problem discussed above, and it can verify the proof state at the end as well.

Auguste Poiroux (Apr 21 2025 at 10:58):

@Matěj Kripner @RexWang here is a draft PR attempting to solve the branching problem. It uses an approach similar to Pantograph (I am in fact using the sorryToHole method from Pantograph to achieve that). Here is an example:
Input:

{"cmd": "example : 1 = 1 := by sorry"}

{"tactic": "have h : 2 = 2 := by sorry", "proofState": 0}

{"tactic": "rfl", "proofState": 1}

{"tactic": "rfl", "proofState": 2}

Output:

{"sorries":
 [{"proofState": 0,
   "pos": {"line": 1, "column": 22},
   "goal": "⊢ 1 = 1",
   "endPos": {"line": 1, "column": 27}}],
 "messages":
 [{"severity": "warning",
   "pos": {"line": 1, "column": 0},
   "endPos": {"line": 1, "column": 7},
   "data": "declaration uses 'sorry'"}],
 "env": 0}

{"proofStatus": "Incomplete: open goals remain",
 "proofState": 1,
 "goals": ["this : 2 = 2\n⊢ 1 = 1", "⊢ 2 = 2"]}

{"proofStatus": "Incomplete: open goals remain",
 "proofState": 2,
 "goals": ["⊢ 2 = 2"]}

{"proofStatus": "Completed", "proofState": 3, "goals": []}

This is work in progress, but feel free to try it and suggest improvements :)

Auguste Poiroux (Apr 21 2025 at 15:09):

I was wondering how it is going to behave with all_goals, and it seems to work better than I expected (unless I am missing something):
Input:

{"cmd": "example : 1 = 1 := by sorry"}

{"tactic": "have : 2 = 2 := by sorry", "proofState": 0}

{"tactic": "all_goals rfl", "proofState": 1}

Output:

{"sorries":
 [{"proofState": 0,
   "pos": {"line": 1, "column": 22},
   "goal": "⊢ 1 = 1",
   "endPos": {"line": 1, "column": 27}}],
 "messages":
 [{"severity": "warning",
   "pos": {"line": 1, "column": 0},
   "endPos": {"line": 1, "column": 7},
   "data": "declaration uses 'sorry'"}],
 "env": 0}

{"proofStatus": "Incomplete: open goals remain",
 "proofState": 1,
 "goals": ["this : 2 = 2\n⊢ 1 = 1", "⊢ 2 = 2"]}

{"proofStatus": "Completed", "proofState": 2, "goals": []}

Simon Sorg (Apr 21 2025 at 17:33):

Another way to address branching would be disentangling individual goals completely (apart from Metavariables), and copying goals if needed similar to Aesop. I believe this was the idea used in ABEL.

Auguste Poiroux (Apr 21 2025 at 18:33):

Yes indeed, if I understood well, @Matěj Kripner is trying to implement that. I believe there is room and interest for both approaches. We can probably have both in the REPL, with an attribute to toggle one or the other. What do you think @Kim Morrison?

Auguste Poiroux (Apr 22 2025 at 16:21):

An idea for later: we can use Lean Task to implement parallelized processing of multiple requests. I imagine this could be very useful in tactic mode for example. We could run n tactics in parallel on a goal within a single REPL instance.

Auguste Poiroux (Apr 23 2025 at 23:25):

Just finished implementing the collection of sorry goals into one proof state per declaration in command mode in the draft PR.
Concretely, this means that we should be able to implement Draft, Sketch, and Prove using Lean REPL and applying it to actual Lean projects in the wild.

Example: let's say we have the following file:

File content

Then using {"path": "sketch_file.lean", "sorries": "grouped"}, we get one proof state with 4 goals for add_comm_proved_formal_sketch and one proof state with one goal for add_rfl2.

Auguste Poiroux (Apr 23 2025 at 23:26):

REPL-output

{"sorries":
 [{"proofState": 0,
   "pos": {"line": 1, "column": 70},
   "parentDecl": "add_comm_proved_formal_sketch",
   "goals":
   ["m : Nat\n⊢ 0 + m = m",
    "m : Nat\nh_base : 0 + m = m\nh_symm : m + 0 = m\n⊢ 0 + m = m + 0",
    "m n : Nat\nih : n + m = m + n\n⊢ n + 1 + m = m + (n + 1)",
    "m : Nat\nh_base : 0 + m = m\n⊢ m + 0 = m"],
   "endPos": {"line": 8, "column": 24}},
  {"proofState": 1,
   "pos": {"line": 11, "column": 49},
   "parentDecl": "add_rfl2",
   "goals": ["⊢ ∀ (n m : Nat), n + m = n + m"],
   "endPos": {"line": 11, "column": 57}}],
 "messages":
 [{"severity": "warning",
   "pos": {"line": 1, "column": 8},
   "endPos": {"line": 1, "column": 37},
   "data": "declaration uses 'sorry'"},
  {"severity": "info",
   "pos": {"line": 10, "column": 0},
   "endPos": {"line": 10, "column": 62},
   "data": "Goals accomplished!"},
  {"severity": "warning",
   "pos": {"line": 11, "column": 8},
   "endPos": {"line": 11, "column": 16},
   "data": "declaration uses 'sorry'"}],
 "env": 0}

We can then prove, with kernel check at the end, the two theorems as follows:

Input-commands

{"tactic": "apply Nat.zero_add", "proofState": 0}

{"tactic": "rw [h_base, h_symm]", "proofState": 2}

{"tactic": "rw [Nat.succ_add, Nat.add_succ, ih]", "proofState": 3}

{"tactic": "apply Nat.add_zero", "proofState": 4}

{"tactic": "(\nintros; rfl)", "proofState": 1}

Output

{"proofStatus": "Incomplete: open goals remain",
 "proofState": 2,
 "goals":
 ["m : Nat\nh_base : 0 + m = m\nh_symm : m + 0 = m\n⊢ 0 + m = m + 0",
  "m n : Nat\nih : n + m = m + n\n⊢ n + 1 + m = m + (n + 1)",
  "m : Nat\nh_base : 0 + m = m\n⊢ m + 0 = m"]}

{"proofStatus": "Incomplete: open goals remain",
 "proofState": 3,
 "goals":
 ["m n : Nat\nih : n + m = m + n\n⊢ n + 1 + m = m + (n + 1)",
  "m : Nat\nh_base : 0 + m = m\n⊢ m + 0 = m"]}

{"proofStatus": "Incomplete: open goals remain",
 "proofState": 4,
 "goals": ["m : Nat\nh_base : 0 + m = m\n⊢ m + 0 = m"]}

{"proofStatus": "Completed", "proofState": 5, "goals": []}

{"proofStatus": "Completed", "proofState": 6, "goals": []}

Since the tactic mode also support sketching (i.e. collecting sorry goals), we can do recursive sketching, i.e. applying the Draft, Sketch, and Prove approach recursively on the sub-lemmas it creates.

Frederick Pu (Apr 28 2025 at 15:20):

I created a batch repl command that supports multithreading via the Task monad: https://github.com/leanprover-community/repl/pull/93

Last updated: May 02 2025 at 03:31 UTC