Zulip Chat Archive
Stream: new members
Topic: Why is using Aesop considered a correct or complete proof in
Matherian (Dec 03 2025 at 01:03):
I would like to understand why a proof using aesop is considered correct or complete in Lean. Since aesop automatically searches for a proof, how does Lean ensure that the result is fully verified and not merely heuristic?
metakuntyyy (Dec 03 2025 at 01:07):
Well aesop is a tactic, if the tactic solves the goal then the kernel guarantees that it's correct. You can print the proof of an aesop proof with #print theoremname. I guess the beauty of a small kernel is precisely that you don't have to worry about correctness, if the tactic closes the goal and no goals are left to be proven then you have completed the proof.
Matherian (Dec 03 2025 at 01:58):
metakuntyyy said:
Well aesop is a tactic, if the tactic solves the goal then the kernel guarantees that it's correct. You can print the proof of an aesop proof with #print theoremname. I guess the beauty of a small kernel is precisely that you don't have to worry about correctness, if the tactic closes the goal and no goals are left to be proven then you have completed the proof.
Thank you for your reply. But what if the infoview shows, "aesop: failed to prove the goal after exhaustive search"? Thanks.
metakuntyyy (Dec 03 2025 at 02:05):
Internally a tactic either succeeds or fails, however success and failure are states for further tactics. A tactic that usually fails also fails to close the goal. In that case aesop can't close the goal you are working on. Other tactics, like grind, give you more details why the tactic invocation failed.
I'm not that familiar how aesop works, but I think it can do some normalisation and it can call simp lemmas. If you have a working aesop you could try aesop? to see what aesop calls.
The beauty of tactics is that you don't have to worry too much about it. If a tactic fails you have to manually find lemmas that can simplify the goal, or you can try other tactics, like grind.
If you have a specific example in mind feel free to share it so that we can give you better answers.
TongKe Xue (Dec 03 2025 at 02:08):
Matherian said:
I would like to understand why a proof using
aesopis considered correct or complete in Lean.
My very limited understanding is that
- using
aesopis sound, not because of anything aesop does, but because, as @metakuntyyy mentioned, the Lean Kernel only accepts correct proofs - I do not believe that
aesopis complete; in that I'm 99.99% sure there are statements that are correct and whichaesopwill not generate a proof for you. Example: Fermat's Last Theorem. :-)
Matherian (Dec 03 2025 at 02:43):
metakuntyyy said:
Internally a tactic either succeeds or fails, however success and failure are states for further tactics. A tactic that usually fails also fails to close the goal. In that case aesop can't close the goal you are working on. Other tactics, like grind, give you more details why the tactic invocation failed.
I'm not that familiar how aesop works, but I think it can do some normalisation and it can call simp lemmas. If you have a working aesop you could try aesop? to see what aesop calls.
The beauty of tactics is that you don't have to worry too much about it. If a tactic fails you have to manually find lemmas that can simplify the goal, or you can try other tactics, like grind.
If you have a specific example in mind feel free to share it so that we can give you better answers.
Thank you for your reply. For example, the infoview at this link shows
aesop: failed to prove the goal after exhaustive search.
in many places in the Lean code. Why is it claimed to be proved in Lean? Thanks.
Chris Henson (Dec 03 2025 at 02:54):
aesop can fail to close the entire proof but still make some progress that when followed by further tactics forms a valid proof. This is typically discouraged in code reviews as this is brittle, but it is common in some AI models.
metakuntyyy (Dec 03 2025 at 03:01):
So here's my guess. The proof is correct but written in a very bad style. Aesop is used in a non-terminating way which does something to a goal, or all goals, but it leaves a warning. It appears to be compiling with warnings.
I've tried the code with 4.26-rc2 and it compiles with only one error. So it seems to be more stable than I thought.
But it compiles and that's the important part.
Marx (Dec 03 2025 at 09:22):
I don't know if it helps but I have rewritten the last part of the following proof from the link:
lemma u_seq_monotone {k : ℕ} {d : Fin k → ℕ} (hk : k ≠ 0) (h_ge : ∀ i, 2 ≤ d i) : Monotone (u_seq d) := by
-- By definition of `u_seq`, we know that `u_seq d n` is the minimum value of `d i ^ e_seq d n i` over all `i`.
have h_min : ∀ n, u_seq d n = (Finset.univ.image (fun i => d i ^ (e_seq d n i))).min' (by
exact ⟨ _, Finset.mem_image_of_mem _ ( Finset.mem_univ ⟨ 0, Nat.pos_of_ne_zero hk ⟩ ) ⟩) := by
unfold u_seq; aesop;
refine' le_antisymm _ _ <;> simp_all +decide [ Finset.min' ];
· exact fun i => Classical.choose_spec ( Finset.exists_min_image Finset.univ ( fun i => d i ^ e_seq d n i ) ( Finset.univ_nonempty_iff.mpr ( Fin.pos_iff_nonempty.mp ( Nat.pos_of_ne_zero hk ) ) ) ) |>.2 i ( Finset.mem_univ i );
· exact ⟨ _, le_rfl ⟩
generalize_proofs at *;
-- By definition of `next_e`, we know that `e_seq d (n + 1)` is obtained by increasing one of the components of `e_seq d n` by 1.
have h_next_e : ∀ n i, e_seq d (n + 1) i ≥ e_seq d n i := by
-- By definition of `next_e`, we know that `e_seq d (n + 1)` is obtained by increasing one of the components of `e_seq d n` by 1. Therefore, for any `i`, `e_seq d (n + 1) i` is either `e_seq d n i + 1` (if `i` is the minimum index) or `e_seq d n i` (otherwise).
intros n i
simp [next_e];
rw [ show e_seq d ( n + 1 ) = next_e d ( e_seq d n ) by rfl ] ; unfold next_e; aesop;
rw [ Function.update_apply ] ; aesop;
intro m n hmn; induction hmn <;> aesop;
exact le_trans ( a_ih a_1 ) ( pow_le_pow_right₀ ( by linarith [ h_ge a_1 ] ) ( h_next_e _ _ ) )
to:
lemma u_seq_monotone {k : ℕ} {d : Fin k → ℕ} (hk : k ≠ 0) (h_ge : ∀ i, 2 ≤ d i) : Monotone (u_seq d) := by
-- By definition of `u_seq`, we know that `u_seq d n` is the minimum value of `d i ^ e_seq d n i` over all `i`.
have h_min : ∀ n, u_seq d n = (Finset.univ.image (fun i => d i ^ (e_seq d n i))).min' (by
exact ⟨ _, Finset.mem_image_of_mem _ ( Finset.mem_univ ⟨ 0, Nat.pos_of_ne_zero hk ⟩ ) ⟩) := by
unfold u_seq; aesop;
refine' le_antisymm _ _ <;> simp_all +decide [ Finset.min' ];
· exact fun i => Classical.choose_spec ( Finset.exists_min_image Finset.univ ( fun i => d i ^ e_seq d n i ) ( Finset.univ_nonempty_iff.mpr ( Fin.pos_iff_nonempty.mp ( Nat.pos_of_ne_zero hk ) ) ) ) |>.2 i ( Finset.mem_univ i );
· exact ⟨ _, le_rfl ⟩
generalize_proofs at *;
-- By definition of `next_e`, we know that `e_seq d (n + 1)` is obtained by increasing one of the components of `e_seq d n` by 1.
have h_next_e : ∀ n i, e_seq d (n + 1) i ≥ e_seq d n i := by
-- By definition of `next_e`, we know that `e_seq d (n + 1)` is obtained by increasing one of the components of `e_seq d n` by 1. Therefore, for any `i`, `e_seq d (n + 1) i` is either `e_seq d n i + 1` (if `i` is the minimum index) or `e_seq d n i` (otherwise).
intros n i
simp [next_e];
rw [ show e_seq d ( n + 1 ) = next_e d ( e_seq d n ) by rfl ] ; unfold next_e; aesop;
rw [ Function.update_apply ] ; aesop;
intro m n hmn
-- induction without aesop
induction hmn with
| refl => simp
| step hmn' a_ih =>
rw [h_min, h_min]
rw [h_min, h_min] at a_ih
simp
intros a
simp at *
-- the previous lines do the same as
-- aesop
exact le_trans ( a_ih a ) ( pow_le_pow_right₀ ( by linarith [ h_ge a ] ) ( h_next_e _ _ ) )
As you can see at the last induction, all aesop is doing in this case is rewriting hypotheses, simping and introduction of a. But since it can't close the goal, it prints out the warning that aesop did not successfully close the goal entirely and that the user still has to do stuff to complete the proof.
Notification Bot (Dec 03 2025 at 09:51):
This topic was moved here from #lean4 > Why is using Aesop considered a correct or complete proof in by Patrick Massot.
Kevin Buzzard (Dec 03 2025 at 10:24):
aesop: failed to prove the goal after exhaustive search.
just means "I tried a tactic, it maybe did something but it didn't prove the goal so there is still some more proving to do; we now continue with the proof by trying more tactics". It doesn't mean "I tried a tactic, it didn't prove the goal so we illegally mark the goal as proved".
Matherian (Dec 03 2025 at 10:45):
metakuntyyy said:
So here's my guess. The proof is correct but written in a very bad style. Aesop is used in a non-terminating way which does something to a goal, or all goals, but it leaves a warning. It appears to be compiling with warnings.
I've tried the code with 4.26-rc2 and it compiles with only one error. So it seems to be more stable than I thought.
But it compiles and that's the important part.
Thank you for your reply! How can one get the 4.26-rc2 version you mentioned? Thanks.
Kim Morrison (Dec 03 2025 at 11:12):
Put leanprover/lean4:v4.26.0-rc2 in your lean-toolchain file, and then lake build.
Kim Morrison (Dec 03 2025 at 11:12):
If your project depends on other projects, you will also need to edit your lakefile.toml, and change your dependencies to point at tags that are using that toolchain, and then run lake update.
Kim Morrison (Dec 03 2025 at 11:13):
Finally, if your project depends on Mathlib, you should run lake exe cache get before lake build.
Alfredo Moreira-Rosa (Dec 04 2025 at 14:39):
TongKe Xue said:
Matherian said:
I would like to understand why a proof using
aesopis considered correct or complete in Lean.
- I do not believe that
aesopis complete; in that I'm 99.99% sure there are statements that are correct and whichaesopwill not generate a proof for you. Example: Fermat's Last Theorem. :-)
I don't know where you got the information that aesop is complete. Obviously there are lemmas that it can't prove. But when it finds a proof, it's garantied by the kernel that it's correct.
And you can extract the autogenarated proof terms using show_terms
Ruben Van de Velde (Dec 04 2025 at 17:13):
I think we're using "complete" in different ways here. One is the technical meaning of a tactic being complete in the sense that it solves all goals within its domain. The other is that if it succeeds, the proof is complete
Ruben Van de Velde (Dec 04 2025 at 17:16):
And I understand the underlying question to be: if aesop does proof search, how can we be convinced that if it finds a proof today, it'll still find a (the same) proof tomorrow?
Last updated: Dec 20 2025 at 21:32 UTC