Zulip Chat Archive

Stream: general

Topic: aesop failures

Floris van Doorn (Oct 04 2024 at 10:47):

I want to like aesop and start using it more often. But usually when I do there is still some very low-level reasoning that aesop doesn't do which I expected it to do. I thought it might be good to start a list, which can be hopefully used to improve aesop.

here is a first one:

import Mathlib.Tactic.Common
example (P Q : Prop) (a b : ℕ) (h : ¬ b = a) (h2 : ¬ a = b → P ∧ Q) : P := by aesop

Floris van Doorn (Oct 04 2024 at 11:51):

probably quite tricky

import Mathlib.Tactic.Common

example (P : ℕ → Prop) (a b c : ℕ) (h : ¬ a = c) (h2 : ¬ a = b → P a ∧ P b)
    (h3 : ¬ b = c → P b ∧ P c) : P a ∧ P c := by
  -- rw [← or_iff_not_imp_left] at h2 h3; -- aesop works with this uncommented
  aesop

Floris van Doorn (Oct 04 2024 at 11:55):

import Mathlib.Tactic.Common

example (a b c : ℕ) (h : a ≤ b) (h2 : b ≤ c) : a ≤ c := by
  aesop

Kevin Buzzard (Oct 04 2024 at 13:29):

I feel the same way, and my recent example is here.

Bhavik Mehta (Oct 04 2024 at 19:08):

import Aesop

example {α : Type} {p : α → Prop} {y : α} (h : ∀ x, p x → x = y) {x : α} (hy : p x) : p y := by aesop

I sometimes have things like this and I'm surprised aesop struggles, not sure if I'm expecting too much of it though.

Bhavik Mehta (Oct 07 2024 at 13:31):

Another example of a similar form, except this one is minimised from one that game up in practice:

import Aesop

example {α β : Type} {s : α → Prop} {t : β → Prop} {f : α → β} {x : β} (hx : t x)
    (ht : ∀ x, t x → ∃ a, f a = x) :
    ∃ a, t (f a) ∧ f a = x := by
  aesop

Note that doing specialize ht x hx before the aesop makes it work. I wonder if an unsafe 5% specialize would fix these two without breaking much else

Jannis Limperg (Oct 08 2024 at 15:29):

Thank you for the examples! User feedback like this is very valuable. I don't have time right now to make bigger changes to Aesop (other than the ones I'm currently working on), but I'll save this thread and come back to it.

@Floris van Doorn the first two examples could be solved by teaching the rule that applies local hypotheses to apply conjunctions. This is definitely doable. I'll make a student project out of this or implement it when I have a moment.

The third example can be solved by locally adding a low-priority transitivity rule (aesop (add 1% le_trans)). This is not a good global rule because it applies to any goal x < y and introduces metavariables, which slow down Aesop. But adding it locally should be fine. Alternatively, le_trans could be added as a forward rule. Then any hypotheses a < b and b < c in the context would be combined into a < c. This seems generally sensible, but only works for local hyps.

@Bhavik Mehta these examples would be solved with a rule that performs forward reasoning with all hypotheses in the context (up to some depth limit). I'm currently working on a big overhaul of Aesop's forward reasoning and this is something I've wanted to do anyway, so it's going to happen eventually.

Kevin Buzzard (Oct 08 2024 at 15:37):

It seems that for many questions, there's a trick of the form "do aesop (add some trick here)". But the comment above seems to indicate that it's not wise to make a PR to make aesop always do this trick (at least in the example being talked about). My conclusion is that we as aesop users are supposed to start learning these tricks (which is fine by me, indeed right now this is the key take-home message I've learnt from asking questions of the form "why doesn't aesop do this?").

Julian Berman (Oct 08 2024 at 15:41):

Maybe this is the same question as Kevin's, or maybe this is precisely the wrong thread for this, but is there a good reference for the exact opposite of these failures, i.e. "here's some kinds of goals you should expect aesop to handle perfectly in the context of Mathlib"? I know the README covers ideas behind it, and I can of course jump back and forth looking for places where aesop is used in Mathlib, but given I don't feel like I have good intuition for when I should tryaesop I'd love to see like 5 or 6 "this is exactly what it's made for" kinds of goals.

Bhavik Mehta (Oct 08 2024 at 16:19):

I often write things like aesop (add simp [...]), where the [...] are filled in with 1-3 simple lemmas which shouldn't be simp in general, but are what I'd rewrite/apply with if I had to prove this manually. And today I also wrote aesop (add unsafe apply le_of_lt), which is a bad aesop rule for the same le_trans reason as above, but was convenient in my particular case

Heather Macbeth (Oct 08 2024 at 16:45):

Bhavik Mehta said:

Another example of a similar form, except this one is minimised from one that game up in practice:

import Aesop

example {α β : Type} {s : α → Prop} {t : β → Prop} {f : α → β} {x : β} (hx : t x)
    (ht : ∀ x, t x → ∃ a, f a = x) :
    ∃ a, t (f a) ∧ f a = x := by
  aesop

Note that this is solved (quickly) by duper [*] (@Josh Clune's superposition prover). Maybe aesop is just the wrong general-purpose tactic for this kind of problem?

Jannis Limperg (Oct 08 2024 at 17:05):

Yes, duper ought to be much better on pure logic + equality. That's what its calculus excels at.

Jannis Limperg (Oct 08 2024 at 17:12):

Kevin Buzzard said:

My conclusion is that we as aesop users are supposed to start learning these tricks [...].

Yes. By design, stock Aesop is fairly weak. It becomes much stronger if Mathlib adds a bunch of rules, either globally (as is currently done in parts of the library, notably category theory) or locally for specific proofs.

If I can help with this, please let me know! I don't currently have time to experiment with different global rules in Mathlib, but I'd be very happy to advise. (One of these days, I'll also make the Aesop docs more comprehensible. Having everything in one README is maybe not ideal.)

Jannis Limperg (Oct 08 2024 at 17:18):

Julian Berman said:

"here's some kinds of goals you should expect aesop to handle perfectly in the context of Mathlib"?

As a somewhat crude heuristic: if a theorem can be proved by a sequence of

apply foo (apply rules)
have := foo a ... z (forward/destruct rules)
simp_all (simp rules)
cases (cases rules)
subst
ext
rfl
split

then it should be possible to make Aesop prove it.

However, Mathlib could also experiment with additional tactic-based rules. For example, Aesop could be taught to apply linarith, omega or positivity on goals of a specific form. The only question is whether this makes Aesop too slow.

Julian Berman (Oct 08 2024 at 17:30):

And the a..z in the have example are expected to be somehow destructed from the current goal state right? aesop isn't going to synthesize terms that don't appear somewhere?

Jannis Limperg (Oct 08 2024 at 17:44):

The a ... z are hypotheses that already appear in the context (or that Aesop has added to the context through other rules).

Kevin Buzzard (Oct 08 2024 at 18:18):

Jannis Limperg said:

Kevin Buzzard said:

My conclusion is that we as aesop users are supposed to start learning these tricks [...].

Yes. By design, stock Aesop is fairly weak. It becomes much stronger if Mathlib adds a bunch of rules, either globally (as is currently done in parts of the library, notably category theory) or locally for specific proofs.

Yeah I hadn't realised this; I thought that the idea was that people just slowly tagged every lemma in mathlib with aesop and then it could prove everything, so I've never paid too much attention to the actual instructions, I've just typed aesop and then moaned when it didn't work. I had not properly understood how to use the tool so this thread has been very helpful!

Patrick Massot (Oct 08 2024 at 19:48):

I think having actual examples of aesop calls with parameters would be very helpful.

Patrick Massot (Oct 08 2024 at 19:49):

Last time I checked the README, it was non-trivial to understand how to tell aesop about a bunch of lemmas it may use.

Bhavik Mehta (Oct 09 2024 at 11:33):

Heather Macbeth said:

Note that this is solved (quickly) by duper [*] (Josh Clune's superposition prover). Maybe aesop is just the wrong general-purpose tactic for this kind of problem?

Perhaps, although this example came from something where aesop did quite a bit of work (in the form Jannis described earlier), but then "got stuck" at a goal like this, so actually using duper here would mean writing a bunch of tactics before it gets down to a pure logic + equality goal

Last updated: May 02 2025 at 03:31 UTC