Zulip Chat Archive

Stream: lean4

Topic: aesop gets stuck


Scott Morrison (Aug 17 2023 at 11:57):

There appears to be exactly one declaration in Mathlib on which aesop runs forever (apparently without triggering a deterministic timeout).

In Mathlib.GroupTheory.Nilpotent, replacing the proof of nilpotent_center_quotient_ind with := by aesop seem to run forever.

Scott Morrison (Aug 17 2023 at 11:57):

I can investigate, but I also see @Jannis Limperg online and am hoping I can tempt them. :-)

Jannis Limperg (Aug 17 2023 at 11:58):

Am tempted, will investigate. ;)

Jannis Limperg (Aug 17 2023 at 14:01):

This is tricky to debug: various trace options never print anything; set_option maxHeartbeats 1 doesn't do anything (so I guess the infinite loop doesn't allocate). Would gdb be useful, given that Aesop is not compiled? Any other ideas, other than peppering the code with dbg_trace?

Sebastian Ullrich (Aug 17 2023 at 15:03):

May be easiest to temporarily enable precompilation for gdb? I hope nothing broke there...

Jannis Limperg (Aug 17 2023 at 15:54):

I've enabled precompilation and I'm running

lake env gdb --args $(elan which lean) <testfile>.lean

but no Aesop functions show up in the backtrace (it's all from libleanshared.so). What's the invocation that makes them show up?

Jannis Limperg (Aug 17 2023 at 15:56):

Mathlib-free test file btw:

import Aesop

class Group (G : Type _)

structure Subgroup (G : Type _) [Group G]

class HasQuotient (A : outParam <| Type u) (B : Type v)

def HasQuotient.Quotient (A : outParam <| Type u) {B : Type v}
    [HasQuotient A B] (b : B) : Type max u v :=
  sorry

notation:35 G " ⧸ " H:34 => HasQuotient.Quotient G H

instance instHasQuotientSubgroup {α} [Group α] : HasQuotient α (Subgroup α) :=
  sorry

variable [Group G]

instance {H : Subgroup G} : Group (G  H) := sorry

class IsNilpotent (G : Type _) [Group G] : Prop

instance (priority := 100) isNilpotent_of_subsingleton [Subsingleton G] : IsNilpotent G :=
  sorry

instance nilpotent_quotient_of_nilpotent (H : Subgroup G) [_h : IsNilpotent G] :
    IsNilpotent (G  H) :=
  sorry

def center (G : Type _) [Group G] : Subgroup G :=
  sorry

theorem nilpotent_center_quotient_ind {P :  (G) [Group G] [IsNilpotent G], Prop}
    (G : Type _) [Group G] [IsNilpotent G]
    (hbase :  (G) [Group G] [Subsingleton G], P G)
    (hstep :  (G) [Group G] [IsNilpotent G],  _ih : P (G  center G), P G) : P G := by
  aesop (rule_sets [-default])

Sebastian Ullrich (Aug 17 2023 at 16:20):

You need to copy the arguments enabling precompilation from lake build -v

Jannis Limperg (Aug 17 2023 at 18:05):

Ah okay, thanks!

Jannis Limperg (Aug 17 2023 at 18:54):

Sebastian Ullrich said:

You need to copy the arguments enabling precompilation from lake build -v

This worked like a charm. :tada:

The underlying issue, it turns out, was not an infinite loop. Rather, the hypothesis P (G / center G) -> P G can always be applied and doubles the size of the goal, so we get extremely large goals before Aesop runs into any of its limits. I could implement additional safeguards against stuff like this, but I don't think that's super necessary.

Scott Morrison (Aug 17 2023 at 22:20):

Is there a cheap way to detect the goal size?

Scott Morrison (Aug 17 2023 at 22:22):

I would like to get a handle on this. I would really like to be able to reliably run aesop even against "bad goals".

Scott Morrison (Aug 17 2023 at 22:23):

The fact that aesop "succeeds" against every other declaration in the library was very encouraging, and makes me enthusiastic to get over the line.

Scott Morrison (Aug 17 2023 at 22:23):

I guess the alternative is that we implement try_for in core, which others have wanted to a while as well.

Scott Morrison (Aug 17 2023 at 23:35):

In any case I'd be curious if you had specific additional safeguards in mind.

Jannis Limperg (Aug 18 2023 at 09:15):

Goal size is linear in the size of the goal (hehe), but caching should help a lot because goals are usually similar. However, there may be other failure modes which goal size doesn't address.

A timeout would be the most complete solution, but it introduces machine dependence. Nevertheless, I think for your mass-testing use case you'll want a timeout sooner or later anyway. E.g. when you test decision procedures or heavier tactics, they might run into pathological goals as well.


Last updated: Dec 20 2023 at 11:08 UTC