Zulip Chat Archive

Stream: mathlib4

Topic: Slowly-failing solve-by-elim call

Heather Macbeth (Mar 19 2023 at 16:39):

I have somehow written a very slowly-failing decreasing_tactic extension, featuring a solve-by-elim call. Does an expert (@Mario Carneiro @Scott Morrison ?) have time to debug it for me?

import Mathlib.Algebra.Order.Group.Abs
import Mathlib.Tactic.SolveByElim

open Lean Meta Elab Mathlib Tactic SolveByElim

@[default_instance] instance : SizeOf Int := ⟨Int.natAbs⟩

syntax "apply_decreasing_rules" : tactic

elab_rules : tactic |
    `(tactic| apply_decreasing_rules)  => do
  let cfg := { Config.noBackTracking with failAtMaxDepth := false }
  liftMetaTactic fun g => solveByElim.processSyntax cfg false false [] [] #[mkIdent `dummy_label_attr] [g]

theorem Int.sizeOf_lt_sizeOf_iff (m n : ℤ) : sizeOf n < sizeOf m ↔ |n| < |m| := sorry

theorem termination1a (n d : Int) (h1 : ¬d ≤ 0) (h2 : n < 0) :
    abs (2 * (n + d) - d + 1) < abs (2 * n - d + 1) := by
  sorry

theorem termination2a (n d : Int) (h1 : ¬d ≤ 0) (h2 : ¬n < 0) (h3 : n ≥ d) :
    abs (2 * (n - d) - d + 1) < abs (2 * n - d + 1) := by
  sorry

theorem termination1b (n d : Int) (h1 : ¬d ≤ 0) (h2 : n < 0) :
    sizeOf (2 * (n + d) - d + 1) < sizeOf (2 * n - d + 1) := by
  rw [Int.sizeOf_lt_sizeOf_iff]
  apply termination1a _ _ h1 h2

theorem termination2b (n d : Int) (h1 : ¬d ≤ 0) (h2 : ¬n < 0) (h3 : n ≥ d) :
    sizeOf (2 * (n - d) - d + 1) < sizeOf (2 * n - d + 1) := by
  rw [Int.sizeOf_lt_sizeOf_iff]
  apply termination2a _ _ h1 h2 h3

-- /-! # Fails-slowly version (uncomment only one of these) -/

-- attribute [dummy_label_attr] termination1a termination2a

-- macro_rules
-- | `(tactic| decreasing_tactic) => `(tactic| simp_wf ; (try rw [Int.sizeOf_lt_sizeOf_iff]) ; apply_decreasing_rules)


/-! # Suceeds-quickly version (uncomment only one of these) -/

attribute [dummy_label_attr] termination1b termination2b

macro_rules
| `(tactic| decreasing_tactic) => `(tactic| simp_wf ; apply_decreasing_rules)


/-! # Test -/

def mod (n d : Int) : Int :=
  if d ≤ 0 then
    n
  else if n < 0 then
    mod (n + d) d
  else if n ≥ d then
    mod (n - d) d
  else
    n
termination_by _ n d => 2 * n - d + 1

Heather Macbeth (Mar 19 2023 at 16:40):

(Uncomment the "Fails-slowly version" for the problem, uncomment the "Suceeds-quickly version" for correct code that explains why I thought it should work.)

Mario Carneiro (Mar 19 2023 at 17:25):

minimized:

import Mathlib.Algebra.Order.Group.Abs
import Mathlib.Tactic.SolveByElim

theorem termination1a (n d : Int) : abs (n + d - d + 1) < abs (n - d + 1) := sorry

example (n d : Int) : abs (n - d - d + 1) < abs (n - d + 1) := by
  solve_by_elim only [termination1a]

Scott Morrison (Mar 19 2023 at 21:36):

Okay, I'll take a look. :-)

Scott Morrison (Mar 19 2023 at 22:39):

Not solve_by_elims fault: the problem is there just with apply:

import Mathlib.Algebra.Order.Group.Abs

theorem termination1a (n d : Int) : abs (n + d - d + 1) < abs (n - d + 1) := sorry

set_option profiler true

example (n d : Int) : abs (n - d - d + 1) < abs (n - d + 1) := by
  apply termination1a -- fails, but takes 16.5 seconds to do so

Scott Morrison (Mar 19 2023 at 22:46):

With

set_option profiler true
set_option trace.Meta.isDefEq true

one sees everything is taking alarmingly long.

Scott Morrison (Mar 19 2023 at 22:47):

it's a bit hard to tell who the culprit is, although

[0.145704s] ❌ -(?n + ?d - ?d + 1) =?= -(n - d - d + 1)

occurs many many times.

Heather Macbeth (Mar 20 2023 at 03:30):

Thank you both! Partial further minimization:

import Mathlib.Init.Algebra.Order
import Mathlib.Init.Data.Int.Basic

class Sup (α : Type) where
  sup : α → α → α

instance (priority := 100) LinearOrder.toLattice {α : Type} [LinearOrder α] : Sup α :=
  { sup := max }

instance : LinearOrder ℤ where
  le := (·≤·)
  le_refl := sorry
  le_trans := sorry
  le_antisymm := sorry
  lt := (·<·)
  lt_iff_le_not_le := sorry
  le_total := sorry
  decidable_eq := by infer_instance
  decidable_le := by infer_instance
  decidable_lt := by infer_instance

class Abs (α : Type) where
  abs : α → α

instance (priority := 100) Neg.toHasAbs [Neg α] [Sup α] : Abs α :=
  ⟨fun a => Sup.sup a (-a)⟩

theorem termination1a (n d : Int) : Abs.abs (n + d - d + 1) < n := sorry


set_option profiler true

example (n d : Int) : Abs.abs (n - d - d + 1) < n := by
  apply termination1a -- fails, but takes 44.3 seconds to do so

Heather Macbeth (Mar 20 2023 at 03:52):

Mathlib-free, taking 49.9 seconds to fail:

theorem foo (n d : Int) :
    Max.max (n + d - d + 1) (-(n + d - d + 1)) < n :=
  sorry

set_option profiler true

example (n d : Int) :
    Max.max (n - d - d + 1) (-(n - d - d + 1)) < n := by
  apply foo

Scott Morrison (Mar 20 2023 at 08:28):

Even

theorem foo (n d : Int) : (-(d - d + 1)) < n := sorry

set_option profiler true
set_option trace.Meta.isDefEq true

example (n d : Int) : (-(- d - d + 1)) < n := by
  apply foo

is ridiculously slow.

Scott Morrison (Mar 20 2023 at 08:37):

The "main branch" of the isDefEq trace looks like:

[Meta.isDefEq] [9.167012s] ❌ -(?d - ?d + 1) < ?n =?= -(-d - d + 1) < n ▼
  [] [0.087260s] ❌ -(?d - ?d + 1) =?= -(-d - d + 1) ▶
  [] [9.079658s] ❌ Int.instLTInt.1 (-(?d - ?d + 1)) ?n =?= Int.instLTInt.1 (-(-d - d + 1)) n ▼
    [] [9.079602s] ❌ Int.lt (-(?d - ?d + 1)) ?n =?= Int.lt (-(-d - d + 1)) n ▼
      [] [0.085383s] ❌ -(?d - ?d + 1) =?= -(-d - d + 1) ▶
      [] [8.994142s] ❌ -(?d - ?d + 1) + 1 ≤ ?n =?= -(-d - d + 1) + 1 ≤ n ▼
        [] [0.412086s] ❌ -(?d - ?d + 1) + 1 =?= -(-d - d + 1) + 1 ▶
        [] [8.581948s] ❌ Int.instLEInt.1 (-(?d - ?d + 1) + 1) ?n =?= Int.instLEInt.1 (-(-d - d + 1) + 1) n ▼
          [] [8.581897s] ❌ Int.le (-(?d - ?d + 1) + 1) ?n =?= Int.le (-(-d - d + 1) + 1) n ▼
            [] [0.407221s] ❌ -(?d - ?d + 1) + 1 =?= -(-d - d + 1) + 1 ▶
            [] [8.174606s] ❌ Int.NonNeg (?n - (-(?d - ?d + 1) + 1)) =?= Int.NonNeg (n - (-(-d - d + 1) + 1)) ▼
              [] [8.174537s] ❌ ?n - (-(?d - ?d + 1) + 1) =?= n - (-(-d - d + 1) + 1) ▼
                [] [0.000031s] ✅ ?n =?= n ▶
                [] [0.415747s] ❌ -(?d - ?d + 1) + 1 =?= -(-d - d + 1) + 1 ▶
                [] [7.758657s] ❌ instHSub.1 ?n (-(?d - ?d + 1) + 1) =?= instHSub.1 n (-(-d - d + 1) + 1) ▼
                  [] [7.758614s] ❌ Sub.sub ?n (-(?d - ?d + 1) + 1) =?= Sub.sub n (-(-d - d + 1) + 1) ▼
                    [] [0.000013s] ✅ ?n =?= n ▶
                    [] [0.409153s] ❌ -(?d - ?d + 1) + 1 =?= -(-d - d + 1) + 1 ▶
                    [] [7.349361s] ❌ Int.instSubInt.1 ?n (-(?d - ?d + 1) + 1) =?= Int.instSubInt.1 n (-(-d - d + 1) + 1) ▼
                      [] [7.349338s] ❌ Int.sub ?n (-(?d - ?d + 1) + 1) =?= Int.sub n (-(-d - d + 1) + 1) ▼
                        [] [0.000014s] ✅ ?n =?= n ▶
                        [] [0.410425s] ❌ -(?d - ?d + 1) + 1 =?= -(-d - d + 1) + 1 ▶
                        [] [6.938839s] ❌ ?n + -(-(?d - ?d + 1) + 1) =?= n + -(-(-d - d + 1) + 1) ▼
                          [] [0.000015s] ❌ Int =?= Nat ▶
                          [] [0.000013s] ✅ ?n =?= n ▶
                          [] [1.649717s] ❌ -(-(?d - ?d + 1) + 1) =?= -(-(-d - d + 1) + 1) ▶
                          [] [5.288936s] ❌ instHAdd.1 ?n (-(-(?d - ?d + 1) + 1)) =?= instHAdd.1 n (-(-(-d - d + 1) + 1)) ▼
                            [] [5.288893s] ❌ Add.add ?n (-(-(?d - ?d + 1) + 1)) =?= Add.add n (-(-(-d - d + 1) + 1)) ▼
                              [] [0.000016s] ❌ Int =?= Nat ▶
                              [] [0.000013s] ✅ ?n =?= n ▶
                              [] [1.538192s] ❌ -(-(?d - ?d + 1) + 1) =?= -(-(-d - d + 1) + 1) ▶
                              [] [3.750569s] ❌ Int.instAddInt.1 ?n (-(-(?d - ?d + 1) + 1)) =?= Int.instAddInt.1 n (-(-(-d - d + 1) + 1)) ▼
                                [] [3.750549s] ❌ Int.add ?n (-(-(?d - ?d + 1) + 1)) =?= Int.add n (-(-(-d - d + 1) + 1)) ▼
                                  [] [0.000013s] ✅ ?n =?= n ▶
                                  [] [1.401068s] ❌ -(-(?d - ?d + 1) + 1) =?= -(-(-d - d + 1) + 1) ▶
                                  [] [2.349408s] ❌ match ?n, -(-(?d - ?d + 1) + 1) with
                                      | Int.ofNat m, Int.ofNat n => Int.ofNat (m + n)
                                      | Int.ofNat m, Int.negSucc n => Int.subNatNat m (Nat.succ n)
                                      | Int.negSucc m, Int.ofNat n => Int.subNatNat n (Nat.succ m)
                                      | Int.negSucc m, Int.negSucc n =>
                                        Int.negSucc (Nat.succ (m + n)) =?= match n, -(-(-d - d + 1) + 1) with
                                      | Int.ofNat m, Int.ofNat n => Int.ofNat (m + n)
                                      | Int.ofNat m, Int.negSucc n => Int.subNatNat m (Nat.succ n)
                                      | Int.negSucc m, Int.ofNat n => Int.subNatNat n (Nat.succ m)
                                      | Int.negSucc m, Int.negSucc n => Int.negSucc (Nat.succ (m + n)) ▼
                                    [] [0.000000s] ✅ fun m n => Int =?= fun m n => Int
                                    [] [0.000013s] ✅ ?n =?= n ▶
                                    [] [1.234129s] ❌ -(-(?d - ?d + 1) + 1) =?= -(-(-d - d + 1) + 1) ▶
                                    [] [0.000000s] ✅ fun m n => Int =?= fun m n => Int
                                    [] [0.000013s] ✅ ?n =?= n ▶
                                    [] [1.115088s] ❌ -(-(?d - ?d + 1) + 1) =?= -(-(-d - d + 1) + 1) ▶
                                    [onFailure] [0.000006s] ❌ match ?n, -(-(?d - ?d + 1) + 1) with
                                        | Int.ofNat m, Int.ofNat n => Int.ofNat (m + n)
                                        | Int.ofNat m, Int.negSucc n => Int.subNatNat m (Nat.succ n)
                                        | Int.negSucc m, Int.ofNat n => Int.subNatNat n (Nat.succ m)
                                        | Int.negSucc m, Int.negSucc n =>
                                          Int.negSucc (Nat.succ (m + n)) =?= match n, -(-(-d - d + 1) + 1) with
                                        | Int.ofNat m, Int.ofNat n => Int.ofNat (m + n)
                                        | Int.ofNat m, Int.negSucc n => Int.subNatNat m (Nat.succ n)
                                        | Int.negSucc m, Int.ofNat n => Int.subNatNat n (Nat.succ m)
                                        | Int.negSucc m, Int.negSucc n => Int.negSucc (Nat.succ (m + n))
                                    [onFailure] [0.000005s] ❌ match ?n, -(-(?d - ?d + 1) + 1) with
                                        | Int.ofNat m, Int.ofNat n => Int.ofNat (m + n)
                                        | Int.ofNat m, Int.negSucc n => Int.subNatNat m (Nat.succ n)
                                        | Int.negSucc m, Int.ofNat n => Int.subNatNat n (Nat.succ m)
                                        | Int.negSucc m, Int.negSucc n =>
                                          Int.negSucc (Nat.succ (m + n)) =?= match n, -(-(-d - d + 1) + 1) with
                                        | Int.ofNat m, Int.ofNat n => Int.ofNat (m + n)
                                        | Int.ofNat m, Int.negSucc n => Int.subNatNat m (Nat.succ n)
                                        | Int.negSucc m, Int.ofNat n => Int.subNatNat n (Nat.succ m)
                                        | Int.negSucc m, Int.negSucc n => Int.negSucc (Nat.succ (m + n))
              [onFailure] [0.000006s] ❌ Int.NonNeg (?n - (-(?d - ?d + 1) + 1)) =?= Int.NonNeg (n - (-(-d - d + 1) + 1))
              [onFailure] [0.000004s] ❌ Int.NonNeg (?n - (-(?d - ?d + 1) + 1)) =?= Int.NonNeg (n - (-(-d - d + 1) + 1))

Scott Morrison (Mar 20 2023 at 08:37):

Note that over and over again we try (and fail) to solve -(?d - ?d + 1) + 1 =?= -(-d - d + 1) + 1.

Sebastian Ullrich (Mar 20 2023 at 09:33):

This looks like the same issue underlying lean4#2055: the heuristic that tries to unify application arguments before resorting to unfolding the function can lead to exponential behavior without proper caching

Sebastien Gouezel (Mar 20 2023 at 10:09):

If this is indeed solved by lean4#2152, could this very example be added as a test to std4, to avoid future regressions?

Sebastian Ullrich (Mar 20 2023 at 10:20):

It still takes 5.5s on my machine using lean4 master, so the redundant work must be outside of typeclass synthesis in this case. Currently there is no caching of nested defeq problems.

Sebastian Ullrich (Mar 20 2023 at 13:05):

Here is a reference to the identical problem in Coq, apparently unsolved: https://github.com/coq/coq/issues/2264#issuecomment-377633737

Heather Macbeth (Mar 20 2023 at 14:13):

That particular example fails quickly in Lean 3. Would you expect that there do exist examples with this bad behaviour in Lean 3? If so, why are the evil examples not the same between Lean 3 and Lean 4?

Sebastian Ullrich (Mar 20 2023 at 14:19):

Is it also fast in Lean 4? succ is probably special-cased here by Lean, so this is not a representative example.

Sebastian Ullrich (Mar 20 2023 at 14:21):

Scott's example takes about 1s for me in Lean 3, so noticeably faster, probably because Lean 3 has a cache for mvar-free defeq problems, but the fundamental issue of exponential behavior should be the same

Sebastian Ullrich (Mar 20 2023 at 14:53):

Do you have a sense for how critical this issue is? I'm assuming it affects mostly incomplete in-progress definitions, not committed code in mathlib

Heather Macbeth (Mar 20 2023 at 14:55):

There is a useability issue to have things that mysteriously fail slowly, but I think the concern for mathlib would be how it affects tactics built on top of apply_rules or solve_by_elim or aesop which consist of throwing a family of lemmas at a goal. That's how I encountered it.

Heather Macbeth (Mar 20 2023 at 14:55):

import Aesop

@[aesop safe] theorem foo (n d : Int) : (-(d - d + 1)) < n := sorry
@[aesop unsafe 70%] theorem bar (n d : Int) : (-(-d - d + 1)) < n := sorry

set_option profiler true

example (n d : Int) : (-(- d - d + 1)) < n := by aesop -- 5.67 seconds to succeed

(edit: more pointed example)

Heather Macbeth (Mar 20 2023 at 14:56):

Are there particular classes of lemmas which we should avoid tagging for aesop/continuity/etc for this reason?

Sebastian Ullrich (Mar 20 2023 at 15:06):

Very hard to say in general. It depends on how deeply nested the mismatch between the theorem and your goal is. So "flatter" theorems should generally be safer.

Sebastian Ullrich (Mar 20 2023 at 15:09):

Apart from heuristics and optimizations, a more fundamental solution would be to make sure none of these functions can be unfolded in the first place -- the code becomes instantaneous with with_reducible(_and_instances). I don't know what the correct reducibility default for Aesop should be /cc @Jannis Limperg.

Sebastian Ullrich (Mar 20 2023 at 15:11):

Probably it depends very much on the topic, with mathematicians being less reliant on reduction than computer scientists

Sebastian Ullrich (Mar 20 2023 at 15:12):

We don't have a profiler metric for Aesop yet btw, do we?

Heather Macbeth (Mar 20 2023 at 15:40):

Sebastian Ullrich said:

Probably it depends very much on the topic, with mathematicians being less reliant on reduction than computer scientists

One data point against that generalization, IIUC: I recently had a bug from trying to use positivity at reducible transparency, and Gabriel told me that I could only expect that tactic to work at default transparency (so it can unfold instance arguments to get them to match).

Sebastian Ullrich (Mar 20 2023 at 15:50):

Hmm, I would have hoped that "reducible and instances" would be enough for that, but I don't know what declarations are involved

Jannis Limperg (Mar 20 2023 at 20:56):

Sebastian Ullrich said:

Apart from heuristics and optimizations, a more fundamental solution would be to make sure none of these functions can be unfolded in the first place -- the code becomes instantaneous with with_reducible(_and_instances). I don't know what the correct reducibility default for Aesop should be /cc Jannis Limperg.

Also unclear to me. In principle I'm not opposed to what I would call the general Lean philosophy: perform reduction explicitly via simp and reducible is the transparency of choice. But I imagine the reality is much more messy. It would be interesting to do some experiments with stuff like this now that there is a decent collection of Aesop calls.

Jannis Limperg (Mar 20 2023 at 20:56):

Sebastian Ullrich said:

We don't have a profiler metric for Aesop yet btw, do we?

Not that I know. Would be much appreciated of course!

Sebastian Ullrich (Mar 21 2023 at 09:04):

You simply need to include a profileitM call :), see e.g. https://github.com/leanprover-community/mathlib4/blob/fc708bb9da8a15e1fd7666931f264738aca9f7c5/Mathlib/Tactic/NormNum/Core.lean#L521

Sebastian Ullrich (Mar 21 2023 at 09:05):

Note though that any nested existing metrics such as "typeclass inference" will be subtracted from the Aesop metric. I'd like to have a better hierarchical model at some point.

Jannis Limperg (Mar 21 2023 at 10:08):

Will do, thanks!

Last updated: May 02 2025 at 03:31 UTC