Zulip Chat Archive
Stream: mathlib4
Topic: Slowly-failing solve-by-elim call
Heather Macbeth (Mar 19 2023 at 16:39):
I have somehow written a very slowly-failing decreasing_tactic
extension, featuring a solve-by-elim call. Does an expert (@Mario Carneiro @Scott Morrison ?) have time to debug it for me?
import Mathlib.Algebra.Order.Group.Abs
import Mathlib.Tactic.SolveByElim
open Lean Meta Elab Mathlib Tactic SolveByElim
@[default_instance] instance : SizeOf Int := ⟨Int.natAbs⟩
syntax "apply_decreasing_rules" : tactic
elab_rules : tactic |
`(tactic| apply_decreasing_rules) => do
let cfg := { Config.noBackTracking with failAtMaxDepth := false }
liftMetaTactic fun g => solveByElim.processSyntax cfg false false [] [] #[mkIdent `dummy_label_attr] [g]
theorem Int.sizeOf_lt_sizeOf_iff (m n : ℤ) : sizeOf n < sizeOf m ↔ |n| < |m| := sorry
theorem termination1a (n d : Int) (h1 : ¬d ≤ 0) (h2 : n < 0) :
abs (2 * (n + d) - d + 1) < abs (2 * n - d + 1) := by
sorry
theorem termination2a (n d : Int) (h1 : ¬d ≤ 0) (h2 : ¬n < 0) (h3 : n ≥ d) :
abs (2 * (n - d) - d + 1) < abs (2 * n - d + 1) := by
sorry
theorem termination1b (n d : Int) (h1 : ¬d ≤ 0) (h2 : n < 0) :
sizeOf (2 * (n + d) - d + 1) < sizeOf (2 * n - d + 1) := by
rw [Int.sizeOf_lt_sizeOf_iff]
apply termination1a _ _ h1 h2
theorem termination2b (n d : Int) (h1 : ¬d ≤ 0) (h2 : ¬n < 0) (h3 : n ≥ d) :
sizeOf (2 * (n - d) - d + 1) < sizeOf (2 * n - d + 1) := by
rw [Int.sizeOf_lt_sizeOf_iff]
apply termination2a _ _ h1 h2 h3
-- /-! # Fails-slowly version (uncomment only one of these) -/
-- attribute [dummy_label_attr] termination1a termination2a
-- macro_rules
-- | `(tactic| decreasing_tactic) => `(tactic| simp_wf ; (try rw [Int.sizeOf_lt_sizeOf_iff]) ; apply_decreasing_rules)
/-! # Suceeds-quickly version (uncomment only one of these) -/
attribute [dummy_label_attr] termination1b termination2b
macro_rules
| `(tactic| decreasing_tactic) => `(tactic| simp_wf ; apply_decreasing_rules)
/-! # Test -/
def mod (n d : Int) : Int :=
if d ≤ 0 then
n
else if n < 0 then
mod (n + d) d
else if n ≥ d then
mod (n - d) d
else
n
termination_by _ n d => 2 * n - d + 1
Heather Macbeth (Mar 19 2023 at 16:40):
(Uncomment the "Fails-slowly version" for the problem, uncomment the "Suceeds-quickly version" for correct code that explains why I thought it should work.)
Mario Carneiro (Mar 19 2023 at 17:25):
minimized:
import Mathlib.Algebra.Order.Group.Abs
import Mathlib.Tactic.SolveByElim
theorem termination1a (n d : Int) : abs (n + d - d + 1) < abs (n - d + 1) := sorry
example (n d : Int) : abs (n - d - d + 1) < abs (n - d + 1) := by
solve_by_elim only [termination1a]
Scott Morrison (Mar 19 2023 at 21:36):
Okay, I'll take a look. :-)
Scott Morrison (Mar 19 2023 at 22:39):
Not solve_by_elim
s fault: the problem is there just with apply
:
import Mathlib.Algebra.Order.Group.Abs
theorem termination1a (n d : Int) : abs (n + d - d + 1) < abs (n - d + 1) := sorry
set_option profiler true
example (n d : Int) : abs (n - d - d + 1) < abs (n - d + 1) := by
apply termination1a -- fails, but takes 16.5 seconds to do so
Scott Morrison (Mar 19 2023 at 22:46):
With
set_option profiler true
set_option trace.Meta.isDefEq true
one sees everything is taking alarmingly long.
Scott Morrison (Mar 19 2023 at 22:47):
it's a bit hard to tell who the culprit is, although
[0.145704s] ❌ -(?n + ?d - ?d + 1) =?= -(n - d - d + 1)
occurs many many times.
Heather Macbeth (Mar 20 2023 at 03:30):
Thank you both! Partial further minimization:
import Mathlib.Init.Algebra.Order
import Mathlib.Init.Data.Int.Basic
class Sup (α : Type) where
sup : α → α → α
instance (priority := 100) LinearOrder.toLattice {α : Type} [LinearOrder α] : Sup α :=
{ sup := max }
instance : LinearOrder ℤ where
le := (·≤·)
le_refl := sorry
le_trans := sorry
le_antisymm := sorry
lt := (·<·)
lt_iff_le_not_le := sorry
le_total := sorry
decidable_eq := by infer_instance
decidable_le := by infer_instance
decidable_lt := by infer_instance
class Abs (α : Type) where
abs : α → α
instance (priority := 100) Neg.toHasAbs [Neg α] [Sup α] : Abs α :=
⟨fun a => Sup.sup a (-a)⟩
theorem termination1a (n d : Int) : Abs.abs (n + d - d + 1) < n := sorry
set_option profiler true
example (n d : Int) : Abs.abs (n - d - d + 1) < n := by
apply termination1a -- fails, but takes 44.3 seconds to do so
Heather Macbeth (Mar 20 2023 at 03:52):
Mathlib-free, taking 49.9 seconds to fail:
theorem foo (n d : Int) :
Max.max (n + d - d + 1) (-(n + d - d + 1)) < n :=
sorry
set_option profiler true
example (n d : Int) :
Max.max (n - d - d + 1) (-(n - d - d + 1)) < n := by
apply foo
Scott Morrison (Mar 20 2023 at 08:28):
Even
theorem foo (n d : Int) : (-(d - d + 1)) < n := sorry
set_option profiler true
set_option trace.Meta.isDefEq true
example (n d : Int) : (-(- d - d + 1)) < n := by
apply foo
is ridiculously slow.
Scott Morrison (Mar 20 2023 at 08:37):
The "main branch" of the isDefEq
trace looks like:
[Meta.isDefEq] [9.167012s] ❌ -(?d - ?d + 1) < ?n =?= -(-d - d + 1) < n ▼
[] [0.087260s] ❌ -(?d - ?d + 1) =?= -(-d - d + 1) ▶
[] [9.079658s] ❌ Int.instLTInt.1 (-(?d - ?d + 1)) ?n =?= Int.instLTInt.1 (-(-d - d + 1)) n ▼
[] [9.079602s] ❌ Int.lt (-(?d - ?d + 1)) ?n =?= Int.lt (-(-d - d + 1)) n ▼
[] [0.085383s] ❌ -(?d - ?d + 1) =?= -(-d - d + 1) ▶
[] [8.994142s] ❌ -(?d - ?d + 1) + 1 ≤ ?n =?= -(-d - d + 1) + 1 ≤ n ▼
[] [0.412086s] ❌ -(?d - ?d + 1) + 1 =?= -(-d - d + 1) + 1 ▶
[] [8.581948s] ❌ Int.instLEInt.1 (-(?d - ?d + 1) + 1) ?n =?= Int.instLEInt.1 (-(-d - d + 1) + 1) n ▼
[] [8.581897s] ❌ Int.le (-(?d - ?d + 1) + 1) ?n =?= Int.le (-(-d - d + 1) + 1) n ▼
[] [0.407221s] ❌ -(?d - ?d + 1) + 1 =?= -(-d - d + 1) + 1 ▶
[] [8.174606s] ❌ Int.NonNeg (?n - (-(?d - ?d + 1) + 1)) =?= Int.NonNeg (n - (-(-d - d + 1) + 1)) ▼
[] [8.174537s] ❌ ?n - (-(?d - ?d + 1) + 1) =?= n - (-(-d - d + 1) + 1) ▼
[] [0.000031s] ✅ ?n =?= n ▶
[] [0.415747s] ❌ -(?d - ?d + 1) + 1 =?= -(-d - d + 1) + 1 ▶
[] [7.758657s] ❌ instHSub.1 ?n (-(?d - ?d + 1) + 1) =?= instHSub.1 n (-(-d - d + 1) + 1) ▼
[] [7.758614s] ❌ Sub.sub ?n (-(?d - ?d + 1) + 1) =?= Sub.sub n (-(-d - d + 1) + 1) ▼
[] [0.000013s] ✅ ?n =?= n ▶
[] [0.409153s] ❌ -(?d - ?d + 1) + 1 =?= -(-d - d + 1) + 1 ▶
[] [7.349361s] ❌ Int.instSubInt.1 ?n (-(?d - ?d + 1) + 1) =?= Int.instSubInt.1 n (-(-d - d + 1) + 1) ▼
[] [7.349338s] ❌ Int.sub ?n (-(?d - ?d + 1) + 1) =?= Int.sub n (-(-d - d + 1) + 1) ▼
[] [0.000014s] ✅ ?n =?= n ▶
[] [0.410425s] ❌ -(?d - ?d + 1) + 1 =?= -(-d - d + 1) + 1 ▶
[] [6.938839s] ❌ ?n + -(-(?d - ?d + 1) + 1) =?= n + -(-(-d - d + 1) + 1) ▼
[] [0.000015s] ❌ Int =?= Nat ▶
[] [0.000013s] ✅ ?n =?= n ▶
[] [1.649717s] ❌ -(-(?d - ?d + 1) + 1) =?= -(-(-d - d + 1) + 1) ▶
[] [5.288936s] ❌ instHAdd.1 ?n (-(-(?d - ?d + 1) + 1)) =?= instHAdd.1 n (-(-(-d - d + 1) + 1)) ▼
[] [5.288893s] ❌ Add.add ?n (-(-(?d - ?d + 1) + 1)) =?= Add.add n (-(-(-d - d + 1) + 1)) ▼
[] [0.000016s] ❌ Int =?= Nat ▶
[] [0.000013s] ✅ ?n =?= n ▶
[] [1.538192s] ❌ -(-(?d - ?d + 1) + 1) =?= -(-(-d - d + 1) + 1) ▶
[] [3.750569s] ❌ Int.instAddInt.1 ?n (-(-(?d - ?d + 1) + 1)) =?= Int.instAddInt.1 n (-(-(-d - d + 1) + 1)) ▼
[] [3.750549s] ❌ Int.add ?n (-(-(?d - ?d + 1) + 1)) =?= Int.add n (-(-(-d - d + 1) + 1)) ▼
[] [0.000013s] ✅ ?n =?= n ▶
[] [1.401068s] ❌ -(-(?d - ?d + 1) + 1) =?= -(-(-d - d + 1) + 1) ▶
[] [2.349408s] ❌ match ?n, -(-(?d - ?d + 1) + 1) with
| Int.ofNat m, Int.ofNat n => Int.ofNat (m + n)
| Int.ofNat m, Int.negSucc n => Int.subNatNat m (Nat.succ n)
| Int.negSucc m, Int.ofNat n => Int.subNatNat n (Nat.succ m)
| Int.negSucc m, Int.negSucc n =>
Int.negSucc (Nat.succ (m + n)) =?= match n, -(-(-d - d + 1) + 1) with
| Int.ofNat m, Int.ofNat n => Int.ofNat (m + n)
| Int.ofNat m, Int.negSucc n => Int.subNatNat m (Nat.succ n)
| Int.negSucc m, Int.ofNat n => Int.subNatNat n (Nat.succ m)
| Int.negSucc m, Int.negSucc n => Int.negSucc (Nat.succ (m + n)) ▼
[] [0.000000s] ✅ fun m n => Int =?= fun m n => Int
[] [0.000013s] ✅ ?n =?= n ▶
[] [1.234129s] ❌ -(-(?d - ?d + 1) + 1) =?= -(-(-d - d + 1) + 1) ▶
[] [0.000000s] ✅ fun m n => Int =?= fun m n => Int
[] [0.000013s] ✅ ?n =?= n ▶
[] [1.115088s] ❌ -(-(?d - ?d + 1) + 1) =?= -(-(-d - d + 1) + 1) ▶
[onFailure] [0.000006s] ❌ match ?n, -(-(?d - ?d + 1) + 1) with
| Int.ofNat m, Int.ofNat n => Int.ofNat (m + n)
| Int.ofNat m, Int.negSucc n => Int.subNatNat m (Nat.succ n)
| Int.negSucc m, Int.ofNat n => Int.subNatNat n (Nat.succ m)
| Int.negSucc m, Int.negSucc n =>
Int.negSucc (Nat.succ (m + n)) =?= match n, -(-(-d - d + 1) + 1) with
| Int.ofNat m, Int.ofNat n => Int.ofNat (m + n)
| Int.ofNat m, Int.negSucc n => Int.subNatNat m (Nat.succ n)
| Int.negSucc m, Int.ofNat n => Int.subNatNat n (Nat.succ m)
| Int.negSucc m, Int.negSucc n => Int.negSucc (Nat.succ (m + n))
[onFailure] [0.000005s] ❌ match ?n, -(-(?d - ?d + 1) + 1) with
| Int.ofNat m, Int.ofNat n => Int.ofNat (m + n)
| Int.ofNat m, Int.negSucc n => Int.subNatNat m (Nat.succ n)
| Int.negSucc m, Int.ofNat n => Int.subNatNat n (Nat.succ m)
| Int.negSucc m, Int.negSucc n =>
Int.negSucc (Nat.succ (m + n)) =?= match n, -(-(-d - d + 1) + 1) with
| Int.ofNat m, Int.ofNat n => Int.ofNat (m + n)
| Int.ofNat m, Int.negSucc n => Int.subNatNat m (Nat.succ n)
| Int.negSucc m, Int.ofNat n => Int.subNatNat n (Nat.succ m)
| Int.negSucc m, Int.negSucc n => Int.negSucc (Nat.succ (m + n))
[onFailure] [0.000006s] ❌ Int.NonNeg (?n - (-(?d - ?d + 1) + 1)) =?= Int.NonNeg (n - (-(-d - d + 1) + 1))
[onFailure] [0.000004s] ❌ Int.NonNeg (?n - (-(?d - ?d + 1) + 1)) =?= Int.NonNeg (n - (-(-d - d + 1) + 1))
Scott Morrison (Mar 20 2023 at 08:37):
Note that over and over again we try (and fail) to solve -(?d - ?d + 1) + 1 =?= -(-d - d + 1) + 1
.
Sebastian Ullrich (Mar 20 2023 at 09:33):
This looks like the same issue underlying lean4#2055: the heuristic that tries to unify application arguments before resorting to unfolding the function can lead to exponential behavior without proper caching
Sebastien Gouezel (Mar 20 2023 at 10:09):
If this is indeed solved by lean4#2152, could this very example be added as a test to std4, to avoid future regressions?
Sebastian Ullrich (Mar 20 2023 at 10:20):
It still takes 5.5s on my machine using lean4 master, so the redundant work must be outside of typeclass synthesis in this case. Currently there is no caching of nested defeq problems.
Sebastian Ullrich (Mar 20 2023 at 13:05):
Here is a reference to the identical problem in Coq, apparently unsolved: https://github.com/coq/coq/issues/2264#issuecomment-377633737
Heather Macbeth (Mar 20 2023 at 14:13):
That particular example fails quickly in Lean 3. Would you expect that there do exist examples with this bad behaviour in Lean 3? If so, why are the evil examples not the same between Lean 3 and Lean 4?
Sebastian Ullrich (Mar 20 2023 at 14:19):
Is it also fast in Lean 4? succ
is probably special-cased here by Lean, so this is not a representative example.
Sebastian Ullrich (Mar 20 2023 at 14:21):
Scott's example takes about 1s for me in Lean 3, so noticeably faster, probably because Lean 3 has a cache for mvar-free defeq problems, but the fundamental issue of exponential behavior should be the same
Sebastian Ullrich (Mar 20 2023 at 14:53):
Do you have a sense for how critical this issue is? I'm assuming it affects mostly incomplete in-progress definitions, not committed code in mathlib
Heather Macbeth (Mar 20 2023 at 14:55):
There is a useability issue to have things that mysteriously fail slowly, but I think the concern for mathlib would be how it affects tactics built on top of apply_rules
or solve_by_elim
or aesop
which consist of throwing a family of lemmas at a goal. That's how I encountered it.
Heather Macbeth (Mar 20 2023 at 14:55):
import Aesop
@[aesop safe] theorem foo (n d : Int) : (-(d - d + 1)) < n := sorry
@[aesop unsafe 70%] theorem bar (n d : Int) : (-(-d - d + 1)) < n := sorry
set_option profiler true
example (n d : Int) : (-(- d - d + 1)) < n := by aesop -- 5.67 seconds to succeed
(edit: more pointed example)
Heather Macbeth (Mar 20 2023 at 14:56):
Are there particular classes of lemmas which we should avoid tagging for aesop
/continuity
/etc for this reason?
Sebastian Ullrich (Mar 20 2023 at 15:06):
Very hard to say in general. It depends on how deeply nested the mismatch between the theorem and your goal is. So "flatter" theorems should generally be safer.
Sebastian Ullrich (Mar 20 2023 at 15:09):
Apart from heuristics and optimizations, a more fundamental solution would be to make sure none of these functions can be unfolded in the first place -- the code becomes instantaneous with with_reducible(_and_instances)
. I don't know what the correct reducibility default for Aesop should be /cc @Jannis Limperg.
Sebastian Ullrich (Mar 20 2023 at 15:11):
Probably it depends very much on the topic, with mathematicians being less reliant on reduction than computer scientists
Sebastian Ullrich (Mar 20 2023 at 15:12):
We don't have a profiler metric for Aesop yet btw, do we?
Heather Macbeth (Mar 20 2023 at 15:40):
Sebastian Ullrich said:
Probably it depends very much on the topic, with mathematicians being less reliant on reduction than computer scientists
One data point against that generalization, IIUC: I recently had a bug from trying to use positivity
at reducible transparency, and Gabriel told me that I could only expect that tactic to work at default transparency (so it can unfold instance arguments to get them to match).
Sebastian Ullrich (Mar 20 2023 at 15:50):
Hmm, I would have hoped that "reducible and instances" would be enough for that, but I don't know what declarations are involved
Jannis Limperg (Mar 20 2023 at 20:56):
Sebastian Ullrich said:
Apart from heuristics and optimizations, a more fundamental solution would be to make sure none of these functions can be unfolded in the first place -- the code becomes instantaneous with
with_reducible(_and_instances)
. I don't know what the correct reducibility default for Aesop should be /cc Jannis Limperg.
Also unclear to me. In principle I'm not opposed to what I would call the general Lean philosophy: perform reduction explicitly via simp
and reducible
is the transparency of choice. But I imagine the reality is much more messy. It would be interesting to do some experiments with stuff like this now that there is a decent collection of Aesop calls.
Jannis Limperg (Mar 20 2023 at 20:56):
Sebastian Ullrich said:
We don't have a profiler metric for Aesop yet btw, do we?
Not that I know. Would be much appreciated of course!
Sebastian Ullrich (Mar 21 2023 at 09:04):
You simply need to include a profileitM
call :), see e.g. https://github.com/leanprover-community/mathlib4/blob/fc708bb9da8a15e1fd7666931f264738aca9f7c5/Mathlib/Tactic/NormNum/Core.lean#L521
Sebastian Ullrich (Mar 21 2023 at 09:05):
Note though that any nested existing metrics such as "typeclass inference" will be subtracted from the Aesop metric. I'd like to have a better hierarchical model at some point.
Jannis Limperg (Mar 21 2023 at 10:08):
Will do, thanks!
Last updated: Dec 20 2023 at 11:08 UTC