Zulip Chat Archive

Stream: general

Topic: Mystery: import breaks simp


Malvin Gattinger (Jan 22 2025 at 16:06):

This is continuing the simp problem from @Dhyan Aranha yesterday. We have sort of an MWE, but not for live.lean because we need two files.

Create a file A.lean with this:

import Mathlib

local notation "ℝ²" => EuclideanSpace  (Fin 2)
local notation "Triangle" => Fin 3  ²
local notation "Segment" => Fin 2  ²

def closed_simplex (n : )  : Set (Fin n  ) := {α | ( i, 0  α i)   i, α i = 1}
def closed_hull {n : } (f : Fin n  ²) : Set ² := (fun α   i, α i  f i) '' closed_simplex n

def det (T : Triangle) : 
  := (T 0 1 - T 1 1) * (T 2 0) + (T 1 0 - T 0 0) * (T 2 1) + ((T 0 0) * (T 1 1) - (T 1 0) * (T 0 1))

lemma linear_combination_det_last {n : } {x y : ²} {P : Fin n  ²} {α : Fin n  }
    ( :  i, α i = 1) :
  det (fun | 0 => x | 1 => y | 2 => ( i, α i  P i)) =
   i, (α i * det (fun | 0 => x | 1 => y | 2 => (P i))) := by sorry

A second file called B.lean with this:

import Mathlib

-- Comment this to get a working version
import FormalBook.A

local notation "ℝ²" => EuclideanSpace  (Fin 2)
local notation "Triangle" => Fin 3  ²
local notation "Segment" => Fin 2  ²

-- This code replaces "import FormalBook.A" and makes the last simp work.
/-
def closed_simplex (n : ℕ)  : Set (Fin n → ℝ) := {α | (∀ i, 0 ≤ α i) ∧ ∑ i, α i = 1}
def closed_hull {n : ℕ} (f : Fin n → ℝ²) : Set ℝ² := (fun α ↦ ∑ i, α i • f i) '' closed_simplex n

def det (T : Triangle) : ℝ
  := (T 0 1 - T 1 1) * (T 2 0) + (T 1 0 - T 0 0) * (T 2 1) + ((T 0 0) * (T 1 1) - (T 1 0) * (T 0 1))

lemma linear_combination_det_last {n : ℕ} {x y : ℝ²} {P : Fin n → ℝ²} {α : Fin n → ℝ}
    (hα : ∑ i, α i = 1) :
  det (fun | 0 => x | 1 => y | 2 => (∑ i, α i • P i)) =
  ∑ i, (α i * det (fun | 0 => x | 1 => y | 2 => (P i))) := by sorry
-/

lemma linear_combination_det_first {n : } {y z : ²} {P : Fin n  ²} {α : Fin n  }
    ( :  i, α i = 1) :
  det (fun | 0 => ( i, α i  P i) | 1 => y | 2 => z) =
   i, (α i * det (fun | 0 => (P i) | 1 => y | 2 => z)) := by sorry

lemma linear_combination_det_middle {n : } {x z : ²} {P : Fin n  ²} {α : Fin n  }
    ( :  i, α i = 1) :
  det (fun | 0 => x | 1 => ( i, α i  P i) | 2 => z) =
   i, (α i * det (fun | 0 => x | 1 => (P i) | 2 => z)) := by sorry

 lemma det_zero_perm {T : Triangle} (hT  : det T = 0) :
     i j k, det (fun | 0 => T i | 1 => T j | 2 => T k) = 0 := by sorry

lemma det_0_triangle_imp_triv {T : Triangle} (hT : det T = 0) :
     x y z, x  closed_hull T  y  closed_hull T  z  closed_hull T 
      det (fun | 0 => x | 1 => y | 2 => z) = 0 := by
  intro x y z ⟨_, ⟨_, hαx , hx ⟨_, ⟨_, hαy , hy ⟨_, ⟨_, hαz , hz
  rw [hx,  hy, hz, linear_combination_det_first hαx]
  simp [linear_combination_det_middle hαy]
  simp [linear_combination_det_last hαz] -- this simp breaks
  simp [det_zero_perm hT, mul_zero]

Then note that the second-to-last line simp is not working.

Now, comment out theimport FormalBook.A line and uncomment the "This code replaces ..." block. then it magically works, even though A.lean contains exactly those definitions. :thinking:

Malvin Gattinger (Jan 22 2025 at 16:09):

My first guess was that local notation is the culprit, but it seems not related.

Malvin Gattinger (Jan 22 2025 at 17:56):

Curiously, replacing the last two simp lines with the code that show_term simp ... gives for them works, also with the import. So @Dhyan Aranha if you just want to have a compiling version, use this, but it really should be shortened I guess:

lemma det_0_triangle_imp_triv {T : Triangle} (hT : det T = 0) :
     x y z, x  closed_hull T  y  closed_hull T  z  closed_hull T 
      det (fun | 0 => x | 1 => y | 2 => z) = 0 := by
  intro x y z ⟨_, ⟨_, hαx , hx ⟨_, ⟨_, hαy , hy ⟨_, ⟨_, hαz , hz
  rw [hx,  hy, hz, linear_combination_det_first hαx]
  simp only [linear_combination_det_middle hαy]
  refine
    Eq.mpr
      (id
        (congrArg (fun x  x = 0)
          (Finset.sum_congr (Eq.refl Finset.univ) fun x a 
            congrArg (HMul.hMul (_))
              (Finset.sum_congr (Eq.refl Finset.univ) fun x_1 a 
                congrArg (HMul.hMul (_))
                  ((fun x_0 x_1 x_2 
                      (fun x_0 x_1 x_2  linear_combination_det_last hαz) x_0 x_1 x_2)
                    (T x) (T x_1) T)))))
      ?_ -- this was the breaking simp
  exact
    of_eq_true
      (Eq.trans
        (congrArg (fun x  x = 0)
          (Eq.trans
            (Finset.sum_congr (Eq.refl Finset.univ) fun x a 
              Eq.trans
                (congrArg (HMul.hMul (_))
                  (Eq.trans
                    (Finset.sum_congr (Eq.refl Finset.univ) fun x_1 a 
                      Eq.trans
                        (congrArg (HMul.hMul (_))
                          (Eq.trans
                            (Finset.sum_congr (Eq.refl Finset.univ) fun x_2 a 
                              Eq.trans
                                (congrArg (HMul.hMul (_))
                                  ((fun i j k  det_zero_perm hT i j k) x x_1 x_2))
                                (mul_zero (_)))
                            Finset.sum_const_zero))
                        (mul_zero (_)))
                    Finset.sum_const_zero))
                (mul_zero (_)))
            Finset.sum_const_zero))
        (eq_self 0))

Johan Commelin (Jan 24 2025 at 08:49):

Very confusing! I can reproduce this locally.

Johan Commelin (Jan 24 2025 at 09:00):

I pushed a slight variant to branch#weird-import-simp-breakage

Malvin Gattinger (Jan 24 2025 at 09:01):

Ah, thanks! Could this be a bug or is simp treating lemmas in the same file differently than those imported (via .oleans)?

Notification Bot (Jan 24 2025 at 09:35):

This topic was moved here from #Proofs from the book > Mystery: import breaks simp by Johan Commelin.

Johan Commelin (Jan 24 2025 at 09:36):

I'm moving this thread to a more public channel, to get a few more eyes on it.
I've not yet minimized this behavior so that it does not depend on mathlib. But I doubt that mathlib plays an essential role.

Kevin Buzzard (Jan 24 2025 at 19:12):

The goals just before the breaking simp is applied are different in the two situations. With pp.all true, the local context in each case is 698 lines long in both cases, of which 697 are identical and one of them is different: the diff is

533c533
<             linear_combination_det_first.match_1.{1}
---
>             linear_combination_det_last.match_1.{1}

This is the 13th line of the 177-line goal when printed with pp.all on. It's data so this could explain the difference.

Kevin Buzzard (Jan 24 2025 at 19:17):

With pp.all off, the goals look identical:

 ( x : Fin 3,
    w✝² x *
       i : Fin 3,
        w✝¹ i *
          det fun x_1 
            match x_1 with
            | 0 => T x
            | 1 => T i
            | 2 =>  x : Fin 3, w x  T x) =
  0

but I give the first 15 or so lines of the pp.all output below with an indication of where the divergence takes place.

 @Eq.{1} Real
  (@Finset.sum.{0, 0} (Fin (@OfNat.ofNat.{0} Nat 3 (instOfNatNat 3))) Real Real.instAddCommMonoid
    (@Finset.univ.{0} (Fin (@OfNat.ofNat.{0} Nat 3 (instOfNatNat 3)))
      (Fin.fintype (@OfNat.ofNat.{0} Nat 3 (instOfNatNat 3))))
    fun (x : Fin (@OfNat.ofNat.{0} Nat 3 (instOfNatNat 3))) 
    @HMul.hMul.{0, 0, 0} Real Real Real (@instHMul.{0} Real Real.instMul) (w✝² x)
      (@Finset.sum.{0, 0} (Fin (@OfNat.ofNat.{0} Nat 3 (instOfNatNat 3))) Real Real.instAddCommMonoid
        (@Finset.univ.{0} (Fin (@OfNat.ofNat.{0} Nat 3 (instOfNatNat 3)))
          (Fin.fintype (@OfNat.ofNat.{0} Nat 3 (instOfNatNat 3))))
        fun (i : Fin (@OfNat.ofNat.{0} Nat 3 (instOfNatNat 3))) 
        @HMul.hMul.{0, 0, 0} Real Real Real (@instHMul.{0} Real Real.instMul) (w✝¹ i)
          (det fun (x_1 : Fin (@OfNat.ofNat.{0} Nat 3 (instOfNatNat 3))) 
            linear_combination_det_last.match_1.{1} ---- <-- **this is the change**
              (fun (x : Fin (@OfNat.ofNat.{0} Nat 3 (instOfNatNat 3))) 
                EuclideanSpace.{0, 0} Real (Fin (@OfNat.ofNat.{0} Nat 2 (instOfNatNat 2))))
...

Mauricio Collares (Jan 24 2025 at 19:46):

Great catch! I can confirm that with both functions on the same file, I get an error on the second line of

#print linear_combination_det_last.match_1
#print linear_combination_det_first.match_1

while both commands work when the functions are in separate files (the functions have type (motive : Fin 3 → Sort u) → (x : Fin 3) → (Unit → motive 0) → (Unit → motive 1) → (Unit → motive 2) → motive x). Lean seems to be cleverly reusing an autogenerated function when both definitions are on the same file, probably with the (reasonable but apparently wrong) expectation that it wouldn't have visible side effects.

Kyle Miller (Jan 24 2025 at 20:30):

Yeah, the matcher generation infrastructure has a cache.

It tends to make it so that match in simp lemma statements fairly fragile.

Johan Commelin (Jan 25 2025 at 04:45):

@Kyle Miller Aha, should simp warn about that? Or at least @[simp] (although that wouldn't help here).

Dhyan Aranha (Jan 26 2025 at 14:54):

Sorry, Im really interested in what's going on here but I'm very much a lean novice. Here are a couple of questions i'd like to know the answers to:

1) What does "pp.all" do and when should I use it?
2) What does the sentence "the matcher generation infrastructure has a cache" mean?

Sebastian Ullrich (Jan 26 2025 at 14:59):

A higher-level definition of the behavior is that two match terms with the same structure are not considered definitionally equal if they are defined in independent modules. This usually works out when we use a match in a def and then use it again in theorems about that def, but in other circumstances you should consider wrapping the match in a new def and putting that one in a common module.

Sebastian Ullrich (Jan 26 2025 at 15:02):

This gets more interesting with parallelism because then we can have independently elaborated parts even within a single file. However, as long as we only parallelize proof elaboration (and not e.g. theorem statement elaboration), it should be fine as matches in different proofs should not depend on each other

Johan Commelin (Feb 06 2025 at 16:20):

Sebastian Ullrich said:

A higher-level definition of the behavior is that two match terms with the same structure are not considered definitionally equal if they are defined in independent modules. This usually works out when we use a match in a def and then use it again in theorems about that def, but in other circumstances you should consider wrapping the match in a new def and putting that one in a common module.

Is this expected behaviour, that those two match statements will not be defeq when in independent modules, but will be defeq when in dependent modules? That seems somewhat fragile to me.
Is there something that could be done about this?

Sebastian Ullrich (Feb 06 2025 at 17:00):

For sure the most robust implementation would be for no two match statements to be defeq even if they are syntactically equal. Which as mentioned would also make parallelism easier. But I don't think people would be happy with that either

Johan Commelin (Feb 06 2025 at 17:03):

But could they also always be defeq? I understand this would require special support for match statements.
Could this be some sort of reducibility setting?

Sebastian Ullrich (Feb 06 2025 at 17:07):

I don't know, that would require new code in critical modules with unclear performance impliciations

Eric Wieser (Feb 06 2025 at 19:34):

Sebastian Ullrich said:

But I don't think people would be happy with that either

This would make it impossible to use a calc block or the show tactic if a match is in the goal, right? Because the match in the tactic would not be defeq to the one in the goal state.

Malvin Gattinger (Feb 06 2025 at 19:45):

I tried to make a more minimal example and stumbled upon the following. Is this related or something else? This is just in one file. Naively I would expect that whenever unfold bla; simp works then also simp [bla] should work?

def A : Nat  Nat := fun n => match n with | 0 => 0 | k => k + 10

def B : Nat  Nat := fun n => match n with | 0 => 0 | k => k + 10

theorem same : A = B := rfl -- works fine

theorem same' : A = B := by unfold A; unfold B; simp -- works fine

theorem same_simp_not_working : A = B := by simp [A, B] -- simp made no progress

(Turning the defs into abbrevs makes it work.)

Kyle Miller (Feb 06 2025 at 20:10):

I think this is different. simp makes use of equation lemmas, but unfold uses an unfold lemma.

You can mimic unfold like this:

theorem same_simp_not_working : A = B := by simp [A.eq_unfold, B.eq_unfold]

Last updated: May 02 2025 at 03:31 UTC