Zulip Chat Archive

Stream: lean4

Topic: Vector refactor

Wrenna Robson (Dec 04 2024 at 11:35):

I see there has been a lot of work refactoring the definition of Vector in recent months. This has totally broken some code I had but it is a major improvement, so that's exciting. Is there a plan to do more work here, e.g. add more API to Vector?

François G. Dorais (Dec 04 2024 at 13:13):

Yes, Vector will be moving to Lean 4 core soon.

Wrenna Robson (Dec 04 2024 at 13:13):

Thanks :)

Wrenna Robson (Dec 04 2024 at 14:34):

If it is of interest, I have a fairly complete treatment of the following:

/--
A `VectorPerm n` is a permutation on `n` elements represented by two vectors, which we can
think of as an array of values and a corresponding array of indexes which are inverse to
one another. (One can flip the interpretation of indexes and values, and this is essentially
the inversion operation.)
It is designed to be a more performant version of `Equiv.Perm`.
-/
structure VectorPerm (n : ℕ) where
  /--
  Gives the `VectorPerm` as an vector of size `n`.
  -/
  protected fwdVector : Vector ℕ n
  /--
  Gives the inverse of the `VectorPerm` as a vector of size `n`.
  -/
  protected bwdVector : Vector ℕ n
  protected getElem_fwdVector_lt' :
    ∀ {i : ℕ} (hi : i < n), fwdVector[i] < n := by decide
  protected getElem_bwdVector_lt' :
  ∀ {i : ℕ} (hi : i < n), bwdVector[i] < n := by decide
  protected left_inv' : ∀ {i : ℕ} (hi : i < n),
      bwdVector[fwdVector[i]]'(getElem_fwdVector_lt' hi) = i :=
    by decide

Eric Wieser (Dec 04 2024 at 18:00):

Strictly speaking this is just the efficient subset of equiv.perm, right?

Wrenna Robson (Dec 05 2024 at 12:21):

I suppose you could put it like that, yes - certainly the idea was that it is to Equiv.Perm (Fin n) as Fin n -> ℕ is to Vector ℕ n.

Wrenna Robson (Dec 05 2024 at 12:22):

I have a constructor which just takes in a single vector along with a proof of its members being bounded by n and have no duplicates (both of which are decidable): obviously that means it has to construct the inverse but it is convenient for testing.

Wrenna Robson (Dec 05 2024 at 12:24):

Notably because of the finiteness you only have to show it's the inverse one way.

Wrenna Robson (Dec 05 2024 at 12:26):

Actually getElem_bwdVector_lt' might also be unncessary.

Wrenna Robson (Dec 05 2024 at 17:06):

(Yes, it is.)

Eric Wieser (Dec 05 2024 at 17:11):

I guess I'd be tempted to define the fields above as Vector (Fin n) n so as to drop the need for two of the fields

Wrenna Robson (Dec 05 2024 at 17:12):

That is true - there was a good reason why I didn't do that to start with but I can't remember it now.

Wrenna Robson (Dec 05 2024 at 17:12):

Oh yes - it just makes it a LOT more annoying to manipulate stuff because you get caught in casting hell.

Wrenna Robson (Dec 05 2024 at 17:12):

Also as I say this is sufficient:

structure VectorPerm (n : ℕ) where
  /--
  Gives the `VectorPerm` as an vector of size `n`.
  -/
  protected fwdVector : Vector ℕ n
  /--
  Gives the inverse of the `VectorPerm` as a vector of size `n`.
  -/
  protected bwdVector : Vector ℕ n
  getElem_fwdVector_lt :
    ∀ {i : ℕ} (hi : i < n), fwdVector[i] < n := by decide
  getElem_bwdVector_getElem_fwdVector : ∀ {i : ℕ} (hi : i < n),
      bwdVector[fwdVector[i]]'(getElem_fwdVector_lt hi) = i := by decide

Wrenna Robson (Dec 05 2024 at 17:13):

i.e. you don't need to bound bwdVector, that is a theorem.

Wrenna Robson (Dec 05 2024 at 17:15):

But for example, suppose you want to cast using n = m to VectorPerm m. You can just use the same fwdArray and bwdArray - you don't have to worry about casting the fin values inside them.

Wrenna Robson (Dec 05 2024 at 17:16):

I appreciate I am just saying "trust me" - but some experience with this has convinced me that this is the right way.

Eric Wieser (Dec 05 2024 at 17:21):

At least for defining the basic API it doesn't seem too bad:

import Mathlib

structure VectorPerm (n : ℕ) where
  protected fwd : Vector (Fin n) n
  protected rev : Vector (Fin n) n

  rev_getElem_fwd_getElem (i : Fin n) : rev[fwd[i]] = i := by decide

@[simps!]
def VectorPerm.toPerm {n : ℕ} (e : VectorPerm n) : Equiv.Perm (Fin n) :=
  Equiv.ofLeftInverseOfCardLE
    le_rfl
    (e.fwd[·])
    (e.rev[·])
    e.rev_getElem_fwd_getElem

theorem VectorPerm.fwd_getElem_rev_getElem {n : ℕ} (e : VectorPerm n) (i : Fin n) :
    e.fwd[e.rev[i]] = i :=
  e.toPerm.right_inv i

def VectorPerm.cast {m n : ℕ} (h : m = n) (a : VectorPerm m) : VectorPerm n where
  fwd := (a.fwd.cast h).map <| Fin.cast h
  rev := (a.rev.cast h).map <| Fin.cast h
  rev_getElem_fwd_getElem i := by
    cases h
    simp [Vector.cast]
    erw [a.rev_getElem_fwd_getElem] -- oops, this isn't simp-normal!

Wrenna Robson (Dec 05 2024 at 17:21):

shrug

Wrenna Robson (Dec 05 2024 at 17:21):

I spent a couple months working on this back in the summer - I tried it a bunch of different ways.

Eric Wieser (Dec 05 2024 at 17:22):

I guess my hesitancy is whether the casting hell is the fault of the design, or is the fault of something in Lean/mathlib that we should be fixing instead of working around

Wrenna Robson (Dec 05 2024 at 17:22):

As I say, there's no problem in the basics, it's later you have issues.

Wrenna Robson (Dec 05 2024 at 17:23):

Essentially what I found is that faiiirly often I wanted to re-interpret nats without keeping track of the bounds I had on them.

Wrenna Robson (Dec 05 2024 at 17:23):

however this was all before Vector.

Wrenna Robson (Dec 05 2024 at 17:24):

Also, as I say your definition is stronger than the second one I give there - because you don't need to specify that both are bounded.

Wrenna Robson (Dec 05 2024 at 17:25):

Oh, a good example of the more complicated stuff I mean is when I was working with permutations on Fin (2^n). You really want to avoid doing too much casting with this - in particular because Fin (2^(n + 1)) isn't def eq to everything you might wish it was, casting comes up quite a lot.

Wrenna Robson (Dec 05 2024 at 17:26):

As an aside - man I wish there was a typeclass for "SMul on the right" that wasn't just fiddling around with MulOpposite

Eric Wieser (Dec 05 2024 at 17:31):

Edited above with cast. Note that if you wrap a.rev.cast h).map <| Fin.cast h with a custom definition, you can csimp into a call to (another wrapper around) _root_.cast which is free at runtime, unlike map which is $O(n)$

Wrenna Robson (Dec 05 2024 at 17:32):

Yeah. I mean I do just feel this is nice:

def cast (h : n = m) (a : VectorPerm n) : VectorPerm m where
  fwdVector := a.fwdVector.cast h
  bwdVector := a.bwdVector.cast h
  getElem_fwdVector_lt := fun _ => by
    simp_rw [Vector.getElem_cast, h.symm, getElem_fwdVector, getElem_lt]
  getElem_bwdVector_getElem_fwdVector :=
    fun hi => a.getElem_inv_getElem (hi.trans_eq h.symm)

Wrenna Robson (Dec 05 2024 at 17:33):

But I haven't got time to hash it out now. At some point I am going to make a big PR and I can have the discussion there.

Eric Wieser (Dec 05 2024 at 17:33):

Eric Wieser said:

or is the fault of something in Lean/mathlib that we should be fixing instead of working around

Indeed, we are missing basic things like Vector.cast rfl x = x and Vector.map id x = x

Wrenna Robson (Dec 05 2024 at 17:33):

Oh I mean Vector is missing a LOT.

Wrenna Robson (Dec 05 2024 at 17:34):

We don't even have this:

@[simp] theorem getElem_map {n i : ℕ} (hi : i < n) (f : α → β) (v : Vector α n) :
    (v.map f)[i] = f v[i] := by
  cases v ; simp_rw [map_mk, getElem_mk, Array.getElem_map]

Eric Wieser (Dec 05 2024 at 17:34):

My suggestion would be to contribute things like this to Batteries, and let core upstream them when ready

Eric Wieser (Dec 05 2024 at 17:34):

That way you also don't need to wait for a lean release to use them in your project

Wrenna Robson (Dec 05 2024 at 17:34):

Oh I wouldn't dream of contributing this to Core, yeah.

Wrenna Robson (Dec 05 2024 at 17:35):

Batteries for most of it, some of it is more Mathlib-y I think.

Wrenna Robson (Dec 05 2024 at 17:35):

But you get some nice stuff - i.e. VectorPerm n has a natural (right) action on Vector \alpha n.

Wrenna Robson (Dec 05 2024 at 17:36):

(And of course it ALSO has a nice left action on Vector Nat n).

Eric Wieser (Dec 05 2024 at 17:36):

In the same way that Equiv.perm α and DomMulAct (Equiv.perm ι) already have natural actions on ι → α

Wrenna Robson (Dec 05 2024 at 17:37):

Right, and presumably you could write down some theorem that links that action to the one you would define here.

Eric Wieser (Dec 05 2024 at 17:37):

Or you could just work in the abstract setting once you want to start talking about actions, rather than working with vector

Wrenna Robson (Dec 05 2024 at 17:37):

But the motivation for all this was some absolutely terrible performance on some code that used Equivs, when I realised - aha.

Wrenna Robson (Dec 05 2024 at 17:38):

No, no, I needed this stuff in the concrete setting :)

Eric Wieser (Dec 05 2024 at 17:38):

Right, but you can convert it to the abstract setting mid-proof

Eric Wieser (Dec 05 2024 at 17:38):

I guess if your definitions want to use the actions then you're right

Wrenna Robson (Dec 05 2024 at 17:38):

Yes, but what I needed was to define something that used the abstraction - yes.

Wrenna Robson (Dec 05 2024 at 17:39):

Specifically there's a key algorithm that only gets decent speed by the fact that, well, essentially for a given permutation ir sort of performs the calculations on all values at once and thus ammortizes the cost.

Wrenna Robson (Dec 05 2024 at 17:39):

If you do it in the naive abstracted way, every time you make a call you re-calculate the whole thing, catastrophic.

Eric Wieser (Dec 05 2024 at 17:43):

You can avoid that with

def Equiv.Perm.finCacheImpl {n : ℕ} (a : Equiv.Perm (Fin n)) : Equiv.Perm (Fin n) where
  toFun i := (Vector.ofFn a)[i]
  invFun i := (Vector.ofFn a.symm)[i]
  left_inv i := by simp
  right_inv i := by simp

/-- Cache a fin permutation for runtime efficiency. -/
@[simp]
def Equiv.Perm.finCache {n : ℕ} (a : Equiv.Perm (Fin n)) : Equiv.Perm (Fin n) := a

-- csimp is undocumented, maybe this is backwards
@[csimp]
theorem Equiv.Perm.finCache_eq : @Equiv.Perm.finCache = @Equiv.Perm.finCacheImpl := by
  ext n a i
  simp [finCacheImpl]

Wrenna Robson (Dec 05 2024 at 17:44):

That's quite nice, yes.

Wrenna Robson (Dec 05 2024 at 17:44):

What is csimp?

Eric Wieser (Dec 05 2024 at 17:45):

@[csimp] theorem foo : A = B means "use A/B when compiling B/A", though I forget which way around it is

Wrenna Robson (Dec 05 2024 at 17:56):

Eric Wieser said:

In the same way that Equiv.perm α and DomMulAct (Equiv.perm ι) already have natural actions on ι → α

Though this isn't quite the same (or rather, it naturally acts on Vector Nat n rather than just on Vector (Fin n) n)!

Eric Wieser (Dec 05 2024 at 17:59):

That's the DomMulAct example as ι := Fin n and α := Nat

Wrenna Robson (Dec 05 2024 at 18:01):

Right, but I'm talking about the action on values not indices (I guess that might be the first one?)

Eric Wieser (Dec 05 2024 at 18:01):

I would argue that the action on Vector Nat n that acts on the values less than n and ignores the rest is not natural

Wrenna Robson (Dec 05 2024 at 18:01):

If you have a vector of Nats, you can either shuffle the indices, or map on the values.

Wrenna Robson (Dec 05 2024 at 18:02):

Hmm, I'm not so sure - I think a lot of people would say that, say, "the permutation that switches 0 and 1" is something that can be applied to all the naturals, not just Fin 2

Wrenna Robson (Dec 05 2024 at 18:03):

Now I mean you could argue that there's what, n + 1 different ways of embedding the symmetry group of Fin n into Fin (n + 1), and this makes one of them "special". I would agree with that. On the other hand, the one it makes special is special.

Wrenna Robson (Dec 05 2024 at 18:04):

But I think it's ultimately a philosophical point. Regardless, in my context I do often need to, say, reinterpret a permutation on 2^n elements as a permutation on 2^(n + 1) elements - so it worked for me.

Eric Wieser (Dec 05 2024 at 18:04):

I think that reinterpretation would probably best be spelt explicitly

Wrenna Robson (Dec 05 2024 at 18:05):

def natPerm (n : ℕ) : VectorPerm n →* Perm ℕ :=
  (Perm.extendDomainHom Fin.equivSubtype).comp (finPerm n : VectorPerm _ →* Equiv.Perm (Fin n))
```

Wrenna Robson (Dec 05 2024 at 18:05):

You can do something like this.

Wrenna Robson (Dec 05 2024 at 18:06):

Where finPerm, to be clear, is the actual isomorphism.

Wrenna Robson (Dec 05 2024 at 18:31):

Here is an example of something which I think might be something more of a headache to do if you do use Fin n

def castIfgetElemLtOfLt (a : VectorPerm n) (ha : ∀ {i : ℕ}, i < m → ∀ {hi : i < n}, a[i] < m) :
    VectorPerm m where
  fwdVector := (Vector.range m).map (fun i => a[i]?.getD i)
  bwdVector := (Vector.range m).map (fun i => a⁻¹[i]?.getD i)
  getElem_fwdVector_lt := by
    simp_rw [Vector.getElem_map, Vector.getElem_range, getElem?_def]
    intro i him
    rcases lt_or_le i n with hin | hin
    · simp only [hin, dite_true, Option.getD_some, ha him]
    · simp only [hin.not_lt, dite_false, Option.getD_none, him]
  getElem_bwdVector_getElem_fwdVector := by
    simp_rw [Vector.getElem_map, Vector.getElem_range, getElem?_def]
    intro i _
    rcases lt_or_le i n with hin | hin
    · simp only [hin, dite_true, Option.getD_some, getElem_lt, getElem_inv_getElem]
    · simp only [hin.not_lt, dite_false, Option.getD_none]

Kim Morrison (Dec 06 2024 at 01:24):

Wrenna Robson said:

We don't even have this:

@[simp] theorem getElem_map {n i : ℕ} (hi : i < n) (f : α → β) (v : Vector α n) :
    (v.map f)[i] = f v[i] := by
  cases v ; simp_rw [map_mk, getElem_mk, Array.getElem_map]

Apologies about the state of Vector, I am trying to fill it in asap. This lemma (all the getElem_X lemmas for basic operations) is in https://github.com/leanprover/lean4/pull/6324.

François G. Dorais (Dec 06 2024 at 02:32):

As illustrated by lean4#6234, the by cases <that_vector>; simp strategy should work in most cases to prove missing lemmas. Of course, please PR such missing lemmas to Batteries (they may then be moved to core, as needed).

Wrenna Robson (Dec 06 2024 at 06:48):

Kim Morrison said:

Wrenna Robson said:
We don't even have this:
@[simp] theorem getElem_map {n i : ℕ} (hi : i < n) (f : α → β) (v : Vector α n) :
    (v.map f)[i] = f v[i] := by
  cases v ; simp_rw [map_mk, getElem_mk, Array.getElem_map]
Apologies about the state of Vector, I am trying to fill it in asap. This lemma (all the getElem_X lemmas for basic operations) is in https://github.com/leanprover/lean4/pull/6324.

It's no trouble - I figured it was being worked on!

Wrenna Robson (Dec 06 2024 at 09:42):

@François G. Dorais Is there a specification somewhere for what fits in Batteries, what fits in Mathlib, and what is (aimed to) fit into core?

Wrenna Robson (Dec 06 2024 at 09:58):

Kim Morrison said:

Wrenna Robson said:
We don't even have this:
@[simp] theorem getElem_map {n i : ℕ} (hi : i < n) (f : α → β) (v : Vector α n) :
    (v.map f)[i] = f v[i] := by
  cases v ; simp_rw [map_mk, getElem_mk, Array.getElem_map]
Apologies about the state of Vector, I am trying to fill it in asap. This lemma (all the getElem_X lemmas for basic operations) is in https://github.com/leanprover/lean4/pull/6324.

I've added a single comment to this based on a thing I just noticed - in my experience it seems to be essential that the bounds in getElem lemmas match defeq to what is needed, otherwise you can have issues in automation. I just had simp choke on your getElem_take lemma for this reason.

François G. Dorais (Dec 07 2024 at 01:21):

Wrenna Robson said:

François G. Dorais Is there a specification somewhere for what fits in Batteries, what fits in Mathlib, and what is (aimed to) fit into core?

It's not that well delineated. The advantage of submitting to Batteries is that it will be available right after merge whereas with Lean core you need to wait for the appropriate release (usually a month right now but possibly longer in the future).

Much like Lean core, Batteries has stricter standards than Mathlib. Code in Batteries (and Lean core) should be efficient and suitable for "industrial use". That said, don't be shy and submit a PR to Batteries if you think it's suitable. Maintainers will work with you to get it up to par before merging, or they will explain why it is not actually suitable for Batteries and recommend a suitable place for your code.

James Gallicchio (Feb 18 2025 at 07:01):

I think there is a typo in a lemma from lean4#6324 , specifically Vector.getElem_take. The type of hi : i < min n m when it should be hi : i < min m n. In the lemma definition, get_elem_tactic fearlessly fills in a nontrivial proof, which means hi is not resolved by unification! Annoyingly, this only becomes noticeable when the goal is hard enough for get_elem_tactic to fail.

Not-very-minimized example:

example (k₁ k₂ : Vector (Fin (s+2)) (n+2)) {j₁} (h : j₁ < n+1) :
  (k₁.take (n + 1))[j₁]'(by simp_all) = (k₂.take (n + 1))[j₁]'(by simp_all) := by
  simp only [Vector.getElem_take] -- no progress

@[simp] theorem getElem_take' (a : Vector α n) (m : Nat) (hi : i < min m n) :
    (a.take m)[i] = a[i] := by
  cases a
  simp

example (k₁ k₂ : Vector (Fin (s+2)) (n+2)) {j₁} (h : j₁ < n+1) :
  (k₁.take (n + 1))[j₁]'(by simp_all) = (k₂.take (n + 1))[j₁]'(by simp_all) := by
  simp only [getElem_take'] -- progress

Maybe we should stylistically prefer getElem simp lemmas to always use the [...]'... notation, to avoid this kind of bug? It seems lint-able, which could be worthwhile given the huge number of getElem simp lemmas.

James Gallicchio (Feb 24 2025 at 19:44):

reported at lean4#7212 so it doesn't get lost

Kim Morrison (Mar 12 2025 at 03:53):

Fixed (at least the surface symptom) at lean#7449.

Last updated: May 02 2025 at 03:31 UTC