Zulip Chat Archive

Stream: new members

Topic: Code review: proof about Array map

Number Eighteen (Jul 28 2024 at 13:20):

I made a simple implementation of map for FloatArray that needs to be as fast as it can be:

@[specialize, inline]
def FloatArray.mapAux (A C : FloatArray) (f : Float → Float)
                      (k : Nat) (k_le_s : k ≤ A.size) :
    FloatArray :=
  if sz_eq : k = A.size then C else
    FloatArray.mapAux A (C.push (f A[k])) f k.succ
      (Nat.lt_of_le_of_ne k_le_s sz_eq)

@[inline]
def FloatArray.map (A : FloatArray) (f : Float → Float) :
    FloatArray :=
  A.mapAux (FloatArray.mkEmpty A.size) f 0 (Nat.zero_le _)

Now I want to prove that map does not change the size of the array:

theorem float_array_map_size : ∀ (A : FloatArray)
    (f : Float → Float),
    (A.map f).size = A.size := sorry

I suspect things like this are proven by some kind of induction tactic, but I am not very familiar with tactics. So I gave the following proof:

First, understand how push affects size:

theorem arr_inc : ∀ (A : FloatArray) (f: Float),
    (A.push f).size = A.size.succ :=
  fun A f ↦ by
    unfold FloatArray.push FloatArray.size
    simp only [Array.size_push]

Next, prove a recursive theorem about mapAux:

theorem float_array_map_size_aux :
    ∀ (A C : FloatArray) (f : Float → Float) (k : Nat),
    (k_le_s : k ≤ A.size) →
    (A.mapAux C f k k_le_s).size = C.size + (A.size - k) :=
  fun A C f k k_le_s =>
    if k_eq : k = A.size then by
      unfold FloatArray.mapAux
      split <;> simp_all only [Nat.sub_self, Nat.add_zero]
    else
      have k_lt_s  : k < A.size      := Nat.lt_of_le_of_ne k_le_s k_eq
      have sk_le_s : k.succ ≤ A.size := Nat.succ_le_of_lt k_lt_s
      have next_k :=
        float_array_map_size_aux A (C.push (f A[k])) f k.succ sk_le_s
      have : C.size + 1 + (A.size - k.succ) = C.size + (A.size - k) := by omega
      by rw [arr_inc C (f A[k]), this] at next_k
         unfold FloatArray.mapAux
         split <;> simp_all only [Nat.le_refl]

Finally apply the recursive theorem to prove what we want:

open FloatArray in
theorem float_array_map_size : ∀ (A : FloatArray)
    (f : Float → Float), (A.map f).size = A.size :=
  fun A f =>
  have aux_eq := float_array_map_size_aux A
    (mkEmpty A.size) f 0 (Nat.zero_le A.size)
  have : (mkEmpty A.size).size = 0 := rfl
  by rw [this, Nat.zero_add] at aux_eq; apply aux_eq

I think there is a much simpler, more streamlined way of proving such things but I could not find it in the docs. Can someone with experience try to simplify the above procedure and suggest tactics/methods for such recursively defined functions?

Michal Wallace (tangentstorm) (Jul 28 2024 at 17:15):

Not really what you're asking for, but here are some small golfs.

for arr_inc:

you generally replace : ∀ x:T , fun x ↦ with just (x:T) : (giving the theorem itself a parameter)
the unfolding isn't necessary. (that wasn't obvious to me, but i deleted it and the proof still worked)

theorem arr_inc (A : FloatArray) (f: Float) : (A.push f).size = A.size.succ :=
  Array.size_push A.data f

for float_arry_map_size:

same thing about the parameters
you can replace some of the explicit parameters with _
apply aux_eq to solve the goal works but exact aux_eq is a little more explicit
then rw [...] at foo; exact foo can be replaced with rwa [...]

theorem float_array_map_size (A : FloatArray) (f : Float → Float)
: (A.map f).size = A.size :=
  open FloatArray in
  have aux_eq := float_array_map_size_aux A _ f 0 _
  have : (mkEmpty A.size).size = 0 := rfl
  by rwa [this, Nat.zero_add] at aux_eq

Number Eighteen (Jul 28 2024 at 17:28):

Thank you! I kept the unfolding because Lean complained in tactic mode, but I should have realized I could just turn to term mode. I also did not know about rwa.

I'd still like to know if there is any way to bypass this pattern completely and give a simple "induction" style proof rather than the explicit recursive call.

Michal Wallace (tangentstorm) (Jul 28 2024 at 17:29):

for float_array_map_size_aux, you can pull out the duplicated unfold and split and put them before the if:

theorem float_array_map_size_aux
    (A C : FloatArray) (f : Float → Float) (k : Nat) (k_le_s : k ≤ A.size)
  : (A.mapAux C f k k_le_s).size = C.size + (A.size - k) := by
    unfold FloatArray.mapAux; split <;>
    if k_eq: k = A.size then
      simp_all only [Nat.sub_self, Nat.add_zero]
    else
      have k_lt_s  : k < A.size      := Nat.lt_of_le_of_ne k_le_s k_eq
      have sk_le_s : k.succ ≤ A.size := Nat.succ_le_of_lt k_lt_s
      have next_k :=
        float_array_map_size_aux A (C.push (f A[k])) f k.succ sk_le_s
      have : C.size + 1 + (A.size - k.succ) = C.size + (A.size - k) := by omega
      rw[arr_inc C (f A[k]), this] at next_k
      simp_all only [Nat.le_refl]

I think you're right about an inductive proof, but I'm not confident that I can produce one. I may give it a shot later. :)

Michal Wallace (tangentstorm) (Jul 28 2024 at 21:55):

So... You could do induction on the List inside the Array, or you could find some natural number that's decreasing and do induction on that.

Your list is growing at each recursive step instead of shrinking, so it's kind of hard to write it as an induction. I think if you rewrote the algorithm to be more amenable to induction, you'd just wind up with map.

On the other hand, you also don't have any explicit numbers that are decreasing... but there is an implicit one: the difference between the length of C and the length of A.

So I rewrote your algorithm slightly. It still has the same logic, but instead of a k that is equal to C.size at each step (and thus growing), I used a d that is equal to A.size - C.size.

Now we can do induction on d. Since mapAux is recursive, the trick is to unfold it a second time and then the induction hypothesis (ih) does the rest of the work.

Here's the complete proof:

@[specialize, inline]
def FloatArray.mapAux (A C:FloatArray) (f:Float→Float) (d:Nat) (hd:C.size+d=A.size)
  : FloatArray :=
    if h: d = 0 then C
    else
      let C' := C.push <| f (A[C.size])
      have : C'.size = C.size.succ := Array.size_push C.data _
      have : C'.size + (d-1) = A.size := by omega
      FloatArray.mapAux A C' f (d-1) this

@[inline]
def FloatArray.map (A:FloatArray) (f:Float→Float) : FloatArray :=
  let C := (FloatArray.mkEmpty A.size)
  have : C.size = 0 := by rfl
  A.mapAux C f A.size (by omega)

open FloatArray

theorem size_mapAux (A: FloatArray) (f: Float→Float) (d:Nat)
  : (C:FloatArray) → (hd:C.size+d=A.size) → (A.mapAux C f d hd).size = A.size := by
    induction d
    unfold mapAux
    case zero => simp
    case succ d' ih =>
      unfold mapAux
      simp_all

theorem size_map (A : FloatArray) (f: Float→Float)
  : A.size = (A.map f).size := by
    unfold FloatArray.map
    have : (mkEmpty A.size).size = 0 := by rfl
    have : (mkEmpty A.size).size + A.size = A.size := by omega
    have aux_eq := size_mapAux A f A.size (mkEmpty A.size) this
    rw [aux_eq]

Michal Wallace (tangentstorm) (Jul 28 2024 at 21:58):

Out of curiosity, is this actually faster than the native map? and if so, why?

Number Eighteen (Jul 29 2024 at 12:46):

FloatArray does not have a map instance; it has ForIn, but it is very hard for me to prove things monadically.

Number Eighteen (Jul 29 2024 at 12:48):

Very nice refactoring, by the way; thanks!

Michal Wallace (tangentstorm) (Jul 29 2024 at 17:48):

No problem.... But it might be worth benchmarking against this:

def FloatArray.map' (x : FloatArray) (f: Float->Float): FloatArray := ⟨ x.data.map f ⟩
def x : FloatArray := ⟨#[0.5, 0.324, 0.65345345]⟩
#eval x.map' (·+1)

Henrik Böving (Jul 29 2024 at 17:49):

Both of these implementations are inherently suboptimal because they allocate even when RC=1. For an optimal implementation you would want to basically do what Array.map is doing.

Michal Wallace (tangentstorm) (Jul 29 2024 at 17:54):

@[inline]
def map {α : Type u} {β : Type v} (f : α → β) (as : Array α) : Array β :=
  Id.run <| as.mapM f

@Henrik Böving huh. I don't know what this means yet, but is there any way to just automatically derive WhateverArray. map (and presumably other Array functions)?

Henrik Böving (Jul 29 2024 at 17:55):

You have to look deeper

Henrik Böving (Jul 29 2024 at 17:55):

https://github.com/leanprover/lean4/blob/702c31b8071269f0052fd1e0fb3891a079a655bd/src/Init/Data/Array/Basic.lean#L280-L309

Henrik Böving (Jul 29 2024 at 17:56):

the mapMUnsafe is the optimal implementation as it is able to modify the array directly in place

Henrik Böving (Jul 29 2024 at 17:57):

And no there is not such a way. Note that FloatArray.data allocates. FloatArray is actually a real unboxed float array (similar to how ByteArray is a unboxed byte array), unlike Array which is an array of pointers for ABI reasons. That's why both calling data or doing the push implementation is suboptimal, both will allocate even if unnecessary.

Michal Wallace (tangentstorm) (Jul 29 2024 at 18:00):

Ah! In looking into this yesterday, I noticed that plain Array is written as a wrapper for List but also annotated with @[extern] something-or-other. Is the idea that the list-based implementation sort of a semantic definition for proofs, and then at compile time, that implementation gets replaced with a completely different implementation in C that the dev team is asserting has the same semantics?

Henrik Böving (Jul 29 2024 at 18:00):

Yes

Henrik Böving (Jul 29 2024 at 18:01):

Similar for FloatArray and ByteArray

Michal Wallace (tangentstorm) (Jul 29 2024 at 18:11):

Interesting.. and in FloatArray and ByteArray I see: -- TODO: avoid code duplication in the future after we improve the compiler. Does this TODO predate the macro system, or is there something planned for the compiler that we're still waiting on?

Henrik Böving (Jul 29 2024 at 18:12):

Well, ideally Array's ABI would be smart enough to not use an array of pointers to uint8_t when it can just be an array of uint8_t directly wouldn't it? That's what this comment is waiting for to happen.

Michal Wallace (tangentstorm) (Jul 29 2024 at 18:23):

Oh I see. So if Array X were more powerful, XArray probably wouldn't need to exist at all, provided X had some way of mapping directly to a chunk of raw memory. Then Array could use an array of some primitive C type under the hood if possible, and fall back to pointers if not.

I would be willing to implement this if someone could help get me oriented and tell me exactly what needs to be done.

(I have fairly extensive experience with this sort of thing. I work professionally with the J and K array languages, and have done a lot of work binding them to other native languages (rust, pascal, godot game engine)... I'm also pretty familiar with compilers)

Kim Morrison (Jul 29 2024 at 23:54):

Thanks for the offer Michal, but this is probably not a good first-Lean-internals project. For projects touching essential internals like this the FRO would either plan to do it internally, or in conjunction with a contributor with an already established track record of collaboration. There are lots of small Array adjacent things that could be better places to start. There is Array adjacent work happening at Batteries, and PRs adding doc-strings to Array functions in Lean would be great. Good benchmarks demonstrating the problems with the current implementation would also be helpful.

Michal Wallace (tangentstorm) (Jul 29 2024 at 23:59):

@Kim Morrison fair enough. I will take a look at the issues list after I wrap up my current project.

Wrenna Robson (Jul 30 2024 at 14:49):

@Kim Morrison out of interest: I've been doing a lot of work with some Array stuff recently and I would be interested in contributing to Batteries-level proofs or adding docstrings and the like. Is there a good place where these efforts are being coordinated or tracked?

Shreyas Srinivas (Jul 30 2024 at 14:56):

there is the #batteries channel.

Kim Morrison (Jul 30 2024 at 23:39):

At present little is being merged to Batteries, but I agree that channel is still a good place to discuss.

Kim Morrison (Jul 30 2024 at 23:40):

I'm soon going to start work on Array in Lean core, and am happy to have pointers to basic material that is conspicuously missing (whether present in Batteries or other libraries).

Shreyas Srinivas (Jul 31 2024 at 12:45):

Will the maintainer bottleneck in batteries be resolved soon?

Number Eighteen (Jul 31 2024 at 14:27):

Henrik Böving said:

Both of these implementations are inherently suboptimal because they allocate even when RC=1. For an optimal implementation you would want to basically do what Array.map is doing.

But the main loop in Array.map is about NonScalar; how would I adapt it to FloatArray?

Henrik Böving (Jul 31 2024 at 14:29):

The main point of the implementation of Array.map is not the non scalar trick, that's just an additional implementation detail because you have ref counted elements. The main point is that it uses set instead of push (or accessing the data, which is even worse). This means that when you have an Array with RC = 1 it will not get copied but instead modified in place. All implementations provided for FloatArray.map in this thread so far will forcibly allocate a new FloatArray which is not necessary.

Number Eighteen (Jul 31 2024 at 14:31):

Henrik Böving said:

The main point of the implementation of Array.map is not the non scalar trick, that's just an additional implementation detail because you have ref counted elements. The main point is that it uses set instead of push (or accessing the data, which is even worse). This means that when you have an Array with RC = 1 it will not get copied but instead modified in place. All implementations provided for FloatArray.map in this thread so far will forcibly allocate a new FloatArray which is not necessary.

Last time I tried using set instead of push in my code, I got tremendous slowdowns. I will try again. Also, is this relevant? https://github.com/leanprover/lean4/issues/4699

Number Eighteen (Jul 31 2024 at 14:39):

Right, I cannot get uget and uset to perform in the same vicinity as consecutive pushes; I just get tremendous slowdowns. I'll try to dig deeper and see if I am making mistakes or give a benchmark.

JJ (Aug 29 2024 at 21:07):

Henrik Böving said:

Well, ideally Array's ABI would be smart enough to not use an array of pointers to uint8_t when it can just be an array of uint8_t directly wouldn't it? That's what this comment is waiting for to happen.

i am curious: i have been running into a lot of stuff that is implemented on Array UInt8, but not ByteArray. should i bother with making PRs to lean/batteries, or is ByteArray expected to be replaced with an optimized Array UInt8 in the near future?

Henrik Böving (Aug 29 2024 at 21:22):

We don't even have a Byte type in Lean core so I don't know where you would've seen that.

JJ (Aug 29 2024 at 21:23):

sorry, my bad, i meant Array UInt8. i was working on some code with an abbrev Byte := UInt8 in it earlier today.

Henrik Böving (Aug 29 2024 at 21:23):

I can't comment on batteries. If you do know about code like this in core and can show that it makes a performance difference it might be interesting, otherwise probably just leave it alone.

Henrik Böving (Aug 29 2024 at 21:25):

but yeah looking it up in core there is literally no API that uses it apart from ByteArray itself

JJ (Aug 29 2024 at 21:30):

oh, i mean, would it be desired to implement Array functionality for ByteArray? i would think that if ByteArray is sticking around it might be helpful, but if the goal is to remove it (soonish) in favour of an optimized Array UInt8 then it probably wouldn't be worth the bother

JJ (Aug 29 2024 at 21:30):

given the lack of internal usage sounds like the answer is probably the latter

Henrik Böving (Aug 29 2024 at 21:31):

I don't think so. We're still figuring out the API for Array itself anyways

Last updated: May 02 2025 at 03:31 UTC