Zulip Chat Archive
Stream: std4
Topic: tailrec pmap
James Gallicchio (Jul 04 2023 at 16:36):
I am using pmap in one of my projects that has a relatively large list constant. Since pmap is not tailrec it causes eval to stack overflow. I know this is one of the functions that should be compiled to the identity function, but thoughts on adding a tailrec implementation with a csimp lemma?
James Gallicchio (Jul 04 2023 at 16:43):
std4#176 in case it is desired
Mac Malone (Jul 04 2023 at 20:52):
@James Gallicchio  This inspired me and led me to a generalization of pmap. Like ite vs dite, we can define a memMap function that carries the knowledge that each mapped a is a \mem l and then pmap is just a specific application of that function. Below is a PoC. I am curious what you think of it.
namespace List
/--
`memMap` is like `map` but with a proof that `f` is only applied to elements
of the list.
-/
@[specialize] def memMap (l : List α) (f : (a : α) → a ∈ l → β) :  List β :=
  match l with
  | [] => []
  | a::as =>
    f a (mem_cons.mpr <| .inl rfl) ::
    memMap as fun a h => f a (mem_cons.mpr <| .inr h)
/-- Tail-recursive version of `memMap`. -/
@[inline] def memMapTR (l : List α) (f : (a : α) → a ∈ l → β) :  List β :=
  loop l f []
where
  @[specialize] loop (l : List α) (f : (a : α) → a ∈ l → β) (acc : List β) : List β :=
    match l with
    | [] => acc.reverse
    | a::as =>
      loop as
      (fun a h => f a <| mem_cons.mpr <| .inr h)
      (f a (mem_cons.mpr <| .inl rfl) :: acc)
@[csimp] theorem memMap_eq_memMapTR : @memMap = @memMapTR := by
  funext α β l f
  unfold memMapTR
  suffices ∀ acc, acc.reverse ++ memMap l f = memMapTR.loop l f acc by
    have := this []
    simp at this
    exact this
  intro acc
  induction l generalizing acc with
  | nil => simp [memMap, memMapTR.loop]
  | cons x xs ih =>
    simp [memMap, memMapTR.loop, ←ih]
@[inline] def pmap {p : α → Prop} (f : (a : α) → p a → β) (l : List α) (h : ∀ a ∈ l, p a) : List β :=
  l.memMap fun a mh => f a (h a mh)
Mac Malone (Jul 04 2023 at 20:57):
I guess memMap could also be called dmap to fit with dite, but I am not sure if that is appropriate.
Mario Carneiro (Jul 04 2023 at 20:57):
pmap, memMap and attach are all inter-definable
Mario Carneiro (Jul 04 2023 at 20:59):
performance-wise, I guess we should bias toward using attach, since that one is always O(1) while pmap is more like a map equivalent
Mac Malone (Jul 04 2023 at 21:00):
@Mario Carneiro Oh, I did not see attach. Yeah, everything should probably be phrased with attach as that can be @[extern] as id.
Mario Carneiro (Jul 04 2023 at 21:00):
it can also be implemented_by as unsafeCast
Mac Malone (Jul 04 2023 at 21:02):
It would be neat if the new compiler could have @[id] attribute that works like unsafeCast/@[extern] does a type-safe check to ensure the data layouts are, in fact, identitical.
Mario Carneiro (Jul 04 2023 at 21:02):
if it gets smart enough to do that, it may as well just do the optimization without annotation
Mac Malone (Jul 04 2023 at 21:05):
@Mario Carneiro @[id] would just require checking the function type signatures (at a data level), the compiler may not have enough information to determine the if the function is actually a no-op. Furthermore, it like e.g. @{musttail] would allow for a quick user-level check that an optimization is occurring as expected.
Mac Malone (Jul 04 2023 at 21:09):
@James Gallicchio from Mario's note, it seems like the best implementation of pmap is thus:
@[inline] def pmap {p : α → Prop} (f : ∀ a, p a → β) (l : List α) (h : ∀ a ∈ l, p a) : List β :=
  l.attach.map fun ⟨a, mh⟩ => f a (h a mh)
Mac Malone (Jul 04 2023 at 21:10):
That way we get the tail-recursion for free (and the function is essentially identical to just map).
James Gallicchio (Jul 04 2023 at 21:11):
oh totally. are we cool with just having the unsafe implementedBy in Std? cuz I'll PR that instead in a heartbeat :P
Mac Malone (Jul 04 2023 at 21:12):
It is used a lot in core, so I would assume it is legal in Std?
Mac Malone (Jul 04 2023 at 21:19):
@Mario Carneiro?
Mario Carneiro (Jul 04 2023 at 21:19):
yup
Mario Carneiro (Jul 04 2023 at 21:19):
obviously any implemented_by need to be sound
James Gallicchio (Jul 04 2023 at 21:20):
will PR
Mario Carneiro (Jul 04 2023 at 21:20):
note, what mac has written is the implementation of pmap, not the definition of pmap which should remain as it is
Mario Carneiro (Jul 04 2023 at 21:21):
i.e. both pmap and attach will get implemented_by, with attach implemented as id and pmap implemented using map and attach
James Gallicchio (Jul 04 2023 at 21:22):
we could do a csimp lemma for pmap since it's provably equivalent
Mario Carneiro (Jul 04 2023 at 21:22):
true
Mac Malone (Jul 04 2023 at 21:25):
Mario Carneiro said:
note, what mac has written is the implementation of
pmap, not the definition ofpmapwhich should remain as it is
I am curious about this Mario. What makes attach + map less useful for proofs than pmap's current implementation? I would think they would be pretty similar when unfolded.
James Gallicchio (Jul 04 2023 at 21:27):
attach just unfolds less nicely iirc
James Gallicchio (Jul 04 2023 at 21:28):
like we probably still want a simp lemma that turns it into a pmap
Mario Carneiro (Jul 04 2023 at 21:29):
yeah, pmap has a better recursive structure for reasoning since p and f can stay fixed
Mario Carneiro (Jul 04 2023 at 21:29):
attach is just horrible to work with recursively, you have to cast it on every iteration
Mac Malone (Jul 04 2023 at 22:02):
Ah, make sense then. :thumbs_up: I would have thought there was some way to easily simp away differences, but apparently not.
James Gallicchio (Jul 05 2023 at 01:39):
Pushed a new version. It's a little bit of a hacky order, I define attach in terms of pmap still, but with an unsafe implemented_by, and then I prove a csimp lemma that substitutes attach+map for pmap.
James Gallicchio (Jul 06 2023 at 05:20):
If possible I'd like to get this merged soonish so it's upstreamed to mathlib soonish!
Last updated: May 02 2025 at 03:31 UTC