Zulip Chat Archive

Stream: lean4

Topic: recOn and pattern matching, computability

Mario Weitzer (Aug 06 2024 at 08:17):

I'm transitioning from Lean 3 to Lean 4 at the moment and there are a few things which irritate me. I was reading this guide which says

The map function is even more tedious to define by hand than the tail function. We encourage you to try it, using recOn, casesOn and noConfusion.

The map function in the text is defined using patter matching in the following way:

namespace hide

inductive Vector (α : Type u) : Nat → Type u
  | nil  : Vector α 0
  | cons : α → {n : Nat} → Vector α n → Vector α (n+1)
  deriving Repr

namespace Vector

def map2 (f : α → β → γ) : {n : Nat} → Vector α n → Vector β n → Vector γ n
  | 0,   nil,       nil       => nil
  | n+1, cons a as, cons b bs => cons (f a b) (map2 f as bs)

end Vector

end hide

which I then implemented as follows:

def mapAux
{α : Type u} {β: Type v} {γ : Type w} (f : α → β → γ) (m : Nat) (v : Vector α m) :
(n : Nat) → (w : Vector β n) → m = n → Vector γ n :=
  Vector.recOn
  ( motive := fun m _ => (n : Nat) → Vector β n → m = n → Vector γ n)
    v
  ( fun _ _ heq => heq ▸ nil)
  ( fun a m1 _ ih _ w =>
      Vector.casesOn
      ( motive := fun n _ => m1 + 1 = n → Vector γ n)
        w
      ( fun _ => nil)
      ( fun b n1 bs heq => cons (f a b) (ih n1 bs (Nat.succ.inj heq))))

def map1 (f : α → β → γ) : {n : Nat} → Vector α n → Vector β n → Vector γ n :=
  fun v w => mapAux f _ v _ w rfl

I then tried to evaluate the function and to my surprise Lean tells me that

code generator does not support recursor 'hide.Vector.recOn' yet, consider using 'match ... with' and/or structural recursion

So I have to mark both mapAux and map1 as noncomputable. Evaluating

#eval map2 (fun (a : Nat) (b : Nat) => 5 * a + 3 * b) (cons 2 (cons 1 nil)) (cons 2 (cons 1 nil))
--hide.Vector.cons 16 (hide.Vector.cons 8 (hide.Vector.nil))

yields the correct result but

#eval map1 (fun (a : Nat) (b : Nat) => 5 * a + 3 * b) (cons 2 (cons 1 nil)) (cons 2 (cons 1 nil))

doesn't work. I then implemented the same function in Lean 3 which works perfectly fine:

--Lean 3 version (cases_on and rec_on cannot be evaluated in Lean 4)
set_option auto_completion.max_results 10000

universes u v w

inductive Vector (α : Type u) : nat → Type u
| nil {}                              : Vector 0
| cons (a : α) {n : ℕ} (v : Vector n) : Vector (n+1)

infixr ` ::: `:67 := Vector.cons

notation `[ᵥ` l:(foldr `, ` (h t, Vector.cons h t) Vector.nil `]ᵥ`) := l

protected def Vector.repr_aux {α : Type u} [has_repr α] : bool → ∀ (n : nat), Vector α n → string
| b  _ [ᵥ]ᵥ     := ""
| tt _ (x:::xs) := repr x ++ Vector.repr_aux ff _ xs
| ff _ (x:::xs) := ", " ++ repr x ++ Vector.repr_aux ff _ xs

protected def Vector.repr {α : Type u} [has_repr α] : ∀ (n : nat), Vector α n → string
| _ [ᵥ]ᵥ     := "[ᵥ]ᵥ"
| _ (x:::xs) := "[ᵥ" ++ Vector.repr_aux tt _ (x:::xs) ++ "]ᵥ"

instance {α : Type u} [has_repr α] (n : nat) : has_repr (Vector α n) :=
⟨Vector.repr n⟩

namespace Vector

def mapAux {α : Type u} {β: Type v} {γ : Type w} (f : α → β → γ) (m : nat) (v : Vector α m) :
∀ (n : nat), ∀ (w : Vector β n), m = n → Vector γ n :=
  @Vector.rec_on
    _
    ( λ m v, ∀ n, Vector β n → m = n → Vector γ n)
    _
    v
  ( λ n w heq, eq.rec_on heq nil)
  ( λ a m1 as ih n w,
      @Vector.cases_on
        _
      ( λ n w, m1 + 1 = n → Vector γ n)
        _
        w
      ( λ heq, nil)
      ( λ b n1 bs heq, cons (f a b) (ih n1 bs (nat.succ.inj heq))))
/-
begin
  apply Vector.rec_on v,
  { intros n w heq,
    exact eq.rec_on heq nil},
  { intros a m1 as ih n w,
    apply Vector.cases_on w,
    { intros heq,
      exact nil,},
    { intros b n1 bs heq,
      exact cons (f a b) (ih n1 bs (nat.succ.inj heq)),},},
end
-/

def map1 {α : Type u} {β: Type v} {γ : Type w} (f : α → β → γ) :
∀ {n : nat}, Vector α n → Vector β n → Vector γ n :=
  λ _ v w, @mapAux _ _ _ f _ v _ w rfl

#eval map1 (λ (a : nat) (b : nat), 5 * a + 3 * b) (cons 2 (cons 1 nil)) (cons 2 (cons 1 nil))
--[ᵥ16, 8]ᵥ

end Vector

I found this post from the chat archive which is somewhat above my pay grade but I take it that evaluation of recursors isn't implemented yet and isn't a priority. Does this mean that the pattern matching implementation isn't based on recursors but is implemented independently? In Lean 3 everything was translated down to recursors in the end but in Lean 4 the whole relation to type theory seems completely opaque to me. Which is especially obvious when I try (and fail) to prove the following:

example (f : α → β → γ) {n : Nat} (v : Vector α n) (w :Vector β n) : map1 f v w = map2 f v w :=

Does anybody know, how this works? Also, if there are two functions, which are identical, one of which is computable and the other is not (in the way it is defined), wouldn't it be nice to have a way to let Lean know, that it can use the computable version to evaluate the uncomputable one? Shouldn't there even be a general theorem, that for every recursor there is a pattern matching alternative which is functionally identical and thus can be used to evaluate the recursor?

Henrik Böving (Aug 06 2024 at 08:22):

When you do pattern matching your definition gets translated into something called a "PreDefinition" at this point the code path in the Lean compiler splits up:

There is the path that goes to the code generator, this path does not desugar pattern matching into recursors and instead keeps it for generating nice and performant code right away
There is the path that goes into the "logic" of Lean, i.e. what you see when you interact with the proof eco system etc. This path does perform the desugaring into recursors etc.

I did explain in a couple of other posts here why the compiler does not do the reduction to recursors but if you don't want to read up on that the TLDR is that it causes unpredictable and unexpected performance caveats.

Mario Weitzer (Aug 06 2024 at 08:25):

I see, thanks for the very fast reply! That leaves the question how one proves the example from my question.

Henrik Böving (Aug 06 2024 at 08:27):

theorem foo (f : α → β → γ) {n : Nat} (v : Vector α n) (w :Vector β n) : map1 f v w = map2 f v w := by
  match n, v, w with
  | 0, nil, nil => rfl
  | n + 1, cons a as, cons b bs =>
    rw [map2]
    rw [← foo]
    rfl

for example like so

Mario Weitzer (Aug 06 2024 at 08:32):

Thanks a lot! It will take me a while to get used to this way of doing things...

Henrik Böving (Aug 06 2024 at 08:34):

I mean the way to do things is to not use recursors^^

Eric Wieser (Aug 06 2024 at 08:42):

Can compile_inductive% be used to make the original example computable?

Henrik Böving (Aug 06 2024 at 08:44):

Eric Wieser said:

Can compile_inductive% be used to make the original example computable?

How does compile inductive even work? Does it just run into the unpredictable performance issues and ignores this?

Mario Weitzer (Aug 06 2024 at 08:48):

Henrik Böving said:

I mean the way to do things is to not use recursors^^

It seems so, but I really like recursors ;). It always seemed so extremely elegant to me that this basic concept is sufficient to implement basically all of mathematics which is why I always tried to understand things in term mode. It's just somewhat irritating to me that if I print your proof (or map2, for that matter) I see pattern matching which I so far associated with a tactic mode approach.

theorem hide.Vector.foo.{u_3, u_2, u_1} : ∀ {α : Type u_1} {β : Type u_2} {γ : Type u_3} (f : α → β → γ) {n : Nat}
  (v : Vector α n) (w : Vector β n), map1 f v w = map2 f v w :=
fun {α} {β} {γ} f {n} v w ↦
  Vector.brecOn (motive := fun {n} v ↦ ∀ (w : Vector β n), map1 f v w = map2 f v w) v
    (fun {n} v f_1 w ↦
      (match (motive :=
          ∀ (n : Nat) (v : Vector α n) (w : Vector β n),
            Vector.below (motive := fun {n} v ↦ ∀ (w : Vector β n), map1 f v w = map2 f v w) v →
              map1 f v w = map2 f v w)
          n, v, w with
        | 0, nil, nil => fun x ↦ Eq.refl (map1 f nil nil)
        | Nat.succ n, cons a as, cons b bs => fun x ↦
          Eq.mpr (id (congrArg (fun _a ↦ map1 f (cons a as) (cons b bs) = _a) (hide.Vector.map2._eq_2 f n a as b bs)))
            (Eq.mpr (id (congrArg (fun _a ↦ map1 f (cons a as) (cons b bs) = cons (f a b) _a) (Eq.symm (x.fst.fst bs))))
              (Eq.refl (map1 f (cons a as) (cons b bs)))))
        f_1)
    w

I clearly need to think about this more.

Mario Weitzer (Aug 06 2024 at 08:52):

Another thing is that I need to mark something as noncomputable which I know to be perfectly computable which feels just wrong to me. I always understood noncomputable to be a theoretical assessment instead of an assessment of a limitation of Lean.

Henrik Böving (Aug 06 2024 at 08:54):

No, noncomputable has nothing to do with a theoretical assesment and everything to do with what the Lean compiler can (and wants) to translate into a binary

Henrik Böving (Aug 06 2024 at 08:54):

Though sometimes that coincides with things that are also theoretically noncomputable^^

Edward van de Meent (Aug 06 2024 at 08:59):

isn't it possible to make lean generate the pattern match from the recursor?

Henrik Böving (Aug 06 2024 at 09:00):

Yes but as I explained above this is not what you want to do, let me dig out my other posts on recursors

Henrik Böving (Aug 06 2024 at 09:00):

https://leanprover.zulipchat.com/#narrow/stream/113489-new-members/topic/code.20generator.20does.20not.20support.20recursor.20'.2E.2E.2E.2Erec'.20yet/near/449931346

Edward van de Meent (Aug 06 2024 at 09:04):

so to summarize, in cases where you don't (or hardly) use the recursive call, the performance still goes down the drain?

Mario Weitzer (Aug 06 2024 at 09:07):

Henrik Böving said:

No, noncomputable has nothing to do with a theoretical assesment and everything to do with what the Lean compiler can (and wants) to translate into a binary

Yes, but the "and wants" part seem to be a new design choice in Lean 4 because in Lean 3 noncomputable only ever seemed necessary when something was obviously noncomputable in a theoretical sense or Lean needed some further argument to believe that something was indeed computable. Sure, sometimes evaluation took forever or even failed for no obvious reason but I always understood why something like this would happen (in an abstract sense, Lean just followed some inappropriate and slow path of beta reduction). But now, the reason something is noncomputable comes down to design choices which on a first, superficial glance seems to call into question the whole enterprise of founding mathematics on type theory in the first place, if one of the most fundamental constructions (recursors) isn't "fully supported" (don't compute). But I will read your post, maybe I warm up on this approach and the reasons for it.

Edward van de Meent (Aug 06 2024 at 09:08):

it feels to me like the user should be the one to judge if the performance is worth it, not the language... but maybe that is because i hardly ever want/need to write performant code...

Mario Weitzer (Aug 06 2024 at 09:12):

Edward van de Meent said:

it feels to me like the user should be the one to judge if the performance is worth it, not the language... but maybe that is because i hardly ever want to write performant code and just code that works.

Exactly, I completely agree (possibly due to insufficient understanding of the issue). But there seems to be a bias towards establishing Lean as a programming language instead of a proof assistent. I mostly use it in its latter role and in this situation performance doesn't matter to me. It was just nice to be able to do sanity checks by evaluating stuff.

Henrik Böving (Aug 06 2024 at 09:22):

Edward van de Meent said:

it feels to me like the user should be the one to judge if the performance is worth it, not the language... but maybe that is because i hardly ever want to write performant code and just code that works.

No, the language should be designed in such a way that performance behaves in a way that is predictable. Compiling recursors does not have predictable performance. For something like List.isEmpty the user would expect O(1) and gets O(n), this is not okay. For something like a tree structure it gets even worse, you expect O(1) but recurse through the entire tree. This can be fixed if someone puts in the work of writing the proper optimization pass for the compiler, it is not an inherent limitation of our design for Lean 4. Until then it simply remains something that is disabled to protect the user from unexpected behavior. Having language primitives that behave as you would expect them to is a key in being able to reliably write performant code in the first place.

Mario Weitzer said:

Edward van de Meent said:

it feels to me like the user should be the one to judge if the performance is worth it, not the language... but maybe that is because i hardly ever want to write performant code and just code that works.

Exactly, I completely agree (possibly due to insufficient understanding of the issue). But there seems to be a bias towards establishing Lean as a programming language instead of a proof assistent.

Yes, Lean is written in Lean, Lean is being used by for example AWS to write executable software. So as you can see it is already a programming language. This does not mean that we wish to move away from being a proof assistant in the slightest either. Lean 4 has tons of additional improvements over Lean 3 for proof assistant use, both performance wise as well as feature wise.

Mario Weitzer said:

Edward van de Meent said:

it feels to me like the user should be the one to judge if the performance is worth it, not the language... but maybe that is because i hardly ever want to write performant code and just code that works.

I mostly use it in its latter role and in this situation performance doesn't matter to me. It was just nice to be able to do sanity checks by evaluating stuff.

Not being able to #eval recursors does not mean you cannot do this, you can very much still write #reduce and rely on the kernel reduction engine to give you a reduced term if it is just for some quick experimentation. It's just that this will of course not scale if you want to experiment with large terms.

But now, the reason something is noncomputable comes down to design choices which on a first, superficial glance seems to call into question the whole enterprise of founding mathematics on type theory in the first place, if one of the most fundamental constructions (recursors) isn't "fully supported" (don't compute).

I don't see why this should apply. People that write mathematics that is noncomputable because of some reason are not the same people that wish to execute performant code in my experience. So those people can still rely on #reduce if they really feel like evaluating something.

Eric Wieser (Aug 06 2024 at 09:23):

Eric Wieser said:

Can compile_inductive% be used to make the original example computable?

Yes, this makes it work:

import Mathlib

namespace hide

inductive Vector (α : Type u) : Nat → Type u
  | nil  : Vector α 0
  | cons : α → {n : Nat} → Vector α n → Vector α (n+1)
  deriving Repr

compile_inductive% Vector -- ADDED

namespace Vector

def map2 (f : α → β → γ) : {n : Nat} → Vector α n → Vector β n → Vector γ n
  | 0,   nil,       nil       => nil
  | n+1, cons a as, cons b bs => cons (f a b) (map2 f as bs)


def mapAux
{α : Type u} {β: Type v} {γ : Type w} (f : α → β → γ) (m : Nat) (v : Vector α m) :
(n : Nat) → (w : Vector β n) → m = n → Vector γ n :=
  Vector.recOn
  ( motive := fun m _ => (n : Nat) → Vector β n → m = n → Vector γ n)
    v
  ( fun _ _ heq => heq ▸ nil)
  ( fun a m1 _ ih _ w =>
      Vector.casesOn
      ( motive := fun n _ => m1 + 1 = n → Vector γ n)
        w
      ( fun _ => nil)
      ( fun b n1 bs heq => cons (f a b) (ih n1 bs (Nat.succ.inj heq))))

def map1 (f : α → β → γ) : {n : Nat} → Vector α n → Vector β n → Vector γ n :=
  fun v w => mapAux f _ v _ w rfl

#eval map1 (fun (a : Nat) (b : Nat) => 5 * a + 3 * b) (cons 2 (cons 1 nil)) (cons 2 (cons 1 nil))


end Vector

end hide

Eric Wieser (Aug 06 2024 at 09:24):

Henrik Böving said:

Until then it simply remains something that is disabled to protect the user from unexpected behavior.

I think you could get the best of both worlds with a warning emitted whenever the compiler compiles a recursor, and the inefficient code genreated

Eric Wieser (Aug 06 2024 at 09:25):

This is after all just a silly exercise in #tpil, and refusing to compile it is just annoying for the reader who is trying to learn about the difference between match and recursors

Mario Weitzer (Aug 06 2024 at 09:28):

Henrik Böving said:

Not being able to #eval recursors does not mean you cannot do this, you can very much still write #reduce and rely on the kernel reduction engine to give you a reduced term if it is just for some quick experimentation. It's just that this will of course not scale if you want to experiment with large terms.

This satisfies me (almost) completely. I didn't think of #reduce but assumed that it would also be blocked by the noncomputable definition. Thanks for explaining this in so much detail!

Mario Weitzer (Aug 06 2024 at 09:31):

Eric Wieser said:

I think you could get the best of both worlds with a warning emitted whenever the compiler compiles a recursor, and the inefficient code genreated

I agree, to me this would be a cleaner and more useful solution.

Mario Weitzer (Aug 06 2024 at 09:35):

Eric Wieser said:

This is after all just a silly exercise in #tpil, and refusing to compile it is just annoying for the reader who is trying to learn about the difference between match and recursors

Maybe a hint at the description of the exercise that the resulting function cannot be "#eval"ed (with a short explanation) but "#reduce"ed would be a good idea?

Mario Weitzer (Aug 06 2024 at 09:38):

Henrik Böving said:

I don't see why this should apply. People that write mathematics that is noncomputable because of some reason are not the same people that wish to execute performant code in my experience. So those people can still rely on #reduce if they really feel like evaluating something.

Maybe I'm a rare exception (wouldn't be the first time) in that I really like to do fundamental mathematics but also love to program. The revelation that in Lean I could "compile and execute my papers" was just a dream come true to me!
Edit: And the fact that I could completely ignore the cultural and subjective notion of "elegance".

Henrik Böving (Aug 06 2024 at 09:39):

So you actively use recursors in your papers and that is what is stopping you from compiling them?

Mario Weitzer (Aug 06 2024 at 09:47):

No, as I said, I'm only just exploring Lean 4. I certainly have used recursors in my previous work in Lean 3 (which I plan to translate to Lean 4) and I just wanted to understand what the new best practice is and how to understand and handle this new restriction. After all your explanations (in particular, seeing how easy a proof of equality of functions is and that I can still use #reduce) I feel much less worried about this. I still think that having recursors evaluate with a possible warning about performance would be better, and having evaluation of recursors properly implemented would be even better than that. I still very much enjoy Lean 4 for what it is now!

Patrick Massot (Aug 07 2024 at 09:01):

Mario Weitzer said:

But now, the reason something is noncomputable comes down to design choices which on a first, superficial glance seems to call into question the whole enterprise of founding mathematics on type theory in the first place, if one of the most fundamental constructions (recursors) isn't "fully supported" (don't compute).

I think this very strong assertion is based on an extremely specialized used of the word “mathematics”. One can argue we’ve been formalizing quite a bit of mathematics in Lean without ever explicitly using this kind of computation. The benefits of founding mathematics on type theory go way beyond that very particular style of proof by computation.

Mario Carneiro (Aug 07 2024 at 23:19):

Note, when it says not "fully supported" it means this in the most literal sense - people have not put in the time to make it work properly but there is no fundamental reason this can't be done, and honestly I think this has lead to a lot of confusion over the years because it's basically just a todo in the lean source code together with a deprioritization making it difficult to see it fixed in short order

Mario Carneiro (Aug 07 2024 at 23:22):

@Henrik Böving said:

Edward van de Meent said:

it feels to me like the user should be the one to judge if the performance is worth it, not the language... but maybe that is because i hardly ever want to write performant code and just code that works.

No, the language should be designed in such a way that performance behaves in a way that is predictable. Compiling recursors does not have predictable performance. For something like List.isEmpty the user would expect O(1) and gets O(n), this is not okay. For something like a tree structure it gets even worse, you expect O(1) but recurse through the entire tree. This can be fixed if someone puts in the work of writing the proper optimization pass for the compiler, it is not an inherent limitation of our design for Lean 4. Until then it simply remains something that is disabled to protect the user from unexpected behavior. Having language primitives that behave as you would expect them to is a key in being able to reliably write performant code in the first place.

This is incorrect, I think. You can't interpret a call to a rec function as a regular function, and doing so is what leads to the surprising evaluation order, but lean already has to do special things to compile casesOn applications, and it could just do the same thing for rec applications and get the right evaluation order. This is really only just some unimplemented code path, it's not a fundamental design issue in any sense.

Last updated: May 02 2025 at 03:31 UTC