Zulip Chat Archive

Stream: lean4

Topic: Defining functions differently for efficiency


Thomas Murrills (Jul 09 2023 at 10:17):

There are often several ways to write a function:

def f1 : A  B  C
| a, b => body

def f2 : B  A  C
| b, a => body

def f3 : A  B  C
| a => fun b => body

def f4 (a : A) : B  C
| b => match a with | a => body

def f5 (a : A) : B  C := match a with | a => fun b => body

def f6 (a : A) (b : B) : C := match a, b with | a, b => body

Imagine that there may be case splitting on a and b so that the match statements aren't trivial. Do these wind up being different internally or in terms of efficiency? Should one be preferred over another in certain circumstances?

I realize this is vague, but I was wondering if there were any vague rules of thumb, or, better, some general way to think about how these might differ underneath the hood. I also thought it might be an interesting conversation topic and might provide a small window into how lean works under the hood. :)

Henrik Böving (Jul 09 2023 at 10:21):

I would suspect that f1 and f2 behave the same, there might be some veeeeeeeery subtle effects regarding cache? That is if one argument still happens to be on a cache line and the other doesnt you could measure a difference but that seems unlikely to be important.

f1 and f3 get compiled exactly the same way.

I don't quite understand how you wish to compare f4-f6, depending on how the case splits look exactly they could be completely different.

Henrik Böving (Jul 09 2023 at 10:22):

But if you are optimizing at this level you have very different issues. The majority of performance issues that you see in Lean code are usually:

  1. Algorithmic in nature
  2. Violate linearity on linearly efficient data structures
  3. Boxing related

Thomas Murrills (Jul 09 2023 at 10:31):

I see, thanks! Also, sorry, I should have said to imagine that all matching against a and b might involve case splits, not just the match statements (so whatever case splits you'd do in e.g. f4, you'd also do in e.g. f1). But it's good to know that this is usually not the source of any performance issues!

Thomas Murrills (Jul 09 2023 at 10:32):

Though, how might boxing hurt your performance? (I only just learned what boxing is now, so apologies if this is a standard issue across programming languages)

Henrik Böving (Jul 09 2023 at 11:52):

Say that for example you have an Array UInt32, given the fact that UInt32 is unboxed in Lean you would expect that in order to access the value you can just index the array and you have the value right? But this is not the case, the values that Array manages are boxed. THat means you have to follow a pointer to get to the value. Following a pointer means access to memory -> potential cache misses -> slow.

Thomas Murrills (Jul 09 2023 at 12:38):

Ahh, gotcha. Makes sense! :) I can see how it might be hard to anticipate what’s boxed and what’s not…


Last updated: Dec 20 2023 at 11:08 UTC