Zulip Chat Archive

Stream: general

Topic: if then else performance weirdness

Floris van Doorn (Jun 12 2021 at 23:55):

Can someone explain why one of the following four lines is slow and the rest is fast? (in Lean 3)

import logic.basic
open expr tactic
meta def ones : ℕ → expr
| 0     := `(1 : ℕ)
| (n+1) := `(%%(ones n) + %%(ones n) : ℕ)

#eval let x := ones 1000000 in if true then const `nat [] else x -- slow
#eval let x := ones 1000000 in if tt then const `nat [] else x   -- fast
#eval if true then const `nat [] else ones 1000000               -- fast
#eval if tt then const `nat [] else ones 1000000                 -- fast

My actual code is more like the following (I'm working on to_additive). The two functions my_replacement_fun are the same, except for inlining the let expression. However, evaluating the former takes seconds, and evaluating the latter is basically instant. If anyone has an idea what is going on and what rules I should follow to have performant code, I would be interested.

open expr tactic
meta def ones : ℕ → expr
| 0     := `(1 : ℕ)
| (n+1) := `(%%(ones n) + %%(ones n) : ℕ)

namespace expr
protected meta def my_replacement_fun (f : name → name) : expr → expr
| e := e.replace $ λ e _,
  match e with
  | const n ls := some $ const (f n) $ ls
  | app g x :=
    let new_x := my_replacement_fun x in
    if g.is_constant ∧ x.const_name = `nat then some $ g new_x else none
  | _ := none
  end

protected meta def my_replacement_fun' (f : name → name) : expr → expr
| e := e.replace $ λ e _,
  match e with
  | const n ls := some $ const (f n) $ ls
  | app g x :=
    if g.is_constant ∧ x.const_name = `nat then some $ g $ my_replacement_fun' x else none
  | _ := none
  end

end expr

set_option profiler true

run_cmd do
  let dict : name_map name := native.rb_map.of_list [(`has_one.one, `has_zero.zero)],
  let f := λ n, (dict.find n).get_or_else n,
  let t := (ones 10).my_replacement_fun f,
  -- let t := (ones 10).my_replacement_fun' f,
  trace t.hash,
  skip

Floris van Doorn (Jun 13 2021 at 00:01):

Ah, I found the earlier thread where I was confused by this: https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/sanity.20check

Floris van Doorn (Jun 13 2021 at 00:18):

Ok, after reading the previous thread, I'm still confused about the difference between if true then ... else ... and if tt then ... else .... Both these conditions should be @[inline]. I guess I'll try to write my code with no global let (only in the branches where I need them).

Mario Carneiro (Jun 13 2021 at 06:56):

Here's my theory: Using let will cause the expression x to be evaluated first, which explains why (3) is faster than (1). However, let statements that are not used at all are deleted, and I believe this explains the difference between (2) and (1). This might seem odd since x is referenced in (2), but I think the lean inliner works out that if tt then t else e can be simplified to t, while the version with true requires inlining a few other things and probably whatever inliner runs before dead code elimination is not strong enough for this.

Mario Carneiro (Jun 13 2021 at 06:59):

Floris van Doorn said:

I guess I'll try to write my code with no global let (only in the branches where I need them).

By the way, one trick you can do if you want to lift a lazy let is to use a unit function:

#eval let x := λ _:unit, ones 1000000 in if true then const `nat [] else x ()  -- fast

Mario Carneiro (Jun 13 2021 at 07:06):

You can see the difference here:

set_option trace.compiler.optimize_bytecode true
def foo1 (t e : ℕ) := if true then t else e
-- [compiler.optimize_bytecode]  foo1 2
-- 0: scnstr #1
-- 1: cases2 4
-- 2: move 0
-- 3: ret
-- 4: move 1
-- 5: ret

def foo2 (t e : ℕ) := if tt then t else e
-- [compiler.optimize_bytecode]  foo2 2
-- 0: move 1
-- 1: ret

The bytecode foo1 says basically if tt then t else e while foo2 says t. There is no call to true.decidable, which is good, but it looks like it got tired after that and didn't do the if tt simplification like in foo2. As a result, even though the move 1 on line 4 of foo1 is dead, it is still present, which keeps the let statement in your example around.

Floris van Doorn (Jun 13 2021 at 16:59):

Thanks for your analysis!

Last updated: May 02 2025 at 03:31 UTC