Zulip Chat Archive
Stream: lean4
Topic: Elaboration time of `example : (List.range 500).length = 500
Scott Morrison (Aug 18 2021 at 01:49):
After Mario showed me how to time elaboration in Lean4, I did a quick experiment with apparently worrying results.
I timed (run elabCmd
10 times, averaging the results) example : (List.range 500).length = 500 := rfl
for various values of n = 500
, and obtained:
| n | ms
------------
| 100 | 67
| 200 | 357
| 300 | 1062
| 400 | 2171
| 500 | 2686
| 600 | 4985
| 700 | 7642
| 800 | 11261
| 900 | 15855
| 1000 | 21477
Notice this is clearly super-linear (roughly cubic). (Below n=100
it is quite close to linear, however.)
In Lean3, this remains linear up to n=4000
, where it is only taking 2400ms. (Shortly after this we start blowing the stack.)
When I mentioned this earlier @Mac said:
I suspect this probably has to do with memory concerns (like locality of reference and allocations/deallocations) in iterating through the list.
I appreciate this is a very artificial example: rfl
is a terrible proof here. But given how much slower this is than Lean3, do we need to worry about what's going on in this example? We do lots of terrible-proofs-by-rfl in mathlib3...
Mario Carneiro (Aug 18 2021 at 01:59):
Have you looked into how range
and length
are defined? I'm pretty sure they do not use the same definitions as lean 3
Mario Carneiro (Aug 18 2021 at 01:59):
it would be good to see whether the numbers change if you use the lean 3 definition of range and length
Leonardo de Moura (Aug 18 2021 at 02:01):
@Scott Morrison This is worth investigating. I don't have cycles to investigate right now.
If you still get a big discrepancy when using the same definitions in Lean 3 and 4, could you please add an issue on GitHub?
Scott Morrison (Aug 18 2021 at 02:08):
Okay, it's just the definitions... Replacing length
with
def List.length' {α : Type u} : List α → Nat
| nil => 0
| cons a as => Nat.succ (as.length')
means it runs in linear time, and indeed slightly faster (~ 2/3) than Lean3.
(List.range
is the same in Lean3 and Lean4.)
Scott Morrison (Aug 18 2021 at 02:14):
Curiously, backporting the "slow" definition from Lean4 to Lean3, Lean3 is still linear.
Scott Morrison (Aug 18 2021 at 02:14):
(but slower by a factor of about ~2 relative to the fast definition)
Scott Morrison (Aug 18 2021 at 02:14):
so there is still perhaps something to worry about.
Mario Carneiro (Aug 18 2021 at 02:28):
I think this is a reasonable argument in favor of using implementedBy
for List.length
so that the compiler can make use of the tail recursive definition and the kernel can use the naive recursive definition
Mario Carneiro (Aug 18 2021 at 02:29):
Have you tried #eval
with both versions? My guess is that List.length
outperforms List.length'
in the VM and in compiled code
Daniel Selsam (Aug 18 2021 at 02:32):
It looks like our good friend the depth/weight-aware expression hashing.
Mario Carneiro (Aug 18 2021 at 02:32):
I haven't heard this story
Daniel Selsam (Aug 18 2021 at 02:34):
Lean4 doesn't do this yet: https://github.com/leanprover/lean/blob/master/src/kernel/expr.cpp#L222-L223 I don't know Leo's current thoughts, but last we spoke about it he was considering using a few bits of the Expr.Data
for it.
Mario Carneiro (Aug 18 2021 at 02:35):
working it out on paper it seems like linear time should still be possible even with the tail recursive length function, although the intermediate terms get big
Daniel Selsam (Aug 18 2021 at 02:39):
This hash issue tends to arise when there are terms that look like f (f (f (f (f (f ...)))))
(in this case Nat.succ
). With bad luck, most terms will end up in a single hash bucket unless you hash the depth at every application.
Scott Morrison (Aug 18 2021 at 03:02):
When using #eval
, the tail-recursive version length
is about twice as fast as the "naive" length'
, at least as far out as n = 10000
. After that, the naive one blows up, while the tail-recursive one continues happily (and linearly) up to about n=2^28
.
Last updated: Dec 20 2023 at 11:08 UTC