Zulip Chat Archive

Stream: lean4

Topic: Metamath verifier in Lean 4


Mario Carneiro (May 07 2021 at 15:01):

This is just a little project to see how well lean can handle high performance programs. I have implemented a Metamath verifier, which can verify set.mm in 10-12 seconds, which is not bad at all considering the reference implementation in C takes 9.2 seconds. A few things I discovered while optimizing:

  • You have to be very careful not to share the main data structures. I didn't find any option for debugging this, but I would just stick a if (!lean_is_exclusive(x_1)) abort() into the generated C file in appropriate locations and go at it with gdb. Some of the compiler passes (the csimp pass got me a few times) can swap the order of operations such that an unshared value becomes shared during a function call, so checking the result is important.
  • Tail call optimization only works when a function is self-recursive. Mutually recursive tail calls are not supported, and this includes when one of the functions is a lambda inside the main function. do notation can introduce these lambdas so they don't need to be explicitly in the source either. The C compiler is generally bad at doing tail call optimization (not really sure why), so a failed TCO generally means that the program will crash with a stack overflow after running long enough. This is fixed by making sure that the whole state machine (at least the parts relevant to control flow) goes in one function.
  • Lean 4 likes functions of the form A -> B -> A, that is, non-monadic state transformation functions, because it will reuse the space used to store A. This requires putting lots of things into the state that would normally be in the output, like error reporting via the Except monad. I got the idea for how to structure the program to work with this design from the lean 4 parser.

I think the code itself is pretty nice looking and easy to write considering what it does, so kudos to the Lean 4 devs for achieving this kind of performance in a functional programming language.

Sebastian Ullrich (May 07 2021 at 15:16):

Nice work! We knew Lean 4 was fast compared to other functional languages, but a favorable comparison to C is even more heartening to see :) .

Sebastian Ullrich (May 07 2021 at 15:25):

Some minor notes:

  • We have dbgTraceIfShared for debugging reuse
  • For clang we could soon use musttail to force general TCO without implementing it ourselves
  • Why do you need the elan commands when you have a pinned version in your leanpkg.toml?
  • Have you tried using -U LEAN_MULTI_THREAD? Though it might not make a difference because branch predictors are simply too good at their job.

Mario Carneiro (May 07 2021 at 15:26):

Why do you need the elan commands when you have a pinned version in your leanpkg.toml?

I probably don't, but I'm no expert on usage. Those commands are just what I do

Mario Carneiro (May 07 2021 at 15:27):

How do I pass leanc flags when using leanmake bin?

Mario Carneiro (May 07 2021 at 15:27):

I've been calling leanc directly which is a bit of a drag

Mario Carneiro (May 07 2021 at 15:32):

-U LEAN_MULTI_THREAD brought it from 11.5 s to 9.9 s, although the noise is also +-1 s so it's hard to tell if it is statistically significant

Sebastian Ullrich (May 07 2021 at 15:33):

Untested: leanpkg build bin LEANC_OPTS+="-U LEAN_MULTI_THREAD"

Sebastian Ullrich (May 07 2021 at 15:34):

Mario Carneiro said:

-U LEAN_MULTI_THREAD brought it from 11.5 s to 9.9 s, although the noise is also +-1 s so it's hard to tell if it is statistically significant

That is a pretty large variance for a single-threaded program. Are you using something like https://github.com/sharkdp/hyperfine?

Mario Carneiro (May 07 2021 at 15:35):

configuring mm-lean4 0.1
> sh -c "/home/mario/.elan/toolchains/leanprover-lean4-nightly-2021-05-05/bin/leanmake" LEAN_OPTS="" LEAN_PATH="././build" bin LEANC_OPTS+=-U LEAN_MULTI_THREAD >&2    # in directory .
make: Nothing to be done for 'bin'.
make: *** No rule to make target 'LEAN_MULTI_THREAD'.  Stop.
uncaught exception: external command exited with status 2

Sebastian Ullrich (May 07 2021 at 15:35):

I hate shell escaping :)

Sebastian Ullrich (May 07 2021 at 15:36):

I guess you can try leanpkg build bin 'LEANC_OPTS+="-U LEAN_MULTI_THREAD"', though this should be fixed in leanpkg

Mario Carneiro (May 07 2021 at 15:37):

hyperfine looks pretty nice. I've been using perf and eyeball grease(?)

Mario Carneiro (May 07 2021 at 15:42):

hm, I think it was statistically insignificant. Without -U LEAN_MULTI_THREAD:

Benchmark #1: build/bin/Metamath /home/mario/Documents/metamath/mm/set.mm
  Time (mean ± σ):     13.251 s ±  0.627 s    [User: 13.129 s, System: 0.108 s]
  Range (min  max):   11.547 s  13.873 s    10 runs

with -U LEAN_MULTI_THREAD:

Benchmark #1: build/bin/Metamath /home/mario/Documents/metamath/mm/set.mm
  Time (mean ± σ):     13.330 s ±  1.512 s    [User: 13.229 s, System: 0.095 s]
  Range (min  max):    9.615 s  15.488 s    10 runs

Mario Carneiro (May 07 2021 at 15:43):

I'm not sure why the variance is so high, it is single threaded and should be quite predictable

Mario Carneiro (May 07 2021 at 15:44):

leanpkg build bin 'LEANC_OPTS+="-U LEAN_MULTI_THREAD"' worked

Guilherme Silva (May 07 2021 at 22:53):

I was using mm0-rs for Language Server Protocol. Can I use this for it too?

Mario Carneiro (May 08 2021 at 02:46):

This program is a pure verifier, it doesn't support LSP in any sense. You should keep on using mm0-rs if you want language highlighting for metamath zero

Mario Carneiro (May 08 2021 at 02:47):

It may however evolve into a metamath or MM0 importer


Last updated: Dec 20 2023 at 11:08 UTC