Zulip Chat Archive

Stream: lean4

Topic: Regarding optimizing compilers for Lean


Number Eighteen (May 18 2024 at 15:40):

The current Lean compiler produces C code that is then compiled with a C compiler. There is a new compiler in the works that produces LLVM code, but I am not sure what the progress is or what kind of compiler it is.

Does the new compiler work directly with Lean's semantics, taking advantage of its features like immutability and persistence, or does it translate to imperative semantics before emitting LLVM code?

If the latter is true, is there any interest in developing an optimizing compiler that takes full advantage of Lean's purity, and tackles the performance issues that arise? Note, I have zero knowledge about LLVM, so if you think I am under misconceptions, please inform!

Henrik Böving (May 18 2024 at 15:56):

You can compile Lean with the LLVM backend yourself if you want to and it will work, pass all tests and be approximately as fast as the C backend. Note that the C backend also ends up performing just a translation to LLVM since it compiles the C code with clang. The semantics of Lean are also not that funky, basically all operations that a Lean program can do at a machine level is:

  • talk to the Lean runtime/call functions provided by the runtime
  • do a branch on a value from the Lean runtime
  • call a Lean function
  • talk to other FFI code

and this is clearly reflected in the C vs LLVM output. If you look at the LLVM that clang generates on the C code after some basic LLVM cleanup passes you will notice that the CFG is often quite literally the same.

Now zooming out a little bit. Both the LLVM and the C code generator are fed by the same part of Lean. Lean defines its own internal IR and performs a series of optimizations that do perform lots of Lean specific optimizations already, most notably of course the FBIP optimizations as those are unique to Lean and only a handful other languages, but also other things that you would expect from a functional compiler.

There is a new optimizer for the IR that's close to functional but work has been halted on it for quite a while. The FRO does plan to resume work on it eventually though.

Note that you can write very performant Lean code already due to the fact that we have the FBIP optimization that allows in place mutation of tree like structures and the docs#Array primitive that is a real sequential piece of memory which will also be mutated in place if used linearly. This allows us to build Lean programs that can match or at least get satisfiably close to the speed of impure and imperative languages in a lot of cases.


Last updated: May 02 2025 at 03:31 UTC