Zulip Chat Archive

Stream: lean4 dev

Topic: Compiling runtime to LLVM bitcode?


Mike Hicks (Oct 12 2023 at 20:38):

I want to try something that’s possibly crazy: Symbolically execute a Lean program.

My plan is to compile my Lean program to C, compile that C code to LLVM bitcode, and then run KLEE on the result. For this to work, I also need to get the lean4 run-time into LLVM bitcode, so I can link it against that of my particular program. Otherwise KLEE will not find symbols defined by the run-time:

$ lli hello.ll
JIT session error: Symbols not found: [ _lean_initialize_runtime_module, _lean_io_mark_end_initialization, _lean_mk_string_from_bytes, _lean_mark_persistent, _initialize_Init, _lean_init_task_manager, _lean_get_stdout, _lean_set_panic_messages, _lean_inc_ref_cold, _lean_io_result_show_error, _l_IO_FS_Stream_putStrLn, _lean_alloc_small, _lean_finalize_task_manager, _lean_notify_assert, _lean_dec_ref_cold ]
lli: Failed to materialize symbols: { (main, { _initialize_hello, _main, __lean_main }) }

My thought for this is to use wllvm to attempt to build lean4 and then link against the produced bitcode for the library. I managed to build lean4 with wllvm (I think), but I'm not sure what to do next. What bits of what I've built do I need to collect with own program? It's presumably whatever leanc uses when it links, but I'm not sure what that is.

Sebastian Ullrich (Oct 12 2023 at 20:41):

You can add -v to the leanc cmdline to see what it uses, does that help?

Mike Hicks (Oct 12 2023 at 21:32):

For simple "hello world" it uses -lleancpp -lInit -lLean -lleanrt -lc++. Looks like the first four are Lean libraries. I'll play around with those. Kind of shocking that the final executable is 3+ MB (the C version is 33K). Hopefully all that extra code does not confuse KLEE, if I get that far.

Siddharth Bhat (Oct 13 2023 at 05:25):

@Mike Hicks if you build lean with the LLVM backend, you can also pass the flag lean -b <bitcode-file.bc and that gives a self-contained LLVM bitcode (with the Lean runtime in it).

Mike Hicks (Oct 13 2023 at 12:37):

Wow, amazing! Where can I learn how to build the LLVM backend @Siddharth Bhat ? I poked around lean-lang.org and github.com/leanprover/lean4 but I didn't see anything obvious.

Does that work for multifile developments? I wouldn't want each file to include the Lean runtime in it. Perhaps you just pass all the files to lean at once and it will merge them into a single bitcode file, with the runtime? That's basically what I want.

Henrik Böving (Oct 13 2023 at 13:09):

Mike Hicks said:

Wow, amazing! Where can I learn how to build the LLVM backend Siddharth Bhat ? I poked around lean-lang.org and github.com/leanprover/lean4 but I didn't see anything obvious.

Does that work for multifile developments? I wouldn't want each file to include the Lean runtime in it. Perhaps you just pass all the files to lean at once and it will merge them into a single bitcode file, with the runtime? That's basically what I want.

The basic setup is as follows: You need LLVM 15 + Clang(++) 15 in your PATH. Then I run:

cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON  -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang -DLLVM=ON -DLLVM_CONFIG=llvm-config  -DCMAKE_CXX_COMPILER=clang++ ../../

and from there its just make -j$(nproc) and you're good

Henrik Böving (Oct 13 2023 at 13:11):

And yes it is important that it is LLVM/Clang 15^^ If you can't get your hands on that because your package manager has already phased it out you can compiler it from source which takes a bit more effort though. That said we do plan on eventually updating to more recent versions of course

Sebastian Ullrich (Oct 13 2023 at 13:43):

Siddharth Bhat said:

that gives a self-contained LLVM bitcode (with the Lean runtime in it).

This isn't true, is it? The file should still depend on libleanrt, and likely libleancpp as well. Afaik, the output will not be materially different right now from what one would get from compiling the .c file to bitcode so I'm not sure using the LLVM backend here will simplify anything, or am I mistaken?

Mac Malone (Oct 13 2023 at 15:22):

Mike Hicks said:

Does that work for multifile developments?

Lake support for LLVM backend is in development -- lean4#2571 -- and should be merged very soon. This will hopefully greatly simplify the process for multifile projects.

Mike Hicks (Oct 13 2023 at 15:37):

Unfortunately I couldn't get the LLVM backend to build on my system. I have MacOS Ventura, and I used homebrew to install LLVM 15, putting that at the front of my PATH. It crapped out when building .lean files:

[    ] Building Init/Data/ToString.lean
[    ] Building Init/NotationExtra.lean
make[7]: *** [../build/release/stage1/lib/lean/Init/NotationExtra.olean] Trace/BPT trap: 5
make[6]: *** [Init] Error 2
make[5]: *** [CMakeFiles/make_stdlib] Error 2
make[4]: *** [CMakeFiles/make_stdlib.dir/all] Error 2
make[3]: *** [all] Error 2
make[2]: *** [stage1-prefix/src/stage1-stamp/stage1-build] Error 2
make[1]: *** [CMakeFiles/stage1.dir/all] Error 2
make: *** [all] Error 2

The native clang with Xcode is also version 15, but other LLVM features seem to be missing; when I use that it dies very quickly after invoking make with

-- Executing 'llvm-config --version' at 'llvm-config' to check configuration.
CMake Error at CMakeLists.txt:262 (execute_process):
  execute_process error getting child return code: No such file or directory

I will wait to get an answer to @Sebastian Ullrich 's question; if using the LLVM backend is tantamount to compiling to C and converting that to bitcode, then I don't necessarily need the backend.

Siddharth Bhat (Oct 13 2023 at 18:48):

@Mike Hicks I retract my original statement, I didn't realize we wanted all the definitions. I'd suggest using the C backend for now

Detailed build instructions for the LLVM backend live off-upstreamed (will be upstreamed soon :tm: ): https://github.com/bollu/lean4/blob/llvm/hargonix-collab/doc/llvm.md#llvm-backend

Joe Hendrix (Oct 13 2023 at 19:19):

I think wllvm will be needed for both C and the LLVM backend to get all the bitcode. If you are able to get the LLVM backend working, it may be possible to skip C generation. I think wllvm will still collect bitcode built directly in addition to bitcode generate from C.

Mike Hicks (Oct 13 2023 at 20:45):

Thanks for the help! I'll play with the C backend and wllvm and see how far I can get.


Last updated: Dec 20 2023 at 11:08 UTC