Zulip Chat Archive

Stream: lean4

Topic: Lean4 Debugger


Richard L Ford (Dec 07 2021 at 21:14):

When I'm learning new software one thing I really like to do is compile the program in debug mode and then step through the program and observe the values of variables. Has there been thought to making this kind of thing possible for lean4?

I know that this can be problematics in the presence of generics or polymorphism. Languages like OCaml and Java erase generic parameters so there is not enough information at runtime to know the exact types of variables. When I worked for Microsoft I worked on the Bartok compiler that was written in C#. One nice thing about the .Net infrastructure is that each object had a pointer to information about its type. On the other hand, this can lead to memory overhead, since I recall some programs where the size of the metadata was as large as the code.

I'm just starting to study the lean4 implementation, but wonder if there would be a way to tweak it, perhaps just in a "debug" mode, so that polymorphic functions were passed type parameters as well as non-type parameters at runtime. This implies that there would be some kind of "runtime type information". This would then give enough information that a debugger could display the types and values of variables.

It would also be nice if the emitted C files had preprocessor directives pointing to the lean source locations so that stepping through the C code could point to the original lean source. Of course, that would not be helpful if you did not have visibility to the local variables that correspond to the lean environment.

I know this would be a major project. I was just wondering if this is something others think is desirable and if anyone is willing to fund such an effort. I'd be happy to work on such a project if someone would fund it.

Richard L Ford (Dec 07 2021 at 21:40):

In thinking about how to support a native code debugger, I think a possible first step would be to provide an LLVM backend for lean4. This would provide a vehicle for supplying the debug information and would get you ports to the machines supported by LLVM. It would probably also be required to make some extensions to Dwarf for lean4. At the moment I don't see an easy path to debugger support without some thing like LLVM.

Eric Taucher (Dec 07 2021 at 21:47):

Richard L Ford said:

At the moment I don't see an easy path to debugger support without some thing like LLVM.

I don't know if Lean4 has debug message hooks but that should work. If Lean4 is missing them then they can be added. If it has hooks and more or needed then they could be added.

Richard L Ford (Dec 07 2021 at 22:32):

I just discovered Papyrus which is a work-in-progress for an LLVM interface to Lean 4. So it look like some thought has been given to relating LLVM to lean4.

Arthur Paulino (Dec 07 2021 at 22:32):

@Richard L Ford Please take my word with a grain of salt because I am very new to Lean, but I think a debugger will be more desirable when/if Lean 4 gets more traction for general purpose programming. As of now the most common use case is theorem proving (not much about variable values... it's more about their types)

Richard L Ford (Dec 07 2021 at 22:37):

@Arthur Paulino, that is probably true. The reason I would want it now is for studying Lean itself. It would be interesting to be able to step through Lean and watch it in action. That might even be helpful for the Lean developers.

Arthur Paulino (Dec 07 2021 at 22:43):

I see. I guess Leonardo and Sebastian would know better how a debugger would help on Lean development. I can only imagine how cool it would be to see Lean's reasoning process from inside

Henrik Böving (Dec 07 2021 at 23:04):

Arthur Paulino said:

I see. I guess Leonardo and Sebastian would know better how a debugger would help on Lean development. I can only imagine how cool it would be to see Lean's reasoning process from inside

If you're interested in "just" the reasoning part the tracing stuff that is provided already shows a ton of what is going on...and most of it is really not that interesting (from a user perspective) unless you are specifically searching for a bug.

Mario Carneiro (Dec 08 2021 at 12:33):

Note that lean 3 had a VM debugger, but it never got any serious use and was basically abandoned

Richard L Ford (Dec 08 2021 at 13:37):

@Mario Carneiro, you mean a debugger that works on the bytecode? Does it still exist. Can it be used?

Mario Carneiro (Dec 08 2021 at 13:39):

Yes it still exists, but don't ask me how to use it because I only tried once and I've never seen anyone else use it. I think it was mainly written as a case study for one of the lean papers

Mario Carneiro (Dec 08 2021 at 13:40):

https://github.com/leanprover-community/lean/blob/master/library/tools/debugger/cli.lean

Richard L Ford (Dec 08 2021 at 13:44):

@Mario Carneiro. Thanks. I'll take a look. It at least illustrates another approach. I seem to remember in a talk (by @Kevin Buzzard I think, though I could be mistaken) that he mentioned that lean3 had more uptake from mathematicians than computer scientists, but that since lean4 has more programming facilities that he expected it to be used more for programming. In that case I think a debugger would be more important.

Kevin Buzzard (Dec 08 2021 at 13:45):

I can believe I said some such thing.

Mario Carneiro (Dec 08 2021 at 13:45):

I think that debugging is generally harder in functional languages. Haskell and OCaml both have dubious debugging functionality

Mario Carneiro (Dec 08 2021 at 13:45):

printf debugging seems to be the norm

Mario Carneiro (Dec 08 2021 at 13:46):

But the interactivity you get from a repl / #eval makes up for it to some extent

Richard L Ford (Dec 08 2021 at 13:46):

I think that debugging is generally harder in functional languages. Haskell and OCaml both have dubious debugging functionality?

I agree with that. In OCaml, the bytecode debugger works somewhat OK if you are not in a polymorphic context but then it fails to show values of variables. And it is not supported in a multi-threading environment, though there is a push to merge multi-core OCaml into the main-line, so I would hope they will make the debugger work with multi-core.

Marc Huisinga (Dec 08 2021 at 13:53):

FWIW, according to the docs, GDB is usable with Lean 4 (I haven't tried this yet). I did however recently use perf for optimizing this small arithmetic benchmark, and it worked very well.

Richard L Ford (Dec 08 2021 at 13:55):

@Marc Huisinga, I have used gdb to step through the lean 4 C code. And I used the python gdb macros to help display the lean data structures somewhat. So it is somewhat doable, but to understand what is happening you need to have the lean code in one window and the C code in another, so you can correlate the lean identifiers with the x_1 (etc.) variables in C.

Marc Huisinga (Dec 08 2021 at 13:57):

I think those variables may also show up in the IR output, which is maybe a bit closer to the source than the C code. You can look at the generated IR of a function with

set_option trace.compiler.ir.result true in

before the function.

Eric Taucher (Dec 08 2021 at 14:07):

Marc Huisinga said:

set_option trace.compiler.ir.result true in

I see the Lean manual list set_options but not the one noted. Is there a way/place to see the entire list? Or does one have to GREP the source code?

Mario Carneiro (Dec 08 2021 at 14:08):

set_option has autocomplete, that's what I normally use

Bryan Gin-ge Chen (Dec 08 2021 at 14:19):

I don't know if it still works, but this snippet used to list all the options.

Leonardo de Moura (Dec 09 2021 at 01:13):

Yes, we want to support a debugger in the future, but it is a low priority right now. The current code generator does not preserve sufficient information. Moreover, part of the code generator has not been ported to Lean yet and is still written in C++. This is a big project, and we also want to fix other compiler-related issues. Some of these issues are recorded as part of our "Compiler Refactoring" project: https://github.com/leanprover/lean4/projects/4
Our current priority is the Mathlib port and then better automation for Lean 4. It is hard to predict, but I think we will only be able to work on the "Compiler Refactoring" project (and then the debugger) one year from now.

Richard L Ford (Dec 09 2021 at 03:43):

@Leonardo de Moura, thanks for the update.

Today I was looking at the code that emits C today and the code seems to be using only a limited set of C features. It seems like it would not be too hard to take that code and modify it to emit the text form of LLVM. I'm not sure if getting it into SSA form would be a problem, and the jump points would be for jumps between basic blocks (if I understand the code correctly). Of course that would just be a first step, but it might be able to produce something that is interoperable with the C++ part (which could also emit LLVM if using clang).

I must say I'm really impressed with Lean4. I've been kind of watching it from afar waiting for it to released. Today I also reread the paper Counting Immutable Beans and it of course was valuable in understanding the IR. I'm excited to start using Lean4

Mario Carneiro (Dec 09 2021 at 03:45):

@Richard L Ford IIUC this is what @Mac 's Papyrus project is doing

Mac (Dec 09 2021 at 03:48):

@Mario Carneiro yeah, my next big use case was going to be writing an LLVM emitter for Lean. I just unfortunately got side-tracked by Lake.

Mac (Dec 09 2021 at 03:48):

My hope is to get back to work on Papyrus in the new year.

Mario Carneiro (Dec 09 2021 at 03:49):

Also @Huỳnh Trần Khanh in response to your question about Cranelift, notice that you can also JIT using an in-lean LLVM emitter, see the Papyrus readme

Mario Carneiro (Dec 09 2021 at 03:51):

so I would be inclined to put resources into that rather than cranelift integration (not that anything is wrong with cranelift, it's a nice compiler, but it's not innately a better target than LLVM)

Richard L Ford (Sep 10 2023 at 19:54):

Hello again. It's been a year and 9 months since I first inquired about a Lean 4 debugger. Since then the mathlib port has been completed, and the first official release of Lean 4 has happened. I was just wondering if thought is being given to a Lean source-level debugger, perhaps as part of the Lean Focused Research Organization. Any progress to report?

Henrik Böving (Sep 10 2023 at 20:08):

The compiler refactoring has been mostly halted due to the mathlib4 port and in general lack of time although we do hope on continuing it eventually, it is already quite far into the process of being usable.

That said we do have a working LLVM backend by now so it is "just" a matter of pulling the necessary debug information through the compilation pipeline but I don't know how hard/easy that is.

Richard L Ford (Sep 10 2023 at 20:43):

Thanks for the update. Is there any documentation on the LLVM backend and how one would enable it?

Richard L Ford (Sep 10 2023 at 20:46):

I see there is a "--bc" option for requesting LLVM bitcode.

Henrik Böving (Sep 10 2023 at 21:13):

It's not integrated with lake through the build system yet sadly. But it is possible to compile single files with it. You can see here how: https://github.com/leanprover/lean4/blob/master/tests/common.sh#L32-L40

Richard L Ford (Sep 10 2023 at 21:55):

I also noticed that "lean --features" is not showing LLVM, so I assume a special build must be done to get LLVM support.

Siddharth Bhat (Sep 11 2023 at 07:42):

Indeed, this part of Lean is under active development, so the information is scattered. Here's a document on how to build the backend (it's not upstream yet, as it's WIP): https://github.com/bollu/lean4/blob/llvm/hargonix-collab/doc/llvm.md


Last updated: Dec 20 2023 at 11:08 UTC