Zulip Chat Archive

Stream: general

Topic: Debugging tools

J. Ryan Stinnett (Sep 05 2025 at 12:10):

As far as I can tell, debugging workflows in Lean are currently limited to manually-added print statements. It is possible to attach a stepping debugger (GDB / LLDB) to compiled Lean programs, but you aren't able to see values in terms of the original Lean source.

I have experience building debugging tools in other ecosystems, and I am considering building out a great Lean debugging experience, including stepping debugging in terms of Lean source and values, macro debugging, etc.

It would be sizeable amount of work though, so before I get too far down that road... Is the Lean community interested in these sorts of debugging workflows? Would you make use of such tools if they existed?

Patrick Massot (Sep 05 2025 at 12:28):

I'm sure lots of people would love that, but you should coordinate with the Lean FRO.

J. Ryan Stinnett (Sep 05 2025 at 12:54):

Ah sure, happy to coordinate with whoever may be interested as needed. :slight_smile:

For now, I just wanted to gauge interest in these debugging abilities, as I haven't seen much discussion of debugging here so far... It could mean people are fine with the current state, or it may mean people do wish for debug tooling, but they just don't talk about it since it doesn't exist. I wasn't sure which it might be.

Robin Arnez (Sep 05 2025 at 13:20):

I'd definitely love some better debugging tools! The current trace system is fine but it is a pretty slow workflow (going back and forth between adding traces, rebuilding, then testing...), an interactive debugger would be so much better! You should probably connect with Cameron Zwarich, he's developing Lean's compiler. Excited where this will go!

J. Ryan Stinnett (Sep 05 2025 at 14:33):

Glad to hear there's interest in this. :smile: I'll spend some time researching / prototyping different designs for this in the Lean ecosystem, as that should help with future planning and coordination.

Gavin Zhao (Sep 05 2025 at 15:48):

Even just stepping line by line (like in gdb) or stepping expression by expression in evaluation order would be very helpful to have.

Chris Bailey (Sep 05 2025 at 18:06):

To Patrick's point about coordination with the FRO, a debugger was on the roadmap in year 1 slated for year 2 or year 3 (it has disappeared from the roadmap in subsequent years). They may have some insights or research from prior work.

J. Ryan Stinnett (Sep 05 2025 at 18:18):

Ah okay, good to have that context. If @Cameron Zwarich or others at the FRO might know who may have explored this area already, I'd be happy to hear about any past efforts or investigations that may have happened. Or if it was perhaps just deprioritised, that would be good to know as well.

Cameron Zwarich (Sep 05 2025 at 22:57):

Thanks for your interest in debugging, @J. Ryan Stinnett. While no one has done any concrete work on this, I've thought about it off and on while working on the compiler.

In my mind, the optimal environment for implementing a debugger is one where it is possible to interrupt execution at any point (or at least a large subset of allowed points), convert a "machine" state to a more source-level program state, and then inspect values, run new code, or continue execution with some changes. In the past, I've worked on JavaScript JITs and dynamic binary translators where other design constraints provide most of the machinery required to do this.

In Lean, I see the following particular challenges for a debugger:

Lean code requires a lot of optimizations to perform well, and these optimizations both erase execution context (e.g. inlining) and reduce live values (e.g. "cases of constructor"). It's not at all obvious how to preserve this information while maintaining performance. This is also true of GHC and they have a debugger, so it's not completely disqualifying.
Lean heavily erases types, which would make it difficult to provide a good experience while debugging polymorphic code. If I understand correctly, the GHC Haskell debugger effectively does type inference at runtime on the GC heap (which is itself actually a valid approach to GC), but I am fairly certain this can't work for the combination of Lean's expressive type system and compilation strategy.
Unless I'm missing something, to set a breakpoint in Lean code, we would need to essentially debug the Lean process as if it was a C program. And to make this work on contemporary macOS without making the user jump through a ton of hoops, we would need to use the GDB/LLDB remote debugging protocol to do this. If we were in an alternate universe where Lean had a JIT runtime, we could do all of this in-process.
We currently use Clang as a code generator from a fairly low level IR, which makes it very difficult (impossible?) to include arbitrary metadata for the debugger to use. Replacing this with a non-LLVM code generator would be a bunch of work, but it's probably more clear what this work would actually entail than the previous items.

In practice, it seems like all existing debuggers for typed functional languages operate on an interpreter. Lean also has an interpreter, but it's different than the OCaml/GHCi interpreters, in that it's really just running the low-level IR at the end of compilation rather than being closer to the source level. If anyone has written a debugger for a typed functional language that runs on optimized compiled code, I would be very curious to see the details and learn about the tradeoffs they made.

We're definitely interested in having a debugger, but especially while the compiler is still under active development to improve its performance, robustness, etc. we would need to find a way to implement a debugger that creates a small number of new requirements (or find other profitable uses for the new compiler work that's required).

Evgeniy Kuznetsov (Sep 06 2025 at 10:27):

I am mainly interested in Lean4 for programming
And the lack of a debugger is the only thing stopping me from playing with it :pensive:
Because I don't want to sink into this tedious, time-consuming tracing

J. Ryan Stinnett (Sep 06 2025 at 16:08):

(deleted)

J. Ryan Stinnett (Sep 06 2025 at 16:55):

Thanks for sharing your insights, it's quite helpful indeed. :smile:

Cameron Zwarich said:

In my mind, the optimal environment for implementing a debugger is one where it is possible to interrupt execution at any point (or at least a large subset of allowed points), convert a "machine" state to a more source-level program state, and then inspect values, run new code, or continue execution with some changes. In the past, I've worked on JavaScript JITs and dynamic binary translators where other design constraints provide most of the machinery required to do this.

That full feature set does indeed sound nice. In my view, there's a baseline of read-only functionality that at least allows inspecting values at as many program points as possible in terms of the original source entered by the user. Read-write features that permit editing the program while debugging are nice to have, but I view them as less important, and as you've hinted, editing while debugging can be particularly challenging to achieve in a compilation-based implementation (vs. a VM which usually has some of the pieces already).

About my own background, I used to work on DevTools at Mozilla (including the JS debugger), and I am currently working on a PhD (nearing completion) focused on debugging optimised programs.

Lean code requires a lot of optimizations to perform well, and these optimizations both erase execution context (e.g. inlining) and reduce live values (e.g. "cases of constructor"). It's not at all obvious how to preserve this information while maintaining performance. This is also true of GHC and they have a debugger, so it's not completely disqualifying.

Lean heavily erases types, which would make it difficult to provide a good experience while debugging polymorphic code. If I understand correctly, the GHC Haskell debugger effectively does type inference at runtime on the GC heap (which is itself actually a valid approach to GC), but I am fairly certain this can't work for the combination of Lean's expressive type system and compilation strategy.

In my view, the foundational challenge required for reliable debugging is finding some way to (correctly) track and/or replay the various program transformations that occur during compilation so you then have a mapping from source to machine program. I consider this to be an open research problem, as it's not all obvious how best to do this, especially in a performant way. I plan to experiment and prototype various approaches to attempt find a good path here, as everything else depends on this.

Unless I'm missing something, to set a breakpoint in Lean code, we would need to essentially debug the Lean process as if it was a C program. And to make this work on contemporary macOS without making the user jump through a ton of hoops, we would need to use the GDB/LLDB remote debugging protocol to do this. If we were in an alternate universe where Lean had a JIT runtime, we could do all of this in-process.

We currently use Clang as a code generator from a fairly low level IR, which makes it very difficult (impossible?) to include arbitrary metadata for the debugger to use. Replacing this with a non-LLVM code generator would be a bunch of work, but it's probably more clear what this work would actually entail than the previous items.

I agree these are further challenges, but at least with these points, they are fairly well-understood pieces that exist in other projects that can be reused / adapted as needed.

In practice, it seems like all existing debuggers for typed functional languages operate on an interpreter. Lean also has an interpreter, but it's different than the OCaml/GHCi interpreters, in that it's really just running the low-level IR at the end of compilation rather than being closer to the source level. If anyone has written a debugger for a typed functional language that runs on optimized compiled code, I would be very curious to see the details and learn about the tradeoffs they made.

Debugging tools for functional languages are generally quite poor all around, unfortunately. I am not currently aware of any that work well, especially once optimisation is involved. My aspiration is to build a top-tier debugging experience for Lean, both for the Lean community and also to demonstrate that functional languages can have a great debugging experience too.

We're definitely interested in having a debugger, but especially while the compiler is still under active development to improve its performance, robustness, etc. we would need to find a way to implement a debugger that creates a small number of new requirements (or find other profitable uses for the new compiler work that's required).

Thanks for explaining those planning and technical constraints, that's quite helpful. As mentioned above, my initial focus is working out the best design for tracking the mapping between source and machine programs. I plan to do this by prototyping / experimenting with various approaches, which should then inform a future debugging proposal for you and other Lean maintainers to consider.

Last updated: Dec 20 2025 at 21:32 UTC