Zulip Chat Archive

Stream: lean4

Topic: Step-by-step debugger?


Tanner Swett (Jul 01 2023 at 21:09):

There isn't yet any step-by-step debugger for Lean 4 like I'm used to from C# and Python, is there?

I'm a little tempted to try to write one myself. I'll work on that next time I have a few thousand hours of spare time :big_smile:

Henrik Böving (Jul 01 2023 at 21:16):

Such a debugger does not exist yet no. Given the gigantic difference between the input code and what actually gets executed you would have to add quite a lot of changes to the compiler itself to get it even close to feasible to implmeent.

James Gallicchio (Jul 01 2023 at 21:24):

honestly even a step debugger on the LCNF of some expression seems useful

Mario Carneiro (Jul 01 2023 at 21:38):

There used to be a debugger for lean 3 IR, docs3#vm_monitor

Tanner Swett (Jul 01 2023 at 22:00):

I would imagine it'd be a lot easier to add a debugger to the reducer than to write one that can attach to compiled code. (Of course, "a lot easier" in this case would still be a pretty enormous undertaking.)

Mario Carneiro (Jul 01 2023 at 22:01):

It should be relatively easy to debug the generated C code

Mario Carneiro (Jul 01 2023 at 22:03):

anything else will need to be written from scratch

James Gallicchio (Jul 01 2023 at 22:12):

But the C is way less intelligible than LCNF, no? I generally struggle to trace what's happening at the C level.

Henrik Böving (Jul 01 2023 at 22:16):

That depends on which level of LCNF you mean. The last one (lambda RC) is basically almost C already, the C translation step from there is mostly replacing constructors of the expression type with calls to runtime functions etc. If you go a little higher up it becomes more readable but I would still doubt the vast majority of users would be able to decipher things based on it that are more useful than inserting debug prints or gdb-ing the C code. I do agree it might be a useful feature for experts though

James Gallicchio (Jul 01 2023 at 22:18):

what level is output by trace.compiler.result?

Henrik Böving (Jul 01 2023 at 22:18):

lambda RC

James Gallicchio (Jul 01 2023 at 22:18):

I find that version infinitely more readable than C. though I do tend to write rather cursed code.

Mac Malone (Jul 01 2023 at 22:23):

I agree that lambda RC is easier to read than the C code. To me this is largely due to the difficult in 1) isolating the relevant function in the C code and 2) the less than ideal formatting of said C code.

Mac Malone (Jul 01 2023 at 22:24):

A debugger that could translate from C code back to lambda RC would already be an amazing feature.

Mac Malone (Jul 01 2023 at 22:26):

Furthermore, even just being able to add function breakpoints in VSCode and then breaking on said function in the lambda RC (or even just C code) would be excellent.

Mac Malone (Jul 01 2023 at 22:28):

In short, even a single feature debugger I think would be a massive boon. It can then just be extended with more features as time progresses.

Mario Carneiro (Jul 01 2023 at 23:18):

Also preserving variable names in the C code would be a massive improvement

Mario Carneiro (Jul 01 2023 at 23:19):

I would attribute the majority of readability loss in both the C code and the old compiler IR to loss of variable names

Julian Berman (Jul 02 2023 at 05:25):

(I'll just drop a tangential link to DAP, which is the debugger protocol equivalent for LSP, so if a debugger ever exists, if it implements DAP then yay all the editors get immediate fun ways to interact with it -- https://microsoft.github.io/debug-adapter-protocol/ )

Mario Carneiro (Jul 02 2023 at 05:47):

last I checked, DAP was not nearly as well adopted as LSP though, and still quite a ways from real debuggers in functionality (VSCode's debugger is still quite lacking compared to visual studio)

Mac Malone (Jul 02 2023 at 06:20):

@Mario Carneiro True, but something is still better than nothing, and would still likely serve as a good starting point.

Henrik Böving (Jul 02 2023 at 10:53):

Mario Carneiro said:

Also preserving variable names in the C code would be a massive improvement

With the new code generator we have put in a bit of effort to preserve original variable names (although there are of course many auxiliary ones introduced still) so once we eventually end up having that working you might get that ^^

Mac Malone (Jul 02 2023 at 14:47):

Henrik Böving said:

With the new code generator we have put in a bit of effort to preserve original variable names

Yeah, the new code generator does a great job persevering more names! One major problem, though, is that too many functions do not have user-defined parameter names (e.g., EStateM.Result.ok/error and various monads). To help with this in the future, linting missing names (and recommending naming all parameters) may be helpful going forward.

Mario Carneiro (Jul 02 2023 at 17:38):

I think that issue is about overriding a named variable with an unnamed one. There is no "good" name you can give the argument of Result.ok, you need to get the name from the function binder in the bind application it came from

Mac Malone (Jul 02 2023 at 17:49):

Mario Carneiro said:

I think that issue is about overriding a named variable with an unnamed one. There is no "good" name you can give the argument of Result.ok, you need to get the name from the function binder in the bind application it came from

No, I meant in e.g., auto-generated compiled match statements where there arguments to Result.ok and Result.error are all auto-generated a.x rather than more informative letterings like e and s or even fuller names like res, err, and st (as one would choose on a user level). It makes it hard to remember which a.x is the error/result and which is the state, for example. Reading compiled monadic code would become much easier with just some more informative letters other than the ubiquitous a.x.

Mac Malone (Jul 02 2023 at 17:52):

This is also brings up the question of whether a is the best name for an auto-generated variable names, as it is frequently also used in hand-written code, maybe a some rare Unicode letter should be used instead for auto-generated names to reduce clashes (and make it clear what names are auto-generated).

Kyle Miller (Jul 02 2023 at 19:09):

In Lean 3 there was the innovation of putting an inverted hat on the letter a for that reason. It even earned a Zulip emoji: :ahat:


Last updated: Dec 20 2023 at 11:08 UTC