Zulip Chat Archive

Stream: Is there code for X?

Topic: Code Formatter


Kaiyu Yang (Aug 28 2023 at 15:23):

Hi,

I'm wondering if there is a code formatter for Lean similar to Python's black.

Bolton Bailey (Aug 29 2023 at 04:11):

No, but I would like one, and now that we are fully in lean4, I think we can start thinking about development environment things more again.

Bolton Bailey (Aug 29 2023 at 04:12):

It would be nice to have something to automatically wrap my lines and put my lemma colons at the end of lines instead of at the beginning. I think the community would hopefully agree that something like this would be useful, but I think some discussion would be warranted on how far it should go.

Bolton Bailey (Aug 29 2023 at 04:18):

Here is a nice list from @Floris van Doorn of things a formatter could potentially do.

  • "clean up" code by looking at open namespaces / (scoped) notation and using that to shorten names / use the notation.
  • Look at namespaces / scoped notations that could be opened to shorten parts of the code.
  • Use variables declared in the file and reuse those variables assuming those variables have the same type / binder type and associated type-classes, potentially by renaming the variable.

Some other things might be:

  • Make sure the right number of newlines exist between lemmas/sections
  • Make sure everything is properly indented
    • Variables to a def/theorem that don't fit on the first line indented by 4
    • First line of a proof indented by 2
  • Check for unnecessary parens
  • Remove transitively-provided imports
  • Perhaps alphabetize the imports (apparently this changes behavior though, so be careful)
  • <- to
  • Collapse sequences of calls to rw into one call, where possible

Scott Morrison (Aug 29 2023 at 04:19):

Possibly a good approach to this would be:

  1. experiment with the current behaviour just parsing a file and then pretty printing the resulting syntax again
  2. be sad that lots breaks
  3. work out how to fix some of the breakages, so we get closer to syntax round-tripping
  4. do as many cycles through 1-3 as you can bear
  5. decide that just parsing and pretty printing is not actually going to be good enough for a code formatter, but bask in the praise of everyone who has been helped by syntax round-tripping working better
  6. start writing a code formatter

Mario Carneiro (Aug 29 2023 at 04:20):

to be fair, we've already done 1-4 quite a bit for mathport, I think we're at 5 now

Scott Morrison (Aug 29 2023 at 04:21):

Good point.

Scott Morrison (Aug 29 2023 at 04:21):

Thank you, Mario, for all your great work getting nice syntax out of mathport. :-)

Scott Morrison (Aug 29 2023 at 04:21):

Now on to 6. :-)

Mario Carneiro (Aug 29 2023 at 04:22):

and it is indeed pretty close to "back to the drawing board"

Mario Carneiro (Aug 29 2023 at 04:22):

The first challenge is getting ASTs out of lean, depending on how we want the formatter to work

Mario Carneiro (Aug 29 2023 at 04:23):

something like an editor action to format the current file or current command is the closest to what code actions are already doing

Mario Carneiro (Aug 29 2023 at 04:24):

in that case you can get the AST directly handed to you by the server, and there isn't a huge overhead because the server already has them lying around

Mario Carneiro (Aug 29 2023 at 04:24):

The harder case is full project formatting, which right now would require a lot of custom code and a full build to extract the AST

Mario Carneiro (Aug 29 2023 at 04:28):

although, if the formatter is an independent lean project, it's not clear how the server would load the project so that it could be applied to arbitrary lean code (which most likely doesn't depend on the formatter)

Bhavik Mehta (Aug 29 2023 at 15:46):

I'd quite like a formatter more in the style of ormolu, in partiular points two and five from the list of goals there

Kyle Miller (Aug 29 2023 at 15:57):

Lean 4 comes with a reformatter that just loads a file and pretty prints all its declarations. My understanding is it's mainly a demonstration for how to write a frontend. I'm not sure how well tested it is or if anyone regularly runs it (to run it on mathlib, I think you'd need to add Lean.enableInitializersExecution).

Mario Carneiro (Aug 30 2023 at 03:11):

It works about as well as mathport in terms of output quality, because mathport uses exactly the same method, it just isn't getting the input from a lean 4 file

Mario Carneiro (Aug 30 2023 at 03:14):

I could package it up into a standalone program, but I'm pretty confident in saying it should not be used on any code you really care about. It's like 95% there but the remaining 5% is way too high an error margin for comfort. I think it also very rarely produces invalid or misinterpreted syntax, but those are generally bugs

Anders Larsson (Sep 03 2023 at 09:42):

I tried the reformat.lean program. And it failed to reformat itself. :-)

<   if cmds.getKind == nullKind || cmds.getKind == ``Parser.Module.header then
---
>   if cmds.getKind == nullKind || cmds.getKind == `` Parser.Module.header then

Scott Morrison (Sep 03 2023 at 23:03):

That example sure seems like a bug. Could you open at issue at the lean4 repository? Minimisations or proposed fixes welcome.

Kyle Miller (Sep 03 2023 at 23:10):

Here's a minimization:

import Lean

#eval do Lean.PrettyPrinter.ppTerm ( `(``x))
-- `` x✝

Anders Larsson (Sep 13 2023 at 15:16):

I created an issue. Hope it was clear enough, I'm still quite new to the Lean echosystem.
https://github.com/leanprover/lean4/issues/2536

Bolton Bailey (Sep 17 2023 at 07:35):

I have made a tracking issue for our own use with respect to the wishlist items above #7217. Please feel free to add things to it. Really, what I would like is that any easily lintable mistake that a maintainer would otherwise spend time on correcting can go in the issue.


Last updated: Dec 20 2023 at 11:08 UTC