Zulip Chat Archive

Stream: lean4

Topic: formatter


Jakob von Raumer (Sep 17 2024 at 12:19):

Is anyone working on building a formatter for Lean (i.e. producing "nice" Syntax from Syntax)? Maybe one that broadly follows the mathlib style guide?

Patrick Massot (Sep 17 2024 at 12:21):

There was a relevant talk during the Lean Together workshop last year.

Patrick Massot (Sep 17 2024 at 12:21):

See https://leanprover.zulipchat.com/#narrow/stream/419231-Lean-Together-2024/topic/A.20Pretty.20Expressive.20Printer.20-.20Sorawee.20Porncharoenwase/near/412181215

Patrick Massot (Sep 17 2024 at 12:22):

I know this doesn’t answer your question, but hopefully it is still relevant.

Jakob von Raumer (Sep 17 2024 at 12:27):

Thanks, that's at least a start. I think I've actually seen the talk and completely forgot about it :sweat_smile:

Jakob von Raumer (Sep 17 2024 at 12:31):

Would be cool to have such a thing written in Lean though

Jakob von Raumer (Sep 17 2024 at 12:31):

Are you still working on the topic, @Sorawee Porncharoenwase ?

Patrick Massot (Sep 17 2024 at 12:40):

It would be extremely useful, but it seems very non-trivial.

Jakob von Raumer (Sep 17 2024 at 12:53):

It doesn't have to be optimal in order to be useful though, something that works better than an elaboration + pretty printer roundtrip would already be an improvement

Mario Carneiro (Sep 17 2024 at 12:54):

yeah I don't think that paper is particularly relevant for making a formatter for lean code. All the complexity is elsewhere

Jakob von Raumer (Sep 17 2024 at 16:27):

My motivation for asking is that I have a tool that spits out less than perfect Lean code and it would be very cool if that could be reformatted by a function that already knows about precedences (to remove brackets) and preferences for breaks and indentations

Henrik Böving (Sep 17 2024 at 16:40):

Jakob von Raumer said:

Would be cool to have such a thing written in Lean though

I have an implementation of the algorithm on my github. Its "just" that someone has to write the formatter for each kind of syntax that can exist and then we are ready to go^^

Mario Carneiro (Sep 17 2024 at 17:13):

@Jakob von Raumer if that's all you need, lean's pretty printer is probably good enough. It was good enough for mathport

Mario Carneiro (Sep 17 2024 at 17:14):

The hard part of writing a formatter is making something that doesn't destroy code which has already been manually formatted (in a style approved way), like mathlib

Eric Wieser (Sep 17 2024 at 17:15):

The pretty printer also destroys -- comments, right?

Mario Carneiro (Sep 17 2024 at 17:15):

not lean's

Mario Carneiro (Sep 17 2024 at 17:16):

it's actually already been tuned quite a bit for this use case because it was needed for mathport

Eric Wieser (Sep 17 2024 at 17:18):

import Mathlib

open Lean Elab PrettyPrinter

elab "printme" cmd:command : command => do
  Lean.logInfo ( Lean.Elab.Command.liftCoreM <| Lean.PrettyPrinter.ppTerm cmd.raw )
  Lean.Elab.Command.elabCommandTopLevel cmd

printme
def ohno (n : ) -- oops
  : n = 1 := sorry

gives

def ohno
    (n : )
      -- oops :
    n = 1 :=
  sorry

which is illegal syntax

Henrik Böving (Sep 17 2024 at 17:18):

Mario Carneiro said:

The hard part of writing a formatter is making something that doesn't destroy code which has already been manually formatted (in a style approved way), like mathlib

I'm not sure if thats a relevant criterion. In languages that use a formatter by default like e.g. go everyone just accepts the formatting or at worst turns it off for a specific code piece

Eric Wieser (Sep 17 2024 at 17:18):

(but you're right, the comment _is_ kept)

Mario Carneiro (Sep 17 2024 at 17:30):

@Henrik Böving it's true, but in that case the formatting needs to be really good so that people don't complain that the buggy formatting is not correctable because it will be reverted

Mario Carneiro (Sep 17 2024 at 17:30):

I don't think lean's formatter is at that level, it has trouble even making correct syntax in some cases

Mario Carneiro (Sep 17 2024 at 17:31):

(as eric shows)

Mario Carneiro (Sep 17 2024 at 17:31):

although to be fair lean's parser is turing complete so correct formatting is basically undecidable

Edward van de Meent (Sep 17 2024 at 20:34):

Wait what

Edward van de Meent (Sep 17 2024 at 20:36):

I suppose the fact that the language is extensible (and you get to define your own syntax) does make that likely I guess... I hadn't thought about it like that before

Edward van de Meent (Sep 17 2024 at 20:38):

Perhaps it would help to make it a bit more concrete as to what a formatter should fix... Indentation? Spacing? Use certain variants of characters (thinking of mapsto)? Remove redundant brackets?

Jakob von Raumer (Sep 18 2024 at 08:53):

Using certain variants of characters can be done with delaborators, removing redundant brackets is already done by the pretty printer, so I guess indentation and line breaks are the main points?

Mario Carneiro (Sep 18 2024 at 09:44):

yes, most of the mistakes in the pretty printer relate to using too many line breaks or getting indentation wrong (sometimes resulting in syntactically incorrect or invalidly parsed output)


Last updated: May 02 2025 at 03:31 UTC