Zulip Chat Archive

Stream: lean4

Topic: PrettyPrinter


Arthur Paulino (Dec 23 2021 at 18:33):

I'm trying to wrap my head around Lean's PrettyPrinter but I am missing some context. How can I use it to process a file written in Lean 4 and output a version of it that's properly formatted?

Henrik Böving (Dec 23 2021 at 18:42):

You would load the file into your current Lean process by using runFrontend which will give you an Environment the environment contains what the Lean compiler internally calls constants, these are not the constant declarations we know but rather refer to more or less everything that has a name, definitions, theorems, inductives etc. (this includes auto generated stuff that is usually not visible to you as a user).

The PrettyPrinter is capable of pretty printing Syntax objects which you can get by using the delab function from the delaborator on an Expr, if you want parenthesizing you also have to run the parenthesizer over the Synax object that is returned by delab.

Now where you get these Expr objects from and which one's you get differs between the type of declarations you are trying to pretty print and is e.g. one of the things I haven't yet fully figured out with doc-gen4.

However in theory once doc-gen4 is fully fleshed out there should be a part of it (namely the one that is not generating the HTML) that is capable of just loading .olean files and converting them into a format that should be more or less easy to pretty print, but as I said I'm still working on that as well^^. @Arthur Paulino

Arthur Paulino (Dec 23 2021 at 18:43):

Wow that's a surgically elaborated answer. Thank you very much! :heart:

Henrik Böving (Dec 23 2021 at 18:44):

It's what I've been figuring out myself for the past few weeks xD

Gabriel Ebner (Dec 23 2021 at 18:44):

How can I use it to process a file written in Lean 4 and output a version of it that's properly formatted?

If you only want this specific task, you might want to look at this script.

Arthur Paulino (Dec 23 2021 at 19:05):

@Mac Do you think it makes sense to add a lake format functionality that would reformat all Lean files in the repo (recursively digging directories)?
(I can try to implement it if you think so, no need to worry)

Mac (Dec 23 2021 at 20:14):

@Arthur Paulino While Gabriel has recently done a lot of work to improve the pretty printer, it still not in a place where it produces completely logical formatting. Furthermore, the pretty printer has a fixed format with virtually no configuration options, so it is not great for reformatting user code where there is likely to be variance in style. It is best, in my view, for what is designed for, pretty printing Lean code in a semi-reasonable format for things like error messages and doc-gen.

Arthur Paulino (Dec 23 2021 at 20:16):

Got it, thanks!

Gabriel Ebner (Dec 23 2021 at 20:34):

I wouldn't be so negative. While you should not in any way rely on the output, I would certainly like to encourage experimentation. Particularly helpful would be feedback of the form "this no longer compiles after formatting".

We're soon going to use this pretty-printer to automatically port mathlib to Lean 4. The formatting will certainly have a big effect on the future coding style, so I'd like it be at least somewhat idiomatic. Bad formatting, particularly bad formatting that doesn't parse, will also require manual attention. (Of course everything will require manual attention, but we'd like to reduce the work.)

Arthur Paulino (Dec 23 2021 at 20:36):

Oh, I thought we'd have guaranteed parsing (as the bare minimum)

Henrik Böving (Dec 23 2021 at 20:37):

Ideally we do, ideally running the formatter on its own output should (emphasis on should) print the same stuff again as well but that's not guaranteed as in, we have a proof for that or something.

Gabriel Ebner (Dec 23 2021 at 20:38):

Most of the parsing errors (i.e., literally parsing, not elaboration) are due to wrong indentation, and we're lacking a principled way of encoding indentation requirements in the formatter.

Arthur Paulino (Dec 23 2021 at 21:15):

Gabriel Ebner said:

I would certainly like to encourage experimentation. Particularly helpful would be feedback of the form "this no longer compiles after formatting".

Are you also interested in cases of properly formatted lean 4 code for which the script didn't work and gave up with a message like this?
cannot print: parenthesize: uncaught backtrack exception

Arthur Paulino (Dec 23 2021 at 21:20):

Alright, if you feed it with a file containing solely the definition bellow:

def foo : List (String × String) := [("asdf", "zxcv")]

The script outputs:

/-
cannot print: parenthesize: uncaught backtrack exception
def foo : List ( String
-/

Sebastian Ullrich (Dec 23 2021 at 21:48):

Can't reproduce on nightly-2021-12-23

Reid Barton (Dec 23 2021 at 21:51):

maybe an encoding issue?

Sebastian Ullrich (Dec 23 2021 at 21:52):

I thought we left those in the 20th century :grimacing:

Reid Barton (Dec 23 2021 at 22:00):

or a locale misconfiguration? I wouldn't be too surprised if Lean itself assumes its input is UTF-8 while a generic Lean program decodes input according to the locale

Arthur Paulino (Dec 23 2021 at 22:25):

I tested it on nightly-2021-12-21. I'm gonna test on _-23 instead

Arthur Paulino (Dec 23 2021 at 22:29):

Still happens here. I have a Foo.lean file with the content:

def qq : Bool := true

def foo : List (String × String) := [("asdf", "zxcv")]

The reformat script outputs:

def qq : Bool :=
  true

/-
cannot print: parenthesize: uncaught backtrack exception
def foo : List ( String
-/

Arthur Paulino (Dec 23 2021 at 23:00):

This is the setup to reproduce the error above:
https://github.com/arthurpaulino/TestPrettyPrinter

Henrik Böving (Dec 23 2021 at 23:09):

Can reproduce over here.

Arthur Paulino (Dec 24 2021 at 03:10):

The exception is being thrown when calling this function.

Gabriel Ebner (Dec 24 2021 at 11:52):

I'm not sure what's going on here. There seems to be a difference between the compiled and interpreted versions. It works fine if you run lean --run lean --run Main.lean Foo.lean`. I guess it's got something to do with initializers and imports or so.

Arthur Paulino (Dec 26 2021 at 18:29):

So you think there might be some deeper problem going on? Should I open an issue on Github?

Arthur Paulino (Feb 24 2022 at 18:51):

I decided to test this again given the fact that the toolchain has evolved but it still ends up with the same error.

Basically, we have Lean 4 code that doesn't work when built but works on script mode. Is this worrisome?

Instructions to reproduce on the README.

Mauricio Collares (Feb 24 2022 at 18:58):

AFAICT LEAN_PATH gets modified when running in script mode. Does it work if you set LEAN_PATH similarly when running the built code?

Mauricio Collares (Feb 24 2022 at 18:59):

Not sure if that should make a difference, to be honest. Just a guess.

Arthur Paulino (Feb 24 2022 at 19:02):

@Mauricio Collares is this what you mean by setting LEAN_PATH?

Mauricio Collares (Feb 24 2022 at 19:40):

That's what I meant, but it doesn't really matter apparently. I don't know what could be causing the problem.

Mauricio Collares (May 12 2022 at 19:27):

I know absolutely nothing about this code, but below is a slightly minimized version whose compiled and interpreted versions still behave differently.

Main.lean:

import Lean
open Lean Elab

unsafe def main (args : List String) : IO Unit := do
  let input := "def foo : List Nat := []"
  let inputCtx := Parser.mkInputContext input ""
  let (header, parserState, messages)  Parser.parseHeader inputCtx
  let (env, messages)  processHeader header {} messages inputCtx 1024
  let s  IO.processCommands inputCtx parserState
    { Command.mkState env messages {} with infoState := { enabled := true } }
  let topLevelCmds : Array Syntax  s.commandState.infoState.trees.toArray.mapM fun
    | InfoTree.context { .. } (InfoTree.node (Info.ofCommandInfo {stx, ..}) _) => pure stx
    | _ => IO.Process.exit 1
  IO.print topLevelCmds

Output of lean Main.lean -c Main.c && leanc -o Main Main.c && ./Main (or using lake build with a minimal lakefile):

#[(Command.declaration
 (Command.declModifiers [] [] [] [] [] [])
 (Command.def
  "def"
  (Command.declId `foo [])
  (Command.optDeclSig [] [(Term.typeSpec ":" (Term.app `List [`Nat]))])
  (Command.declValSimple ":=" (Term.depArrow (Term.instBinder "[" [] <missing>)))))]

Output of lean --run Main.lean:

#[(Command.declaration
 (Command.declModifiers [] [] [] [] [] [])
 (Command.def
  "def"
  (Command.declId `foo [])
  (Command.optDeclSig [] [(Term.typeSpec ":" (Term.app `List [`Nat]))])
  (Command.declValSimple ":=" («term[_]» "[" [] "]") [])
  []
  []
  []))]

Mario Carneiro (May 12 2022 at 19:54):

That depArrow interpretation looks really weird. It seems like it is interpreting [] as the start of a dependent arrow expression like [Foo A] -> Bar where [Foo A] is an instance argument (but the Foo A part is missing), and the -> Bar is just totally omitted (which seems totally wrong from the depArrow parser)

Gabriel Ebner (May 12 2022 at 20:29):

It looks like the parser doesn't know about the list notation here. The obvious difference between list notation and the dependent arrow notation is that the dependent arrow notation is a builtin notation, and list notation is not.

Sebastian Ullrich (May 13 2022 at 09:52):

The problem is hidden in the ignored messages variable:

:1:0: error: unknown package 'Init'

Adding

  initSearchPath ( findSysroot)

should fix that

Sebastian Ullrich (May 13 2022 at 09:56):

I updated reformat.lean

Mauricio Collares (May 13 2022 at 15:09):

Ah, many thanks for looking at this! Adding the initSearchPath line to the minimized example, the messages variable now contains the following:

:1:0: error: could not find native implementation of external declaration 'UInt64.ofNatCore' (symbols 'l_UInt64_ofNatCore___boxed' or 'l_UInt64_ofNatCore')

Mauricio Collares (May 13 2022 at 15:10):

I found your answer at https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/include.20lean.2Eh and it makes Arthur's example work. Thanks again!

Sebastian Ullrich (May 13 2022 at 16:57):

Ah, I must have tried that first and left it in!


Last updated: Dec 20 2023 at 11:08 UTC