Zulip Chat Archive

Stream: general

Topic: paper: pretty expressive printer

Asei Inoue (Feb 03 2025 at 16:53):

title:A Pretty Expressive Printer (with Appendices)
link: https://arxiv.org/abs/2310.01530

I haven't read the entire paper yet; I've only looked at the abstract, but it seems very interesting. The fact that it uses Lean is also intriguing, and I thought that the proposed technique might be applicable to Lean as well. What do you think?

Henrik Böving (Feb 03 2025 at 17:28):

I already implemented it: https://github.com/hargoniX/pfmt, it's just waiting for someone :tm: to actually make use of it to format Lean code (which is a non trivial effort)

Asei Inoue (Feb 04 2025 at 01:01):

oh nice! good job!

Walter Moreira (Feb 04 2025 at 02:28):

@Henrik Böving what is the process you envision for using your tool to format Lean code? Would one build the pretty printing language (I believe Doc in your code) from Syntax values in Lean?

Walter Moreira (Feb 04 2025 at 03:11):

... and if I may add a follow up question: would Lean.Parser.runParserCategory be the right function to convert a source file to Syntax?

Henrik Böving (Feb 04 2025 at 07:11):

Walter Moreira said:

Henrik Böving what is the process you envision for using your tool to format Lean code? Would one build the pretty printing language (I believe Doc in your code) from Syntax values in Lean?

Yes

Walter Moreira said:

... and if I may add a follow up question: would Lean.Parser.runParserCategory be the right function to convert a source file to Syntax?

Its more involved than that. As lean can extend itself on the file you need to elaborate a file as you are parsing it in case someone declares a syntax in the middle of the fly. In general we most likely need an attribute to tag pretty printers of custom user declared syntax. I assume that for many syntaxes like single symbol ones or infix notation we can automatically derive a pretty printer so one would need to integrate that with these mechanisms. And then someone has to go through all of core and add a formatter for the at least dozens, potentially hundreds (I didn't count them yet) kinds of syntax that we have. And then someone has to do that for all the mathlib syntax that cannot be auto generated and then we should be good to go.

Walter Moreira (Feb 04 2025 at 21:07):

Thank you for the information. It is indeed more involved than what I was thinking. Are you aware of someone already working on this plan, using your pfmt code?

If so, would it be possible to contribute to that effort, or otherwise to start at least an incremental approach to mapping Lean's syntax into your framework?

(I'm a big fan of auto-formatters, like black in Python, and I really miss that tool in Lean. It's an itch to scratch...)

Henrik Böving (Feb 04 2025 at 21:08):

Are you aware of someone already working on this plan, using your pfmt code?

Frithjof Sletten (Feb 04 2025 at 23:24):

Hi, I am considering using Pretty Expressive to format Lean in connection to my master thesis.

My current Idea would be to get the Syntax of all top-level commands using a previous attempt at formatting (essentially call processHeader, then IO.processCommads and then format the code based on the InfoTree): https://github.com/leanprover/lean4/blob/master/script/reformat.lean
As far as I can tell this would let us see the current state of the environment and options for each top-level command. Which would be useful for temporary configuration of the formatter, such as disabling it momentarily using set_option.

Note that InfoTree.context has changed since the code was written. This would be my smallest fix

let topLevelCmds : Array CommandSyntax ← s.commandState.infoState.trees.toArray.mapM fun
  | InfoTree.context i
      (InfoTree.node (Info.ofCommandInfo {stx, ..}) _) =>
      match i with
      | .commandCtx { env, currNamespace, openDecls, .. } =>
        pure {env, currNamespace, openDecls, stx}
      | _ =>
        failWith "not a commandCtx"
  | _ =>
    failWith "unknown info tree"

I am new to Lean so I am probably missing something.
For example, I am uncertain whether it is possible to determine the value of a reference after the code has been executed at a specific point in time.
This interested me because I considered supporting changing how the syntax is formatted part way through the file.
However, this might not be important because the only use I could see is making testing slightly easier.

I am also considering connecting to the LSP so we wouldn't have to evaluate the code again before formatting it, to hopefully speed up formatting while editing.

Henrik Böving (Feb 05 2025 at 07:13):

Your general approach to this project above sounds correct to me yes.

Organization wise an important thing to keep in mind is that this project is on the Lean FRO roadmap so it will also be tackled sooner or later by a FRO employee (potentially me). I can't make any guarantees about when this is going to happen so it might be that we end up working on a similar project before or during the period you plan to write your thesis in. It might also happen that we start working on it after your thesis and find that it needs to be reworked to make it work as we would like it to.

Last updated: May 02 2025 at 03:31 UTC