Zulip Chat Archive
Stream: lean4
Topic: Is there a way to get Lean AST into JSON from CLI?
Dan Abramov (Feb 14 2025 at 01:22):
I'm not Seriously Considering this yet (and it may be a foolish idea anyway) but I'm curious to try making a Prettier plugin for Lean.
I know there's an in-progress official pretty formatter effort written purely in Lean. From what I gathered, it's far from finished. The reason I'm curious about trying to integrate with Prettier is that Prettier has some interesting know-how (like nuances about "fitting things into a line" etc) and I'm wondering whether its abstractions are sufficient to overcome these difficulties. (The core is not JS-specific.) If anything, it would be a fun exercise for me to get to know the Lean syntax better.
To even take a stab at it, I'd need two things:
- A way to get Lean AST from a Lean file into a JSON format (so I can consume it from my JS code)
- A way to print individual AST tokens later (while using Prettier's abstractions for "soft line breaks" and other layout)
I can probably take a stab at writing the latter (which would be the majority of my learning exercise) but I really need the former.
To conclude: is there a command I can shell out to that does this for me? Just the AST, in a JSON form. The key requirement is that it needs to be a standalone program I can call from JS — I'm not proficient at Lean yet so I don't want to write the glue code in Lean. I've seen mentions of --ast
flag but it doesn't seem to work anymore.
Thanks!
Dan Abramov (Feb 14 2025 at 01:29):
I guess ideally I'd want an executable that can read Lean source code via stdin and output an AST in the JSON form via stdout.
Dan Abramov (Feb 14 2025 at 01:51):
Oops I maybe should have posted this into #lean4 dev instead. Doesn't seem like there's a way to move...
Henrik Böving (Feb 14 2025 at 08:29):
There does not currently exist such a program but it could probably be written out in a reasonable amount of time. The key reason that the formatter would have to be written in Lean itself is that Lean users can extend the Lean syntax dynamically as the program is being elaborated so they also need to be able to dynamically register formatting rules, otherwise the custom syntax cannot be handled properly by a potential formatting tool.
Eric Wieser (Feb 14 2025 at 09:53):
Dan Abramov said:
The reason I'm curious about trying to integrate with Prettier is that Prettier has some interesting know-how (like nuances about "fitting things into a line" etc
I think that unfortunately Lean syntax is permitted to be sensitive to the exact placement of line breaks and indents, so it's possible that prettier's attempt to fit things into a line would violate these constraints
Bhavik Mehta (Feb 14 2025 at 12:56):
I'd also really love to see an autoformatter for Lean (I think it was also one of my first asks when I joined the community, many years ago!). Is there a tracking issue for this already and/or is it on the current FRO roadmap?
Henrik Böving (Feb 14 2025 at 13:04):
There's discussion on this matter over here
Mario Carneiro (Feb 14 2025 at 13:07):
my impression of the pretty expressive paper is that it's not about a lean formatter at all, it's just a formalization of a new pretty printing algorithm which is not language specific
Mario Carneiro (Feb 14 2025 at 13:07):
if that's all that is needed we have one in lean already, it's Lean.Format
and it's been around ~forever
Rado Kirov (Feb 16 2025 at 00:12):
Sounds like a great plan of attack. I can try to help out if there are subtasks that are parallelisable (I have a bit of experience is TS language tooling).
Having significant white-space shouldn't be a deal breaker for prettier (fun fact JS also has significant white space technically due to semicolon insertion), it just means it needs to have harder rules around indentation, but from what I gather prettier's pluggable whitespace logic is expressive enough for that. It's exactly what https://github.com/prettier/plugin-python does.
Sebastian Ullrich (Feb 21 2025 at 10:52):
From the FRO side I can reassure that a robust and extensible code formatter is still part of our 5-year roadmap and of continued interest to us though at this point I can't give a specific date for its delivery. However, we do hear the significant interest from users and are looking into getting to it sooner rather than later.
Zeyuan Hu (Mar 07 2025 at 17:35):
Just curious what's the current practice on enforcing format on the code being merged into Lean4 master branch? Is there any check on that in the merging pipeline? I would like to replicate that practice (and hopefully it is semi-automated) locally while waiting for the formatter.
Sebastian Ullrich (Mar 07 2025 at 18:24):
The formatting check is currently human-driven :) . Otherwise all linting etc is the same as in a local build.
Rado Kirov (Mar 08 2025 at 04:36):
I think this file does exactly what you are asking for https://github.com/lean-dojo/LeanDojo/blob/main/src/lean_dojo/data_extraction/ExtractData.lean . Here is the output on the first one line file in MIL - https://gist.github.com/rkirov/b85e53225cae4a59c253158b1110a950 . For unknown to me reason it crashes on other files.
It also looks like it doesn't contain comments, which I think is in theory correct for AST (you want a "concrete" syntax tree for a formatter?) Can Prettier deal with missing comments some how (additional parsing on the blocks outside what the ones in the AST ?)
Last updated: May 02 2025 at 03:31 UTC