Zulip Chat Archive

Stream: new members

Topic: Full BNF syntax?

Shanghe Chen (Jul 02 2024 at 05:39):

Hi is there now a BNF file for lean? I checked https://github.com/leanprover/lean4/blob/master/doc/lexical_structure.md#identifiers but it seems not a full specification

Ruben Van de Velde (Jul 02 2024 at 05:50):

There is not

Shanghe Chen (Jul 02 2024 at 06:10):

Thanks! I am trying to develop some basic functionality in Intellij Idea in which highlighting may require this for generating the ast. Yeah I will try if get ast from lsp/sematicsTokens work or not

Johan Commelin (Jul 02 2024 at 06:15):

Lean's syntax can be arbitrarily extended by the user. To the point that people can (and have!) embedded DSLs for html and json inside Lean.
For this reason, syntax highlighting is usually left to the Lean server, and it gives highlighting instructions to the editor via LSP.

Shanghe Chen (Jul 02 2024 at 06:20):

Yeah it’s definitely the correct way but IJ does still not support semantic highlighting from lsp yet. Maybe I will Ieave it alone first and try Lean.tmbundle for basic highlighting

Marc Huisinga (Jul 02 2024 at 08:07):

The VS Code extension uses the following TextMate grammar as a fall-back when the server has not provided any semantic tokens yet: https://github.com/leanprover/vscode-lean4/tree/master/vscode-lean4/syntaxes

Marc Huisinga (Jul 02 2024 at 08:17):

It's worth pointing out that the semantic token request is by far the strangest request in our language server and that we actively violate the LSP spec to make it work for us, so there's a chance that you will run into issues here.

The core problem is that VS Code has no support for LSP's result streaming mechanism, but we cannot wait with responding to the semantic tokens request until the full file has been processed, as that may take minutes.
To work around this, when the client requests the semantic tokens for the full file, we instead only provide it with the semantic tokens up to the current point of elaboration. Unfortunately, now the client thinks that it knows all the semantic tokens, so it will not request the semantic tokens again for a while.
To make the client re-request the semantic tokens, we send a workspace/semanticTokens/refresh server to client request every couple of seconds so that the client re-requests the set of semantic tokens for the current point of elaboration in the file.

This is a really bad hack, but there's not much we can do about it at this point. I have some ideas to improve this and I also want to see if I can perhaps hack the VS Code client to make it support some limited form of result streaming, but that's for another day.

Shanghe Chen (Jul 02 2024 at 09:38):

Thanks for pointing this out! Yeah the hard part about semantic highlighting seems to be doing it incrementally. Also the textmate file is very nice too! I am thinking if it’s possible to highlight variable names using it, currently it seems only keywords/comments/notations are recognized using the textmate rules (using the old Lean.tmlanguage file)
30E04DDD-94D1-4CE6-A32B-56366B20A3B0.jpg

Last updated: May 02 2025 at 03:31 UTC