Zulip Chat Archive

Stream: lean4

Topic: Are docstrings documented anywhere?


Josh Tilles (Jun 25 2023 at 01:33):

Is there any documentation that describes Lean4’s docstrings? There are plenty of examples in Lean itself that are helpful for me to consult, but I’m wondering if there is any official description.

For example, a few minutes ago I fiddled with some example code to ascertain whether Lean accepted a constructor’s docstring if it’s written between the pipe and the constructor name. It appears so, but now I’m not sure whether that’s intended behavior or accidental leniency.

Mario Carneiro (Jun 25 2023 at 01:35):

The places where docstrings can appear are authoritatively described by the grammar, although I realize that's just a "check the code" recommendation

Mario Carneiro (Jun 25 2023 at 01:36):

the situation with doc strings around the | in inductives is that it used to only be accepted after the bar, but that looks weird and is hard to format so we added an additional place to put doc strings before the | and transitioned the style to that

Mario Carneiro (Jun 25 2023 at 01:38):

it's a bit difficult to prevent docstrings after | from being syntactically correct because they are considered part of the declModifiers and we want other modifiers like attributes or protected to come after the bar

Kyle Miller (Jun 25 2023 at 01:38):

Here, I just checked the code :smile: https://github.com/leanprover/lean4/blob/4036be4f50824ab3144a3543f83591a25b6476c3/src/Lean/Parser/Command.lean#L123

It says a docComment can appear before the |, and after the | there's a declModifiers, and that can start with a docComment too

Kyle Miller (Jun 25 2023 at 01:41):

Mario Carneiro said:

the situation with doc strings around the | in inductives is that it used to only be accepted after the bar

I didn't realize there was a grammar justification why the docs render inductives like docs4#Lean.Expr

Mario Carneiro (Jun 25 2023 at 01:41):

and I just checked the elaborator code :smile: https://github.com/leanprover/lean4/blob/4036be4f50824ab3144a3543f83591a25b6476c3/src/Lean/Elab/Declaration.lean#L150-L152

and it says that the doc string can be accepted in either place, it gives an error if you put both

Mario Carneiro (Jun 25 2023 at 01:42):

yeah that docgen output looks pretty crazy

Kyle Miller (Jun 25 2023 at 01:52):

There was some discussion about improving it here. It might be nice to have docstrings come after constructors in the documentation to match how other definitions look, even if it's not justified by the grammar. Another thing that came up was that it might be nice to visually distinguish parameters.

Mario Carneiro (Jun 25 2023 at 01:59):

I think the parameters should just be dropped from the constructor types

Mario Carneiro (Jun 25 2023 at 01:59):

they are visible in the inductive declaration already

Max Nowak 🐉 (Jun 26 2023 at 15:27):

Mildly related: Is there plans to elaborate code inside docstrings in a similar way to how Rust does it? That way you could have ctrl-click-go-to-definition in docs, and have semantic highlighting in doc strings, and the examples inside doc strings would be checked by the compiler as well, thus making sure they don't silently break.

James Gallicchio (Jun 26 2023 at 15:36):

I wonder if that can be implemented as a linter. Isn't there a way to call the parser on arbitrary strings?

James Gallicchio (Jun 26 2023 at 15:37):

In order to have semantic highlighting and click to definition though I guess it would need to be part of the AST

James Gallicchio (Jun 26 2023 at 15:38):

One issue with actually elaborating docstrings is that I sometimes refer in a docstring to something later in the file...

James Gallicchio (Jun 26 2023 at 15:40):

It feels like you'd want to elaborate docstrings after processing the entire project (since sometimes you forward-refer to stuff in later files)

Wojciech Nawrocki (Jun 26 2023 at 16:59):

James Gallicchio said:

It feels like you'd want to elaborate docstrings after processing the entire project (since sometimes you forward-refer to stuff in later files)

Given how the language server works, my intuition would be that running code in docstrings would be quite easy (change the docstring elaborator; done), whereas running all docstrings after processing everything would be very difficult. Lean elaboration is designed to be quite sequential.

Sebastian Ullrich (Jun 26 2023 at 17:06):

Yes, the best compromise I was able to come up with in my mind (i.e. with no concrete plans to implement it) was to elaborate the docstring after the current command, which would allow it to at least access that function and any above it

James Gallicchio (Jun 26 2023 at 17:08):

Yeah. But anything short of allowing forward references between files would be a non-starter for me, at least how I use docstrings ATM. I think there is reasonable justification for elaborating docstring code with the elab state of the project's root file. But I have no idea how you would implement that in the language server without substantial design changes (?)

Max Nowak 🐉 (Jun 26 2023 at 17:33):

I may be low-key interested in implementing at least the easy version, but I haven't contributed to the Lean compiler before, and got tons of other things to work on ;-;

Jireh Loreaux (Jun 26 2023 at 18:14):

The problem I see with running forward references in docstrings is that you would have to first compile the deeper file. If you're making edits to the current file, that seems problematic, unless you only use a cached version.

James Gallicchio (Jun 26 2023 at 19:53):

yeah, this seems like something that would have to be best effort in vscode

David Thrane Christiansen (Jun 26 2023 at 20:10):

In case it's useful, here's some notes from the past. We had this in Idris 1 - not sure if it's been redone for Idris 2 yet. Some things that were turned out well:

  • Having a modifier for code blocks that said they should contain an error. This caused them to get a nice red underline with an error message tooltip, rather than having elaboration fail due to errors in the code block. This was good for demonstrating "this fancy type prevents this mistake".
  • Having the equivalent of #eval inside docs, so that the result of evaluation was rendered in the docs. Good for program examples. We labeled those blocks with the example modifier.

So e.g. one could write:

||| Add two numbers.
|||
||| ```idris example
||| add 2 3
||| ```
||| Adding strings doesn't work:
||| ```idris error
||| add "two" "three"
||| ```
||| @ n the first number
||| @ k the second number
plus : (n : Nat) -> (k : Nat) -> Nat
plus 0 k = k
plus (S n') k = S (plus n' k)

There was also an explicit convention that the first paragraph of the docs was a summary to be shown in tooltips and the like.

We used the scope that existed right after the decl was elaborated for the docs, and it worked OK. In retrospect, it would have been better to wait for the end of the mutual block.

A challenge with Markdown is that inline code (that is, not a block) has no way of indicating the language or modifying it, and sometimes people want to refer to code in other languages (e.g. JSON or SQL or C or whatever) or use code font for literal output. IIRC (and I probably don't), we tried to elab those as type-correct Idris but didn't fail if we couldn't.

In Lean, it would be cool if code blocks could be hooked up to a command elaborator via the modifier name that would allow the above features to be built as libraries.

David Thrane Christiansen (Jun 26 2023 at 20:12):

I also had a prototype of including image references, which was nice. I never merged it due to not having a good place in the package to stick files like that - I think Lake is advanced enough that this could be fixed now. Showing images in docs is occasionally really useful!

David Thrane Christiansen (Jun 26 2023 at 20:36):

The ||| syntax for Idris docs was to make them not be syntactically comments. This was done in order to make it a parse error when they were placed in an invalid location, rather than silently ignoring them as comments. This decision had some unfortunate knock-on effects related to syntax highlighting in certain editors, but :shrug:

Ruben Van de Velde (Jun 26 2023 at 20:43):

Same is true with /-- -/

David Thrane Christiansen (Jun 26 2023 at 20:53):

Ruben Van de Velde said:

Same is true with /-- -/

Good! :-)

Mario Carneiro (Jun 26 2023 at 21:33):

I assume it is a typo that the declaration named plus has a docstring about add :wink: , would this have been highlighted in the editor?

Max Nowak 🐉 (Jun 26 2023 at 21:36):

I would expect it to at least, since it can't find add in the environment, only plus. And the build should fail then as well (docs are important, and easy to fix).

David Thrane Christiansen (Jun 27 2023 at 10:05):

Mario Carneiro said:

I assume it is a typo that the declaration named plus has a docstring about add :wink: , would this have been highlighted in the editor?

Yes, assuming there was no addin scope, it would have failed to compile because the example code block contained an elaboration-time error. I made these features because I'm terrible at this kind of thing and want the computer to take care of it for me!

OTOH, if add was in scope, then this would not have failed. There was no relevance requirement for docs.

We didn't have snippet completion, but the explicit docs for arguments could be used to build a nice experience on top of that, I think.


Last updated: Dec 20 2023 at 11:08 UTC