Zulip Chat Archive

Stream: general

Topic: docstring of function arguments/return value

Asei Inoue (Apr 21 2024 at 11:41):

What is the standard way to annotate function arguments and return values in Lean with Doc comments?

It would be useful to be able to read the documentation when hovering over an argument, but is this difficult to achieve?

Max Nowak 🐉 (Apr 21 2024 at 12:51):

I don’t think this is possible at all. We have markdown in doc strings, that’s about it.

Alok Singh (Aug 13 2024 at 20:13):

i was thinking about this too looking at @Martin Dvořák‘s code. i often wish for this 99808EB0-2059-403B-86D3-924B6705692C.png

Asei Inoue (Aug 13 2024 at 23:07):

I found #guard_msgs has documented arguments.

Martin Dvořák (Aug 14 2024 at 07:21):

To be honest, my comments on extendedFarkas provide little to no value for the reader. By reading the code, you get the same or better information in approximately the same time as it takes to read the comments. I might as well delete them — just keep the docstring on the top, as it provides context.

Alok Singh (Aug 14 2024 at 07:49):

Martin Dvořák said:

To be honest, my comments on extendedFarkas provide little to no value for the reader. By reading the code, you get the same or better information in approximately the same time as it takes to read the comments. I might as well delete them — just keep the docstring on the top, as it provides context.

i totally disagree. the prose is much easier to read than math/code for most people. if you’re intro-ing using yourself as a generic reader, i think that’s off for figuring out what’s valuable to them. i certainly find the prose easier for the hypotheses. not for the matrix, for the bounding vector b it’s good but there i’d just call it boundingVector

Damiano Testa (Aug 14 2024 at 07:52):

I also think that it is really rare that a comment does not provide added value. As Alok also said: even a mere translation into natural language of what the code represents is often useful.

Yaël Dillies (Aug 14 2024 at 07:56):

But you have to remember that every comment is a maintenance burden. A docstring needs to be reviewed, kept in sync, copied over to similar lemmas...

Yaël Dillies (Aug 14 2024 at 07:57):

So, even if it's obvious that most comments add value to the reader, it's not obvious that some comments have positive value once you take into account both the reader and the library maintenance.

Yaël Dillies (Aug 14 2024 at 07:58):

With the advent of LLM-based tools, it is thinkable that trivial comment that just restate a piece of Lean code shouldn't be written and maintained in the codebase, but rather generated on demand when the user asks for it

Damiano Testa (Aug 14 2024 at 07:59):

I would have reached the opposite conclusion: if something needs to be kept in sync, then this is a feature of a meaningful text! :smile:

Michael Rothgang (Aug 14 2024 at 09:15):

There certainly are low-value comments, the standard example being

-- Increment `i` by one.
let i := i + 1

That said: if you find most of your comments on functions or definitions are low-value, I'd tend to

wonder if the comments are as useful (are they detailed enough, keeping in mind the the right target audience?)
wonder if the extra definitions are useful. (For instance, not every lemma in mathlib has a documentation string. For some lemmas, I'd argue this is fine!)

Eric Wieser (Aug 14 2024 at 09:21):

In favor of argument docstrings:

They provide a source for a tooltip when hovering over (argname := _)
They avoid us having to invent syntax inside the docstring

And arguments against:

They make declarations headers much more verbose, as the arguments might be hidden amidst a large body of text
Sometimes arguments are best documented all at once rather than separately ("foo a b is the foo of a at b")

Mario Carneiro (Aug 14 2024 at 16:24):

Yaël Dillies said:

With the advent of LLM-based tools, it is thinkable that trivial comment that just restate a piece of Lean code shouldn't be written and maintained in the codebase, but rather generated on demand when the user asks for it

Actually, I was having a conversation with some LLM folks not long ago that came to basically the opposite conclusion: one reason why LLMs aren't better at autoformalization and informalization is lack of parallel corpora, so paired english and lean text is actually hugely valuable to them and they wish the whole library was marked up like this.

Mario Carneiro (Aug 14 2024 at 16:27):

Not to mention that using an LLM to generate the text on demand means there is no chance for an expert reviewer to be able to catch mistakes before they are shown to the user. This is less of an issue if an LLM fills in some form which is edited by a human if necessary and then committed somewhere for retrieval by the user

Violeta Hernández (Aug 15 2024 at 13:45):

I'm all for good documentation, but I think the benefits of documenting every single value don't outweigh the inherent verbosity. In 99% of cases, understanding what the variables in a theorem are doing is no more difficult than understanding the theorem itself.

Violeta Hernández (Aug 15 2024 at 13:46):

Even in cases where you have much more bespoke arguments, e.g. complex induction proofs split into multiple lemmas, it's probably better to have a top-level comment explaining what these assumptions are and where they come from

Last updated: May 02 2025 at 03:31 UTC