Zulip Chat Archive

Stream: Machine Learning for Theorem Proving

Topic: comments

Reid Barton (Sep 16 2020 at 12:18):

Stanislas Polu said:

Kevin Buzzard hi hi! To follow-up on your question. Anybody who has applied State of the Art technology to code generation knows how useful are comments in code to train models to translate from informal prompts to usable code. If I was to make one recommendation for the Lean community, that would be to comment Lean code as much as conceivably possible. Future ML experts will thank you for that :)

I hadn't heard about anything like this before and I'm curious: are there particular kinds of comments that are especially useful?

Reid Barton (Sep 16 2020 at 12:22):

There is quite a variety of sorts of comments in mathlib, for example:

technical comments about local choices made to work around a bug or limitation in Lean
comments within tactic blocks that give the reader a sense of the structure of the argument, without having to load the proof in an IDE
declaration-level "docstring" comments, mainly on definitions, or on particularly complicated or well-known lemmas
module-level docstrings which give an overview of the contents of a module, any relevant implementation decisions, local notations, etc.
"library notes" which explain the reason for some pattern (usually a technical one related to the workings of Lean in particular) which can be referenced throughout mathlib

Kevin Buzzard (Sep 16 2020 at 12:23):

I would like to use this as an excuse to fill my forthcoming mathlib PRs with comments so hopefully the answer is going to be "everything is useful" ;-)

Kevin Buzzard (Sep 16 2020 at 12:23):

I was imagining commenting long proofs

Kevin Buzzard (Sep 16 2020 at 12:23):

One line comments saying where we're going

Reid Barton (Sep 16 2020 at 12:25):

I always wondered whether there was anything to be gained from (hypothetical) docstrings like

/-- The canonical "swap" isomorphism between `α × β` and `β × α`. -/
def prod_comm (α β : Type*) : α × β ≃ β × α := [...]

that wasn't already obvious from the type, but now I can see that it might teach a machine reader something about the meaning of the English words "canonical" and "swap"!

Reid Barton (Sep 16 2020 at 12:27):

Well if the AI can do something useful with copyright and author information comments then we're all set on that front.

Reid Barton (Sep 16 2020 at 12:45):

Kevin you asked about Stacks project crossreferences--I guess any time one of your lemmas overlaps in content with a Stacks project item (but isn't contained in a more specific item) then you should link the lemma to the item.

Stanislas Polu (Sep 16 2020 at 14:07):

Everything is useful

Stanislas Polu (Sep 16 2020 at 14:12):

More precisely,

Comments inlined in proofs would help alignment between local informal hints and actual proof code.
Declaration-level comments even if trivial would provide precious alignement between informal and formal definitions or theorem statements.
Module level docstrings that provide an overview are very probably useful as well

I think intuitively, any comment that relate to some mathematical knowledge, even if trivial has a good chance to be leverageable in a useful way.

I have less to say about "technical comments" about Lean or specific work-arounds but you shouldn't feel like they would hurt anything.

Stanislas Polu (Sep 16 2020 at 14:17):

I think a very useful type of comment that does not really exists that much in Lean today would be a high-level summary of the proof strategy as part of the declaration-level docstring. But I also appreciate the fact that this may not be very natural for mathlib contributors to provide those.

Kevin Buzzard (Sep 16 2020 at 14:56):

You say this but I suspect that when mathematicians write lean they very much know the high level summary of the proof strategy. These are not like the "heavy refl" or crush tactics of program verification

Stanislas Polu (Sep 16 2020 at 14:59):

Yeah, I mostly meant it might generally feel repetitive or useless to them to state it in plain english, as it does not seem to happen a lot.

Kevin Buzzard (Sep 16 2020 at 16:36):

People don't see it done, so they don't do it. The teaching materials I write are full of comments.

Last updated: May 02 2025 at 03:31 UTC