Zulip Chat Archive

Stream: general

Topic: lean classes in documentation

Kevin Buzzard (Aug 12 2020 at 09:45):

I feel silly saying "R is a ring" in documentation, because I may as well say "R is a ring". I feel equally silly saying "R is a commutative ring" because it's much shorter to say "R is a comm_ring". I have noticed that I write inconsistent docstrings as a consequence (not that they prove false, just that I refer to classes using an inconsistent convention). I am not sure whether consistency is important here but given that my PhD student Ashvni Narayanan seems to have done a super job on the bundled-subring branch and I'm just going through it and tidying up docstrings etc before a PR (and I want to put a docstring on subsemiring.lean, despite the fact that nobody will probably ever read it -- who cares about subsemirings? ;-) ) I thought that now might be a good time to either come up with a convention or declare that we are happy that some people say "let R be a ring" and others say "let R be a ring".

Rob Lewis (Aug 12 2020 at 09:48):

The html docs will try to link identifiers in backticks. So if you write ring it should link it to docs#ring . I think we also have the custom of backticking R, but obviously this doesn't linkify and we're not fully consistent about it.

Kevin Buzzard (Aug 12 2020 at 09:50):

Just to give a specific example which is already in mathlib (submonoid.map v addsubmonoid.map):

The image of a submonoid along a monoid homomorphism is a submonoid.

The image of an `add_submonoid` along an `add_monoid` homomorphism is
an `add_submonoid`.

The difference is probably simply because the Lean class submonoid is a word, whereas add_submonoid has a character in it which isn't even a valid character in words.

Kevin Buzzard (Aug 12 2020 at 09:52):

As a mathematician who is very used to writing "let $R$ be a ring" (even in emails) I have no objections to R indicating a term.

Rob Lewis (Aug 12 2020 at 09:52):

Er, let me walk back my claim,it doesn't do this as aggressively as I thought. See https://leanprover-community.github.io/mathlib_docs/tactics.html#ring where ring doesn't get linked, which is a good thing in this context.

Kevin Buzzard (Aug 12 2020 at 09:55):

I am getting the impression that you are not worried about the structural inconsistency in the two map docstrings so I will go ahead with my convention of doing it inconsistently and concentrate instead on maximising readability (although to be honest they're both pretty maximally readable)

Rob Lewis (Aug 12 2020 at 09:56):

I meant to suggest to prefer submonoid over submonoid because links in the docs are good. But clearly there's an issue with the linkifier, which now that I think about is has been pointed out before.

Yury G. Kudryashov (Aug 12 2020 at 17:38):

It would be nice to be able to write [additive monoid](add_monoid) (possibly with some prefix before add_monoid) as well.

Rob Lewis (Aug 12 2020 at 18:28):

https://github.com/leanprover-community/doc-gen/pull/45

Rob Lewis (Aug 12 2020 at 18:29):

Yury G. Kudryashov said:

It would be nice to be able to write [additive monoid](add_monoid) (possibly with some prefix before add_monoid) as well.

I don't think there's a markdown-standard way to do this. It would be easy enough to hack our own syntax into doc-gen. But this pushes us even further into having "markdown" comments that can really only be interpreted by our own tools.

Rob Lewis (Aug 12 2020 at 18:30):

What we do now (linkify identifiers that appear in backticks) is okay because it degrades gracefully. Any markdown parsing that doesn't know how to do the linkification will just display them as tt text like normal.

Mario Carneiro (Aug 12 2020 at 19:22):

What if it is treated as a link transformer rather than a markdown interpretation? That is, anything that should be a link has to be markdown formatted like one, but the target of the link might be processed by us to something else

Mario Carneiro (Aug 12 2020 at 19:23):

That means you have to write [`ring`] and a markdown formatter will put a link there to `ring` which isn't a web page, but we know how to turn that into docs#ring

Rob Lewis (Aug 12 2020 at 19:58):

anything that should be a link has to be markdown formatted like one,

I really don't think we should get rid of the current behavior that tries to linkify everything in code blocks. It's super convenient. Even ignoring code blocks with more than one name in them, look at the top of https://leanprover-community.github.io/mathlib_docs/group_theory/submonoid/operations.html where all the names after bullet points should be links. And again, this degrades gracefully in other markdown processors.

Rob Lewis (Aug 12 2020 at 20:00):

The question is how to link text to a declaration without giving the full URL, like [additive monoid](add_monoid). We could treat this as a link transformer and any other processor would just end up with broken links. But we'd need a way to distinguish this from a normal relative link. I guess if the term in () is a known identifier.

Mario Carneiro (Aug 12 2020 at 20:02):

A regular link ends in .html?

Mario Carneiro (Aug 12 2020 at 20:03):

there is nothing definitive but lots of circumstantial evidence one way or another

Rob Lewis (Aug 12 2020 at 20:06):

Mario Carneiro said:

A regular link ends in .html?

Not at all, e.g. no Zulip links and very few GitHub links will. Defaulting to regular links and only changing to declarations if there's an exact match is much safer.

Mario Carneiro (Aug 12 2020 at 20:07):

those aren't relative links though

Mario Carneiro (Aug 12 2020 at 20:07):

they start with https://

Rob Lewis (Aug 12 2020 at 20:11):

True. But this is still ugly when the markdown gets processed anywhere other than doc-gen. Which I hope will happen more often. Maybe this is planning for an unknown future but I do dislike the idea of changing the interpretation of standard markdown.

Mario Carneiro (Aug 12 2020 at 20:12):

I don't see any way around that though, unless we actually put proper links in markdown

Mario Carneiro (Aug 12 2020 at 20:13):

by the way, that approach is also what is used in later versions of rustdoc

Mario Carneiro (Aug 12 2020 at 20:15):

you just use a regular link notation but the target of a link is a name (fully qualified if necessary) instead of a URL

Rob Lewis (Aug 12 2020 at 20:16):

Yury G. Kudryashov said:

It would be nice to be able to write [additive monoid](add_monoid) (possibly with some prefix before add_monoid) as well.

So maybe this is a question for Yury and the people who hearted this comment! How nice would it be to write [additive monoid](add_monoid), as opposed to writing additive monoid (`add_monoid`)? The first prints as additive monoid, the second as additive monoid (add_monoid)

Mario Carneiro (Aug 12 2020 at 20:18):

I think both would be good

Rob Lewis (Aug 12 2020 at 20:18):

Looking at the Rust docs, they don't linkify all identifiers inside backticks like we do.

Mario Carneiro (Aug 12 2020 at 20:19):

no, they implement the mechanism I mentioned before where you have to use a link markup to get a link

Mario Carneiro (Aug 12 2020 at 20:19):

If you have both, then documentation writers can choose how to present it

Rob Lewis (Aug 12 2020 at 20:20):

But when they're very similar and one breaks portability, I'm much more hesitant to support both.

Mario Carneiro (Aug 12 2020 at 20:21):

I don't understand. Neither one works without some docgen magic

Mario Carneiro (Aug 12 2020 at 20:22):

but they should both present just fine with a regular markdown parser, you will just maybe get a broken link in addition to the reasonable and colored text

Rob Lewis (Aug 12 2020 at 20:23):

additive monoid (`add_monoid`) works in doc-gen now with my new PR fix. In the official docs the (add_monoid) will be a link, in other processors it will just be <code> text. No broken links.

Mario Carneiro (Aug 12 2020 at 20:24):

okay, but a broken link doesn't hurt anyone

Mario Carneiro (Aug 12 2020 at 20:24):

garbled text would be bad, but this won't cause it

Mario Carneiro (Aug 12 2020 at 20:26):

I think if we start inventing our own way to style markdown it's still basically a markdown extension

Rob Lewis (Aug 12 2020 at 20:26):

... sure it does. I imagine something like, people writing tutorials like TPIL can grab bits of information from the docs like tactic syntax and options. These stay up to date because they're pulled straight from source. If those descriptions have broken links they end up broken in the tutorial which doesn't look great.

Mario Carneiro (Aug 12 2020 at 20:27):

We should make sure everything we produce does the right thing then

Mario Carneiro (Aug 12 2020 at 20:28):

either that or train people to put full URLs in all the markdown links

Mario Carneiro (Aug 12 2020 at 20:29):

If you are okay with degrading to no link you can run a postprocessor to delete all the bad links. I think that's not better

Rob Lewis (Aug 12 2020 at 20:29):

Mario Carneiro said:

We should make sure everything we produce does the right thing then

What do you mean? Ask everyone who ever wants to source markdown from mathlib to implement our linkification hack? It seems easier just to not put broken links in the library, since again, there's a similar pattern that's a valid link in the docs and perfectly good looking non-linked text otherwise.

Mario Carneiro (Aug 12 2020 at 20:30):

"The markdown in lean docs is not pure markdown, it's mostly markdown with a few extensions to make writing lean documentation easier."

Mario Carneiro (Aug 12 2020 at 20:30):

If people want to just paste that into a markdown file that's their problem

Mario Carneiro (Aug 12 2020 at 20:31):

Alternatively, we can make a markdownifier to apply the lean hacks without otherwise rendering the text

Mario Carneiro (Aug 12 2020 at 20:31):

and people can copy from there

Rob Lewis (Aug 12 2020 at 20:37):

I just think that making up a new incompatible markdown standard to allow a slightly different style of writing docs is extreme. If you really feel strongly about it we can do this: we can implement the heuristic to convert [additive monoid](add_monoid) in doc-gen and run it before we create the public json dump. That way at least anyone using the doc-gen dump gets proper links. Anyone sourcing from mathlib directly will still be out of luck unless they reimplement the hack.

Mario Carneiro (Aug 12 2020 at 20:43):

I don't feel strongly about it, but it seems more likely to produce well structured docs than the auto-linkifier alone, because it gives people the freedom to have the visible text be different from the link, for example to get the namespace right

Mario Carneiro (Aug 12 2020 at 20:43):

also it's really painful to use https links for internal links, so I would hope we have enough freedom in that regard

Mario Carneiro (Aug 12 2020 at 20:44):

does the auto-linkifier work on vscode markdown rendering of hover text?

Rob Lewis (Aug 12 2020 at 20:47):

Mario Carneiro said:

does the auto-linkifier work on vscode markdown rendering of hover text?

No, neither would the url hack unless we hacked the VSCode markdown processing.

Bryan Gin-ge Chen (Aug 12 2020 at 20:48):

We could preprocess the docstring before passing it to VS Code, but for the auto-linkifier we'd need the export JSON to find declarations, right?

Mario Carneiro (Aug 12 2020 at 20:48):

Would it be a good idea to move docgen to the lean executable?

Mario Carneiro (Aug 12 2020 at 20:48):

I know that there is a lean doc subcommand but I think it is basically a stub

Mario Carneiro (Aug 12 2020 at 20:49):

Do we have to find declarations that are not visible to lean?

Rob Lewis (Aug 12 2020 at 20:50):

Bryan Gin-ge Chen said:

We could preprocess the docstring before passing it to VS Code, but for the auto-linkifier we'd need the export JSON to find declarations, right?

Yep. The auto-linkifier is a global thing, it doesn't really make sense out of the context of doc-gen. The url hack is local (depending on what heuristic you use, we could use a better one with the global info in doc-gen).

Rob Lewis (Aug 12 2020 at 20:50):

Mario Carneiro said:

Do we have to find declarations that are not visible to lean?

Sure, docs can mention declarations that aren't defined yet.

Rob Lewis (Aug 12 2020 at 20:51):

I'm sure there are examples in mathlib of declarations that cross reference each other in their docs.

Mario Carneiro (Aug 12 2020 at 20:51):

Maybe this can be done at hover time instead of at declaration time

Mario Carneiro (Aug 12 2020 at 20:51):

so that lean can link to it as long as it is somewhere in the final environment

Rob Lewis (Aug 12 2020 at 20:51):

What if the library isn't compiled?

Mario Carneiro (Aug 12 2020 at 20:51):

then nothing works

Rob Lewis (Aug 12 2020 at 20:51):

If hovering spawns a full library build...

Mario Carneiro (Aug 12 2020 at 20:51):

it already basically does

Rob Lewis (Aug 12 2020 at 20:52):

Hmm? It doesn't build new files, does it?

Mario Carneiro (Aug 12 2020 at 20:52):

I'm suggesting not a full library build, but rather lean uses all the information at its disposal as of the hover

Mario Carneiro (Aug 12 2020 at 20:53):

which means that you can't link to things in files that are not imported

Mario Carneiro (Aug 12 2020 at 20:53):

unless you happen to have a file open that imports the relevant declaration

Rob Lewis (Aug 12 2020 at 20:53):

Right, which is a limitation that I'm pretty sure the docs don't respect now.

Mario Carneiro (Aug 12 2020 at 20:54):

I think that's the best we can do from the lean server point of view

Rob Lewis (Aug 12 2020 at 20:54):

This is all connected to a much more convincing reason to keep a compiled database of declarations around: searching for unimported declarations.

Mario Carneiro (Aug 12 2020 at 20:54):

however I think we should look into some kind of global indexing service for use by the lean server

Mario Carneiro (Aug 12 2020 at 20:57):

Incidentally, I wrote a lean parser as part of olean-rs. It doesn't understand expressions so it can't read most of the file, but it can find command keywords, and this is enough to get namespace structure and declaration names. So that might be enough to do indexing of uncompiled files

Kevin Buzzard (Aug 12 2020 at 21:01):

[additive monoid](add_monoid) is more like the way Wikipedia does it. Will you be able to hover over the link and see the name of the lean term?

Rob Lewis (Aug 12 2020 at 21:05):

So, I think where we're at is

supporting [additive monoid](add_monoid) takes a hack in doc-gen and a hack in the VSCode extension. In VSCode these links will point to the doc pages, until
someone builds a reliable way to index the library so we can point these links to lines in files in VSCode.
This indexing should be a start for proper in-editor search, so eventually it should index declaration types as well as names.
Any external tools that use mathlib as a database will either need to use the doc-gen export or handle misdirected links in doc strings on their own, however they see fit.

Rob Lewis (Aug 12 2020 at 21:05):

Worth it?

Kevin Buzzard (Aug 12 2020 at 21:07):

I'm happy with the current situation, my question really was what should I be doing in these situations? My impression is that people here are generally very unhappy here with inconsistent conventions. If nobody is bothered about the initial docstring issue then we can just do nothing

Mario Carneiro (Aug 12 2020 at 21:07):

The vscode extension doesn't need a hack btw, lean just supplies legal markdown to vscode

Reid Barton (Aug 12 2020 at 21:08):

Mario Carneiro said:

however I think we should look into some kind of global indexing service for use by the lean server

By global do you mean hosted on a server somewhere?

Mario Carneiro (Aug 12 2020 at 21:08):

no, I mean global to the project

Mario Carneiro (Aug 12 2020 at 21:09):

probably another part of the lean server, or perhaps a separate process

Mario Carneiro (Aug 12 2020 at 21:10):

which is responsible for keeping the global definition list (of everything in the project manifest) up to date with changes from vscode

Bryan Gin-ge Chen (Aug 12 2020 at 21:19):

We could start including the doc-gen export JSON file (which has a list of declarations and what files they're in) in the .xz files on Azure and then have vscode-lean use that.

Rob Lewis (Aug 13 2020 at 09:00):

Coming back to this:
Mario Carneiro said:

The vscode extension doesn't need a hack btw, lean just supplies legal markdown to vscode

If we want the links to open certain files/lines in VSCode, the extension has to do something, no? Whatever Lean produces has to support emacs too.

Rob Lewis (Aug 13 2020 at 09:01):

If they're just links that point to the docs, it can be done in Lean, but then we're hard coding a reference to the mathlib docs into the Lean system.

Rob Lewis (Aug 13 2020 at 09:03):

Bryan Gin-ge Chen said:

We could start including the doc-gen export JSON file (which has a list of declarations and what files they're in) in the .xz files on Azure and then have vscode-lean use that.

It seems better to build this as part of lean --make (with less extra info, formatting, etc than doc-gen gives).

Mario Carneiro (Aug 13 2020 at 12:41):

Rob Lewis said:

If they're just links that point to the docs, it can be done in Lean, but then we're hard coding a reference to the mathlib docs into the Lean system.

I thought we had already established that leanprover-community/lean is designed to work with mathlib. It already links to community web pages

Mario Carneiro (Aug 13 2020 at 12:42):

Like I said before, I would prefer this all be handled locally, but that's a whole bunch of code that someone has to write. Until then, the docs pages seem like an easy place to link

Last updated: May 02 2025 at 03:31 UTC