Zulip Chat Archive

Stream: new members

Topic: Online documentation stripping useful information


Philippe Duchon (Jul 18 2025 at 17:15):

The way the Mathlib documentation is available on https://leanprover-community.github.io/mathlib4_docs/index.html, together with direct links to the source code, is IMHO a huge advantage of Lean - I keep spending hours just trying to browse it, but that's just because there are lots of thing in there.

I assume the documentation pages are extracted automatically from the source code, but I am a bit puzzled when the program doing the extraction strips out some information that seems key to understanding what a given theorem is about.

Here's an example, from ## Mathlib.Topology.Algebra.InfiniteSum.Group. Theorems Summable.subtype and summable_subtype_and_compl are both about summability of a function defined over a type, and summability of its restriction to a subtype. Summable.subtype provides an implication: if the function is summable on the whole type, then it is summable on a subtype; and summable_subtype_and_compl provides an equivalence: the function is summable if and only if its restrictions to a subtype and is complement are both summable.

When you look at the source code (which is really about infinite products and, I suppose, converted to infinite sums by a little bit of black magic, but I assume more people will be familiar with infinite sums than infinite products, so I picked the additive version for my example), both theorem statements explicitly mention the subtype (by coercion from a Set):

Multipliable (f  () : s  α)

and

((Multipliable fun x : s  f x)  Multipliable fun x : s  f x)  Multipliable f

Yet, when you look at the theorem statements online, the former removes the reference to s and the latter doesn't:

Summable (f  Subtype.val)
((Summable fun (x : s) => f x)  Summable fun (x : s) => f x)  Summable f

Of course, if you know about infinite sums, it's pretty easy to guess what Summable.subtype is really about (there is, after all, a Set mentioned in the arguments, and the name kinda gives it away), but I still find this both puzzling (why the different treatments?) and annoying (I tend to look at the source code to check that I'm getting it right).

Why is it this way?

Aaron Liu (Jul 18 2025 at 17:15):

it's pretty-printed

Aaron Liu (Jul 18 2025 at 17:16):

using the pretty-printer

Aaron Liu (Jul 18 2025 at 17:17):

this absolutely destroys the readability of any category theory stuff

Philippe Duchon (Jul 18 2025 at 17:19):

OK but in one case it leaves a type annotation, and in the other it doesn't, and I still don't get why it does this.

Aaron Liu (Jul 18 2025 at 17:19):

it's the pretty printer

Aaron Liu (Jul 18 2025 at 17:20):

it's somewhat type aware

Kyle Miller (Jul 18 2025 at 17:21):

It's an important problem, but at the moment we don't have much of a solution to these pretty printer issues. (There's pp.analyze, but it's not enabled for docgen, and it also likely needs a good amount of work to get it into shape again.)

With your examples, the type ascriptions show up for fun binders (I'm not sure exactly why), which is different from term type ascriptions. It's not surprising that there could be a difference here.

Kyle Miller (Jul 18 2025 at 17:23):

People pointing out annoying things about the pretty printer has helped it make incremental progress. Not too long ago, dot notation didn't even have links, for example.

Ruben Van de Velde (Jul 18 2025 at 18:33):

It's not so much that it's stripping any data, it's that doc gen tries to print code (using the pretty printer) from a parsed representation that has already lost much of the formatting information, and this process is sadly imperfect

Philippe Duchon (Jul 18 2025 at 19:24):

Ah OK, so it's not working from the source code.

At least I know it's not just an oversight, I'm not sure I understand why but the problem doesn't seem to be easy to solve. Thanks!

Kyle Miller (Jul 18 2025 at 19:35):

Doc-gen basically takes the output of #check with a handful of extra pretty printer settings and then uses all the hover information to linkify the result. That at least can give you an idea of what you might expect from it when working from within Lean.


Last updated: Dec 20 2025 at 21:32 UTC