Zulip Chat Archive

Stream: general

Topic: DocNextGen discussion thread


Tony Beta Lambda (Oct 29 2025 at 19:59):

This is the discussion thread of DocNextGen.

Justin Asher (Oct 29 2025 at 20:08):

Great! A few questions to start:

(1) How often will you be updating the index? Will it be static? Or will it be updated periodically, say, nightly?

(2) Is it just for Mathlib? I noticed that you just have Mathlib and Init indexed right now, but other packages such as Batteries, Lean4, and Std would be good. Will this be runable for any repository?

(3) Is there any reason to use jixia over, say, parsing the doc-gen4 output? I am curious what useful data jixia can provide me.

(4) Will there be any Python package or MCP tooling support?

Thanks again for putting this together—I am looking forward to playing around with it more. The UI is fun, and I think it would be cool if at some point the main documentation website supported semantic search and informalizations similarly!

Philippe Duchon (Oct 29 2025 at 21:27):

I had a quick look, and since I'm on a laptop with a not-so-large screen I might be missing a lot.

(1) I'm a bit surprised that the (theorem/definition/whatever) statements don't contain links, only the "full source" does. On this, I prefer the way the current documentation works.

(2) I like the single click to see the source of just the current statement (instead of the full file, like the current doc does), but also having a link to the full file text would be nice

(3) (might be screen-size dependent, but this is what I get on my laptop) clicking to see the source of a statement changes the layout significantly: the first column (with the statement itself) becomes larger, to the point that the second column (with the informal statement) becomes almost useless. Maybe instead of a two-column panel one could have the source code above the informal statement? (I can see the need for longer lines when displaying source code)

(4) Again, when asking for the full source, the width of the panel changes to a different width for each statement (I assume the goal is to match the line lengths in the code). This is a bit unsettling, as having several "source code"s displayed and opening/closing one of them sometimes results in unpredictable width changes

But overall, it seems pretty nice as an option!

Tony Beta Lambda (Oct 30 2025 at 12:13):

Justin Asher said:

(1) How often will you be updating the index?
(2) Is it just for Mathlib?
(3) Is there any reason to use jixia
(4) Will there be any Python package or MCP tooling support?

(1) Every three month, since mathlib-informal and LeanSearch updates for each Lean v4.x where x == 2 (mod 3)
(2) It can be run on other packages, but the prompts used in the informalization process is specialized on math; it might or might not work as well for other packages
(3) I'm using jixia since it has a nice, structured output format and can be tweaked to my need; haven't really tried parsing doc-gen4 output.
(4) I didn't quite understand this one; do you mean MCP support for DocNextGen or LeanSearch?

Tony Beta Lambda (Oct 30 2025 at 12:19):

Philippe Duchon said:

I had a quick look, and since I'm on a laptop with a not-so-large screen I might be missing a lot.

(1) I'm a bit surprised that the (theorem/definition/whatever) statements don't contain links, only the "full source" does. On this, I prefer the way the current documentation works.

(2) I like the single click to see the source of just the current statement (instead of the full file, like the current doc does), but also having a link to the full file text would be nice

(3) (might be screen-size dependent, but this is what I get on my laptop) clicking to see the source of a statement changes the layout significantly: the first column (with the statement itself) becomes larger, to the point that the second column (with the informal statement) becomes almost useless. Maybe instead of a two-column panel one could have the source code above the informal statement? (I can see the need for longer lines when displaying source code)

(4) Again, when asking for the full source, the width of the panel changes to a different width for each statement (I assume the goal is to match the line lengths in the code). This is a bit unsettling, as having several "source code"s displayed and opening/closing one of them sometimes results in unpredictable width changes

But overall, it seems pretty nice as an option!

Thanks for the feedback!

(1) Links in statements will be added
(2) Links to github source will be added
(3) & (4) This is bothering me too, and I haven't really found a nice solution; the layout change is quite disruptive, but I want to avoid breaking lines arbitrarily. Right now I'm thinking of adding options to hide the sidebars, which is not a perfect solution but can be helpful. If you have some suggestions please let me know.

Alexander Hicks (Oct 30 2025 at 20:47):

For (2) could it be set up to allow the maintainer of a repo to plug in their own model/prompt for the informalization?

Patrick Massot (Oct 30 2025 at 23:14):

I think every page should include a big warning saying the informalized statements are not reliable because they are AI generated.

Kim Morrison (Oct 31 2025 at 02:49):

In fact, I think it is safe to say: please do not run this on Mathlib and publish the results without such a warning, because the AI generated results, if not labelled as such, could incorrectly reflect poorly on the original authors of the pages.

(The Apache license of course allows you to do this without such a warning; this is a request for courtesy to the authors.)

Alex Meiburg (Oct 31 2025 at 13:44):

Relatedly, it would be good if there was a way to report incorrect informal descriptions!

Tony Beta Lambda (Nov 20 2025 at 07:49):

Agreed, will add a warning and a button to flag inaccurate informal descriptions


Last updated: Dec 20 2025 at 21:32 UTC