Zulip Chat Archive

Stream: mathlib4

Topic: 100 theorems list and undergrad targets


Eric Wieser (Jul 21 2023 at 16:24):

Shortly these will be in mathlib master; but right now, the website expects these files to be duplicated to the mathlib docs (presumably as an attempt to keep things in sync).

@Henrik Böving, can you setup CI to host the files there? In the long run it might be better for doc-gen4 to natively consume those files and generate special pages, but that's fine to punt for another time.

Henrik Böving (Jul 21 2023 at 16:36):

Just to make sure you want me to put the yaml file at mathlib4_docs/... ?

Eric Wieser (Jul 21 2023 at 16:38):

Yes, if you agree that that's reasonable

Henrik Böving (Jul 21 2023 at 16:41):

Sure, which PR is going to add the yaml to mathlib4? The change should be pretty easy you basically want to just copy it in here: https://github.com/leanprover-community/mathlib4/blob/master/scripts/deploy_docs.sh#L42

Eric Wieser (Jul 21 2023 at 16:43):

#6026: the file is already in branch#staging

Henrik Böving (Jul 21 2023 at 16:45):

AH excellent, I'll prepare a PR

Henrik Böving (Jul 21 2023 at 16:49):

https://github.com/leanprover-community/mathlib4/pull/6043 this should do the trick

Eric Wieser (Jul 21 2023 at 16:50):

Thanks!

Eric Wieser (Jul 25 2023 at 08:06):

It looks like we also need a Lean4 version of https://leanprover-community.github.io/mathlib_docs/export_db.json.gz to get https://github.com/leanprover-community/leanprover-community.github.io/pull/338 passing

Eric Wieser (Aug 06 2023 at 12:03):

@Henrik Böving, it looks like the file in question is actually generated by python code in doc-gen; did doc-gen4 port this code at all?

Henrik Böving (Aug 06 2023 at 12:04):

No it didn't exactly seem necessary at the time since I was working under the assumption that it is only required to generate the docs with the python jinja2 stuff which was fully replaced right.

Eric Wieser (Aug 06 2023 at 12:07):

It's used by python code on the main website

Henrik Böving (Aug 06 2023 at 12:10):

I see. You can get a similar style json file here: https://leanprover-community.github.io/mathlib4_docs//declarations/declaration-data.bmp (it is BMP because somewhere in the old doc-gen it said that makes github do compression while json does not) is that enough info for your script or do we need additional fields from the old one?

Eric Wieser (Aug 06 2023 at 12:11):

The decl_header_html field is particularly important as it means the website doesn't have to render things itself

Eric Wieser (Aug 06 2023 at 12:12):

(I'm aware of the BMP hack, I introduced it!)

Henrik Böving (Aug 06 2023 at 12:15):

Okay decl_header_html is something I would definitely not want to be in this file because it get loaded and parsed by all doc-gen4 pages so I guess I'll need to build a second json file?

Henrik Böving (Aug 06 2023 at 12:38):

Hm, I'll just check first if it is doable in one file without significant regressions, I'll be back!

Henrik Böving (Aug 06 2023 at 13:13):

Okay I implemented the feature now and it blew up declaration-data.bmp from 7.5M to 28M on std4 alone /o\

Eric Wieser (Aug 06 2023 at 13:15):

I think a separate file makes a lot of sense

Eric Wieser (Aug 06 2023 at 13:15):

This file is going to be loaded once per day when the site rebuilds, not every few minutes by every user

Henrik Böving (Aug 06 2023 at 13:16):

Okay I'll just put one next to declaration-data.bmp then

Eric Wieser (Aug 06 2023 at 13:16):

Thanks!

Henrik Böving (Aug 06 2023 at 13:40):

@Eric Wieser okay I pushed a change that adds two things:

  1. You can now also get line info from the ordinary declaration-data.bmp (although the format is not fully compatible with the original python one but I have faith you will be able to deal with that :P)
  2. You get a second file next to declaration-data.bmp called header-data.bmp that contains a key value map from decl name to the header HTML

I wanted to trigger a build for mathlib4 but I'm seeing right now that something is failing there so I'll investigate that one first. If you want to experiment in the meantime, building docs for std4 is very quick and will get you same format files.

Henrik Böving (Aug 06 2023 at 14:25):

Okay regarding the doc-gen failure @Mac Malone I saw that you have replaced the human readable names in lake-packages with hash values now? There is a minimal issue with this in mathlib4's CI namely that we have this line in a script: https://github.com/leanprover-community/mathlib4/blob/master/scripts/deploy_docs.sh#L30-L32 that tries to obtain the git hash of doc-gen4 that was checked out. Is there a nice way to get the path? Or do you recommend we instead try to parse the lockfile with something like jq in this script?

Mac Malone (Aug 06 2023 at 14:33):

@Henrik Böving This is a bug I accidentally introduced while testing another change (I forgot to revert some code) that I fixed after the 08-03 nightly. I mention this bug in the bump PR which introduced the bug, but it seems as though mathlib has not bumped Lean again yet.

Henrik Böving (Aug 06 2023 at 14:33):

Ah so I can just sit it out^^ Excellent

Henrik Böving (Aug 08 2023 at 19:38):

Okay we sat this out, slight issue now

Henrik Böving (Aug 08 2023 at 19:39):

remote: warning: File docs/declarations/declaration-data.bmp is 75.10 MB; this is larger than GitHub's recommended maximum file size of 50.00 MB
remote: error: Trace: 11d69bbd352a395b679a8a8dc9c4ddb245f9df34cc6fd56eb0e2995a7594245f
remote: error: See https://gh.io/lfs for more information.
remote: error: File docs/declarations/header-data.bmp is 379.73 MB; this exceeds GitHub's file size limit of 100.00 MB
remote: error: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com.
To github.com:leanprover-community/mathlib4_docs.git
 ! [remote rejected] HEAD -> master (pre-receive hook declined)
error: failed to push some refs to 'github.com:leanprover-community/mathlib4_docs.git'

Henrik Böving (Aug 08 2023 at 19:39):

@Eric Wieser so pre-rendering all decl headers is kinda big as it turns out :D

Eric Wieser (Aug 08 2023 at 20:01):

Ah, we solved this already for some other leanprover-community repos (such as the Lean 3 web editor)

Eric Wieser (Aug 08 2023 at 20:02):

The trick is to deploy the build straight to github pages, not commit it to a repo and deploy from there

Henrik Böving (Aug 08 2023 at 20:05):

I see, if this is already a solved thing can you port the fix to the deploy_docs script then?

Eric Wieser (Aug 08 2023 at 20:07):

We didn't ever do it for mathlib_docs, so it's not quite as simple as a direct port

Eric Wieser (Aug 08 2023 at 20:07):

Here are the relevant lines of #port-dashboard

Eric Wieser (Aug 08 2023 at 20:08):

The only real difficulty is that I don't know if you're allowed to have an action in one repo push to pages in another

Eric Wieser (Aug 08 2023 at 20:09):

https://github.com/actions/deploy-pages has a bit more info

Eric Wieser (Aug 08 2023 at 20:11):

Where does the action which deploys the docs currently run?

Henrik Böving (Aug 08 2023 at 20:11):

https://github.com/leanprover-community/mathlib4_docs/actions

Eric Wieser (Aug 08 2023 at 20:12):

Oh, perfect

Eric Wieser (Aug 08 2023 at 20:12):

Then it's easy

Eric Wieser (Aug 08 2023 at 20:13):

Wait, I asked the wrong question.. Where does the action that pushes to that repo currently run?

Henrik Böving (Aug 08 2023 at 20:14):

https://github.com/leanprover-community/mathlib4/blob/master/.github/workflows/mathlib4docs.yml#L39

Eric Wieser (Aug 08 2023 at 20:17):

So the easy solution is to move the docs to leanprover-community.github.io/mathlib4 or leanprover-community.github.io/mathlib4/docs

Henrik Böving (Aug 08 2023 at 20:26):

Sure that's possible, we should keep in mind to change docs# things if we do that

Eric Wieser (Aug 08 2023 at 20:29):

The alternative would be to have mathlib4 trigger a build in mathlib4_docs, and then actually do the doc-gen build there

Henrik Böving (Aug 08 2023 at 20:30):

If you want to do that you'll have to get whoever is in charge of the chonky runners to let them run the builds on mathlib4_docs as well

Eric Wieser (Aug 08 2023 at 20:30):

Oh, is that why the build is in the mathlib4 repo?

Eric Wieser (Aug 08 2023 at 20:31):

I think I have the power to do that

Eric Wieser (Aug 09 2023 at 00:29):

https://github.com/leanprover-community/mathlib4_docs/tree/main is my attempt

Eric Wieser (Aug 09 2023 at 00:31):

What's currently missing is the logic we had before to avoid wasting CI cycles rebuilding if nothing has changed

Eric Wieser (Aug 09 2023 at 01:52):

(the docs will be down until https://github.com/leanprover-community/mathlib4_docs/actions/runs/5803909282 completes)

Eric Wieser (Aug 09 2023 at 07:16):

It worked!

Eric Wieser (Aug 09 2023 at 08:22):

#6464 disables the old build, to save 2.5hrs of now pointless CI time a day

Eric Wieser (Aug 09 2023 at 08:23):

Eric Wieser said:

What's currently missing is the logic we had before to avoid wasting CI cycles rebuilding if nothing has changed

This logic seemed to fire a few times a week, so doesn't matter at all

Eric Wieser (Aug 09 2023 at 11:30):

https://github.com/leanprover-community/leanprover-community.github.io/pull/338 should now be ready to go

Eric Wieser (Aug 09 2023 at 21:32):

@Henrik Böving, this is *almost* working! Unfortunately, the header html (and docLink and srcLink fields, though these are easier to fix) contain only relative paths.

Henrik Böving (Aug 09 2023 at 21:47):

The relative links are actually on purpose so you can just shift the page around wherever without changing how the generated HTML is prefixed all the time through e.g. an argument so they work e.g. on your local file system (also after you move them around directories) etc. This is also how tools like rustdoc implement it so I would prefer to keep it that way.

Could we maybe introduce a post processing step that takes the HTML headers and adds the proper prefix? It's just 100 theorems so that should be cheap right?

Eric Wieser (Aug 09 2023 at 21:50):

I guess another possibly more sensible option than generating a 300Mb database file to display only 100 theorems would be to have the website build invoke a small part of doc-gen for rendering those theorems with the desired base URL

Eric Wieser (Aug 09 2023 at 21:52):

But yes, a quick hack would be to run an HTML parser in the main website builder to swap out the hyperlinks.

Henrik Böving (Aug 09 2023 at 21:55):

actually

Henrik Böving (Aug 09 2023 at 21:55):

Hm no, I was thinking if there are some iframe magics possible that would do this automatically but I dont think so?


Last updated: Dec 20 2023 at 11:08 UTC