Zulip Chat Archive

Stream: general

Topic: what is doc-gen?


Kevin Buzzard (Dec 22 2021 at 11:54):

I'm really sorry, I know nothing at all about html or how to generate it. I'm running a project-based course next term and I am considering how all this should work. I thought I'd take a look at Sylow's theorems in the API docs and it occurred to me that I would be rather happy with a submission which looked like that. Am I right in thinking that if an undergraduate were to host a public Lean repository on github containing a Lean file foo.lean then I can perform some magic which creates foo.html? If this is the case then I can simply ask for submissions as pure .lean files without any corresponding .pdf or .md file and I can mark both the Lean file and the html file.

Arthur Paulino (Dec 22 2021 at 12:19):

I'm assuming the project is written in Lean 3. But I don't know to which point (if any) doc-gen is tied to mathlib's structure.
Maybe ask @Julian Berman how you can setup a Read the Docs page with his sphinx adaptation for Lean 3. Here's his outcome: https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/doc-gen4/near/264465413

Alex J. Best (Dec 22 2021 at 12:25):

@Eric Wieser has managed to get doc-gen working on external projects e.g https://pygae.github.io/lean-ga-docs/geometric_algebra/translations/ida.html#ida.multivector.has_mul but I don't know if that means its easy to set up on CI yet

Eric Taucher (Dec 22 2021 at 12:35):

I can not answer in the specific case, I.e. doc-gen for Lean as I have not used it yet. But will give you a general programmers answer.

In the land of programming, programmers tend to be like sports teams and cheer on their favorite programming language. As such for each programming language they tend to redo the efforts again and again as they transition to each new language. This gives us all of these wonderful different tools that work great for one programming language but are so customized at times they only work for the one programming language. IDEs, build tools, debuggers, etc. are such tools and even documentation generators.

Looking at the comparison of documentation generators one sees that they are capable of taking in a specific programming language and depending upon the tool capable of generating documentation in various output formats including as you ask PDF, markdown or as HTML files that contain hyperlinks. Some really nice ones will even create graphs showing the relationships.

However many of these tools rely on what are called documentation comments (Lean doc comment syntax) as opposed to source code comments. These documentation comments are the basis of the generated document and must be entered into the source code for the documentation generator to produce meaningful documentation. Often plain old source code comments are ignored by the documentation generator .

So just getting a lean file may not be enough for what you seek. The author of the code would need to also add the documentation comments.

Julian Berman (Dec 22 2021 at 12:37):

https://github.com/pygae/lean-ga/blob/master/.github/workflows/lean_doc.yml it looks like is the file Eric wrote with what's needed to get doc-gen to do non-mathlib things. Skimming it it looks fairly simple as a thing you could have students' submissions run through -- I'd personally probably myself think that were the better route than my thing, especially since Sphinx doesn't really shine for single files it's nice for large things.

Julian Berman (Dec 22 2021 at 12:38):

(I'm running out myself though I'm sure Eric will show up at some point) -- @Eric Taucher I think the idea is Kevin will probably teach his students to write the docstrings (and therefore grade them on that bit), in mathlib everyone basically uses markdown docstrings as the way to write interlinked docs

Eric Wieser (Dec 22 2021 at 13:10):

The downside in using doc-gen for what is suggested here is:

  • Doc-gen only know to generate docs for _everything_ you import transitively, not just the exercises themselves.
  • You have to upload the output somewhere to actually view it. In the case of my repo, that amounts to a separate lean-ga-docs repo, which requires some slightly awkward ssh setup to push to. Either every student has to do this, or you have to run everything locally when you grade, which can be slow and doesn't work on windows all that well.
  • Having each student upload their own copy might lead to some confusing Google results (although hopefully the recent rel=canonical makes that less bad)

Gabriel Ebner (Dec 22 2021 at 13:13):

which requires some slightly awkward ssh setup to push to.

We could make this slightly less awkward by pushing to the gh-pages branch in the same repository (but we should disable the redirect files for space reasons). Then you could provide a template repository that the students only need to fork.

Eric Wieser (Dec 22 2021 at 13:14):

I can't remember why I didn't use that approach

Kevin Buzzard (Dec 22 2021 at 14:10):

So is there no cheap and cheerful program that says "I'm assuming that there are no warnings and no errors in this file and I'm going to have some kind of a guess as to what your HTML file will look like although none of the links will work, but that's OK"

Eric Wieser (Dec 22 2021 at 14:15):

Not without modifying bits of doc-gen

Eric Wieser (Dec 22 2021 at 14:16):

The reason it works for my use case is that I already PR'd changes that add the be options I needed (mainly removing stuff that only makes sense in mathlib

Eric Wieser (Dec 22 2021 at 14:17):

I have an open PR that tries to make doc-gen a standalone python package, which would likely help with building modified versions in future.

Kevin Buzzard (Dec 22 2021 at 17:40):

How about if the student takes their Lean files and put them in src/my_stuff/ in a clone of mathlib? Is this actually too hard? It would be a really nice way for students to hand stuff in.

Gabriel Ebner (Dec 22 2021 at 17:42):

I don't think it makes a difference whether it's in mathlib or a separate repo. The doc-gen setup for mathlib isn't automatic either, your personal mathlib fork doesn't have documentation generated. We'd need to do some integration work either way.

Rob Lewis (Dec 22 2021 at 17:47):

@Kevin Buzzard just to understand a bit better, what problem are you trying to solve? Is it making things easier for the students to submit, making it easier for you to read their submissions, making it easier for them to share submissions with others, ...?

If it's just a matter of markdown-formatting comments for ease of reading, remember Bryan's tool. It's not entirely "I'm assuming that there are no warnings and no errors in this file and I'm going to have some kind of a guess as to what your HTML file will look like although none of the links will work, but that's OK" but it's fairly close

Kevin Buzzard (Dec 22 2021 at 19:21):

I just wondered if I could get students making something beautiful which they could show to non lean users and say "that's my coursework"

Rob Lewis (Dec 22 2021 at 19:25):

Ah, then yeah, a kevins-course-projects Lean project on GitHub with its own docs sounds perfect

Jireh Loreaux (Dec 22 2021 at 20:47):

I guess format_lean is too involved for them, or too much manual labor?

Kevin Buzzard (Dec 22 2021 at 22:42):

These people are not computer scientists, and are typically uncomfortable with the command line

Eric Taucher (Dec 23 2021 at 08:09):

Kevin Buzzard said:

These people are not computer scientists, and are typically uncomfortable with the command line

Not that I plan to do this but noting this as food for thought for those reading this.

I know the students probably use a few different editors such as VSCode, etc. however editors typically have a menu option for Save As. Would it be possible to add a plugin for each different editor they might use to save the file as a push to a GitHub repository? That is a rhetorical question.

Kevin Buzzard (Dec 23 2021 at 08:10):

The students use VS Code. Emacs is a computer science thing

Arthur Paulino (Dec 23 2021 at 09:30):

@Eric Taucher I think it's a good idea. I have a half-baked JavaScript code that turns Lean files into markdown files. It's an extremely simple code, but I could try plugging it in the lean vs code plugin

Kevin Buzzard (Dec 23 2021 at 09:35):

I had hoped it would be easier than all this. I can just get them to write markdown files to go with their Lean files.

Eric Wieser (Dec 23 2021 at 09:41):

I can try PR'ing the yaml file above to your repo, if you have such a repo in mind

Arthur Paulino (Dec 23 2021 at 13:30):

@Kevin Buzzard but I really think it's a good start to eventually get us to the point of being able to export to pdf and html as well

Eric Wieser (Dec 23 2021 at 13:45):

I had a go at trying to make your M40001 repo use doc-gen, but your pesky numeric folder names broke things :upside_down:. Maybe #11001 will fix it.

Arthur Paulino (Dec 23 2021 at 14:20):

@Gabriel Ebner do you think we could add extra functionalities to the Lean VS Code plugin that improve accessibility but aren't necessarily about communicating with Lean server?

Gabriel Ebner (Dec 23 2021 at 14:26):

We have several features like that, for example the documentation view. But I'd rather not make a prototype HTML exporter a feature only available from vscode (if I read the thread correctly).

Arthur Paulino (Dec 23 2021 at 14:29):

How would you go about improving accessibility for people who aren't comfortable with command line and just go straight to VS Code? Specific plugins?

Eric Wieser (Dec 23 2021 at 14:37):

I think Gabriel's point is that if we want these things, we should build them as standalone tools first, and link them to vscode later

Eric Wieser (Dec 23 2021 at 14:39):

Eric Wieser said:

I had a go at trying to make your M40001 repo use doc-gen, but your pesky numeric folder names broke things :upside_down:. Maybe #11001 will fix it.

Done: https://eric-wieser.github.io/M40001_lean/2021/sets/sheet1.html (https://github.com/ImperialCollegeLondon/M40001_lean/pull/5)

Gabriel Ebner (Dec 23 2021 at 14:41):

Here's the workflow file in case anybody else is looking for it: https://github.com/eric-wieser/M40001_lean/blob/eric-wieser/doc-gen/.github/workflows/lean_doc.yml

Eric Wieser (Dec 23 2021 at 14:44):

I think there's now only one line of that yml that needs to be adjusted per-project; the lean_problem_sheets name needs to match the name in the leanpkg.toml

Kevin Buzzard (Dec 23 2021 at 14:48):

Wow maybe I can get this working then. It would be great if they could just hand in lean files and not md or pdf files

Kevin Buzzard (Dec 23 2021 at 14:49):

Thanks a lot Eric!

Eric Wieser (Dec 23 2021 at 14:49):

Here's an example of things going wrong: https://eric-wieser.github.io/M40001_lean/2020/problem_sheets/Part_II_sheet1.html

Eric Wieser (Dec 23 2021 at 14:49):

Most of the questions are missing, because you already had a Q2a in a previous year

Kevin Buzzard (Dec 23 2021 at 14:50):

Yeah but the students will be handing in sorry-free stuff

Eric Wieser (Dec 23 2021 at 14:50):

That's not the problem I'm talking about, I mean the fact that only 4 of the 9 lemmas in that file appear in the HTML output

Eric Wieser (Dec 23 2021 at 14:51):

Because the other 5 are errors as Q2a already exists as a lemma in another file and you're not allowed to have two Q2as

Kevin Buzzard (Dec 23 2021 at 14:51):

Oh I see

Kevin Buzzard (Dec 23 2021 at 14:51):

Again this probably isn't going to be a problem

Eric Wieser (Dec 23 2021 at 14:52):

The last thing to watch out for is that doc-gen ignores examples, so either someone needs to teach it not to, or you need to name everything

Alex J. Best (Dec 23 2021 at 14:59):

Eric Wieser said:

The last thing to watch out for is that doc-gen ignores examples, so either someone needs to teach it not to, or you need to name everything

I believe this is "impossible", examples don't show up in the generated oleans/the environment, so there is nothing to generate from. This is also very annoying when minimizing imports / linting things

Arthur Paulino (Dec 23 2021 at 15:07):

Eric Wieser said:

I think Gabriel's point is that if we want these things, we should build them as standalone tools first, and link them to vscode later

Do you mean, for instance, making alectryon (a tool on its own) accessible via plugin with a click?
(as opposed to coding a tool like alectryon inside the plugin)

Kevin Buzzard (Dec 23 2021 at 15:08):

Examples being invisible: Is the same true in lean 4?

Kevin Buzzard (Dec 23 2021 at 15:09):

I guess there are very few examples in mathlib but what I had not realised until this thread was that doc-gen was tied to mathlib

Alex J. Best (Dec 23 2021 at 15:15):

Kevin Buzzard said:

Examples being invisible: Is the same true in lean 4?

A quick experiment points to yes, adding an example didn't change the olean file generated at least

Alex J. Best (Dec 23 2021 at 15:16):

I would be very happy if mathlib by convention didn't use examples outside of test folders, instead just protected lemma example_blah : ... or something, these would then show up in docs too which would be nice.

Eric Wieser (Dec 23 2021 at 15:26):

Kevin Buzzard said:

Examples being invisible: Is the same true in lean 4?

I only meant this as "doc-gen doesn't record them", I hadn't realized they were entirely undetectable.

Kevin Buzzard (Dec 30 2021 at 14:29):

OK I am back to thinking about this. I've put a random file called lean_doc.yml into the .github directory of a private local project. What now? Am I not doing the compilation myself?

Eric Wieser (Dec 30 2021 at 15:10):

That yaml file only works if you push to github

Eric Wieser (Dec 30 2021 at 15:11):

It should work fine on a private github repo, but you'll have to download the output to view it as github pages doesn't work on private repoa

Kevin Buzzard (Dec 30 2021 at 16:31):

So what will happen when I push?

Alex J. Best (Dec 30 2021 at 16:42):

Github pages used to work fine on private repos, did that change or something?

Rob Lewis (Dec 30 2021 at 16:43):

I think you need Github Pro

Kevin Buzzard (Dec 30 2021 at 16:48):

Someone else already pushed the yaml file. What is supposed to happen next? What is a workflow? I am completely lost.

Julian Berman (Dec 30 2021 at 17:33):

Workflow is what github calls "sequence of steps that runs on some event, like you or someone pushing to the repo or sending a PR"

Julian Berman (Dec 30 2021 at 17:33):

If you were precise before when you said you put that in a directory called .github, it needs to go in one called .github/workflows/ instead

Julian Berman (Dec 30 2021 at 17:34):

It may be easiest if you have some repo that contains things you don't want to be public, to make a second one, set it private, and give access to some people here who can get you a working setup the way you like it, and then it can be copied to the repo you don't want to share

Julian Berman (Dec 30 2021 at 17:34):

(the point above is worth keeping in mind though, if your goal is still to get students a webpage they can be proud of and easily autogenerate on their behalf without working too hard, github pages isn't going to work for that if the repo is private)

Eric Wieser (Dec 30 2021 at 17:59):

(github pro is free for students I think?)

Kevin Buzzard (Dec 30 2021 at 18:01):

The file is in the correct place. It is pushed. What do I do next?

Eric Wieser (Dec 30 2021 at 18:01):

What's the repo it pushed to?

Kevin Buzzard (Dec 30 2021 at 18:01):

A private one.

Eric Wieser (Dec 30 2021 at 18:01):

Can you share the url anyway?

Kevin Buzzard (Dec 30 2021 at 18:01):

What I am trying to establish is whether something is supposed to have happened, and how to check that it has happened.

Julian Berman (Dec 30 2021 at 18:02):

The most basic thing to have happened will be on the "Actions" tab of your repo

Julian Berman (Dec 30 2021 at 18:02):

(Go to the equivalent of https://github.com/leanprover-community/mathlib/actions on your repo)

Eric Wieser (Dec 30 2021 at 18:02):

You might also need to turn on GitHub pages

Kevin Buzzard (Dec 30 2021 at 18:03):

failure.png

Eric Wieser (Dec 30 2021 at 18:03):

That just looks unlucky

Eric Wieser (Dec 30 2021 at 18:03):

Can you restart it and try again?

Kevin Buzzard (Dec 30 2021 at 18:04):

How do I run it again?

Eric Wieser (Dec 30 2021 at 18:05):

There's a button at the top right

Kevin Buzzard (Dec 30 2021 at 18:05):

got it -- looks like it's working :D

Julian Berman (Dec 30 2021 at 18:06):

Kevin this is the last step in becoming a computer scientist.

Kevin Buzzard (Dec 30 2021 at 18:42):

Should I worry about

WARNING: Lean version mismatch: installed version is leanprover-community/lean:3.35.1, but package requires leanprover-community/lean:3.32.1

?

Bryan Gin-ge Chen (Dec 30 2021 at 18:52):

I think you can ignore that message if everything builds successfully.

Dima Pasechnik (Dec 31 2021 at 09:42):

Julian Berman said:

Kevin this is the last step in becoming a computer scientist.

only a beginning one. The next step would be modifying these .github/ yml files which handle this, while watching in pain how your GH Actions jobs run out of resources, split them into sequences of jobs using containers. Then giving up on this and setting up your own runners to run said jobs. Etc.

Or just going to command line and do everthing there, locally :-)


Last updated: Dec 20 2023 at 11:08 UTC