Zulip Chat Archive

Stream: general

Topic: Lean project repository prototype


Rob Lewis (Jul 30 2020 at 15:53):

I'm noticing more and more things posted here where I think "that's cool, but doesn't fit in mathlib." Unfortunately, with the rate mathlib changes, these things are very susceptible to bit rot. It's time to address this!

I put together the beginnings of a project repository earlier today. There's a script that (when it's finished) will try to build all lean-3.*.* branches of checked in projects, with respect to the latest corresponding version of mathlib (and the latest versions of other dependencies, if there are any).

The idea right now is not that mathlib contributors are expected to update the collected projects. But if mathlib changes break a project, the project owners will be notified as soon as the CI script runs. (Once a day?) We could also potentially add this as a CI step on mathlib, displaying failures but not blocking merges.

This is very much WIP so comments and suggestions are appreciated. And it's yet another thing that I don't really have time for right now, so if anyone is interested in collaborating/taking charge, let's talk.

Rob Lewis (Jul 30 2020 at 15:54):

I also remembered halfway into building this that @Simon Hudon had started on something similar a while back. I may have just rediscovered a bunch of ideas he already had! Sorry Simon, maybe you have suggestions here?

Simon Hudon (Jul 30 2020 at 15:56):

Yes indeed, I have such a project which I haven't touched in a while. The idea is to curate sets of projects in such a way that there's always a known answer to "which version of which packages do I need"

Rob Lewis (Jul 30 2020 at 15:56):

The script is runnable right now, if you figure out what Python packages you need. It assumes you have lean (via elan), leanpkg, and leanproject in the path. It will check out a bunch of stuff and pollute whatever directory you run it in.

Rob Lewis (Jul 30 2020 at 15:58):

Simon Hudon said:

Yes indeed, I have such a project which I haven't touched in a while. The idea is to curate sets of projects in such a way that there's always a known answer to "which version of which packages do I need"

Hmm, that sounds slightly different. Do you have this project hosted somewhere?

Rob Lewis (Jul 30 2020 at 15:58):

That question doesn't always have an answer, does it?

Simon Hudon (Jul 30 2020 at 16:00):

The point is helping create an answer when possible. Basically, select generations of packages, build them together and report which project is broken, leaving it to the author of the package to decide to fix the breakage or not be a part of that generation

Simon Hudon (Jul 30 2020 at 16:00):

Here is where I put it: https://github.com/leanprover-community/mathlib-tools/tree/lean-depot

Rob Lewis (Jul 30 2020 at 16:02):

Cool, thanks. I'll take a look. This sounds very much like what I've done, except I've defined "generations" by Lean versions.

Rob Lewis (Jul 30 2020 at 16:04):

If you know you want to use packages x, y, and z, you should be able to ask the tool if there are any Lean versions where x, y, and z all build against the last mathlib commit for that version.

Simon Hudon (Jul 30 2020 at 16:04):

It may not turn out to be a problem but one worry I had was that the repository would become large, of uneven quality and that mathlib maintainers (or specifically me) we become responsible for fixing the breakage.

Rob Lewis (Jul 30 2020 at 16:06):

This is definitely a worry. It's a matter of setting up the right expectations and rules around entering the repo. Having outdated projects in there isn't a problem, especially if other projects don't depend on them. I think we just need to emphasize that project owners are responsible for updating their own projects.

Rob Lewis (Jul 30 2020 at 16:07):

We may decide that some subset of projects are "blessed" and get to participate in mathlib CI in some way. Obviously we'd need criteria to make this decision.

Simon Hudon (Jul 30 2020 at 16:07):

That sounds reasonable. And you may resort to submodules to decentralize the organization

Rob Lewis (Jul 30 2020 at 16:09):

Simon Hudon said:

That sounds reasonable. And you may resort to submodules to decentralize the organization

Hmm, what do you mean by that?

Rob Lewis (Jul 30 2020 at 16:10):

The sign-up process I have right now is just adding a link to a repo. There's no checking in code. The build script clones all the relevant repos every time it runs.

Simon Hudon (Jul 30 2020 at 16:11):

If I have a repo and I want to be part of that collection, git has a submodule feature that lets you reference my repo from yours. Then updating my repo doesn't automatically break your build and your submodule reference must be explicitly updated to follow my official versions.

Rob Lewis (Jul 30 2020 at 16:12):

Aha. Yeah, I think that isn't needed with the current architecture.

Gabriel Ebner (Jul 30 2020 at 16:13):

It might be time to rid of the lean-xyz branches now.

Rob Lewis (Jul 30 2020 at 16:16):

And move to releases/tags instead?

Simon Hudon (Jul 30 2020 at 16:17):

Rob Lewis said:

Aha. Yeah, I think that isn't needed with the current architecture.

I'm not sure about that. I feel like if it's necessary to have your code hosted in that repo, it denies people control of their own projects. I think it's worth considering

Rob Lewis (Jul 30 2020 at 16:21):

My first thought this morning was to fork every checked-in repo to the leanprover-contrib organization, and to automatically update revisions in leanpkg.tomls to point to these forks. But it sounded very messy. Why do we want to deny people control of their own projects?

Rob Lewis (Jul 30 2020 at 16:21):

I am looking at this as more of a CI thing, less of a package manager. Maybe this is where the difference lies.

Kevin Buzzard (Jul 30 2020 at 16:23):

I like the idea of my projects being able to track master very much. What happened with the perfectoid project would be that we'd leave it for a bit, or try to get some of it into mathlib and worry about that, and then we'd try to bump mathlib and the project would break in ways I didn't understand because I don't keep up with all mathlib changes. To hear "a change to mathlib in the last 24 hours just broke your project" is a very welcome piece of news in this situation.

Gabriel Ebner (Jul 30 2020 at 16:26):

@Rob Lewis more like only having master. The separation into lean versions is not particularly useful, its much more important to know the mathlib version. The lean-xyz branches just cause headaches.

Rob Lewis (Jul 30 2020 at 16:31):

The lean-xyz branches are good as synchronization points though. If every project only has master, and gets updated every few weeks/months to the latest mathlib, how could you ever start a new project with two separate dependencies?

Gabriel Ebner (Jul 30 2020 at 16:38):

Haha, with the current release cadence the lean version is kind of a proxy for the mathlib version.. You could always look for commits with matching date manually. But I won't push this. Its only a few more days until the clover blooms.

Rob Lewis (Jul 30 2020 at 16:41):

Yeah, a mathlib release schedule would be just as good. But that'd probably be pegged to a Lean release schedule to some degree, hah.

Simon Hudon (Jul 30 2020 at 17:02):

@Rob Lewis I don't think handling submodules will add so much to your setup and I think if could make a big difference for people who want their package in that archive

Sebastien Gouezel (Jul 30 2020 at 17:32):

IMHO, a great strength of the AFP for Isabelle is that everything is updated to be compatible with the last release. They have official releases of Isabelle every 8 months, a snapshot of the AFP at that date where everyting compiles, and then the AFP is updated together with master, where someone making a change in master is responsible for fixing what breaks in the AFP.

Kevin Buzzard (Jul 30 2020 at 17:36):

So what happens when they refactor subgroups?

Sebastien Gouezel (Jul 30 2020 at 17:40):

They fix everything. But it's stable enough that they don't need to refactor subgroups every day.

Simon Hudon (Jul 30 2020 at 17:43):

I'm not sure it's such a good idea that contributors to mathlib would be responsible for a potentially very diverse set of libraries. Especially since I think there should be almost no barrier to entry for that archive so a lot of it can be pretty bad code.

Sebastien Gouezel (Jul 30 2020 at 17:46):

For the AFP, there is a barrier entry, in the sense that the AFP maintainers check that the code is not crazy. But the bar is not too high either, just reasonable.

Gabriel Ebner (Jul 30 2020 at 17:50):

What would be something that is acceptable for the AFP but not acceptable for mathlib?

Sebastien Gouezel (Jul 30 2020 at 17:52):

If you want to formalize something that is interesting to you but not to the community (logic puzzles from a book, that you want to formalize the way you want), for instance.

Sebastien Gouezel (Jul 30 2020 at 17:53):

Or 10 different versions of graph libraries that are mutually incompatible.

Simon Hudon (Jul 30 2020 at 17:59):

I think we might be better off reserving that treatment for only parts of the repository. I think there's a big review load already.

Sebastien Gouezel (Jul 30 2020 at 18:02):

Yes, I agree that it's not necessary that everything is fixed along the way, especially when the core library is seeing a lot of refactors. It might become an option when things are much more stable. But I think it is important that some things are fixed along with mathlib (for instance books and tutorials)

Simon Hudon (Jul 30 2020 at 18:11):

:+1:

Simon Hudon (Jul 30 2020 at 18:12):

Probably projects that are used by many others too

Rob Lewis (Jul 30 2020 at 18:29):

This is what I was trying to capture by:
Rob Lewis said:

We may decide that some subset of projects are "blessed" and get to participate in mathlib CI in some way. Obviously we'd need criteria to make this decision.

The LFTCM tutorials could be blessed. Changes to mathlib that break them would raise a big flag. There would be at least a social obligation to fix the tutorial if you make a PR that raises this flag.

Rob Lewis (Jul 30 2020 at 18:30):

But I do think we should move away from putting everything important in the mathlib repo. People who want to use the tutorials shouldn't have to clone the mathlib repo. It should be a Lean project like the LFTCM repo was.

Rob Lewis (Jul 30 2020 at 18:31):

I'm familiar with the Isabelle AFP model and it's an ideal to keep in mind. But I'm pretty sure mathlib still changes too drastically too quickly for it to be practical right now. We'd need a serious review process on entries and the mathlib maintainers are already stretched thin.

Rob Lewis (Jul 30 2020 at 18:32):

My image here is much lighter weight. It's a way for smaller projects to get a bit of visibility and a bit of automatic updating.

Rob Lewis (Jul 30 2020 at 18:33):

Simon Hudon said:

Rob Lewis I don't think handling submodules will add so much to your setup and I think if could make a big difference for people who want their package in that archive

Could you expand on this? I'm really not sure what the benefit is.

Alex J. Best (Jul 30 2020 at 18:34):

This is a great idea! Is the plan that all projects in the list should pass with latest mathlib as much as possible? I was thinking it would be good to have a community maintained list of projects with interesting work in, this list could do that role also if there was a flag for included in CI or not, as there are already many projects which contain interesting work but are way behind mathlib master.

Rob Lewis (Jul 30 2020 at 18:39):

It will track the latest version of mathlib that a project compiles with, and flag the project owner if/when the project breaks. So yeah, it will be very easy to make a list of checked in projects and their status (if they compile with new mathlib/the last version they work with).

Simon Hudon (Jul 30 2020 at 18:39):

@Rob Lewis The idea is to make it a decentralized organization. That we can add a project to this archive without relinquishing control of the project or abandoning the github data (issues, pull requests, etc)

Rob Lewis (Jul 30 2020 at 18:41):

But that's already possible? All you have to do is add your project repo name to https://github.com/leanprover-contrib/leanprover-contrib/blob/master/projects/projects.yml

Simon Hudon (Jul 30 2020 at 18:41):

Ah ok, it's more similar to what I was doing. I thought you were storing the code there

Rob Lewis (Jul 30 2020 at 18:42):

Well, technically the script doesn't do that yet, it only fetches from the leanprover-contrib repo. But that's what I wrote in the readme :p

Rob Lewis (Jul 30 2020 at 18:43):

My original thought was to fork repos to this organization but I decided against that. Just haven't updated the couple lines of relevant code since the test repos were in this org anyway, heh.

Rob Lewis (Jul 30 2020 at 18:46):

The one thing you lose with this setup is auto-updating the checked in repos when there's no breakage. But this gets complicated quick and I don't think it's worth the trouble.

Simon Hudon (Jul 30 2020 at 18:46):

Rob Lewis said:

The one thing you lose with this setup is auto-updating the checked in repos when there's no breakage. But this gets complicated quick and I don't think it's worth the trouble.

You mean as opposed to hosting the code itself?

Rob Lewis (Jul 30 2020 at 18:48):

Yes, if we hosted the code ourselves we could update the dependencies automatically. Otherwise we need write permission to the checked in repo.

Rob Lewis (Jul 30 2020 at 18:49):

But if we keep our own version and update it, it diverges (slightly) from the main version, needing someone to keep them aligned.

Simon Hudon (Jul 30 2020 at 18:51):

Yeah. I think hosting the blessed projects only might be sufficient.

Gabriel Ebner (Jul 30 2020 at 18:51):

The builder could also submit a PR to the affected project (like the nolints PR).

Simon Hudon (Jul 30 2020 at 18:52):

Or maybe an issue?

Gabriel Ebner (Jul 30 2020 at 18:52):

Technically, a PR is an issue (they have the same API and numbering on github).

Rob Lewis (Jul 30 2020 at 18:53):

Gabriel Ebner said:

The builder could also submit a PR to the affected project (like the nolints PR).

This gets super awkward when you have project 1 depending on project 2 and project 2 needs to update. Then you need to update project 1 to a commit of project 2 that may not exist, depending on if your PR gets squashed, rebased, whatever.

Rob Lewis (Jul 30 2020 at 18:54):

It also just seems kind of annoying, there will be one PR every day like clockwork unless we magically have a day with no mathlib commits...

Gabriel Ebner (Jul 30 2020 at 18:54):

Fair enough, this also rules out one issue per day. Can we at least spam the owners via email?

Rob Lewis (Jul 30 2020 at 18:55):

We should definitely open an issue when things break. I don't think spamming them to update helps much of anything.

Rob Lewis (Jul 30 2020 at 18:56):

In fact, we should just publish an action script to update and build once a day and let people install that themselves.

Rob Lewis (Jul 30 2020 at 18:56):

That kind of does 90% of the job of this repo, heh.

Rob Lewis (Jul 30 2020 at 18:56):

It just doesn't handle dependencies very well.

Gabriel Ebner (Jul 30 2020 at 19:02):

Ah so it opens one issue, and doesn't open a new one until the first one is closed? That sounds good to me.
Do you get a notification if a cron job action fails?

Rob Lewis (Jul 30 2020 at 19:04):

Yeah, someone gets a notification. Whoever last modified the .yml file maybe? The cron job could also open an issue though.

Gabriel Ebner (Jul 30 2020 at 19:06):

I wonder who should be spammed for the tutorials or other projects that have multiple maintainers. Opening an issue is probably the easiest way to notify all interested parties.

Rob Lewis (Jul 30 2020 at 19:14):

I need to look into what leanpkg does with conflicting dependencies. This project repo might not be necessary at all...

Gabriel Ebner (Jul 30 2020 at 19:14):

If you depend on multiple versions of mathlib, the last one "wins" (in the order the dependencies are traversed).

Rob Lewis (Jul 30 2020 at 19:18):

Hmm. Then I think there is a place for the project repo. leanpkg upgrade isn't recursive, right? It won't updgrade the mathlib revision of a dependency?

Gabriel Ebner (Jul 30 2020 at 19:52):

I have completely forgotten what leanpkg upgrade does. But I'm pretty sure it isn't recursive.

Simon Hudon (Jul 30 2020 at 19:54):

I think we can be sure that any way it might resolve conflicts won't result in choosing one version of mathlib that will work with all dependencies. I think the need for the project archive is certain

Ruben Van de Velde (Jul 30 2020 at 20:06):

Not sure if this makes sense, but perhaps the project repository could store, for each project, the most recent mathlib commit it works with. Then you could have automation that regularly checks all the projects against current mathlib, and for each project, either updates the project repository (if it still works) or files an issue on the project (if it doesn't). Then you'd only have the daily (or whatever) commits in the project repository, but the maintainers of the projects can still be confident that they can upgrade when they want to

Rob Lewis (Jul 30 2020 at 21:10):

Ruben Van de Velde said:

Not sure if this makes sense, but perhaps the project repository could store, for each project, the most recent mathlib commit it works with. Then you could have automation that regularly checks all the projects against current mathlib, and for each project, either updates the project repository (if it still works) or files an issue on the project (if it doesn't). Then you'd only have the daily (or whatever) commits in the project repository, but the maintainers of the projects can still be confident that they can upgrade when they want to

I'm not quite sure what you mean by the "project repository" here -- do you mean a fork of the project to leanprover-contrib?

Ruben Van de Velde (Jul 30 2020 at 21:13):

Terminology is a little confusing :)
I was thinking you'd have the projects.yml file include a link to the git repository of the project plus a mathlib commit hash

Ruben Van de Velde (Jul 30 2020 at 21:14):

So the automation would just update the hash in https://github.com/leanprover-contrib/leanprover-contrib/blob/master/projects/projects.yml , not the git repository that holds the actual code

Ruben Van de Velde (Jul 30 2020 at 21:15):

Might be completely off base, but it avoids the issue of pushing changes to all the different git repositories every day

Rob Lewis (Jul 30 2020 at 21:16):

Ah, I see what you mean. Even better, I think it should store a table: every time CI runs, store the mathlib commit hash and the commit hash of every project that builds wrt that mathlib.

Rob Lewis (Jul 30 2020 at 21:17):

That way, if you know that you need packages x, y, z working together, you look for the most recent mathlib hash in the table that also has hashes for x, y, and z.

Rob Lewis (Jul 30 2020 at 21:18):

I definitely don't want to push changes from the CI repo to the projects. I think we should write a (separate) Action script that they can install on their own, to do daily updates.

Rob Lewis (Jul 31 2020 at 15:32):

This is "live," in that you can sign up your project and it will get checked periodically and open some issues if things are broken. See the readme for details.

Rob Lewis (Jul 31 2020 at 15:33):

It stores some history here.

Rob Lewis (Jul 31 2020 at 15:33):

At some point I'll move this to the leanprover-community org. Since the plan is no longer to fork projects, there's no need for a separate org.

Rob Lewis (Jul 31 2020 at 15:34):

And it will be more powerful with some local Actions scripts on the project repos, coming later.

Gabriel Ebner (Jul 31 2020 at 15:45):

Adding lean-x.y.z branches is a massive commitment, is there a chance to make it test master?
If I were to add a lean-3.17.1 branch today, will the bot at least try to upgrade it to lean-3.18.4 in a few days?

Kevin Buzzard (Jul 31 2020 at 15:52):

I only care about building my projects against mathlib master

Rob Lewis (Jul 31 2020 at 15:52):

My plan is to write an action for the project repo to install, that will mirror each commit to master to the right lean-x.y.z branch. I don't think it should be the job of this tool to guess the "last" project commit for a particular Lean version. Same with trying to upgrade to new Lean versions. If it's done by an action in the project repo, it can automatically push the new branch when a successful upgrade can be done.

Rob Lewis (Jul 31 2020 at 15:53):

Assuming these actions are in place in the project repo, the project master will always be up to date with a lean-x.y.z branch, so there's no need to test master too.

Rob Lewis (Jul 31 2020 at 15:54):

@Kevin Buzzard your projects don't have any dependencies besides mathlib, do they?

Kevin Buzzard (Jul 31 2020 at 15:54):

No

Rob Lewis (Jul 31 2020 at 15:55):

You'll be welcome to add them to this repo, but it won't do anything you can't do locally. I'm going to make a GitHub action you can install on your own project. It will try to upgrade your master to mathlib master every night (or however often you want).

Gabriel Ebner (Jul 31 2020 at 15:59):

So if I don't upgrade the lean version, then the repo won't notify if changes in mathlib have broken my project? If I upgrade the lean version, then I need to upgrade mathlib as well. So everything needs to be done manually, and the bot doesn't help me.

Rob Lewis (Jul 31 2020 at 16:00):

Rob Lewis said:

You'll be welcome to add them to this repo, but it won't do anything you can't do locally. I'm going to make a GitHub action you can install on your own project. It will try to upgrade your master to mathlib master every night (or however often you want).

Now that I think about it, this exact action is already running in the tutorial repo. https://github.com/leanprover-community/tutorials/blob/master/.github/workflows/main.yaml You could copy this directly into the same directory in your own repo and it will update it once a week if it can. Change the funny number at the top to do it more often.

Rob Lewis (Jul 31 2020 at 16:03):

Gabriel Ebner said:

So if I don't upgrade the lean version, then the repo won't notify if changes in mathlib have broken my project? If I upgrade the lean version, then I need to upgrade mathlib as well. So everything needs to be done manually, and the bot doesn't help me.

An action running in your repo will upgrade the Lean and mathlib versions and push to a new lean-x.y.z branch. It can do this regardless of whether your project builds with the upgrade. Then the bot will pick it up.

Gabriel Ebner (Jul 31 2020 at 16:15):

The tutorials action doesn't seem to update the lean-x.y.z branch though.

Rob Lewis (Jul 31 2020 at 17:25):

No, that script is ancient, almost two months old! From before I knew there was any need for the x.y.z. branches. But it will do @Kevin Buzzard 's job of checking his master against mathlib master just fine.

Scott Morrison (Jul 31 2020 at 23:45):

I just want to add a contrary note here, that I think this whole concept may be harmful. I think already we miss out on lots of good material that is developed in other repositories, which then bitrots. I'm unpersuaded so far that this proposal will improve this; in fact by blessing it as standard practice it may make it worse, and just leave us with an excellent reference list of broken projects.

Rather, I think we should be encouraging everyone to do their new work on branches of mathlib, and actively discourage even creating new repositories.

I appreciate this is being contrary; hopefully just food for thought and discussion. (Of course I'm happy to help out and contribute if this new system is the way to go.)

Rob Lewis (Aug 01 2020 at 11:18):

Uh, so my response turned into a short essay, sorry... posting this in pieces.

Rob Lewis (Aug 01 2020 at 11:19):

Rather, I think we should be encouraging everyone to do their new work on branches of mathlib, and actively discourage even creating new repositories.

Doing work on a mathlib branch doesn't accomplish anything unless that work gets PRed. And I don't think it's maintainable to say that every Lean project should be PRed to mathlib.

Rob Lewis (Aug 01 2020 at 11:19):

First, there's the PR process. We can't assume that anyone making a PR to mathlib will stick around to maintain their own code. So the community needs to be willing and able to keep additions up to date. There's a burden on the person making the PR to get their code into reasonable shape. There's a huge burden on the reviewers to guide them through this process and check that the code actually is reasonable -- you know as well as I do how much effort it can take to get a rough PR into shape.
And any lapses here make a burden for future contributors, who may not even know or care about this material, but need to change something that it depends on.

Rob Lewis (Aug 01 2020 at 11:19):

Saying that everything should be PRed to mathlib is telling everyone involved that it's always worth it to take on these burdens. I don't think this is fair: I can absolutely imagine someone doing something cool in Lean, that they'd like to keep reasonably up to date, that they don't think is worth the effort of PRing. I guarantee this has happened before.

Rob Lewis (Aug 01 2020 at 11:19):

Second, there's the question of what material "belongs" in mathlib. Obviously this is subjective. I don't want to debate the edge cases here, but I can name two projects I've looked at in the last week that I think clearly don't belong and are clearly worth archiving in some way: the Mathematica link and verified Sudoku. I don't know of any programming language where every project written in that language is distributed as part of its standard library.

Rob Lewis (Aug 01 2020 at 11:19):

Mathlib is a resesarch project in its own right and also a standard library. As a project, I think some of its strengths are that it does things at high levels of generality and is surprisingly interconnected and coherent. We don't (or we try not to) have multiple APIs for the same content, similar things are glued together as much as possible, etc.

Rob Lewis (Aug 01 2020 at 11:19):

Opening it up with an "anything goes" approach would contradict all this. I can imagine getting five different pedagogical developments of intro calc, based on different first principles, unconnected to each other, unused and unusable in the rest of the library. All have perfectly good value as Lean projects. None fit in mathlib as a research project. And why should they be part of a standard library? It's like five different markdown parsers in Python. If you want to use one, you start your own Python project and pip install that package. Same here: if you want to build material based on someone else's intro calc, you start your own Lean project and leanpkg add that package, which you find on a list on our website, generated from the repository I'm proposing.

Rob Lewis (Aug 01 2020 at 11:20):

(end of essay)

Rob Lewis (Aug 01 2020 at 11:22):

Of course none of this is to say that we shouldn't pull in material from other projects when it seems appropriate. And a centralized project repository makes it easier to see what's been done in other projects.

Scott Morrison (Aug 01 2020 at 11:34):

Maybe we should do both: set up this infrastructure, and at the same time work out of more systematic way to encourage anyone working on something (plausibly) "library-worthy" that we really hope their contribution will make it into mathlib, not just as CI-checked repository, and a good way to do this is to do work in branches of mathlib.

Mario Carneiro (Aug 01 2020 at 11:42):

I am waiting for the discussions to settle down so I can put my project that is worth keeping up to date (with me as maintainer, not mathlib) but not worth distributing to everyone

Mario Carneiro (Aug 01 2020 at 11:43):

My view on this is similar to Kevin's: I want my project to be built against current master, and if a mathlib change breaks it I would like to be notified, but after that if I don't pick it up then it should just be marked as abandoned so that it doesn't bother the global tracker

Rob Lewis (Aug 01 2020 at 11:45):

Mario Carneiro said:

I am waiting for the discussions to settle down so I can put my project that is worth keeping up to date (with me as maintainer, not mathlib) but not worth distributing to everyone

Does your project meet the criteria here / can you make one lean-3.17.1 branch mirroring master? If so, please add it as a test!

Rob Lewis (Aug 01 2020 at 11:46):

Clearly more testing is needed. For instance, I see from last night's run that I got the case wrong where mathlib changes but no project does.

Mario Carneiro (Aug 01 2020 at 11:47):

This seems to require that I pin down the version? I just said I don't want to do that (it is too high maintenance)

Rob Lewis (Aug 01 2020 at 11:48):

Mario Carneiro said:

My view on this is similar to Kevin's: I want my project to be built against current master, and if a mathlib change breaks it I would like to be notified, but after that if I don't pick it up then it should just be marked as abandoned so that it doesn't bother the global tracker

As with Kevin, the only part of this that can't be done locally is publicity -- it sounds like you just want an action that tries to update your project every day.

Mario Carneiro (Aug 01 2020 at 11:49):

every month would be fine for my abandoned projects

Rob Lewis (Aug 01 2020 at 11:49):

Mario Carneiro said:

This seems to require that I pin down the version? I just said I don't want to do that (it is too high maintenance)

This repo is 48 hours old, not all features are there. For now one version branch is fine.

Mario Carneiro (Aug 01 2020 at 11:49):

I'm thinking this is just an upkeep operation on old projects

Rob Lewis (Aug 01 2020 at 11:50):

Rob Lewis said:

Now that I think about it, this exact action is already running in the tutorial repo. https://github.com/leanprover-community/tutorials/blob/master/.github/workflows/main.yaml You could copy this directly into the same directory in your own repo and it will update it once a week if it can. Change the funny number at the top to do it more often.

Rob Lewis (Aug 01 2020 at 11:51):

Keeping a project that depends only on mathlib up to date is a local task. There's no global activity needed there.

Rob Lewis (Aug 01 2020 at 11:52):

The job is harder when you have projects with multiple or transitive dependencies.

Mario Carneiro (Aug 01 2020 at 11:55):

I don't think I want to be a first adopter for this kind of thing though. I'll wait until after the tutorials come out

Rob Lewis (Aug 01 2020 at 12:13):

Scott Morrison said:

Maybe we should do both: set up this infrastructure, and at the same time work out of more systematic way to encourage anyone working on something (plausibly) "library-worthy" that we really hope their contribution will make it into mathlib, not just as CI-checked repository, and a good way to do this is to do work in branches of mathlib.

Sure. I have no picture of what a systematic way to do this would be, but of course I'm not against the idea.

Utensil Song (Aug 01 2020 at 13:34):

As a newcomer, I really love this prototype and the ideas behind it.

Utensil Song (Aug 01 2020 at 13:34):

The "monolithic" philosophy behind Mathlib is crucial to absorb important formalized mathematics as much as possible. Every project out there should seriously consider PRing part of it (if not all of it) into mathlib, especially the foundation part of the project that most likely should be kept interconnected and coherent with other parts of Mathlib, be kept general, maintainable, efficient to the standard of mathlib, and encapsulate the foundation into high-level primitives and theorems friendly to users who are less sophisticated in math.

Utensil Song (Aug 01 2020 at 13:34):

On the other hand, the reality is, as it should be, that there're satellite projects such as tutorials, games, workshop repos to form the ecosystem, as well as formalized projects, domain-specific stand-alone projects, educational projects like in Xena projects. An ecosystem like this is the model that scales.

Utensil Song (Aug 01 2020 at 13:35):

What's lacking for these projects, are the generic tooling to keep them in good shape. There're lots of tools to help Mathlib to be maintainable but can't be easily reused for a stand-alone project, arguably this is one of the reasons for the projects that are not PRed into Mathlib to rot.

Utensil Song (Aug 01 2020 at 13:35):

Understandably, the work on generalizing these tools (or merge some tools into a toolkit, e.g. leanproject, format_lean, doc_gen, blueprint etc.) to work for the whole ecosystem is currently marked to have a low priority by core members of Mathlib. This is an inevitable resource bottleneck in the current stage. But these're mostly just hard work (i.e. no hard technical barrier) and definitely can be solved when there're more users of Lean and Mathlib and the motivations accumulate. This prototype sends a signal for these coming goodies.

Rob Lewis (Aug 02 2020 at 12:52):

Here's a first action: https://github.com/leanprover-contrib/update-versions-action

Rob Lewis (Aug 02 2020 at 12:53):

Install this and it will keep your lean-x.y.z branches mirroring master. Going forward, anyway. It won't create anything historical.

Rob Lewis (Aug 02 2020 at 12:57):

Assuming you don't use lean-x.y.z branches, this action will seem to do nothing for you, which is good. It's just keeping your repo structure compatible with the project repository.

Rob Lewis (Aug 03 2020 at 11:03):

And here's a second action: https://github.com/leanprover-contrib/lean-upgrade-action

Rob Lewis (Aug 03 2020 at 11:03):

Install this and it will try to upgrade your project once a day. It will push the result if the upgrade succeeds and open an issue if the upgrade fails.

Rob Lewis (Aug 03 2020 at 11:04):

I promise this action has bugs and needs testing.

Patrick Massot (Aug 03 2020 at 13:13):

Rob, as you can see here whatever this script is meant to do didn't report a mathlib update breakage. Should I get rid of it and try your latest action?

Rob Lewis (Aug 03 2020 at 13:17):

Hah. That's because it should say lean src --make.

Rob Lewis (Aug 03 2020 at 13:17):

But yes, please do switch to the action.

Rob Lewis (Aug 03 2020 at 13:17):

I think it's fine to keep it scheduled for once a week. The demo workflow for the action schedules it for once a day.

Rob Lewis (Aug 03 2020 at 13:18):

In fact, let's install both of the new actions and add it to the project repo.

Rob Lewis (Aug 03 2020 at 13:34):

@Patrick Massot a website question while you're here. The project repository stores the success/failure of the builds like this: https://raw.githubusercontent.com/leanprover-contrib/leanprover-contrib/store-version-history/version_history.yml

I'd like to display this on the website as a table.

             | 3.16.3 | 3.17.0 | 3.17.1 | 3.18.4
mathematica  | build  |        | build  |
test-repo-1  | build  | fail   | build  | fail
test-repo-2  |
another-test |

etc. I can put the data in whatever format is convenient. What would be the way to do this that fits in with the rest of the website generation?

Patrick Massot (Aug 03 2020 at 13:36):

Rob Lewis said:

In fact, let's install both of the new actions and add it to the project repo.

I'm not sure I understand this sentence.

Rob Lewis (Aug 03 2020 at 13:37):

tutorials#25

Patrick Massot (Aug 03 2020 at 13:38):

Thanks

Patrick Massot (Aug 03 2020 at 13:39):

About the website question: do you want a new web page in the community website displaying this table?

Rob Lewis (Aug 03 2020 at 13:39):

Yeah, that was the plan.

Patrick Massot (Aug 03 2020 at 13:40):

Do you like the tables from the mathlib stats page?

Rob Lewis (Aug 03 2020 at 13:40):

We can have a page that lists all the projects that are checked in with some information about them, and displays this table at the end, to show which projects are compatible with which Lean versions.

Rob Lewis (Aug 03 2020 at 13:41):

Patrick Massot said:

Do you like the tables from the mathlib stats page?

Sure, I'm not picky. Those are higher tech than I was imagining for this.

Rob Lewis (Aug 03 2020 at 13:42):

But if the tools are there, no reason not to reuse them!

Patrick Massot (Aug 03 2020 at 13:43):

They are very cheap to generate, they use http://tabulator.info/. See https://github.com/leanprover-community/leanprover-community.github.io/blob/newsite/templates/mathlib_stats.html#L180-L188

Patrick Massot (Aug 03 2020 at 13:44):

In that case the data comes from https://leanprover-community.github.io//mathlib_stats/gitstats.js

Patrick Massot (Aug 03 2020 at 13:45):

Search for authors = and authors_cols = in the js data file

Patrick Massot (Aug 03 2020 at 13:46):

Simply staring at those examples should be enough to build what you want very easily.

Rob Lewis (Aug 03 2020 at 13:47):

Okay, cool, thanks. I'll take a look.

Rob Lewis (Aug 25 2020 at 16:16):

Just to update people on the status here: I've written some Github Actions scripts that will help maintain your projects. See the recommendations here. These are still in beta so please report bugs/annoying behavior.

If your project depends only on mathlib (or on nothing at all), and nothing depends on your project, these scripts are really all you need for CI. But you should add it to the contrib repository anyway. This will help if other people want to use your project as a dependency. It will also help with publicity -- I have a draft of a page for the leanprover-community site that will display the checked-in projects with compatibility information.

Patrick Massot (Aug 25 2020 at 16:27):

Thanks! Should you add the tutorials project there?

Patrick Massot (Aug 25 2020 at 16:27):

Or even MIL?

Rob Lewis (Aug 25 2020 at 16:31):

Both of those should definitely use the actions scripts. We can also check them in to the repo, but there's less benefit there. I don't think either will be used as a dependency and they might be out of place on the display page. But that depends what else goes in to the repo I guess.

Patrick Massot (Aug 25 2020 at 16:32):

I was asking about the repo, the script case is clear.

Patrick Massot (Aug 25 2020 at 16:32):

Sorry this wasn't clear.

Rob Lewis (Aug 25 2020 at 16:56):

https://github.com/leanprover-community/leanprover-community.github.io/pull/121 introduces the page that displays checked in projects.

Patrick Massot (Aug 25 2020 at 16:58):

Does this picture means your script can't build the perfectoid project? Our CI says it builds (with an ancient mathlib of course).

Rob Lewis (Aug 25 2020 at 17:00):

The picture says it doesn't build with mathlib's lean-3.5.1 branch. It uses the version branches as "releases."

Rob Lewis (Aug 25 2020 at 17:00):

My guess is, if you found the last commit before you upgraded from 3.4.2 and put that on a branch called lean-3.4.2, you'd get a check mark in that column.

Patrick Massot (Aug 25 2020 at 17:01):

lean-3.5.1 means right before mathlib moved to whatever the next version of Lean was, right?

Rob Lewis (Aug 25 2020 at 17:02):

Right.

Patrick Massot (Aug 25 2020 at 17:02):

If we are lucky maybe that was not too long after we stopped bumping mathlib in the project.

Patrick Massot (Aug 25 2020 at 17:02):

Johan or Kevin, do you feel some motivation?

Rob Lewis (Aug 25 2020 at 17:02):

leanpkg upgrade should bump mathlib to its last commit on the lean-3.5.1 branch, if you want to try.

Rob Lewis (Aug 25 2020 at 17:05):

The checkmarks update once a day sometime in the middle of the night European time.

Rob Lewis (Aug 25 2020 at 17:16):

Rob Lewis said:

The picture says it doesn't build with mathlib's lean-3.5.1 branch. It uses the version branches as "releases."

To explain this a bit more: the idea with that table is that, if you want to import e.g. mathlib, topos, and lean-perfectoid-spaces in the same project, you need to pick a Lean version that they all support. But there's another compatibility issue, since topos and lean-perfectoid-spaces both depend on mathlib, you need a mathlib version that's compatible with both. So we assume that for any project A depending on B on Lean version x.y.z, the head commit on A's x.y.z branch should build using the head commit of B's x.y.z branch. That's what the check marks are testing. If you need three projects and they all have check marks on the same Lean version, the head commits on those branches are all compatible so you're in luck.

Rob Lewis (Aug 25 2020 at 17:17):

The story the graph tells now is "there's not much compatibility," hah. Best you can do is super2, mathematica, and mathlib on 3.17.1.

Rob Lewis (Aug 25 2020 at 17:17):

Or super, mathematica, and mathlib on 3.4.1, but that's cheating, because neither super nor mathematica use mathlib.

Patrick Massot (Aug 25 2020 at 17:18):

Not a big surprise...

Rob Lewis (Aug 25 2020 at 17:20):

The picture would be rosier if there were more x.y.z branches, which those actions will help with going forward.

Patrick Massot (Aug 25 2020 at 17:28):

I think the main limiting ingredients are project depending on something else than mathlib.

Patrick Massot (Aug 25 2020 at 17:29):

(or, better, on mathlib and something else).

Rob Lewis (Aug 25 2020 at 17:37):

Limiting ingredients for stress-testing the setup, for sure.

Rob Lewis (Aug 25 2020 at 17:38):

The fewer of those we have, the more check marks we should get.

Kevin Buzzard (Aug 25 2020 at 18:48):

Do we want this perfectoid thing compiling again?

Rob Lewis (Aug 28 2020 at 16:20):

I linked this in another thread, but to repeat it here: here are all the projects checked in so far. Add yours! And star the ones that are there so we look good.

Bhavik Mehta (Aug 28 2020 at 20:22):

I should probably write a better description :)

Julian Berman (Sep 03 2020 at 23:39):

Darn, I should have looked more carefully for this an hour ago instead of essentially manually rewriting it and then looking after to see if someone else had done this :). This action should basically remove me needing to do these installs manually, right? https://github.com/Julian/lean-across-the-board/blob/main/.github/workflows/ci.yml#L20-L30

Julian Berman (Sep 03 2020 at 23:40):

In case it's useful feedback, I did try and find something like this but I looked for something called setup-lean, in line with what other actions use, or for something like toolchain, in line with what the rust (cargo) action uses (https://github.com/actions-rs/toolchain)

Julian Berman (Sep 03 2020 at 23:44):

Hm. It won't though leave behind the lean binary on the host right, just in the container it's running in? So I wouldn't be able to run the lean --make test command afterwards?

Rob Lewis (Sep 04 2020 at 07:09):

I believe it doesn't leave behind the binary, no. But it runs leanpkg test which first builds src and then test if it exists.

Rob Lewis (Sep 04 2020 at 07:10):

I'm no expert at designing these things, happy to accept improvements.

Julian Berman (Sep 04 2020 at 12:23):

Rob Lewis said:

I believe it doesn't leave behind the binary, no. But it runs leanpkg test which first builds src and then test if it exists.

Ah that's awesome! Didn't know that (sounds like you may indeed be an expert then :D)

Julian Berman (Sep 04 2020 at 23:39):

Rob Lewis said:

Just to update people on the status here: I've written some Github Actions scripts that will help maintain your projects. See the recommendations here. These are still in beta so please report bugs/annoying behavior.

If your project depends only on mathlib (or on nothing at all), and nothing depends on your project, these scripts are really all you need for CI. But you should add it to the contrib repository anyway. This will help if other people want to use your project as a dependency. It will also help with publicity -- I have a draft of a page for the leanprover-community site that will display the checked-in projects with compatibility information.

OK yeah dropping in this action indeed worked perfectly :) thanks!

Rob Lewis (Sep 09 2020 at 10:33):

Another reminder to people, set up some CI for your project and add it to the list of external projects! Once the list is a little more populated we should start linking there.

Bhavik Mehta (Sep 10 2020 at 18:53):

Out of curiosity, does this give me compiled oleans I can download somewhere like mathlib does?

Bryan Gin-ge Chen (Sep 10 2020 at 19:54):

No, but we could potentially have the CI scripts create github actions artifacts and then maybe have leanproject try to get those for non-mathlib projects. (Maybe this is too much of a step backwards though, since we moved away from that sort of thing due to people hitting GitHub API limits.)

Bhavik Mehta (Sep 10 2020 at 19:57):

Aren't the projects already being compiled by CI when there's a new version of mathlib though - so the limits which could be hit are storage of oleans rather then computation usage

Bryan Gin-ge Chen (Sep 10 2020 at 19:59):

Sorry, I was unclear. I meant that leanproject users (maybe it was still called update-mathlib then?) were hitting GitHub's API limits when they tried to download oleans that were hosted on GitHub.

Bhavik Mehta (Sep 10 2020 at 20:00):

Ah I see, makes sense

Bryan Gin-ge Chen (Sep 10 2020 at 20:01):

Mathlib's olean files are hosted on Azure right now, and I think it would be difficult to allow random Lean projects to upload their own oleans while keeping the credentials secure.

Rob Lewis (Sep 11 2020 at 09:02):

Yeah, I agree with @Bryan Gin-ge Chen here. We can't allow arbitrary repos to upload to Azure without making it "free unlimited file storage for everyone." We could change the action to upload an artifact which you could download by hand. IIRC, at least when we started using Actions, there was no way to map a commit hash to a particular artifact. And this isn't even well defined, you can run the same action multiple times on the same commit. So "leanproject but downloading from Actions artifacts instead of Azure" doesn't work.


Last updated: Dec 20 2023 at 11:08 UTC