Zulip Chat Archive

Stream: general

Topic: mathlib nightlies


Patrick Massot (Feb 17 2020 at 14:08):

Is there any reason why mathlib nightlies are pre-releases and not releases? It makes it harder to fetch them.

Patrick Massot (Feb 17 2020 at 14:11):

This is done here at the very end of the line.

Patrick Massot (Feb 17 2020 at 14:13):

@Simon Hudon ?

Patrick Massot (Feb 17 2020 at 14:14):

Specifically, this choices makes it impossible to use https://developer.github.com/v3/repos/releases/#get-the-latest-release

Rob Lewis (Feb 17 2020 at 14:15):

No idea. Would changing it require updates to cache-olean/update-mathlib too? I wouldn't be inclined to spend much time changing this. Instead we should figure out the infrastructure to store builds for each commit on our own server.

Patrick Massot (Feb 17 2020 at 14:16):

No, this would have no impact on existing scripts. They loop through all releases, pre or not.

Patrick Massot (Feb 17 2020 at 14:18):

Rob, do you have enough admin power to change the latest nightly into an actual release?

Rob Lewis (Feb 17 2020 at 14:20):

Done, although I'm not sure what difference it makes...

Patrick Massot (Feb 17 2020 at 14:21):

Awesome! I explained the difference a few messages ago (the one with a GitHub API link)

Rob Lewis (Feb 17 2020 at 14:21):

Ah, sorry, missed that message.

Patrick Massot (Feb 17 2020 at 14:22):

No problem.

Sebastian Ullrich (Feb 17 2020 at 14:27):

I originally set up Lean nightlies as pre-releases because semantically it seemed more appropriate, and elan can't use the API anyway because of the rate limiting.

Patrick Massot (Feb 17 2020 at 14:28):

Thanks for the explanation. My experience is the API rate limiting is annoying when developing tools, but ok when using them.

Sebastian Ullrich (Feb 17 2020 at 14:54):

That depends on your expected number of users. It would not scale at all for elan.

Patrick Massot (Feb 17 2020 at 14:56):

I don't understand what you mean. Each user gets a limit, right?

Sebastian Ullrich (Feb 17 2020 at 14:59):

Each public IP address gets a (very low) limit, afair. I think we had some issues from users/scripts sharing an IP address.

Patrick Massot (Feb 17 2020 at 15:06):

I see that IP sharing could an issue in universities.

Patrick Massot (Feb 17 2020 at 15:07):

Currently I guess this matters only for people working at VU.

Patrick Massot (Feb 17 2020 at 15:07):

and maybe CMU.

Gabriel Ebner (Feb 17 2020 at 15:09):

At VU (and TU Wien) every computer has their own public IP so this shouldn't be an issue.

Rob Lewis (Feb 17 2020 at 15:11):

I disabled the cache-olean githook because I was getting rate limiting errors when switching around a bunch of branches. (Also it was kind of slow.) So it does come up in practice.

Rob Lewis (Feb 17 2020 at 15:12):

Again, we should probably start hosting these things ourselves pretty soon.

Johan Commelin (Feb 21 2020 at 08:43):

What do we need of a host for the nightlies?

Johan Commelin (Feb 21 2020 at 08:43):

The bandwidth won't be that much, right?

Johan Commelin (Feb 21 2020 at 08:43):

I've got a server running in my basement, I'm fine with hosting stuff.

Mario Carneiro (Feb 21 2020 at 08:44):

the main problem with basement servers is uptime

Johan Commelin (Feb 21 2020 at 08:44):

Sure

Johan Commelin (Feb 21 2020 at 08:45):

Currently I've been up for 250 days in a row...

Johan Commelin (Feb 21 2020 at 08:45):

But I agree that it is fragile

Mario Carneiro (Feb 21 2020 at 08:45):

Maybe we can get a mirror setup and crowdsource several basement servers?

Johan Commelin (Feb 21 2020 at 08:46):

Yup, I guess that would work

Patrick Massot (Feb 21 2020 at 09:05):

I think Scott also said he could host stuff at some point.

Patrick Massot (Feb 21 2020 at 09:06):

How could we have a mirror setup? Doesn't it require a server that is always responding?

Johan Commelin (Feb 21 2020 at 09:19):

But supposedly that would be very lightweight

Johan Commelin (Feb 21 2020 at 09:19):

You could host a static file with URLs on github.io

Johan Commelin (Feb 21 2020 at 09:20):

And clients can cache the URLs, if all of them fail, check on github.io what the latest URLs are

Johan Commelin (Feb 21 2020 at 09:20):

Maybe I should say hostname instead of URL

Johan Commelin (Feb 21 2020 at 09:20):

And then we manually maintain that list of server hostnames on github.io

Patrick Massot (Feb 21 2020 at 09:29):

Maybe we could even use torrents.

Patrick Massot (Feb 21 2020 at 09:29):

But we should let people who know about computers setup all this.

Rob Lewis (Feb 21 2020 at 09:38):

Buying actual space/bandwidth on Azure/S3/whatever will cost pennies if anything at all. Jasmin said he's happy to use grant money to cover it, if the time cost of getting it reimbursed is less than the monetary cost, which remains to be seen.

Patrick Massot (Feb 21 2020 at 09:39):

How would that work?

Rob Lewis (Feb 21 2020 at 09:39):

How would what work?

Patrick Massot (Feb 21 2020 at 09:39):

Using Azure/S3/whatever.

Rob Lewis (Feb 21 2020 at 09:39):

We get server space and push olean caches there instead of to a random GitHub repository.

Rob Lewis (Feb 21 2020 at 09:40):

cache-olean/update-mathlib get pointed there instead of to the nightly repos.

Patrick Massot (Feb 21 2020 at 09:40):

Sounds easy.

Rob Lewis (Feb 21 2020 at 09:40):

Indeed.

Patrick Massot (Feb 21 2020 at 09:41):

What would the url look like?

Rob Lewis (Feb 21 2020 at 09:41):

Dunno if there's a default. Worst case we buy leanprover-community.info for €10.

Rob Lewis (Feb 21 2020 at 09:42):

lean.loan?

Rob Lewis (Feb 21 2020 at 09:46):

I don't want something cheap. I want something we don't need to waste time maintaining. A network of basement servers with a URL cache on GitHub doesn't sound like that.

Patrick Massot (Feb 21 2020 at 09:46):

See Johan? We need the computer experts to handle that.

Rob Lewis (Feb 21 2020 at 11:37):

The computer expert is on the case. Everyone hold your breath for 90 minutes. https://github.com/leanprover-community/mathlib/runs/460051728

Patrick Massot (Feb 21 2020 at 13:19):

RESPONSE Status: 403 This request is not authorized to perform this operation using this permission. :sad:

Rob Lewis (Feb 21 2020 at 13:20):

Arbitrary file size limit. https://github.com/leanprover-community/mathlib/runs/460227887?check_suite_focus=true

Patrick Massot (Feb 21 2020 at 13:24):

Are you sure? This message really sounds like a permission issue.

Gabriel Ebner (Feb 21 2020 at 13:25):

I think what happens is the following: if the file is very small, it's uploaded in one go. If the file is larger, then it is uploaded in two or more parts. For security reasons, we're disallowing write operations (only create is allowed). So appending is disallowed.

Rob Lewis (Feb 21 2020 at 13:25):

I'm "sure," in that I reproduced and solved it locally with a flag to allow larger files.

Patrick Massot (Feb 21 2020 at 13:33):

I knew this should be handled by people working in a CS department.

Patrick Massot (Feb 21 2020 at 15:19):

RESPONSE Status: 403 Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.

Patrick Massot (Feb 21 2020 at 15:19):

I guess this is progress.

Rob Lewis (Feb 21 2020 at 15:19):

Yes, I'm aware.

Patrick Massot (Feb 21 2020 at 15:19):

Maybe we need a toy project to test this without waiting 90 minutes between each try.

Rob Lewis (Feb 21 2020 at 16:24):

download a 21mb mathlib build here

Rob Lewis (Feb 21 2020 at 16:24):

You were wondering about the url? Nothing could be better than oleanstorage.blob.core.windows.net/mathlib!

Rob Lewis (Feb 23 2020 at 13:55):

In the 27 hours since this landed, we've accumulated 900mb of olean caches.

Patrick Massot (Feb 23 2020 at 13:56):

We should probably decide a conservation policy pretty quickly.

Patrick Massot (Feb 23 2020 at 13:56):

And implement it.

Rob Lewis (Feb 23 2020 at 13:58):

I don't think there's a huge hurry. The policy I have in mind is to keep branch builds for a month and master builds forever. So we have a month to implement that before it will matter.

Rob Lewis (Feb 23 2020 at 13:58):

Maybe more helpful: should we try to cancel artifact uploads from non-master branches if another commit is added to that branch before the build finishes?

Patrick Massot (Feb 23 2020 at 13:58):

Won't we hit the disk quota in one week instead of one month?

Rob Lewis (Feb 23 2020 at 13:59):

The free account comes with $170 credit. It'll be a while before we burn through that.

Rob Lewis (Feb 23 2020 at 13:59):

Even if we go over the disk quota.

Patrick Massot (Feb 23 2020 at 14:00):

Does it start eating up that credit automatically as soon as we go over quota?

Rob Lewis (Feb 23 2020 at 14:00):

I think we should get the infrastructure working, see what the usage looks like, and then figure out how we're paying long term.

Rob Lewis (Feb 23 2020 at 14:00):

We'll find out!

Rob Lewis (Feb 23 2020 at 14:06):

Rob Lewis said:

Maybe more helpful: should we try to cancel artifact uploads from non-master branches if another commit is added to that branch before the build finishes?

This might actually be the most useful move to save space. A lot of very useless archives will be stored by people updating a PR incrementally, like by accepting suggestions from comments. Is there a situation where someone would need to check out those oleans?

Patrick Massot (Feb 23 2020 at 14:07):

I don't think we would ever use those oleans.

Gabriel Ebner (Feb 23 2020 at 16:59):

In my current usage I don't need these oleans either. I typically push to a branch, and then do a curl https://oleanstorage.blob.core.windows.net/mathlib/$(git rev-parse HEAD).tar.gz | tar xz once the build is finished. I don't need anything except the oleans for the latest revision. This might be different if multiple people are working on a branch.


Last updated: Dec 20 2023 at 11:08 UTC