Zulip Chat Archive

Stream: lean4

Topic: fail to download some mathlib cache files


Bulhwi Cha (Jul 03 2023 at 12:38):

I can't download part of the Mathlib cache files:

$ lake exe cache get!
Attempting to download 3512 file(s)
Downloaded: 2073 file(s) [attempted 3512/3512 = 100%] (59% success)
Decompressing 2073 file(s)

Mauricio Collares (Jul 03 2023 at 12:41):

Did you change any mathlib source file?

Bulhwi Cha (Jul 03 2023 at 12:46):

I'll check the files soon.

Bulhwi Cha (Jul 03 2023 at 12:54):

my-project-dir/lake-packages/mathlib$ git status
HEAD detached at f3e62849a
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   Mathlib/Algebra/Module/Submodule/Lattice.lean

no changes added to commit (use "git add" and/or "git commit -a")

Kevin Buzzard (Jul 03 2023 at 12:55):

You won't be able to download a cache for a version of mathlib which doesn't have a cache, and perhaps a random commit won't have a cache

Bulhwi Cha (Jul 03 2023 at 12:57):

Mauricio Collares Kevin Buzzard Thanks for your help! I restored the changed Mathlib source file. Now I can download all the cache files.

my-project-dir/lake-packages/mathlib$ git restore .
my-project-dir/lake-packages/mathlib$ git status
HEAD detached at f3e62849a
nothing to commit, working tree clean
$ lake exe cache get
Attempting to download 3512 file(s)
Downloaded: 3512 file(s) [attempted 3512/3512 = 100%] (100% success)
Decompressing 3512 file(s)

Bulhwi Cha (Jul 03 2023 at 13:03):

It'd be nice if Lake printed an error message saying that there are some modified Mathlib cache files.

Arthur Paulino (Jul 03 2023 at 15:20):

Cache is built on top of how Lake behaves, but Lake is completely unaware of Cache. Lake is a Lean application that's downloaded in the toolchain. Cache is an application that's locally implemented in Mathlib.

Julian Berman (Jul 03 2023 at 15:23):

Is there no current equivalent of --fallback download-first?

Arthur Paulino (Jul 03 2023 at 15:31):

I'm not fully aware of what that does, but Cache uses (per file) content addressing to know precisely the cache file it would need. If the file is not available on Azure, it won't try to download anything else (because it wouldn't work anyway)

Julian Berman (Jul 03 2023 at 16:01):

In Lean 3 it basically means "download the cache from the most recent ancestor git commit which has a built one" (and I think would have been the default behavior if not for hysterical raisins)

Julian Berman (Jul 03 2023 at 16:01):

I think that sounds like it should work around the same here right, basically as long as the parent commit has the same per-file contents (which are only likely to differ for a few files on average) then you still save lots of time, no?

Julian Berman (Jul 03 2023 at 16:03):

Ah wait or.are you teaching me it already works this way by nature of operating per file?

Julian Berman (Jul 03 2023 at 16:03):

Probably I should simply learn how it works eh.

Arthur Paulino (Jul 03 2023 at 16:15):

Julian Berman said:

Ah wait or.are you teaching me it already works this way by nature of operating per file?

Not that I intend to teach you or anything like that. I'm just describing how Cache works. It doesn't make use of Git commit hashes because it has its own mechanics for hashing. If you branch out from main, make a local commit changing README.md and then try to use Cache, it will work smoothly

Julian Berman (Jul 03 2023 at 16:18):

Aha! Got it, nice! And the issue above was simply because it was just working tree changes and not a commit? Or did it work that way here too?

Arthur Paulino (Jul 03 2023 at 16:25):

Apparently the issue was that Bulhwi had some local changes on some file(s), which propagated to the hashes of the files that depend on it/them. Since the cache files that would result from those changes have never been pushed to Azure, there was no (probable) way Cache would succeed at downloading those files (except for some unlucky and unlikely hash collision)

Sebastian Ullrich (Jul 03 2023 at 17:06):

Bulhwi Cha said:

It'd be nice if Lake printed an error message saying that there are some modified Mathlib cache files.

I slightly clarified the output: https://github.com/leanprover-community/mathlib4/pull/5692

Bulhwi Cha (Jul 04 2023 at 03:15):

Sebastian Ullrich Thanks!


Last updated: Dec 20 2023 at 11:08 UTC