Zulip Chat Archive

Stream: lean4

Topic: lake exe cache get! failure


Sebastien Gouezel (Sep 15 2023 at 05:37):

When there are partly downloaded files, lake exe cache get! fails, which defeats its purpose:

PS C:\Users\Sebastien\Desktop\mathlib4_2> lake exe cache get!
Attempting to download 3757 file(s)
uncaught exception: already exists (error code: 17, file exists)
  file: C:\Users\Sebastien\.cache\mathlib\ed2f5f00b2156522.ltar.part and/or C:\Users\Sebastien\.cache\mathlib\ed2f5f00b2156522.ltar

Should a run of lake exe cache get erase first all .partfiles? (Maybe bad if one is doing two runs at the same time on different branches). Or adjust some flags to not fail on creation if the file already exists?

Mario Carneiro (Sep 15 2023 at 05:50):

I can't replicate, maybe it is windows-only?

Scott Morrison (Sep 15 2023 at 05:50):

This might be a bug introduced by the .part PR. If so, my fault!

Mario Carneiro (Sep 15 2023 at 05:51):

Strictly speaking we never actually create the part file directly, we put that file name in a config and pass it to curl

Mario Carneiro (Sep 15 2023 at 05:51):

but the error message seems like some file IO in lean is failing

Mario Carneiro (Sep 15 2023 at 05:54):

aha, that's an error message coming from the IO.rename

Mario Carneiro (Sep 15 2023 at 05:54):

it is saying that the .ltar file to be overwritten already exists

Mario Carneiro (Sep 15 2023 at 05:59):

By comparison to https://doc.rust-lang.org/std/fs/fn.rename.html it seems like the best fix is to make IO.FS.rename pass the right flag so that it also automatically replaces the existing file on windows for consistency with other OS's

Scott Morrison (Sep 15 2023 at 05:59):

(and that is definitely my fault for not thinking about what would happen in that case!)

Mario Carneiro (Sep 15 2023 at 06:09):

lean4#2546

Kevin Buzzard (Sep 15 2023 at 06:27):

FWIW, If I have trouble with get! after an "incident" (eg getting off a train and losing internet connection last week) my next move is usually rm -rf lake-packages and then get! again.

Kevin Buzzard (Sep 15 2023 at 06:28):

Oh but looking at the error I see I should also be nuking .cache :-)

Mario Carneiro (Sep 15 2023 at 06:28):

I should remark that although a lot of people seem to think that clearing the lake-packages and/or build directory or getting a clean mathlib installation will fix cache problems, you are clearing the wrong place

Mario Carneiro (Sep 15 2023 at 06:28):

.cache is what you should be nuking

Kevin Buzzard (Sep 15 2023 at 06:29):

I'll add it to the list

Floris van Doorn (Sep 17 2023 at 20:09):

On Windows, on the latest version of mathlib, unpacking of lake exe cache get is blazingly fast (~200ms, which was ~100x faster than before), but the cache also doesn't work: I still have to recompile mathlib afterwards. I expect the files are not actually getting copied.

Note: I first got a bunch of errors of the following form, when having VSCode open. That is not a problem, but might have caused the other behavior.

C:\Users\Floris\.cache\mathlib\e41550aac102cd35.ltar: The process cannot access the file because it is being used by another process. (os error 32)

Floris van Doorn (Sep 17 2023 at 20:19):

I also got the error in the first message of the thread, and cleared my .cache directory to fix it.

Mario Carneiro (Sep 17 2023 at 20:36):

if you get those errors, does it help to re-run the command?

Floris van Doorn (Sep 21 2023 at 12:50):

sorry for my late response. Re-running does not solve it: some files (presumably the ones that failed to be accessed before) will then not have the correct cache, and opening a Lean file will cause Lean to rebuild some files from Mathlib.

Mario Carneiro (Sep 21 2023 at 12:52):

the error at the top of the thread will be fixed when lean4#2546 percolates down to mathlib

Floris van Doorn (Sep 21 2023 at 12:52):

I ran into this again just now. I had VSCode open at the start and got the error messages. After running lc (short for lake exe cache get) a second time - this time after closing VSCode - it would still rebuild mathlib

Floris@Dell-E MINGW64 ~/projects/mathlib ((9ebe3593a...))
$ lc
info: std: updating repository '.\lake-packages\std' to revision '80089ff4cd1808ad1506a62dac557517d9b875ff'
info: proofwidgets: updating repository '.\lake-packages\proofwidgets' to revision '44e6673a20fc0449d003983d1e1f472df40f7107'
info: [0/6] Fetching proofwidgets cloud release
info: [1/9] Building Cache.IO
info: [2/9] Building Cache.Hashing
info: [2/9] Compiling Cache.IO
info: [3/9] Building Cache.Requests
info: [3/9] Compiling Cache.Hashing
info: [4/9] Compiling Cache.Requests
info: [4/9] Building Cache.Main
info: [6/9] Compiling Cache.Main
info: [9/9] Linking cache.exe
C:\Users\Floris\.cache\mathlib\d77883098349e9eb.ltar: The process cannot access the file because it is being used by another process. (os error 32)
C:\Users\Floris\.cache\mathlib\8dc5d51a6e575ab5.ltar: The process cannot access the file because it is being used by another process. (os error 32)
C:\Users\Floris\.cache\mathlib\5379c5f98d59e3ce.ltar: The process cannot access the file because it is being used by another process. (os error 32)
[... many of these lines omitted ...]
C:\Users\Floris\.cache\mathlib\dcb8d7b8acb67db2.ltar: The process cannot access the file because it is being used by another process. (os error 32)
No files to download
Decompressing 3765 file(s)


Floris@Dell-E MINGW64 ~/projects/mathlib ((9ebe3593a...))
$ lc
No files to download
Decompressing 3765 file(s)
unpacked in 29361 ms

Floris@Dell-E MINGW64 ~/projects/mathlib ((9ebe3593a...))
$ lake exe cache get!
Attempting to download 3765 file(s)
uncaught exception: already exists (error code: 17, file exists)
  file: C:\Users\Floris\.cache\mathlib\d7e6fffb792190df.ltar.part and/or C:\Users\Floris\.cache\mathlib\d7e6fffb792190df.ltar

Floris@Dell-E MINGW64 ~/projects/mathlib ((9ebe3593a...))
$ lake build
[264/3763] Building Mathlib.Init.Data.Prod
[290/3763] Building Mathlib.Data.Rat.Init
[301/3763] Building Mathlib.Tactic.WLOG
[317/3763] Building Mathlib.Util.Qq
[319/3763] Building Mathlib.Tactic.Says
[320/3763] Building Mathlib.Data.MLList.Dedup
[322/3763] Building Mathlib.Lean.Name
[339/3763] Building Mathlib.Data.TypeMax
[...] -- this continues, it is recompiling mathlib

Shreyas Srinivas (Sep 21 2023 at 12:53):

I have this issue too.

Shreyas Srinivas (Sep 21 2023 at 12:54):

I am not running MINGW, but lake update worked for a project with a more recent toolchain. I ran lake update, copied the lean-toolchain, cleared ~/.cache, ran lake exe cache get!. It worked for a project properly yesterday. Today it fails.

Shreyas Srinivas (Sep 21 2023 at 12:55):

the cache claimed to work successfully in both instances.

Mario Carneiro (Sep 21 2023 at 12:55):

note that cache is currently broken on master, #7291 is a fix

Mario Carneiro (Sep 21 2023 at 12:57):

@Scott Morrison we need a better way to keep track of a todo list for pending changes that must be done to mathlib before the next lean bump

Shreyas Srinivas (Sep 21 2023 at 12:57):

something like a trello board?

Floris van Doorn (Sep 21 2023 at 12:58):

(I'm on mathlib from 4 days ago)

Mario Carneiro (Sep 21 2023 at 12:58):

is it using 4.1.0-rc1?

Shreyas Srinivas (Sep 21 2023 at 12:59):

Mario Carneiro said:

is it using 4.1.0-rc1?

yes in my case.

Floris van Doorn (Sep 21 2023 at 13:00):

In my case also, yes

Mario Carneiro (Sep 21 2023 at 13:04):

if you are having problems with 4.1.0-rc1, sit tight until #7291 is merged

Shreyas Srinivas (Sep 21 2023 at 13:06):

I have to xD . Powershell 7 is crawling to a standstill with nearly 1100 files left to be built.

Shreyas Srinivas (Sep 21 2023 at 13:08):

Is there are a lake command to revert an update to the previous state (including bringing back the old cache copy in the case of math)?

Mario Carneiro (Sep 21 2023 at 13:15):

lake doesn't do anything with the cache

Mario Carneiro (Sep 21 2023 at 13:16):

to undo an update you check out the old version and then lake exe cache get

Shreyas Srinivas (Sep 21 2023 at 13:17):

Hmm... it would be cool if we could snapshot a project with lake and restore it, build files and everything. OTOH, it might be a big ask, given mathlib handles cache.

Mario Carneiro (Sep 21 2023 at 13:22):

this is literally what lake exe cache does

Mario Carneiro (Sep 21 2023 at 13:22):

work to incorporate cache into lake is ongoing and tricky due to the cloud storage requirements

Shreyas Srinivas (Sep 21 2023 at 13:23):

Mario Carneiro said:

this is literally what lake exe cache does

doesn't lake just store the object files? I am thinking of a full snapshot of the project directory. Config, source, build, and everything. Like a mini backup tool.

EDIT: To elaborate a bit, here's the usage I have in mind. At a clean working state, I create a checkpoint named, for e.g. Charlie by typing lake checkpoint charlie. Then if I need to return to this state, I type lake revert charlie that restores everything as it was to charlie, calling cache as needed, redownloading/restoring old mathlib4 versions to match the old lean-toolchain, etc, etc

Floris van Doorn (Sep 21 2023 at 13:24):

I use git worktree to have multiple work trees of mathlib. It makes it very easy to switch between projects. I think that is basically doing what you want.

Shreyas Srinivas (Sep 21 2023 at 13:24):

Okay I'll check it out.

Floris van Doorn (Sep 21 2023 at 13:25):

(each mathlib project has its own cache and commit)

Floris van Doorn (Sep 21 2023 at 13:25):

they share the git history

Kevin Buzzard (Sep 22 2023 at 13:16):

@Floris van Doorn does lake clean fix things?

Shreyas Srinivas (Sep 22 2023 at 13:21):

Kevin Buzzard said:

Floris van Doorn does lake clean fix things?

The problem has something to do with the way hashes are computed Cache is broken somehow. I don't see how lake clean will fix it.

Mauricio Collares (Sep 22 2023 at 14:54):

Cache is no longer broken on master, but previously lake clean fixed stuff by deleting .hash files which weren't updated by lake exe cache get


Last updated: Dec 20 2023 at 11:08 UTC