Zulip Chat Archive
Stream: lean4
Topic: Caching doesn't seem to work
Nikolas Kuhn (May 12 2023 at 22:30):
I have trouble getting mathlib4 to build with the cached files on Windows. I have all instances of VSCode closed, and am trying to run
lake clean
lake exe cache get
lake build
but it seems like it tries to build mathlib from scratch each time. I'm trying to build after porting a mathlib3 file using the script - I think it worked once when I tried to build "pure" mathlib4.
Is there anything that I'm overlooking or that could help to diagnose the issue?
Alex J. Best (May 12 2023 at 23:11):
Are you sure it is checking the files and not just compiling the executable code (what is the output of the build command?)
Nikolas Kuhn (May 12 2023 at 23:54):
[1/2369] Building Std.Tactic.OpenPrivate
[1/2369] Building Std.Data.Option.Init.Lemmas
[1/2369] Building Std.Tactic.HaveI
[1/2369] Building Std.Tactic.Unreachable
...
Mario Carneiro (May 12 2023 at 23:55):
what does lake exe cache get
do? (What you posted is indeed mathlib building, and we want to avoid that)
Nikolas Kuhn (May 12 2023 at 23:57):
Attempting to download 4 file(s)
Decompressing 2350 file(s)
It seems like it's exiting fine. If I do lake clean
beforehand it runs through a bunch of steps before downloading.
Kevin Buzzard (May 13 2023 at 00:01):
Try lake exe cache get!
maybe?
Nikolas Kuhn (May 13 2023 at 08:05):
Same issue. I might try to reinstall in WSL and see whether that works better.
Max Nowak 🐉 (May 13 2023 at 08:49):
Have you tried lake update
?
Nikolas Kuhn (May 13 2023 at 09:20):
Just tried it. Now lake clean
followed by lake exe cache get
or lake exe cache get!
give
Attempting to download 2354 file(s)
No cache files to decompress
And build still starts at the beginning.
Nikolas Kuhn (May 13 2023 at 09:21):
Also, for reference:
PS C:\Users\kuhnn\lean4\mathlib4> lean --version
Lean (version 4.0.0-nightly-2023-05-06, commit 445fd417be4d, Release)
PS C:\Users\kuhnn\lean4\mathlib4> lake --version
Lake version 4.1.0-pre (Lean version 4.0.0-nightly-2023-05-06)
Eric Wieser (May 13 2023 at 09:58):
Do you have vscode open at the same time? lake cache get
writes the same files as vscode does, so having both open makes a mess
Nikolas Kuhn (May 13 2023 at 10:44):
No, and as far as I can tell no lean processes running (although I don't know how to properly check this in windows).
Nikolas Kuhn (May 13 2023 at 10:51):
Relatedly, what is the preferred way to run start_port.sh in windows? It didn't work in Powershell (i.e. bash _
reported errors), so I ran it in the VSCode terminal. Maybe that has broken something?
Mario Carneiro (May 13 2023 at 10:52):
WSL might work
Mario Carneiro (May 13 2023 at 10:52):
I think it needs at least sed
, so I doubt a plain cmd.exe will work
Mario Carneiro (May 13 2023 at 10:54):
hm, actually it is quite a bit sophisticated, you will probably need an actual conforming bash implementation
Mario Carneiro (May 13 2023 at 10:54):
what errors do you get?
Shreyas Srinivas (May 13 2023 at 11:36):
Did you run lake update before running lake exe cache get?
Nikolas Kuhn (May 13 2023 at 13:11):
@Shreyas Srinivas yes, I did
@Mario Carneiro This is what I get.
PS C:\Users\kuhnn\lean4\mathlib4> bash ./scripts/start_port.sh Mathlib/CategoryTheory/Sites/Pushforward.lean
./scripts/start_port.sh: line 2: $'\r': command not found
: invalid optionport.sh: line 3: set: -
set: usage: set [-abefhkmnptuvxBCHP] [-o option-name] [--] [arg ...]
: invalid option namesh: line 4: set: pipefail
./scripts/start_port.sh: line 5: $'\r': command not found
./scripts/start_port.sh: line 20: syntax error near unexpected token `$'in\r''
'/scripts/start_port.sh: line 20: `case $mathlib4_path in
Mario Carneiro (May 13 2023 at 13:14):
it looks like a lot of the problem is that the shell script has windows style newlines in it which are confusing bash
Mario Carneiro (May 13 2023 at 13:14):
try opening it and saving it with just \n
newlines
Nikolas Kuhn (May 14 2023 at 12:34):
Thanks! I've got the script to work (changing the newlines, also had to add setting github credentials to within the script). I set up a new copy of mathlib4 and re-did everything -- it's still trying to build mathlib from scratch after calling lake exe cache get
(or get!
).
Nikolas Kuhn (May 14 2023 at 12:36):
I'm giving up on trying to make it work in windows directly now, and will try again with WSL.
Nikolas Kuhn (May 15 2023 at 19:04):
Got it to work in WSL. Under Windows, I could narrow down the issue:
1) If I run lake update, it modifies lake-manifest.json
(I think it changes the line endings). This leads to lake exe cache get
not working (with or without ! ):
Attempting to download 2404 file(s)
No cache files to decompress
2) If I then revert lake-manifest.json
to its original form, it seems like lake exe cache get
goes through (it actually says it's decompressing), but then mathlib starts building anew anyway.
Sebastian Ullrich (May 15 2023 at 19:12):
Could you please confirm whether the line endings were the only change? This should have been fixed by !4#3096
Sebastian Ullrich (May 15 2023 at 19:13):
In any case, do not run lake update
in mathlib4 if you want to use the cache! I don't know where this suggestion is coming from, it's the opposite of what you want to do usually.
Nikolas Kuhn (May 15 2023 at 19:18):
Thanks for the clarification! People have been asking whether I ran it at least twice in this thread, which I assumed means they suggest to do so..
Nikolas Kuhn (May 15 2023 at 19:20):
This is what git diff shows (it changes not only the line endings):
warning: LF will be replaced by CRLF in lake-manifest.json.
The file will have its original line endings in your working directory
diff --git a/lake-manifest.json b/lake-manifest.json
index 9a29eda5..7e4395ee 100644
--- a/lake-manifest.json
+++ b/lake-manifest.json
@@ -10,12 +10,12 @@
{"git":
{"url": "https://github.com/JLimperg/aesop",
"subDir?": null,
- "rev": "cdc00b640d0179910ebaa9c931e3b733a04b881c",
+ "rev": "5cf2eb0722e8d0cb58149c744f9888de1b9f0855",
"name": "aesop",
"inputRev?": "master"}},
{"git":
{"url": "https://github.com/leanprover/std4",
"subDir?": null,
- "rev": "6006307d2ceb8743fea7e00ba0036af8654d0347",
+ "rev": "fe45c329180f4d051f9f19d1a360d9e195532afb",
"name": "std",
"inputRev?": "main"}}]}
Ruben Van de Velde (May 15 2023 at 19:20):
You'll want to revert those changes in any case
Nikolas Kuhn (May 15 2023 at 19:27):
Yup! I re-reverted in order to check.
Nikolas Kuhn (May 15 2023 at 19:30):
Starting with a fresh copy of mathlib and running lake exe cache get
and lake build
still starts a fresh build. (but cache get now has files to decompress again)
Nikolas Kuhn (May 15 2023 at 19:31):
lake exe cache get
output:
info: cloning https://github.com/leanprover/std4 to .\lake-packages\std
info: cloning https://github.com/gebner/quote4 to .\lake-packages\Qq
info: cloning https://github.com/JLimperg/aesop to .\lake-packages\aesop
info: [0/8] Building Cache.IO
info: [1/8] Compiling Cache.IO
info: [1/8] Building Cache.Hashing
info: [2/8] Compiling Cache.Hashing
info: [2/8] Building Cache.Requests
info: [4/8] Compiling Cache.Requests
info: [4/8] Building Cache.Main
info: [6/8] Compiling Cache.Main
info: [8/8] Linking cache.exe
Attempting to download 2402 file(s)
Decompressing 2402 file(s)
Arthur Paulino (May 15 2023 at 19:34):
It's just building the cache
binary, right?
Arthur Paulino (May 15 2023 at 19:35):
How long does it all take? And what happens if you open mathlib4
on some random lean
file that contains theorems and stuff?
Nikolas Kuhn (May 15 2023 at 19:35):
You mean lake build
? It looks like its building everything.
Ruben Van de Velde (May 15 2023 at 19:36):
If that's where it finished, try lake build
- this should have barely if any output, which means it worked
Ruben Van de Velde (May 15 2023 at 19:36):
:(
Arthur Paulino (May 15 2023 at 19:37):
Nikolas Kuhn said:
You mean
lake build
? It looks like its building everything.
But how long does it take? More than a minute?
Nikolas Kuhn (May 15 2023 at 19:37):
To be clear, if the lines start Building ....
, that's rebuilding from scratch, right? I had it correctly build from the cached version in WSL before and that was quick and didn't have this.
Nikolas Kuhn (May 15 2023 at 19:38):
Yes, more than a minute, more like a day.
Arthur Paulino (May 15 2023 at 19:40):
Okay, if lake build
is taking forever even after cache
successfully downloaded and decompressed all the olean
files you needed, then it's likely something wrong with Lake
Arthur Paulino (May 15 2023 at 19:40):
Maybe some combination of Lake + your dev environment
Arthur Paulino (May 15 2023 at 19:52):
Here's an idea: pick an olean
file that's built from scratch with a raw clone of mathlib4
without lake exe cache get
... just go straight to lake build
. You don't need to wait for it to finish. One olean
file is enough. Once you have it, do a md5
on it and save the hash on a text file.
Then compare with the md5
hash of the same file that's downloaded by lake exe cache get
. Maybe it's different? They should be the same
Sebastian Ullrich (May 15 2023 at 20:06):
Works on my machine, this is in a fresh clone of mathlib4
image.png
Mauricio Collares (May 15 2023 at 20:08):
It doesn't work here, and I have some potentially useful information: when I run lake build, the .olean and .ilean files don't change but the .trace files do
Mauricio Collares (May 15 2023 at 20:09):
Is there a way to see what goes into the trace hash computation?
Sebastian Ullrich (May 15 2023 at 20:10):
I don't think so, though I've wished for that before
Mauricio Collares (May 15 2023 at 20:12):
Wait, I see a few "mixTrace (<- getMTime olean) ..." calls in lake
Mauricio Collares (May 15 2023 at 20:13):
Does it really require accurate modification timestamps?
Mauricio Collares (May 15 2023 at 20:14):
Ah, that's "old mode"
Nikolas Kuhn (May 15 2023 at 20:18):
Here are the associated to Mathlib/Tactic/Set.lean
with a fresh build and cache get
respectively:
7C39C1177B1D85A22CB0DF933FC5E51E
7C39C1177B1D85A22CB0DF933FC5E51E
Should be indeed the same. (I got these from ./build/Lib/Mathlib/Tactic/Set.olean
.)
Nikolas Kuhn (May 15 2023 at 20:20):
Weirdly, lake build does seem to pick up from where it stopped when I cancel it. So caching seems to work for me in principle?
Mauricio Collares (May 15 2023 at 20:21):
Nikolas, does it work if you run git config --global core.autocrlf false
and reclone mathlib4? (also @Yaël Dillies)
Mauricio Collares (May 15 2023 at 20:21):
I think Git for Windows might have some weird defaults
Nikolas Kuhn (May 15 2023 at 20:25):
Thanks! That sounds like a useful thing to set either way. I'm giving it a try.
Sebastian Ullrich (May 15 2023 at 20:26):
Yes, it turns out I must have selected autocrlf = false when installing Git for Windows
Nikolas Kuhn (May 15 2023 at 20:27):
Yes, now it works! :D
Nikolas Kuhn (May 15 2023 at 20:32):
This also fixes the issue with the porting script.
Mauricio Collares (May 15 2023 at 21:04):
Should lake traces normalize newlines (at least on .lean files)?
Sebastian Ullrich (May 15 2023 at 21:04):
Yes, we need the same thing as !4#3096 in Lake
Sebastian Ullrich (May 21 2023 at 21:17):
https://github.com/leanprover/lake/pull/170
Last updated: Dec 20 2023 at 11:08 UTC