Zulip Chat Archive
Stream: mathlib4
Topic: problems with `lake exe cache`
Scott Morrison (Jun 16 2023 at 06:19):
We currently have two problems with lake exe cache get
in mathlib4:
- it does not cache anything from ProofWidgets (this is usually not a problem, but anytime you change lean versions or touch the lakefile these files will rebuild, but could be cached)
lake exe cache get Archive/Imo/Imo2001Q6.lean
fails rather than either retrieving a cache for the IMO file and all its dependencies, or just retrieving a cache for the mathlib4 files it depends on.
!4#5109 fixes these two problems.
Scott Morrison (Jun 16 2023 at 06:20):
@Mario Carneiro asked to move the discussion of that PR here; he had concerns about whether we should be fixing these problems before planning how to allow non-mathlib caches, or upstreaming cache
.
Scott Morrison (Jun 16 2023 at 06:21):
My preference is to not to wait on those problems (which we've neither made a plan for, nor done the work for) in order to fix these immediate problems.
That said, I totally agree that we should solve both of those problems!
Mario Carneiro (Jun 16 2023 at 06:21):
The ProofWidgets fix is unproblematic to me
Mario Carneiro (Jun 16 2023 at 06:21):
But I don't think we should be caching the archive files at all
Mario Carneiro (Jun 16 2023 at 06:22):
I'm not sure exactly what you meant by the cache not understanding archive inputs
Mario Carneiro (Jun 16 2023 at 06:23):
but I don't think we should waste azure space, and more importantly users' download time, on cache files for things that are never used as dependencies
Scott Morrison (Jun 16 2023 at 06:23):
Try running lake exe cache get Archive/Imo/Imo2001Q6.lean
:
% lake exe cache get Archive/Imo/Imo2001Q6.lean
uncaught exception: No match for Archive/Imo/Imo2001Q6.lean
Mario Carneiro (Jun 16 2023 at 06:24):
that's a weird way to say file not found
Scott Morrison (Jun 16 2023 at 06:24):
No -- the file exists just fine.
Mario Carneiro (Jun 16 2023 at 06:24):
that's a weird way to say cache file not found
Scott Morrison (Jun 16 2023 at 06:25):
It's not even that --- it hasn't gone looking on the azure server by that point.
Mario Carneiro (Jun 16 2023 at 06:25):
yeah I get it
Mario Carneiro (Jun 16 2023 at 06:25):
what do you expect it to say if you cache get the readme file?
Scott Morrison (Jun 16 2023 at 06:25):
Okay, good point. :-)
Scott Morrison (Jun 16 2023 at 06:26):
It just seems surprising to me to not do this. The only reason I made this PR was that I honestly typed lake exe cache get Archive/Imo/Imo2001Q6.lean
earlier today, when looking at Yury's IMO PRs!
Mario Carneiro (Jun 16 2023 at 06:27):
We can certainly improve the error message, like I said that's a weird way to say that the file's not there
Mario Carneiro (Jun 16 2023 at 06:27):
but the file isn't there
Mario Carneiro (Jun 16 2023 at 06:27):
and I don't think it should be
Mario Carneiro (Jun 16 2023 at 06:28):
Although, when you typed lake exe cache get Archive/Imo/Imo2001Q6.lean
there is actually a reasonable interpretation of that command which isn't just an error, which is to download all the dependencies of that file
Mario Carneiro (Jun 16 2023 at 06:29):
ideally it should be able to do this even for files which are not mathlib files
Eric Wieser (Jun 16 2023 at 06:35):
I think there should either be a flag for this, or a different exit code (or both); for the sake of scripting, I think it's good to know whether exe cache get Foo.lean
has obtained a complete or a partial cache
Mario Carneiro (Jun 16 2023 at 06:42):
right now cache get doesn't really have a plumbing version but it certainly could
Mario Carneiro (Jun 16 2023 at 06:42):
e.g. -q
quiet mode and just return the result in the exit code
Mario Carneiro (Jun 16 2023 at 06:44):
I don't think there is any need to differentiate between a file like this which is not in the cache because it is not a mathlib file, compared to a file which is not in the cache because it just hasn't been uploaded yet
Mario Carneiro (Jun 16 2023 at 06:44):
at least in terms of the usual progress bar mechanics
Mario Carneiro (Jun 16 2023 at 06:45):
so you would just get 99% success
in this case
Kevin Buzzard (Jun 16 2023 at 06:56):
I tell Xena people "first lake exe cache get, then lake build" and if you're just installing lean 4/mathlib for the first time then often the lake build does something
Scott Morrison (Jun 16 2023 at 07:01):
So it seems we have several options for lake exe cache get Archive/Foo.lean
- get everything, including
Foo.olean
- get whatever dependencies of Foo are available in the cache, warning that
Foo.olean
is not covered by the caching policy - get whatever dependencies of Foo are available in the cache, with the only clue that the olean is not available being a
99% success
message - fail with an error message that
Foo
is not covered by the caching policy - fail with an inscrutable error message
We are currently at 5. My PR results in 1. (which, as Mario points out, comes at the cost of lake exe cache get
now retrieving some extra files that users might prefer not to download.)
My personal preference for behaviour is in the order listed above. I'm happy to try to implement 2., 3., or 4. if there is consensus that one of those is preferable to 1. (Or, of course, to let someone else do this.)
Yaël Dillies (Jun 16 2023 at 07:04):
It seems you are missing
1.5 get all the dependencies of Foo.olean
, including itself, when running lake exe cache get Archive/Foo.lean
, but do not get it when running lake exe cache get
Mario Carneiro (Jun 16 2023 at 07:07):
I don't think there is any reason to download Archive/Foo.olean
, not even your initial use case
Mario Carneiro (Jun 16 2023 at 07:08):
if you want to open the file in the editor Foo.olean
is not of use to you
Yaël Dillies (Jun 16 2023 at 07:09):
Does that not apply to all leaf files? That would be a great way of reducing the size of the cache.
Mario Carneiro (Jun 16 2023 at 07:09):
well there are technically no leaf files in mathlib except Mathlib.lean
Yaël Dillies (Jun 16 2023 at 07:10):
Uh okay but maybe we don't want cache for Mathlib.lean
either
Mario Carneiro (Jun 16 2023 at 07:10):
plus external users want to be able to import Mathlib
Mario Carneiro (Jun 16 2023 at 07:10):
so we do want it
Yaël Dillies (Jun 16 2023 at 07:11):
What about a --no-leaf
flag?
Mario Carneiro (Jun 16 2023 at 07:11):
I think it would just be cache get --deps Mathlib/Foo.lean
Mario Carneiro (Jun 16 2023 at 07:12):
This is one of many reasons I would like to hurry up and get this integrated with lake, we are literally reinventing lake here and I don't like it
Scott Morrison (Jun 16 2023 at 07:27):
So we'll stick with 5. until such a time?
Mario Carneiro (Jun 16 2023 at 07:31):
well I would assume there are some easy ways to make the error message slightly less inscrutable
Scott Morrison (Jun 16 2023 at 07:31):
!4#5128 is the part we can do in the meantime, teaching cache
about ProofWidgtes
.
Scott Morrison (Jun 16 2023 at 07:34):
!4#5129 replaces uncaught exception: No match for Archive/Imo/Imo2001Q6.lean
with uncaught exception: Archive/Imo/Imo2001Q6.lean is not covered by the olean cache.
.
Mario Carneiro (Jun 16 2023 at 07:37):
(you should double check that my println suggestion actually works as intended)
Scott Morrison (Jun 16 2023 at 07:49):
Maybe something useful we could do is plan some other features we would like to add to lake
. I feel like with the current situation of "only Mac has ever contributed to lake", it might be good to have some small things we could add to get the ball rolling.
My suggestion would be lake mk-all
(perhaps a better name), which regenerates Mathlib.lean
, Archive.lean
, and so on. (But obviously doing so generically, by looking in the lakefile to decide what needs doing.)
Other ideas?
Mario Carneiro (Jun 16 2023 at 07:51):
maybe mk-all
should just be an exe
, utilizing a lake plumbing command to make it a one-liner
Mario Carneiro (Jun 16 2023 at 07:51):
I don't see how lake would know what files to generate otherwise
Scott Morrison (Jun 16 2023 at 09:22):
I was imagining:
- for each
lean_lib XYZ where
in the lakefile, generateXYZ.lean
importing every.lean
file underXYZ/
as we currently do in .github/workflows/build.yml
.
Mario Carneiro (Jun 16 2023 at 09:34):
But lean_lib XYZ where
imports XYZ.lean
Mario Carneiro (Jun 16 2023 at 09:34):
which doesn't exist (presumably) or is out of date
Scott Morrison (Jun 16 2023 at 09:38):
I don't think that's right. e.g. with current mathlib4
, try:
- delete
./Archive.lean
- verify
lake build
works fine - (and therefore
lake mk-all
could work too) - but
lake build Archive
fails
Scott Morrison (Jun 16 2023 at 09:38):
the fact that one particular XYZ.lean
is missing or out of date doesn't prevent us from running lake
Arthur Paulino (Jun 16 2023 at 11:32):
Mario Carneiro said:
This is one of many reasons I would like to hurry up and get this integrated with lake, we are literally reinventing lake here and I don't like it
Yes, please. lake#153
My initial idea for cache
was that it would be a quick and dirty workaround while Lake itself doesn't implement a proper solution. The target audience was ppl doing the porting.
Yaël Dillies (Jun 16 2023 at 11:36):
The linkifier is broken because it links to leanprover-community/lake
instead of leanprover/lake
.
Arthur Paulino (Jun 16 2023 at 11:37):
https://github.com/leanprover/lake/issues/153
Last updated: Dec 20 2023 at 11:08 UTC