Zulip Chat Archive

Stream: mathlib4

Topic: Some files not found in the cache


Ruben Van de Velde (Nov 17 2023 at 20:48):

On clean master:

Downloaded: 3919 file(s) [attempted 3928/3928 = 100%] (99% success)
Warning: some files were not found in the cache.
This usually means that your local checkout of mathlib4 has diverged from upstream.
If you push your commits to a branch of the mathlib4 repository, CI will build the oleans and they will be available later.

Anyone else noticed this?

Kyle Miller (Nov 17 2023 at 21:04):

Yes, I've noticed. It looks like some widget code isn't part of the cache?

% lake build
[2758/3926] Building Mathlib.Tactic.Widget.SelectPanelUtils
[2793/3926] Building Mathlib.Tactic.Widget.CommDiag
[3919/3926] Building Mathlib.Tactic.Widget.Calc
[3919/3926] Building Mathlib.Tactic.Widget.Conv
[3919/3926] Building Mathlib.Tactic.Widget.Congrm
[3919/3926] Building Mathlib.Tactic.Widget.Gcongr
[3924/3926] Building Mathlib.Tactic
[3925/3926] Building Mathlib

Scott Morrison (Nov 18 2023 at 01:13):

They are part of the cache, unfortunately they now need to be rebuilt every time.

Scott Morrison (Nov 18 2023 at 01:17):

Actually, I realise I am confused.

Scott Morrison (Nov 18 2023 at 01:17):

Those Mathlib.Tactic.Widget oleans are definitely being stored in the cache.

Scott Morrison (Nov 18 2023 at 01:18):

The source of this issue is going to be that we use the lake release bundle for ProofWidgets, to retrieve javascript files, but this also delivers olean files for ProofWidgets.

Scott Morrison (Nov 18 2023 at 01:19):

However at the moment those ProofWidgets oleans are being built with an older toolchain, so are invalid.

Scott Morrison (Nov 18 2023 at 01:19):

(We can fix that, but I still don't understand why the observed behaviour is happening.)

Scott Morrison (Nov 18 2023 at 01:20):

I would have thought that cache would see the ProofWidgets oleans are no good, and get them from the lake exe cache cache, after which the Mathlib.Tactic.ProofWidgets.* oleans would be fine, and no rebuilding would be necessary.

Scott Morrison (Nov 18 2023 at 01:21):

This is an unfortunate interaction between caching systems, and yet another reminder that we need to integrate them!

Scott Morrison (Nov 18 2023 at 01:22):

@Mac Malone, @Mario Carneiro, do either of you understand what is happening here?

Mario Carneiro (Nov 18 2023 at 01:24):

This sounds like an issue that came up before but it should have been fixed already

Mario Carneiro (Nov 18 2023 at 01:25):

Is it the case that the downloaded ltar files do not work in lake?

Mario Carneiro (Nov 18 2023 at 01:25):

Or are they just missing

Mario Carneiro (Nov 18 2023 at 04:29):

I think the issue is that the hack that proofwidgets was using to avoid downstream rebuilds no longer works because of the git hash check

Mario Carneiro (Nov 18 2023 at 04:33):

ProofWidgets doesn't actually build from source. If the release oleans don't work then building the lean files will not work either, because there are missing build files. I just tried getting a fresh mathlib and lake exe cache get; lake build and I get stuck on:

[3826/3836] Building ProofWidgets.Compat
error: > LEAN_PATH=./.lake/packages/std/.lake/build/lib:./.lake/packages/Qq/.lake/build/lib:./.lake/packages/aesop/.lake/build/lib:./.lake/packages/Cli/.lake/build/lib:./.lake/packages/proofwidgets/build/lib:./.lake/build/lib LD_LIBRARY_PATH=./.lake/packages/proofwidgets/build/lib /home/mario/.elan/toolchains/leanprover--lean4---v4.3.0-rc2/bin/lean ./.lake/packages/proofwidgets/././ProofWidgets/Compat.lean -R ./.lake/packages/proofwidgets/./. -o ./.lake/packages/proofwidgets/build/lib/ProofWidgets/Compat.olean -i ./.lake/packages/proofwidgets/build/lib/ProofWidgets/Compat.ilean -c ./.lake/packages/proofwidgets/build/ir/ProofWidgets/Compat.c
error: stdout:
./.lake/packages/proofwidgets/././ProofWidgets/Compat.lean:174:16: error: no such file or directory (error code: 2)
  file: ./.lake/packages/proofwidgets/././ProofWidgets/../build/js/compat.js
./.lake/packages/proofwidgets/././ProofWidgets/Compat.lean:170:2: error: cannot evaluate code because 'sorryAx' uses 'sorry' and/or contains errors
error: external command `/home/mario/.elan/toolchains/leanprover--lean4---v4.3.0-rc2/bin/lean` exited with code 1

The offending line is:

@[widget]
def metaWidget : Lean.Widget.UserWidgetDefinition where
  -- The header is sometimes briefly visible before compat.tsx loads and hides it
  name := "Loading ProofWidgets.."
  javascript := include_str ".." / "build" / "js" / "compat.js"

where the include_str refers to a file that does not exist, because the ProofWidgets lakefile is set up to not build the javascript files if you are a downstream project

Scott Morrison (Nov 18 2023 at 05:35):

I have seen that error a few times, but deleting .lake seemed to resolve it. I had assumed it was a consequence of switching between v4.3.0-rc2 and pre-v.4.3.0-rc2.

Mario Carneiro (Nov 18 2023 at 05:43):

There appears to be an issue in the CI script which checks the cache, see https://github.com/leanprover-community/mathlib4/actions/runs/6912140188/job/18807525121 which "fails successfully" and would have caught this issue

Mario Carneiro (Nov 18 2023 at 05:47):

oh you are right:

$ rm -rf .lake
$ lake exe cache get
$ lake build
builds 9 files, works
$ lake clean
$ lake exe cache get
$ lake build
fails to build ProofWidgets.Compat

Mario Carneiro (Nov 18 2023 at 05:47):

I don't think this can be explained as a versioning issue

Mario Carneiro (Nov 18 2023 at 05:50):

I diffed the two .lake folders and the only difference is that .lake/packages/proofwidgets/build is present in the first case and absent in the second

Mario Carneiro (Nov 18 2023 at 05:55):

It seems that when you run lake exe cache with a missing .lake folder, lake will clone all the repos and fetch the proofwidgets cloud release, while lake clean; lake exe cache does not fetch the cloud release

Mario Carneiro (Nov 18 2023 at 05:56):

lake env true does not fetch the release, so this must have something to do with the compilation dependencies of cache

Mario Carneiro (Nov 18 2023 at 05:57):

@Mac Malone

Mario Carneiro (Nov 18 2023 at 05:58):

The part that is still not explained here is why the upload doesn't work even though the build does

Mac Malone (Nov 18 2023 at 22:43):

@Mario Carneiro I think there are likely two problems here:

  • ProofWidgets build directory is now in the non-default location of proofwidgets/build, which cache does not handle properly
  • lake clean does not clean the build archive files and Lake does not re-unpack the files if the build archive is still present.

Mario Carneiro (Nov 18 2023 at 22:44):

the second issue seems pretty severe, right now I don't know any way to get back to a sensible state if those files are missing other than deleting .lake

Mario Carneiro (Nov 18 2023 at 22:50):

What I don't understand about this is why lake wouldn't see that the files are missing and unpack or re-download in response

Mac Malone (Nov 19 2023 at 03:54):

The answer is a bug.

Mac Malone (Nov 20 2023 at 15:46):

@Scott Morrison Is it possible to get the mathlib CI to detect this kind of error?

Mario Carneiro (Nov 20 2023 at 16:28):

Yes, I have a branch that does this, it's not merged because it fails right now (maybe things are better if the proofwidgets bump has landed)

Mario Carneiro (Nov 20 2023 at 16:29):

#8535

Mac Malone (Nov 21 2023 at 14:50):

@Mario Carneiro Oh, I see you changed the ProofWidgets output directory. I had specifically requested that it be specified and not be in .lake (since code depends on its location). Can cache not be adjusted to handle different build directories for different packages?

Mario Carneiro (Nov 21 2023 at 17:03):

It can, but it would also require baking information about where lake puts its stuff in cache. I'm not convinced that it is a good idea to make this configurable by the project (it may be configurable by the user though). The issue with having .lake/build show up 20 times in the proofwidgets source code should be solved by having lake pass the location of the build directory to lean elaboration, similar to how env!("OUT_DIR") works in rust + cargo.

Mac Malone (Nov 22 2023 at 04:39):

Mario Carneiro said:

It can, but it would also require baking information about where lake puts its stuff in cache.

Cache already does this? (The lake directories for packages, libraries, binaries, etc. are already built into cache.)

Mario Carneiro said:

lake pass the location of the build directory to lean elaboration, similar to how env!("OUT_DIR") works in rust + cargo

Yes, passing an extended environment to Lean elaboration would one way to solve this that seems reasonable. Part of me would still prefer more sophisticated interaction between Lake and Lean (and other Lean executors).

Mac Malone (Nov 22 2023 at 04:40):

Environment variables are one way only and can only hold so much information.

Mario Carneiro (Nov 22 2023 at 07:16):

Mac Malone said:

and can only hold so much information.

That just means you haven't tried hard enough :wink:

llllvvuu (Dec 11 2023 at 07:28):

currently it takes a long time to get VSCode responsive:

; rm -rf .lake
; lake update
std: cloning https://github.com/leanprover/std4 to './.lake/packages/std'
Qq: cloning https://github.com/leanprover-community/quote4 to './.lake/packages/Qq'
aesop: cloning https://github.com/leanprover-community/aesop to './.lake/packages/aesop'
proofwidgets: cloning https://github.com/leanprover-community/ProofWidgets4 to './.lake/packages/proofwidgets'
Cli: cloning https://github.com/leanprover/lean4-cli to './.lake/packages/Cli'
mathlib: running post-update hooks
; lake exe cache get
info: [0/9] Downloading proofwidgets cloud release
info: [0/9] Unpacking proofwidgets cloud release
info: [1/9] Building Cache.IO
info: [2/9] Compiling Cache.IO
info: [2/9] Building Cache.Hashing
info: [3/9] Compiling Cache.Hashing
info: [3/9] Building Cache.Requests
info: [5/9] Compiling Cache.Requests
info: [5/9] Building Cache.Main
info: [7/9] Compiling Cache.Main
info: [9/9] Linking cache
Attempting to download 5 file(s)
Downloaded: 5 file(s) [attempted 5/5 = 100%] (100% success)
Decompressing 4009 file(s)
unpacked in 4454 ms
; lake build
[310/495] Building ProofWidgets.Compat
[930/4007] Building ProofWidgets.Component.Basic
[931/4007] Building ProofWidgets.Component.MakeEditLink
[931/4007] Building ProofWidgets.Component.Panel.Basic
[931/4007] Building ProofWidgets.Data.Html
...

Is that related to this issue?

Kevin Buzzard (Dec 11 2023 at 07:29):

What happens after the ... ? Does it build 10 files in total or 3000? If 3000, are you on windows?

llllvvuu (Dec 11 2023 at 07:36):

Seems to be around 3000, on macOS sonoma 14.1

other version info:

; elan -V
elan 3.0.0 (cdb40bff5 2023-09-08)
; elan show
installed toolchains
--------------------

leanprover/lean4:nightly (default)
leanprover/lean4:nightly-2023-03-17
leanprover/lean4:stable
leanprover/lean4:v4.3.0
leanprover/lean4:v4.3.0-rc2
leanprover/lean4:v4.4.0-rc1

active toolchain
----------------

leanprover/lean4:v4.4.0-rc1 (overridden by '/Users/llwu/git/@leanprover-community/mathlib42/lean-toolchain')
Lean (version 4.4.0-rc1, commit b0fe9d6cdca8, Release)

; lake --version
Lake version 5.0.0-b0fe9d6 (Lean version 4.4.0-rc1)

Sebastien Gouezel (Dec 11 2023 at 07:37):

Does #8968 solve the issue for you?

Mario Carneiro (Dec 11 2023 at 07:38):

note that ProofWidgets.Compat needing to be rebuilt is also a bug, but it's probably best to work on it after landing #8968

llllvvuu (Dec 11 2023 at 07:42):

It does fix it! I guess the issue isn't Windows-specific?

Mario Carneiro (Dec 11 2023 at 07:46):

It seems that (on linux) lake exe cache get and lake build will battle to set their own values in .lake/packages/proofwidgets/.lake/build/lib/ProofWidgets/Compat.olean.hash. The run which produced the file that is downloaded when I lake exe cache get ProofWidgets.Compat is https://github.com/leanprover-community/mathlib4/actions/runs/7154688761/job/19482498392#step:13:29, which is odd because it doesn't actually build anything because it was canceled by an earlier build step... but of course it has already downloaded the out of date oleans from the proofwidgets release, and laundering them through a CI run like this gives them the current lean stamp of approval incorrectly.

Mario Carneiro (Dec 11 2023 at 07:55):

@Scott Morrison I think the upload cache action should run only if the build mathlib step has already run (even if it failed). Otherwise I don't see how we can avoid confusing bad oleans from good ones without a lake check-hash command

Mario Carneiro (Dec 11 2023 at 08:47):

The bad cache files have been purged, so lake exe cache get on master now works properly (on linux). What happens on windows?

Mario Carneiro (Dec 11 2023 at 08:48):

(You may need to use lake exe cache get! if you have previously downloaded the bad cache files)

Sebastien Gouezel (Dec 11 2023 at 08:52):

It works perfectly, nothing is rebuilt. Thanks!

Scott Morrison (Dec 11 2023 at 22:52):

#8987

Eric Wieser (Dec 12 2023 at 00:04):

Is there a meaningful difference in terms of cache risk between "build was skipped" and "build was cancelled part way though"?

Scott Morrison (Dec 12 2023 at 00:23):

I'm not sure there.

However there's a big upside to uploading the cache when the build was cancelled part way through!!

Eric Wieser (Dec 12 2023 at 00:53):

An aside: one problem I think we might have is that if you push a new commit to a PR with an in progress build, you can end up with:

  • new: start
  • old: cancel build step
  • new: get cache
  • old: upload partial cache
  • new: build, but without the cache that was just uploaded

Eric Wieser (Dec 12 2023 at 00:54):

I worry that the upside to the partial caches is the same as the problem we're trying to solve...

Mario Carneiro (Dec 12 2023 at 06:36):

I think the optimal solution for us in CI would be to run a pre-cache put step which deletes any oleans which have bad hashes (i.e. lake refuses to use them in lake build). Unfortunately this requires lake support

Mac Malone (Dec 12 2023 at 21:58):

Mario Carneiro said:

Scott Morrison I think the upload cache action should run only if the build mathlib step has already run (even if it failed). Otherwise I don't see how we can avoid confusing bad oleans from good ones without a lake check-hash command

If the goal is to just make lake not trust the hash files, there is already an option for that (--rehash / -H)

Mario Carneiro (Dec 13 2023 at 07:32):

@Mac Malone the goal is to make sure that after lake runs there are no (reachable) olean files in the build directory that are not correct.

Mario Carneiro (Dec 13 2023 at 07:32):

but we don't want to rebuild anything, hence we have to just remove the bad files

Mario Carneiro (Dec 13 2023 at 07:35):

The problem is not files with correct hashes and incorrect content (which is what --rehash handles), it is files with incorrect hashes and incorrect content. lake build would want to replace them, but it may not actually do so because of build errors or because it was canceled. We need something that does not take an hour to run


Last updated: Dec 20 2023 at 11:08 UTC