Zulip Chat Archive
Stream: general
Topic: Cache on GitHub Actions
SnO2WMaN (Nov 16 2025 at 14:20):
In our project FFL, we rely on GitHub Actions’ cache functionality.
See: https://github.com/FormalizedFormalLogic/Foundation/blob/master/.github/workflows/ci.yml
I note that we don't use lean-action because we build library, doc (by doc-gen), book (by verso) at once, the reusable action doesn't fit our desire.
However, mainly due to the size of Mathlib, our cache grows to about 4 GB, and since GitHub’s total cache quota is 10 GB, we can effectively store only 2 or 3 cache entries, which makes the cache nearly useless in practice.
To address this, I tried caching only .lake/build, and handling by lake exe cache get for mathlib's cache. Moreover, it seems that we don’t really need to cache .lake/build/bin either.
There may be other directories that don’t actually need to be cached, so I’d like to reduce the cache as much as possible. (below dust .lake --depth 2 in FFL/Foundation)
3.4M ┌── doc-data │█░░░░░░░░░░ │ 0%
5.9M ├── doc │█░░░░░░░░░░ │ 0%
233M ├── ir │██░░░░░░░░░ │ 3%
308M ├── lib │██░░░░░░░░░ │ 4%
1.2G ├── bin │████████░░░ │ 15%
1.8G ┌─┴ build │███████████ │ 22%
376K │ ┌── BibtexQuery │█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ 0%
432K │ ├── Cli │█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ 0%
1.8M │ ├── MD4Lean │█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ 0%
2.1M │ ├── subverso │█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ 0%
4.0M │ ├── doc-gen4 │█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ 0%
5.7M │ ├── LeanSearchClient│█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ 0%
6.2M │ ├── importGraph │█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ 0%
12M │ ├── plausible │█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ 0%
22M │ ├── UnicodeBasic │█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ 0%
26M │ ├── Qq │█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ 0%
31M │ ├── verso │█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ 0%
41M │ ├── proofwidgets │█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ 0%
154M │ ├── aesop │█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ 2%
449M │ ├── batteries │███░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ 5%
5.7G │ ├── mathlib │███████████████████████████████████░░░░ │ 69%
6.4G ├─┴ packages │███████████████████████████████████████ │ 78%
8.2G ┌─┴ .lake │█████████████████████████████████████████████████ │ 100%
Does this seem like a sound approach? If this is indeed a reasonable solution, then note that lean-action currently caches the entire .lake directory as well. Shouldn’t we revise that too?
SnO2WMaN (Nov 16 2025 at 14:24):
Not using a cache is not an option, currently, a full build without caching takes 50 minutes in GitHub Actions.
Ruben Van de Velde (Nov 16 2025 at 14:25):
Are you saying 50 minutes just for your own code?
SnO2WMaN (Nov 16 2025 at 14:28):
See repository's log, completing jobs in CI takes 50 minutes: in this case cache doesn't match.
https://github.com/FormalizedFormalLogic/Foundation/actions/runs/19405054054
Bryan Gin-ge Chen (Nov 16 2025 at 14:35):
I can only look briefly now but I'm a little confused why mathlib files are being built in the log here, even though you ran lake exe cache get in an earlier step: https://github.com/FormalizedFormalLogic/Foundation/actions/runs/19405054054/job/55518313582#step:10:266
Are you modifying files in mathlib or one of its dependencies somehow?
Sebastian Ullrich (Nov 16 2025 at 14:44):
Note that it's doing native compilation there, which is not currently cached
SnO2WMaN (Nov 16 2025 at 14:45):
Our project needs to produce executables for the Zoo (see https://github.com/FormalizedFormalLogic/Foundation?tab=readme-ov-file#zoo ), the corresponding source code is located here: https://github.com/FormalizedFormalLogic/Foundation/tree/master/Zoo
Files required to build these executables (I'm not sure .o files calls) are presumably not included in mathlib's cache, and thus cannot be obtained via lake exe cache get. For this reason, I suspect these files are being build in CI (this inference may be incorrect). Please refer to the "Generate Propositional Zoo" step in the following job: https://github.com/FormalizedFormalLogic/Foundation/actions/runs/19286694824/job/55148901446
Anyway, the building .o files takes about 4-5 minutes, it's not essential the issue we're currently trying to address.
SnO2WMaN (Nov 16 2025 at 14:54):
I also remembered another issue to related this. An excessively large cache may exceed the disk space of GitHub Actions runners, and I have personally had to deal with exactly this problem.
Please refer to the “Save project cache” step this job: https://github.com/FormalizedFormalLogic/Foundation/actions/runs/19384312786/job/55468840547
/usr/bin/tar --posix -cf cache.tzst --exclude cache.tzst -P -C /home/runner/work/Foundation/Foundation --files-from manifest.txt --use-compress-program zstdmt
zstd: error 70 : Write error : cannot write block : No space left on device
/usr/bin/tar: cache.tzst: Wrote only 4096 of 10240 bytes
/usr/bin/tar: Child returned status 70
/usr/bin/tar: Error is not recoverable: exiting now
Warning: Failed to save: "/usr/bin/tar" failed with error: The process '/usr/bin/tar' failed with exit code 2
Warning: Cache save failed.
As an ad-hoc workaround, I used https://github.com/BRAINSia/free-disk-space to free storage on the GH Actions Ubuntu runner. However, there is no guarantee that this problem will not occur in other repositories, especially projects using lean-action. Indeed, in our case, the issue suddenly appeered just two days ago (before this, caching seems to works well).
Last updated: Dec 20 2025 at 21:32 UTC