Zulip Chat Archive
Stream: general
Topic: Systematically transferring cache builds to S3 bucket
Albert Jiang (Dec 29 2025 at 17:26):
Hi! This is more of a question to the Lean FRO but I thought others might have the same question, so posting as a zulip topic instead:
I'd like to understand how the Lean FRO manages the cache builds for mathlib and how I can transfer them to my own S3 bucket.
To give some context, I have a scenario under which I need to do lots of lake cache get for arbitrary commits of mathlib. I tried to look for more info about how the current caches are stored but only got so far as that they are hosted on Azure from the terminal message. If I could get some more info about the current cache storage, that would be super helpful for the migration to my own S3 bucket. Many thanks in advance!
Joachim Breitner (Jan 01 2026 at 19:08):
Hi Albert! Sorry for the slow response, some people are on vacation and so on.
Is this about the mathlib cache (lake exec cache) or the new lake built in cache? In either case, maybe @Mac Malone has the right pointers?
Mac Malone (Jan 01 2026 at 19:13):
I can certainly provide some pointers on both fronts (Mathlib's cache and Lale's) once I am back from vacation. Along the lines of Joachim's question, Is the goal to just switch the storage location of Mahtlib's cache or move data from it over to a custom cache setup?
Albert Jiang (Jan 05 2026 at 09:54):
Thanks Joachim and Mac! Both the mathlib cache and the new lake built in cache will be super useful for me and I'd definitely appreciate some pointers!
Albert Jiang (Jan 05 2026 at 09:56):
Mac Malone said:
I can certainly provide some pointers on both fronts (Mathlib's cache and Lale's) once I am back from vacation. Along the lines of Joachim's question, Is the goal to just switch the storage location of Mahtlib's cache or move data from it over to a custom cache setup?
Sorry for the stupidity here, but is there a distinction between switch the storage location of Mahtlib's cache and move data from it over to a custom cache setup? I'd still like to use lake cache get, but from a custom S3 bucket.
Mac Malone (Jan 05 2026 at 14:38):
Albert Jiang said:
I'd still like to use lake cache get, but from a custom S3 bucket.
Did you mean lake exe cahe get (i.e., Mathlib's cache) and not lake cache get (i.e., Lake's cache)? Assuming you meant Mathlib's cache, then just switching the storage location is what you want. A custom cache setup would be using something other than Mathlib's cache code to manage the cache.
Mac Malone (Jan 05 2026 at 14:50):
Pointer-wise, Mathlib's cache already supports S3 endpoints. This has been found to be flaky in the past, hence why it still uses Azure by default, but hopefully the flakiness should not be a problem for your use case. This support can be toggled via the USE_FRO_CACHE environment variable. The only thing that needs to be changed code-wise is the URL cache uses for the S3 bucket, which is set here. With that corrected, the process should be as simple as the following:
$ lake exe cache get
$ USE_FRO_CACHE=1 MATHLIB_CACHE_S3_TOKEN=<s3-token> lake exe cache put
Were s3-token is the string <S3_ACCESS_KEY>:<S3_SECRET_KEY> (using the respective authentication keys for your bucket).
I can also look into making the URL the cache uses configurable via a PR. That would enable addressing this use case without any future code changes.
Mac Malone (Jan 05 2026 at 16:07):
Mathlib PR is up for review at #33605.
Kim Morrison (Jan 06 2026 at 02:27):
#33647 (stacked on top) renames USE_FRO_CACHE to MATHLIB_CACHE_USE_CLOUDFLARE for consistency, and also adds a hopefully useful README.
Last updated: Feb 28 2026 at 14:05 UTC