Zulip Chat Archive

Stream: mathlib4

Topic: lake exe cache get very slow

Sébastien Gouëzel (Jun 12 2025 at 16:00):

On a PR where there are very few files built (#25816), lake exe cache get is essentially unusable for me (1 second per failed file, 5550 files). My connection should in theory be good, so I wonder if there's something I could make to mitigate the issue?

Kim Morrison (Jun 12 2025 at 17:15):

Hmm, I'm seeing about 100/s, even on my not-that-great-hooray-internet-in-Australia connection.

Kim Morrison (Jun 12 2025 at 17:16):

:fingers_crossed: it's was a transient issue?

Robin Carlier (Jun 12 2025 at 17:24):

I think the point is that it's long for failed files. I sometimes experience that as well, I get ".../6xxx" attempted, while 0 downloaded, and that takes a very long while.

Sébastien Gouëzel (Jun 12 2025 at 18:22):

Nope, it's still 1 (failed) file per second now. It's really because of the failed files: on a green branch, it will happily download the 6000 files in 5 seconds or so, but when there are failures it is super slow.

Sébastien Gouëzel (Jun 12 2025 at 18:28):

I wonder how complicated it would be to try to download files in order, starting from files at the root, and if a file fails then not even trying to download the files that depend on it. The order must already be available somewhere (because when computing the hash of a file you need the hashes of all of its parents), I guess, but I don't know how much engineering this would require (and if it wouldn't slow down things in the typical case where there are almost no failures)

Matthew Ballard (Jun 12 2025 at 18:31):

I got some similar behavior just now.

Mathlib repository: mattrobball/mathlib4
Attempting to download 6808 file(s) from leanprover-community/mathlib4 cache
Downloaded: 6808 file(s) [attempted 6808/6808 = 100%] (100% success)
Attempting to download 6808 file(s) from mattrobball/mathlib4 cache
Downloaded: 0 file(s) [attempted 1498/6808 = 22%]

Sébastien Gouëzel (Jun 12 2025 at 18:36):

In your case, it's just silly behavior: it has already found all the files in the first cache source, so why even look in the second?

Sébastien Gouëzel (Jun 12 2025 at 18:38):

@Kim Morrison, does your speedy download happen on a good branch, or on #25816? I.e., is

 gh pr checkout 25816
 lake exe cache get

fast for you? It's still dramatically slow for me.

Jireh Loreaux (Jun 12 2025 at 18:51):

I just did that:

[jireh@boston mathlib4]$  gh pr checkout 25816
remote: Enumerating objects: 6, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 6 (delta 2), reused 2 (delta 2), pack-reused 1 (from 1)
Unpacking objects: 100% (6/6), 10.80 KiB | 526.00 KiB/s, done.
From https://github.com/leanprover-community/mathlib4
 * [new ref]                 refs/pull/25816/head -> SG_coeEquiv
Switched to branch 'SG_coeEquiv'
[jireh@boston mathlib4]$  lake exe cache get
Mathlib repository: j-loreaux/mathlib4
Attempting to download 5964 file(s) from leanprover-community/mathlib4 cache
Downloaded: 414 file(s) [attempted 5964/5964 = 100%] (6% success) -- about 20-30 seconds here, total, with many misses
Attempting to download 5550 file(s) from j-loreaux/mathlib4 cache
Downloaded: 0 file(s) [attempted 465/5550 = 8%] -- about 4-5 seconds per file here, much slower

I'm on a ≥ 1 Gb/s connection.

Sébastien Gouëzel (Jun 12 2025 at 18:54):

It's not too bad, because you can kill the second download as it won't find new files. But after that, what happens if you do again lake exe cache get? Is it still reasonable in the first step?

Jireh Loreaux (Jun 12 2025 at 18:55):

(deleted)

Jireh Loreaux (Jun 12 2025 at 18:56):

(deleted)

Jireh Loreaux (Jun 12 2025 at 18:57):

Ah! It now attempts to download from leanprover-community/mathlib4, but it is now just as slow as when trying j-loreaux/mathlib4 the first time!

Sébastien Gouëzel (Jun 12 2025 at 18:58):

ok, that's exactly the same behavior as the one I have. So it's not just me. Thanks for trying!

Sébastien Gouëzel (Jun 12 2025 at 19:00):

@Mac Malone , I don't know if this slowness is to be expected?

Eric Taucher (Jun 12 2025 at 19:10):

There is currently a major Internet outage in the U.S. for GCS (Google Cloud Service) and some other regions.

https://status.cloud.google.com/regional/multiregions

https://www.githubstatus.com/

HTH

Robin Carlier (Jun 12 2025 at 19:13):

I've had this behavior several time when there was no outage

Jireh Loreaux (Jun 12 2025 at 19:21):

What is most strange to me about this is that cache (IIRC) executes downloading of cached files in parallel, but the failure mode makes it appear as if they are attempted iteratively. :thinking:

Robin Carlier (Jun 12 2025 at 19:23):

Next time it happens I'll Wireshark and look at the requests

Eric Wieser (Jun 13 2025 at 01:19):

It could be rate limiting on the server end?

Mac Malone (Jun 13 2025 at 18:59):

Sébastien Gouëzel said:

Mac Malone , I don't know if this slowness is to be expected?

Unfortunately, it is partially expected. Downloads of missing files in the cache (i.e., cache misses) have been observed to often be quite slow.

Sébastien Gouëzel (Jun 14 2025 at 15:45):

It seems to me that there is still something fishy going on. I just had a lake exe cache get executed, where in the end it only found 11 files out of 372, as expected. At the beginning, each failed attempt was taking 1 second per file. When it found the first successful file, after 20 files and 20 seconds, then it finished the rest in 1 second overall -- which means that all the other failures were essentially instant! Is there a different behavior in the code before the first success and after the first success?

Sébastien Gouëzel (Jun 14 2025 at 15:45):

@Mac Malone

Mac Malone (Jun 15 2025 at 01:04):

@Sébastien Gouëzel As far as I am aware, cache just downloads all the files in parallel using curl. Thus, any such behavior would be curl itself doing something weird.

Kim Morrison (Jun 15 2025 at 05:55):

My suspicion is that the blame goes to the cache providers (Azure, S3, I'm not even sure), not cache or curl.

Sébastien Gouëzel (Jun 15 2025 at 07:08):

Could we hack it around by downloading also the cache of a very low file in the hierarchy (preferably as the first download), even if if we already have it, just to get the process going?

Kim Morrison (Jun 15 2025 at 07:49):

I guess there's no reason we can't force downloading Mathlib.Init even if it is available locally, if anything else needs downloading...

Sébastien Gouëzel (Jun 15 2025 at 08:16):

My current workflow is the following, when I'm on a branch with mostly failed files that remain to be downloaded. First, start with

lake exe cache get!

This redownloads already downloaded files, but it also means I will hit a successful download much quicker, and then everything is quick. This is much quicker than lake exe cache get. There is a difficulty if I'm on a fork, though. I will see

Mathlib repository: sgouezel/mathlib4
Attempting to download 6814 file(s) from leanprover-community/mathlib4 cache
Downloaded: 1264 file(s) [attempted 6814/6814 = 100%] (18% success)
Attempting to download 6814 file(s) from sgouezel/mathlib4 cache
Downloaded: 0 file(s) [attempted 12/6814 = 0%]

where the second download on sgouezel/mathlib4 cache will only see failures, and will take more than one hour to do so. Then I interrupt the second download with Ctrl-C, but this means the files are not decompressed. Then I do

lake exe cache unpack

to unpack already downloaded files.

With this worflow, things become usable again.

Sébastien Gouëzel (Jun 15 2025 at 08:20):

(If you're not convinced the above is necessary, I invite you to try it by yourself: on a mathlib fork, do

gh pr checkout 25816
lake exe cache get

let it complete, and then do a second

lake exe cache get

For me (with an 8Gb/s connection), the second one takes more than one hour to complete -- I'd be interested to know if someone gets the same behavior, or a different one)

Robin Carlier (Jun 15 2025 at 08:30):

Can confirm same behaviour with this PR

Kim Morrison (Jun 15 2025 at 08:35):

The first lake exe cache get spends about a minute retrieving things (well, nothing) from leanprover-community/mathlib4, and them about the same from kim-em/mathlib4 cache. (This is definitely the wrong behaviour it should be looking in the sgouezel/mathlib4 fork!) The second invocation seems to be identical.

Kim Morrison (Jun 15 2025 at 08:36):

@Robin Carlier, @Sébastien Gouëzel, which OS are you on? (I'm on a mac.)

Robin Carlier (Jun 15 2025 at 08:37):

Linux 6.14.8 #1-NixOS SMP PREEMPT_DYNAMIC x86_64 GNU/Linux

Sébastien Gouëzel (Jun 15 2025 at 08:45):

Windows 11 for me.

It could also be a geographical issue. I'm in France.

Robin Carlier (Jun 15 2025 at 08:46):

Yes, France as well here (Lyon).

Robin Carlier (Jun 15 2025 at 08:52):

Note that as mentioned above, for me this is issue is rather old, I have been having this kind of issues for at least more than 1 month. Can’t say exactly when this started.

Kim Morrison (Jun 15 2025 at 09:07):

Kim Morrison said:

This is definitely the wrong behaviour it should be looking in the sgouezel/mathlib4 fork!

I'm still working on this issue, but I'll try @Sébastien Gouëzel's suggestion when I can.

Kim Morrison (Jun 15 2025 at 09:21):

Sigh, the best solution I have so far is really complicated: #25895

@Mac Malone or @Johan Commelin, would you be available to review?

Kim Morrison (Jun 15 2025 at 09:43):

@Robin Carlier or @Sébastien Gouëzel, could one of you try #25898? You may have to move the script to a different branch to test properly.

Kim Morrison (Jun 15 2025 at 09:44):

(Alternatively, since we're working on the hypothesis this is a France-specific problem, can we find anyone who knows cache and is in France? Or can give me ssh access? :-)

Robin Carlier (Jun 15 2025 at 09:58):

Kim Morrison said:

Robin Carlier or Sébastien Gouëzel, could one of you try #25898? You may have to move the script to a different branch to test properly.

copy-pasted the diff on @Sébastien Gouëzel's branch and ran cache, but no luck :(

Robin Carlier (Jun 15 2025 at 10:02):

I’ll try on a server I have access to that I think is hosted in France, to test the geography hyp.

Robin Carlier (Jun 15 2025 at 10:20):

Trying on a France-hosted ~~behavior~~ server, there is still a lot of cache misses, and it’s overall not fast, but this time it does seems like downloads happen in parallel, so while it takes a few minutes to fail the entire cache, it’s much faster than the 2h+ I have at home.
The system there is a debian 12

Robin Carlier (Jun 15 2025 at 10:21):

So we can scratch the "cache doesn’t like french people" hypothesis I guess?

Sébastien Gouëzel (Jun 15 2025 at 10:36):

Doesn't make a difference for me either. My internet provider at home is Free, what is yours Robin?

Robin Carlier (Jun 15 2025 at 10:38):

Mine is Orange. Maybe a weird thing with curl? fwiw at home

$ curl --version
curl 8.13.0 (x86_64-pc-linux-gnu) libcurl/8.13.0 OpenSSL/3.4.1 zlib/1.3.1 brotli/1.1.0 zstd/1.5.7 libidn2/2.3.8 libpsl/0.21.5 libssh2/1.11.1 nghttp2/1.65.0
Release-Date: 2025-04-02
Protocols: dict file ftp ftps gopher gophers http https imap imaps ipfs ipns mqtt pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: alt-svc AsynchDNS brotli GSS-API HSTS HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz NTLM PSL SPNEGO SSL threadsafe TLS-SRP UnixSockets zstd

while on the server on which it seems to work

$ curl --version
curl 7.88.1 (x86_64-pc-linux-gnu) libcurl/7.88.1 OpenSSL/3.0.16 zlib/1.2.13 brotli/1.0.9 zstd/1.5.4 libidn2/2.3.3 libpsl/0.21.2 (+libidn2/2.3.3) libssh2/1.10.0 nghttp2/1.52.0 librtmp/2.3 OpenLDAP/2.5.13
Release-Date: 2023-02-20, security patched: 7.88.1-10+deb12u12
Protocols: dict file ftp ftps gopher gophers http https imap imaps ldap ldaps mqtt pop3 pop3s rtmp rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: alt-svc AsynchDNS brotli GSS-API HSTS HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz NTLM NTLM_WB PSL SPNEGO SSL threadsafe TLS-SRP UnixSockets zstd

Sébastien Gouëzel (Jun 15 2025 at 11:25):

I don't know if cache is using the system's curl or an embedded one. curl --version doesn't even make sense on a Windows system. I also have a mingw32 version

$ curl --version
curl 7.64.0 (x86_64-w64-mingw32) libcurl/7.64.0 OpenSSL/1.1.1a (Schannel) zlib/1.2.11 libidn2/2.1.1 nghttp2/1.36.0
Release-Date: 2019-02-06
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp smtp smtps telnet tftp
Features: AsynchDNS IDN IPv6 Largefile SSPI Kerberos SPNEGO NTLM SSL libz TLS-SRP HTTP2 HTTPS-proxy MultiSSL Metalink

but I'm not sure it's the one it uses...

Aaron Liu (Jun 15 2025 at 11:28):

Sébastien Gouëzel said:

curl --version doesn't even make sense on a Windows system.

I'm on windows and curl --version seems to work just fine, it prints

C:\Users\aaron>curl --version
curl 8.9.1 (Windows) libcurl/8.9.1 Schannel zlib/1.3 WinIDN
Release-Date: 2024-07-31
Protocols: dict file ftp ftps http https imap imaps ipfs ipns mqtt pop3 pop3s smb smbs smtp smtps telnet tftp
Features: alt-svc AsynchDNS HSTS HTTPS-proxy IDN IPv6 Kerberos Largefile libz NTLM SPNEGO SSL SSPI threadsafe Unicode UnixSockets

Sébastien Gouëzel (Jun 15 2025 at 11:29):

I get this output

PS C:\Users\Sebastien\Desktop> curl --version
curl : Le nom distant n'a pas pu être résolu: '--version'
Au caractère Ligne:1 : 1
+ curl --version
+ ~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation : (System.Net.HttpWebRequest:HttpWebRequest) [Invoke-WebRequest], WebEx
   ception
    + FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShell.Commands.InvokeWebRequestCommand

but we're getting offtopic, I think.

Aaron Liu (Jun 15 2025 at 11:29):

oh you're on powershell

Aaron Liu (Jun 15 2025 at 11:30):

indeed I get the same result on powershell

Sébastien Gouëzel (Jun 15 2025 at 11:30):

Ah, on non powershell I get indeed

C:\Users\Sebastien>curl --version
curl 8.13.0 (Windows) libcurl/8.13.0 Schannel zlib/1.3.1 WinIDN
Release-Date: 2025-04-02
Protocols: dict file ftp ftps http https imap imaps ipfs ipns mqtt pop3 pop3s smb smbs smtp smtps telnet tftp ws wss
Features: alt-svc AsynchDNS HSTS HTTPS-proxy IDN IPv6 Kerberos Largefile libz NTLM SPNEGO SSL SSPI threadsafe Unicode UnixSockets

Robin Carlier (Jun 15 2025 at 11:36):

Cache calls system’s curl on my system (I had IO.getCurl log what command it found). I’ll see if I can hack this with a different curl to test the hyp.

Robin Carlier (Jun 15 2025 at 12:09):

So I downgraded curl to 7.88.1 (the version mentioned in cache’s source), and suddenly it works: the cache fails but downloads everything in parallel, so is doing it fast

Robin Carlier (Jun 15 2025 at 12:10):

Back to 8.13 and we’re slowpokes again.
I think we have our culprit.

Sébastien Gouëzel (Jun 15 2025 at 12:32):

For me, removing the --fail option in the curl call (in downloadFiles) makes it fast again. That's weird, because it's supposed to be an option to make it fast. Bug in curl 8.13?

Robin Carlier (Jun 15 2025 at 12:55):

The regression happens between curl 8.9.1 and 8.10.0

Sébastien Gouëzel (Jun 15 2025 at 12:55):

@Kim Morrison, does --fail really speedup things in your setup, or could we consider dropping it for everyone? I guess another option is to add it or not depending on the curl version.

Johan Commelin (Jun 16 2025 at 07:52):

Kim Morrison said:

Sigh, the best solution I have so far is really complicated: #25895

Mac Malone or Johan Commelin, would you be available to review?

I took a look, and as far as I can tell LGTM. But I would be happy if another pair of eyes took a look.

Johan Commelin (Jun 16 2025 at 07:53):

Also, Robin Carlier raises a valid point about upstream.

Robin Carlier (Jun 19 2025 at 12:49):

So, what do we do about the --fail? Remove it? Remove it conditionally on curl version?

Michael Rothgang (Jun 19 2025 at 13:06):

Jireh Loreaux said:

Ah! It now attempts to download from leanprover-community/mathlib4, but it is now just as slow as when trying j-loreaux/mathlib4 the first time!

I experienced someting similar: downloading from leanprover-community/mathlib4 was fast, but did not have the cache for all files. Then it tried downloading all files from my fork grunweg/mathlib4, which mostly failed and did so slowly. Why did it even try to re-download files from my fork; that should not be necessary!

Robin Carlier (Jun 19 2025 at 13:11):

I think the "slow fail" happens when all downloads fail, so when it tries do download from a fork that has no cache it's bound to hit the bug.

Matthew Ballard (Jun 19 2025 at 13:17):

Can we expose a flag to handle this?

Filippo A. E. Nuccio (Jun 19 2025 at 13:40):

Michael Rothgang said:

Jireh Loreaux said:

Ah! It now attempts to download from leanprover-community/mathlib4, but it is now just as slow as when trying j-loreaux/mathlib4 the first time!

I experienced someting similar: downloading from leanprover-community/mathlib4 was fast, but did not have the cache for all files. Then it tried downloading all files from my fork grunweg/mathlib4, which mostly failed and did so slowly. Why did it even try to re-download files from my fork; that should not be necessary!

Same here, but trying one hour late succeeded, so I attributed it to random problems, or oversaturated server activities.

Matthew Ballard (Jun 19 2025 at 13:58):

My current workflow on startup is

git fetch upstream master
git switch --detach upstream/master
git switch -c new_branch
lake exe cache get

which is guaranteed to have no cache for my fork. For me, lake exe cache get --upstream-only would not be a burden. I also may be doing things suboptimally and am open to suggestions.

Filippo A. E. Nuccio (Jun 19 2025 at 14:48):

Why do you need the git switch --detach upstream/master?

Robin Carlier (Jun 19 2025 at 15:44):

@Kim Morrison mentioned that we should change how lake exe cache handles CLI arguments (e.g --repo which has to be before the get), if we're adding flags (e.g --upstream-only and --no-fail), maybe it's a good time to do that?

Robin Carlier (Jun 19 2025 at 15:46):

(at least I remember a comment about "using lean4cli", which I have no idea what it is)

Matthew Ballard (Jun 19 2025 at 16:16):

Filippo A. E. Nuccio said:

Why do you need the git switch --detach upstream/master?

I am on some random branch and don't want any branch tracking master on my fork.

Sébastien Gouëzel (Jun 24 2025 at 17:23):

For the record, lake exe cache get is still unusable for me on branches with mostly missing files (1 second per missing file!), so I am using lake exe cache get! in such situations. I have seen that removing the --fail flag to the curl call solves the issue for me. I have opened #26367 for this, but I can't test if this has bad performance consequences for other curl versions.

Robin Carlier (Jun 24 2025 at 17:48):

About the curl version issue, while its behaviour when fetching mathlib change somewhere between 8.9.1 and 8.10.0, testing with local servers showed me that in fact --parallel + --fail never worked well together in the first place (for local servers, it seems to fail "in series"), even since curl 8.0.0. I don’t know if it is by design or by accident (curl’s documentation says nothing about this).
So I think removing it is the right way for now, since we’ve been expecting a behaviour it does not have from the start.
Note however this will mean a very slight increase in network rate for whoever hosts the cache server, since now every failed download will still transfer a few bytes of data.

We should however test that this does not actually write down files that look like html 404 errors, because I see potential very weird errors in this case.

Joscha Mennicken (Sep 08 2025 at 14:16):

Could the removal of --fail maybe be a cause for some of the recent issues in #mathlib4 > lake exe cache get @ 💬? If so, could switching to --fail-early instead of omitting --fail help?

Robin Carlier (Sep 08 2025 at 15:08):

That’s probably what I was describing in my last paragraph, yes.
--fail-early will exit curl and stop any pending download as soon as there is one error right? This might be even worse if the server is missing a single file, as it’ll fail quickly and then you won’t be able to download anything?

Robin Carlier (Sep 08 2025 at 15:17):

Also from what I read in curl’s manual, it seems --fail-early won’t fail due to HTTP code and would still write an html 404 file as a .ltar unless paired with --fail anyways.

Kim Morrison (Dec 03 2025 at 01:54):

I just discovered the following:

  | Threads | Wall Time | vs 24 threads |
  |---------|-----------|---------------|
  | 1       | 30.60s    | 0.7x slower   |
  | 2       | 15.71s    | 1.4x faster   |
  | 4       | 10.50s    | 2.1x faster   |
  | 5       | 9.84s     | 2.3x faster   |
  | 6       | 9.37s     | 2.4x faster   |
  | 7       | 10.28s    | 2.2x faster   |
  | 8       | 10.09s    | 2.2x faster   |
  | 12      | 12.65s    | 1.8x faster   |
  | 16      | 15.44s    | 1.4x faster   |
  | 24      | 22.16s    | baseline      |

leangz is selecting 24 by default, since that's the number of CPUs I have. We can probably do some optimization here!

Kim Morrison (Dec 03 2025 at 03:16):

Profiling tool at https://github.com/digama0/leangz/pull/6

Kim Morrison (Dec 03 2025 at 03:17):

Could someone please check this out, and run

cargo build --release
leantar-benchmark --create-manifest v4.26.0-rc2
leantar-benchmark --sweep

and report back their results? It would be good to have both old/new laptop/desktop linux/macos/windows data if possible.

(edit: changed rc1 to rc2)

Chris Henson (Dec 03 2025 at 03:53):

On my almost brand-new laptop running Ubuntu 24.04.3 LTS with a AMD Ryzen AI 5 PRO 340 (12 threads) CPU:

chenson@chenson:~/git/leangz$ ./target/release/leantar-benchmark --sweep
=== Thread Count Sweep ===
Testing thread counts: [1, 2, 4, 6, 8, 12]
(This will take a few minutes...)

Testing 1 threads... 17.85s
Testing 2 threads... 7.34s
Testing 4 threads... 4.12s
Testing 6 threads... 3.00s
Testing 8 threads... 2.78s
Testing 12 threads... 2.58s

=== Results ===

 Threads       Time    vs Best
 -------   --------    -------
       1     17.85s      6.93x
       2      7.34s      2.85x
       4      4.12s      1.60x
       6      3.00s      1.17x
       8      2.78s      1.08x
      12      2.58s      1.00x <-- best

Optimal thread count: 12 (2.58s)

Chris Henson (Dec 03 2025 at 04:20):

I tried on my older Windows desktop too, but got a failure message for all thread counts. Maybe someone else can try?

Anne Baanen (Dec 03 2025 at 14:30):

Hmm, if I run leantar-benchmark --create-manifest v4.26.0-rc1, then I get the following error:

Downloading cache files...
info: plausible: cloning https://github.com/leanprover-community/plausible
info: plausible: checking out revision 'ec4a54b5308c1f46b4b52a9c62fb67d193aa0972'
info: LeanSearchClient: cloning https://github.com/leanprover-community/LeanSearchClient
info: LeanSearchClient: checking out revision '2ed4ba69b6127de8f5c2af83cccacd3c988b06bf'
info: importGraph: cloning https://github.com/leanprover-community/import-graph
info: importGraph: checking out revision '7ff87023a8e1b358d2d01c1bc56b040a60609577'
info: proofwidgets: cloning https://github.com/leanprover-community/ProofWidgets4
info: proofwidgets: checking out revision '894c511d39bf442636bcba085245a1cf2bbdf665'
info: aesop: cloning https://github.com/leanprover-community/aesop
info: aesop: checking out revision 'c00943d5ff28c6dc623dbc24f8d659a9d3a3d29a'
info: Qq: cloning https://github.com/leanprover-community/quote4
info: Qq: checking out revision '1be06a278c3c249edafb722bfb278622024929d8'
info: batteries: cloning https://github.com/leanprover-community/batteries
info: batteries: checking out revision 'c278d4e94fe43bc38da4966795dc0c72538e68ab'
info: Cli: cloning https://github.com/leanprover/lean4-cli
info: Cli: checking out revision '00fad25fa9bedece1f1f988ab9c180dfe3d535b3'
PANIC at String.toNat! Init.Data.String.Extra:39:4: Nat expected
backtrace:
0   cache                               0x0000000108f20074 _ZN4leanL15lean_panic_implEPKcmb + 268
1   cache                               0x0000000108f20548 lean_panic_fn + 40
2   cache                               0x0000000102054048 l_Cache_IO_parseVersion + 548
3   cache                               0x0000000102054ae4 l_Cache_IO_validateLeanTar + 1560
4   cache                               0x0000000102044ad0 l_main___lam__1 + 1416
5   cache                               0x0000000108f2df68 lean_apply_2 + 788
6   cache                               0x0000000102047f0c main + 204
7   dyld                                0x000000019eeef154 start + 2476
installing leantar 0.1.16
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  884k  100  884k    0     0  3278k      0 --:--:-- --:--:-- --:--:-- 23.3M

Fetching ProofWidgets cloud release... done!

Anne Baanen (Dec 03 2025 at 14:30):

(Also, can I ask if using -rc1 instead of -rc2 was intentional?)

Kim Morrison (Dec 04 2025 at 09:56):

Oops, that was a mistake, I had forgotten we were on rc2...

Kim Morrison (Dec 04 2025 at 09:57):

I think it actually works even after the panic.

Kim Morrison (Dec 09 2025 at 01:40):

@Anne Baanen I've updated the PR to use rc2 instead of rc1.

Is anyone available to test on macOS? It would help to know if the thread count optimization is reproducible there, or if this is Linux-specific.

Thomas Murrills (Dec 09 2025 at 02:20):

I'm on a Macbook Pro with an M4 Pro, Sequoia 15.6.1, and got failure on all counts:

thomas@trm leangz % leantar-benchmark --sweep
Thread Count Sweep
Testing: [1, 2, 4, 6, 8, 12, 14]

Testing 1 threads... FAILED
Testing 2 threads... FAILED
Testing 4 threads... FAILED
Testing 6 threads... FAILED
Testing 8 threads... FAILED
Testing 12 threads... FAILED
Testing 14 threads... FAILED
No successful benchmark runs!

Here's everything I did (after some false starts) if you'd like to double-check that how I got here makes sense:
thomas-terminal-log.txt

Thomas Murrills (Dec 09 2025 at 22:15):

I was reminded of this just now and realized I could also try it on a modern windows machine (Windows 11, AMD Ryzen 9 9950X) which I have access to. I also got the same failure on all counts there. :sweat_smile: Here's the log:
thomas-windows-terminal-log.txt

If it's any help, each test took a long time to fail on mac, but failed basically instantly on windows. (The rest of the process was comparable in speed, so I'm not sure it's a difference in power between the machines.)

Jireh Loreaux (Dec 09 2025 at 22:21):

@Kim Morrison I also got FAILED runs. Prior to that, creating the manifest failed because it exceed the inode quota in my /tmp‌/ directory. So I had to temporarily change my TMPDIR environment variable to something else (which I also did for the sweep). Then it succeeded in creating the manifest, but the sweep failed.

Kim Morrison (Dec 09 2025 at 22:44):

Thanks for testing @Thomas Murrills! I found the bug - the --create-manifest command was downloading cache files to a temp directory and then deleting them immediately after writing the manifest. So when --sweep ran, the .ltar files no longer existed.

I've pushed a fix: https://github.com/digama0/leangz/pull/6

Now --create-manifest downloads cache files to a persistent directory (benchmarks/cache-<tag>/) and stores that path in the manifest. The benchmark then reads from there by default.

If you have a chance to try again:

git pull
cargo build --release
rm -rf benchmarks/  # clear old manifest
leantar-benchmark --create-manifest v4.26.0-rc2
leantar-benchmark --sweep

(The PANIC you saw is from the Lean cache tool, not leangz - it's harmless and happens during leantar version detection.)

Thomas Murrills (Dec 09 2025 at 22:57):

Success! :tada: Here's what I got on mac, M4 Pro:

thomas@trm leangz % leantar-benchmark --sweep
Thread Count Sweep
Testing: [1, 2, 4, 6, 8, 12, 14]

Testing 1 threads... 21.86s
Testing 2 threads... 12.94s
Testing 4 threads... 8.59s
Testing 6 threads... 7.19s
Testing 8 threads... 6.68s
Testing 12 threads... 7.16s
Testing 14 threads... 7.79s

Results:
  1 threads: 21.86s (3.27x)
  2 threads: 12.94s (1.94x)
  4 threads: 8.59s (1.28x)
  6 threads: 7.19s (1.08x)
  8 threads: 6.68s (1.00x) (best)
  12 threads: 7.16s (1.07x)
  14 threads: 7.79s (1.17x)

Optimal thread count: 8 (6.68s)
Speedup vs 14 threads: 1.2x

Thomas Murrills (Dec 09 2025 at 22:57):

(For the record, I left my computer alone while benching. :grinning_face_with_smiling_eyes:)

Thomas Murrills (Dec 09 2025 at 22:59):

(I'll try it on windows when that machine is next available.)

Jireh Loreaux (Dec 09 2025 at 23:03):

Arch Linux with KDE:

Thread Count Sweep
Testing: [1, 2, 4, 6, 8, 12]

Testing 1 threads... 30.13s
Testing 2 threads... 22.36s
Testing 4 threads... 33.28s
Testing 6 threads... 39.92s
Testing 8 threads... 30.96s
Testing 12 threads... 42.35s

Results:
  1 threads: 30.13s (1.35x)
  2 threads: 22.36s (1.00x) (best)
  4 threads: 33.28s (1.49x)
  6 threads: 39.92s (1.79x)
  8 threads: 30.96s (1.38x)
  12 threads: 42.35s (1.89x)

Optimal thread count: 2 (22.36s)
Speedup vs 12 threads: 1.9x

Jireh Loreaux (Dec 09 2025 at 23:04):

My normal decompression rates when working on Mathlib are more like <10s though, so I suspect there is still some sort of problem with this benchmarking.

Snir Broshi (Dec 09 2025 at 23:27):

1 year-old Macbook Air M3 w/ 24GB RAM:

$ ./target/release/leantar-benchmark --sweep
Thread Count Sweep
Testing: [1, 2, 4, 6, 8]

Testing 1 threads... 26.77s
Testing 2 threads... 16.61s
Testing 4 threads... 10.88s
Testing 6 threads... 8.97s
Testing 8 threads... 7.68s

Results:
  1 threads: 26.77s (3.48x)
  2 threads: 16.61s (2.16x)
  4 threads: 10.88s (1.42x)
  6 threads: 8.97s (1.17x)
  8 threads: 7.68s (1.00x) (best)

Optimal thread count: 8 (7.68s)

Thomas Murrills (Dec 10 2025 at 04:08):

I was really surprised by the performance on Windows (AMD Ryzen 9 9950X) compared to my M4 Pro mac; it was quite a bit worse.

Thread Count Sweep
Testing: [1, 2, 4, 6, 8, 12, 16, 24, 32]

Testing 1 threads... 42.37s
Testing 2 threads... 31.94s
Testing 4 threads... 32.22s
Testing 6 threads... 31.69s
Testing 8 threads... 29.49s
Testing 12 threads... 30.65s
Testing 16 threads... 30.66s
Testing 24 threads... 29.90s
Testing 32 threads... 30.32s

Results:
  1 threads: 42.37s (1.44x)
  2 threads: 31.94s (1.08x)
  4 threads: 32.22s (1.09x)
  6 threads: 31.69s (1.07x)
  8 threads: 29.49s (1.00x) (best)
  12 threads: 30.65s (1.04x)
  16 threads: 30.66s (1.04x)
  24 threads: 29.90s (1.01x)
  32 threads: 30.32s (1.03x)

Optimal thread count: 8 (29.49s)
Speedup vs 32 threads: 1.0x

Then I noticed in task manager that the antimalware process was whirring away, so I turned it off temporarily:

Thread Count Sweep
Testing: [1, 2, 4, 6, 8, 12, 16, 24, 32]

Testing 1 threads... 32.05s
Testing 2 threads... 19.62s
Testing 4 threads... 13.99s
Testing 6 threads... 14.92s
Testing 8 threads... 11.90s
Testing 12 threads... 11.73s
Testing 16 threads... 14.08s
Testing 24 threads... 11.53s
Testing 32 threads... 12.26s

Results:
  1 threads: 32.05s (2.78x)
  2 threads: 19.62s (1.70x)
  4 threads: 13.99s (1.21x)
  6 threads: 14.92s (1.29x)
  8 threads: 11.90s (1.03x)
  12 threads: 11.73s (1.02x)
  16 threads: 14.08s (1.22x)
  24 threads: 11.53s (1.00x) (best)
  32 threads: 12.26s (1.06x)

Optimal thread count: 24 (11.53s)
Speedup vs 32 threads: 1.1x

which is better! But still a bit behind the M4 Pro. (The M4 Pro has a bit more memory, though.)

I'm not sure if it's possible for leangz to do anything about antimalware on windows, though. It's not the first case in which dealing with the cache (or other similar things) has been annoying on this machine due to windows defender, and it seems like this just has to be dealt with locally. :upside_down:

(Aside for anyone else with a windows machine: maybe it's worth setting up a dev drive, since apparently there's a windows defender setting which allows it to run asynchronously on dev drives, such that it doesn't interrupt operations like this...?)

Thomas Murrills (Dec 10 2025 at 04:21):

@Chris Henson I'm quite impressed by your sub-3-second times! :grinning_face_with_smiling_eyes: Do you mind if I ask how much memory that machine has? I'm wondering how big a factor that is here.

Chris Henson (Dec 10 2025 at 04:38):

It is 96GB, you're probably right about that being significant.

Last updated: Feb 28 2026 at 14:05 UTC