Zulip Chat Archive
Stream: lean4
Topic: stack overflow failure in `get cache` step
Sébastien Gouëzel (May 04 2024 at 08:23):
I don't think I already had a stack overflow in the get cache
step (see https://github.com/leanprover-community/mathlib4/actions/runs/8949168207/job/24583144189). It's probably a transient failure, but I thought I would mention it here in case someone else encounters it.
Sébastien Gouëzel (May 04 2024 at 08:34):
Crap, it did it again, so it looks like it's not purely transient. Any idea to debug this?
Sébastien Gouëzel (May 04 2024 at 08:45):
@Mac Malone , here is a deterministic way to get the issue:
- create a new branch based on mathlib master
- add a file
Mathlib/test.lean
only containingimport Mathlib
- commit and push
Then the get cache
step fails, with a stack overflow. See for instance https://github.com/leanprover-community/mathlib4/actions/runs/8949293595/job/24583426811.
Sébastien Gouëzel (May 04 2024 at 08:48):
Probably not directly related, but in another run https://github.com/leanprover-community/mathlib4/actions/runs/8949280481/job/24583395867 the get cache
step downloaded 2586 files, but then started building everything again. Is there still something wrong with the cache or the Hoskinson machines? @Mario Carneiro , do you understand what might be going on?
Sebastian Ullrich (May 04 2024 at 09:16):
This is most likely not a problem in Lake but in cache. As a workaround, try putting ulimit -s unlimited
in the preceding line
Ruben Van de Velde (May 04 2024 at 09:16):
Is this because we have a step that will add your new test.lean to Mathlib.lean?
Sebastian Ullrich (May 04 2024 at 09:18):
Oh you're right, this might be true infinite recursion between modules. Which Lake should be able to detect but cache might not.
Mario Carneiro (May 04 2024 at 15:46):
@Sébastien Gouëzel said:
Probably not directly related, but in another run https://github.com/leanprover-community/mathlib4/actions/runs/8949280481/job/24583395867 the
get cache
step downloaded 2586 files, but then started building everything again. Is there still something wrong with the cache or the Hoskinson machines? Mario Carneiro , do you understand what might be going on?
Are you sure it actually started building everything again? It looks like you misread the new lake output as it building everything, but I'm pretty sure it just printed everything because it only took 13 minutes to build
Sébastien Gouëzel (May 04 2024 at 15:50):
Does it say Building foo.lean
even when there are available oleans? Because there are 4500 such lines in the build step, starting from the first files.
Mario Carneiro (May 04 2024 at 15:50):
yes
Mario Carneiro (May 04 2024 at 15:51):
This is new behavior and not very well received, it's a side effect of the new logging architecture but I'm increasingly convinced we need a way to fix it anyway
Mario Carneiro (May 04 2024 at 15:52):
the issue is that it prints that line to say it is preparing to build the target before it has checked to see whether it is up to date
Sébastien Gouëzel (May 04 2024 at 15:53):
Then probably I've just misread the build output. But I agree it would be a good idea to clarify the output!
Mario Carneiro (May 04 2024 at 15:54):
@Mac Malone any chance we can get a fix for this in rc2?
Mac Malone (May 04 2024 at 17:37):
@Mario Carneiro Unlikely/ It would require significant rearchitecture. I ahve some ides on how this could be done, but nothing easy.
Mac Malone (May 04 2024 at 20:38):
Also, it would be nice to have issues for these things, so I can have a recrod of the various improvements people would like to the logging output and keep them in mind. :pray:
Richard Copley (May 05 2024 at 13:24):
Mac Malone said:
Also, it would be nice to have issues for these things, so I can have a recrod of the various improvements people would like to the logging output and keep them in mind. :pray:
Last updated: May 02 2025 at 03:31 UTC