Zulip Chat Archive

Stream: mathlib4

Topic: github action bandwidth


Kenny Lau (Oct 28 2025 at 16:39):

https://github.com/leanprover-community/mathlib4/actions?query=is%3Aqueued
Is our Github Action bandwidth enough?

Bryan Gin-ge Chen (Oct 28 2025 at 16:40):

We have 10 self-hosted runners and they're all busy at the moment. It looks like we've just hit a busy period.

Bryan Gin-ge Chen (Oct 28 2025 at 16:43):

One thing I will do is dedicate one of the runners to our bors jobs so that we can keep commits flowing into master.

Bryan Gin-ge Chen (Oct 28 2025 at 16:56):

Browsing the runs on that page, it looks like there's actually a bunch of dead jobs from August; I'll try and delete those from the list so that the count is more useful.

Kenny Lau (Oct 28 2025 at 17:15):

@Bryan Gin-ge Chen there are also duplicate runs on that page; if I understand correctly, if two commits are pushed, the first one would be skipped; but it seems like the "skipping" doesn't happen while the run is being queued

Bryan Gin-ge Chen (Oct 28 2025 at 17:22):

I think that's controlled by the concurrency setting for our workflows -- I don't know off the top of my head whether what you're observing is expected or if we need to tweak this somehow.

Kenny Lau (Oct 28 2025 at 18:08):

it looks like the first one was in fact not skipped... maybe this is suboptimal?

Jiang Jiedong (Oct 28 2025 at 19:04):

Report of the current situation: The CI of one of my PRs has been waiting for 1h10m ... Still pending.

Bryan Gin-ge Chen (Oct 28 2025 at 19:16):

Unfortunately all 10 of our runners are still busy (and I've changed the dedicated runner back so that it will run jobs from ordinary PRs as well). You can take a look at this link (also posted by Kenny above) to see how many jobs are currently queued, and maybe get a sense of how much longer it might take.

Michael Stoll (Oct 28 2025 at 19:28):

I've noticed that CI on a PR that is assigned to me seemed to build all of Mathlib, which I think it should not do.
If this happens across the board, all these CI runs are taking a rather long time...

Bryan Gin-ge Chen (Oct 28 2025 at 19:35):

It's possible that something has gone wrong with caching, though if that's happening I'm surprised that we're not getting straight-up errors -- if you can link to an example log where you expected a short build but got a long one, I can take a quick look.

Kenny Lau (Oct 28 2025 at 19:51):

Kenny Lau said:

@Bryan Gin-ge Chen there are also duplicate runs on that page; if I understand correctly, if two commits are pushed, the first one would be skipped; but it seems like the "skipping" doesn't happen while the run is being queued

@Bryan Gin-ge Chen who might know about this?

Bryan Gin-ge Chen (Oct 28 2025 at 20:06):

Hmm, looks like the concurrency groups aren't causing multiple runs from the same PR to be canceled for some reason. I'll try and look into this.

Michael Stoll (Oct 28 2025 at 20:06):

Bryan Gin-ge Chen said:

if you can link to an example log

Here

Bryan Gin-ge Chen (Oct 28 2025 at 20:13):

@Michael Stoll I'm not sure if it's the root cause but it looks like the PR author hadn't merged master for some time, and some big changes were made to the cache with Lean v4.25.0-rc2 which may have made it incompatible with old PRs. We probably should have made an announcement, but it slipped my mind at least. I think merging master, as the PR author did, should hopefully fix things.

Maybe we need to bring back the "please-merge-master" label idea I mentioned here #mathlib4 > CI failure @ 💬

Kenny Lau (Oct 28 2025 at 20:14):

yeah that might be it, master was changed

Michael Stoll (Oct 28 2025 at 20:14):

Just to be clear: I'm not the author of this PR :-)

Michael Stoll (Oct 28 2025 at 20:15):

But in any case, if there are many older PRs that haven't merged master in a while, it could be a similar effect.

Bryan Gin-ge Chen (Oct 28 2025 at 20:45):

Bryan Gin-ge Chen said:

Hmm, looks like the concurrency groups aren't causing multiple runs from the same PR to be canceled for some reason. I'll try and look into this.

#31032 is my attempt at fixing this

David Loeffler (Oct 28 2025 at 21:26):

@Bryan Gin-ge Chen I'm afraid that doesn't seem to have worked, there are two parallel runs from #31004 happening at the moment, both triggered in the last 10 minutes. (Or does the change to CI only take effect once existing PR authors merge master?) oops, sorry, of course the CI change is still an open PR, clearly I am not thinknig straight and need to get some sleep.

Kenny Lau (Oct 28 2025 at 21:27):

@David Loeffler what Bryan linked to is an open PR

Kim Morrison (Oct 29 2025 at 02:50):

Bryan Gin-ge Chen said:

#31032 is my attempt at fixing this

I've hit merge, please everyone keep an eye out for changes when this lands.

Bryan Gin-ge Chen (Oct 30 2025 at 06:59):

I just observed a push of a new commit at #30759 canceling this run so #31032 might be working as expected (I'm not 100% sure because #30759 is not from a fork though)


Last updated: Dec 20 2025 at 21:32 UTC