Zulip Chat Archive
Stream: general
Topic: bors build failure
Kyle Miller (Oct 06 2020 at 07:13):
Bors has failed building #4444 twice now, and it seems to be a deterministic timeout in mv_polynomial
. The PR touches control/traversable
, but only by adding documentation and reordering a definition.
/home/runner/work/mathlib/mathlib/src/data/mv_polynomial/basic.lean:414:0: error: (deterministic) timeout
Bryan Gin-ge Chen (Oct 06 2020 at 07:16):
I guess the first thing to do is to try merging master on that branch and letting the CI run
Bryan Gin-ge Chen (Oct 06 2020 at 07:20):
Here’s the line referenced in the error: https://github.com/leanprover-community/mathlib/blob/523bddb4f0cbfdc401b3b2f943ca6eba92ad90cc/src/data/mv_polynomial/basic.lean#L414
Kyle Miller (Oct 06 2020 at 07:25):
Ok, I've merged master and pushed it. Is the CI just to see if it can still build?
Johan Commelin (Oct 06 2020 at 07:26):
I'm having the same issue with #4461
Johan Commelin (Oct 06 2020 at 07:28):
The problem is coeff_mul
Johan Commelin (Oct 06 2020 at 07:29):
It's known to be slow... I guess we need to work on it
Kyle Miller (Oct 06 2020 at 07:32):
So it's not @Aaron Anderson's docstrings that broke the build? :upside_down:
Kyle Miller (Oct 06 2020 at 07:47):
Is the deterministic timeout actually nondeterministic? (at least when it comes to building all of mathlib?) It seems sort of improbable that a commit with only documentation comments would change much...
Bryan Gin-ge Chen (Oct 06 2020 at 07:59):
Yeah, I'm not sure what deterministic really means here. For what it's worth, I ran lean --make src/data/mv_polynomial/basic.lean
on the #4444 branch of mathlib and didn't get a timeout when opening the file in VS Code, even though it should be enforcing the same -T100000
limit.
According to the profiler, the tactic execution in mv_polynomial.coeff_mul
is taking 19.7 seconds, so it's definitely ripe for reworking. I don't know how to interpret the more detailed output though:
elaboration: tactic execution took 19.7s
num. allocated objects: 7835
num. allocated closures: 10779
19748ms 100.0% tactic.istep
19748ms 100.0% _interaction
19748ms 100.0% tactic.istep._lambda_1
19748ms 100.0% scope_trace
19748ms 100.0% _interaction._lambda_2
19748ms 100.0% tactic.step
19667ms 99.6% tactic.solve1
19525ms 98.9% _interaction._lambda_19
19140ms 96.9% tactic.interactive.propagate_tags
18716ms 94.8% interaction_monad_orelse
18665ms 94.5% _interaction._lambda_17
18615ms 94.3% tactic.interactive.simp_core
18554ms 94.0% tactic.interactive.simp_core_aux
18550ms 93.9% tactic.simplify
18550ms 93.9% tactic.try
18502ms 93.7% tactic.interactive.simpa._lambda_6
18501ms 93.7% tactic.interactive.simpa
17747ms 89.9% tactic.simp_target
17747ms 89.9% tactic.interactive.simp_core_aux._lambda_5
13118ms 66.4% _interaction._lambda_16
13061ms 66.1% _interaction._lambda_14
5162ms 26.1% _interaction._lambda_10
1697ms 8.6% interaction_monad.monad._lambda_9
807ms 4.1% tactic.interactive.simp_core_aux._lambda_4
807ms 4.1% tactic.interactive.simp_core_aux._lambda_2
525ms 2.7% rw_core
524ms 2.7% interactive.loc.apply
521ms 2.6% interaction_monad.orelse'
492ms 2.5% _private.3291397495.rw_goal._lambda_4
481ms 2.4% _private.3291397495.rw_goal._lambda_2
415ms 2.1% tactic.rewrite_target
415ms 2.1% tactic.rewrite
414ms 2.1% tactic.rewrite_core
368ms 1.9% tactic.mk_instance
287ms 1.5% _interaction._lambda_12
286ms 1.4% tactic.apply_instance
268ms 1.4% tactic.all_goals'
268ms 1.4% all_goals'_core
268ms 1.4% _private.3677394557.all_goals'_core._main._lambda_2
268ms 1.4% tactic.to_expr
235ms 1.2% _interaction._lambda_18
147ms 0.7% _interaction._lambda_7
142ms 0.7% _interaction._lambda_4
139ms 0.7% tactic.interactive.have._lambda_1
99ms 0.5% tactic.interactive.concat_tags._lambda_5
98ms 0.5% repeat_aux
98ms 0.5% tactic.try_core
98ms 0.5% tactic.repeat
98ms 0.5% _interaction._lambda_3
98ms 0.5% _private.1616170719.repeat_aux._main._lambda_1
94ms 0.5% _interaction._lambda_11
93ms 0.5% tactic.interactive.by_cases._lambda_2
93ms 0.5% tactic.interactive.by_cases
84ms 0.4% tactic.interactive.to_expr'
83ms 0.4% tactic.by_cases
59ms 0.3% tactic.mk_simp_set_core
59ms 0.3% tactic.mk_simp_set
57ms 0.3% tactic.interactive.convert
57ms 0.3% tactic.focus1
54ms 0.3% tactic.congr'
52ms 0.3% _interaction._lambda_13
39ms 0.2% _interaction._lambda_8
39ms 0.2% simp_lemmas.add_pexpr
39ms 0.2% simp_lemmas.append_pexprs
39ms 0.2% tactic.mk_simp_set_core._lambda_5
36ms 0.2% tactic.apply_eq_congr_core
36ms 0.2% tactic.congr_core'
35ms 0.2% tactic.congr'._main._lambda_3
32ms 0.2% tactic.mk_specialized_congr_lemma
31ms 0.2% rw_hyp
30ms 0.2% _private.859230899.rw_hyp._main._lambda_3
23ms 0.1% _private.859230899.rw_hyp._main._lambda_2
23ms 0.1% tactic.apply_core
21ms 0.1% tactic.rewrite_hyp
21ms 0.1% _interaction._lambda_5
20ms 0.1% tactic.seq'
18ms 0.1% tactic.join_user_simp_lemmas
18ms 0.1% simp_lemmas.mk_default
17ms 0.1% relation_tactic
17ms 0.1% _interaction._lambda_6
16ms 0.1% _private.1624076617.relation_tactic._lambda_1
16ms 0.1% tactic.exact
15ms 0.1% tactic.congr'._main._lambda_4
14ms 0.1% tactic.replace_target
10ms 0.1% tactic.interactive.simpa._lambda_1
10ms 0.1% tactic.replace_hyp
10ms 0.1% _private.3291397495.rw_goal._lambda_3
9ms 0.0% tactic.save_info_with_widgets
8ms 0.0% tactic.apply
8ms 0.0% tactic.mk_eq_mpr
8ms 0.0% tactic.save_widget
7ms 0.0% tactic.save_type_info
6ms 0.0% tactic.assert
6ms 0.0% tactic.interactive.apply._lambda_1
5ms 0.0% tactic.assumption._lambda_1
5ms 0.0% tactic.find_same_type
5ms 0.0% tactic.refine
4ms 0.0% tactic.apply_congr_core
4ms 0.0% tactic.mk_eq_mp
3ms 0.0% tactic.rintro
3ms 0.0% tactic.find_same_type._main._lambda_1
3ms 0.0% simp_lemmas.resolve_and_add
3ms 0.0% tactic.unify
3ms 0.0% tactic.assert_core
3ms 0.0% _private.3676539765.simp_lemmas.resolve_and_add._lambda_2
3ms 0.0% tactic.interactive.rintro
3ms 0.0% tactic.infer_type
3ms 0.0% tactic.rintro._lambda_3
3ms 0.0% simp_lemmas.add_simp
2ms 0.0% tactic.mk_app
2ms 0.0% simp_lemmas.erase
2ms 0.0% tactic.rcases_core._main._lambda_5
2ms 0.0% tactic.interactive.exact
2ms 0.0% tactic.rcases_core._main._lambda_9
2ms 0.0% tactic.rcases_core
2ms 0.0% tactic.mk_id_eq
2ms 0.0% tactic.rcases.continue
2ms 0.0% tactic.cases_core
2ms 0.0% _interaction._lambda_9
1ms 0.0% tactic.has_opt_auto_param._lambda_2
1ms 0.0% tactic.clear_goals._lambda_5
1ms 0.0% tactic.apply_congr_core._lambda_1
1ms 0.0% tactic.applyc
1ms 0.0% tactic.mk_const
1ms 0.0% tactic.has_opt_auto_param_for_apply._lambda_2
1ms 0.0% tactic.intro_core
1ms 0.0% tactic.get_eqn_lemmas_for
1ms 0.0% tactic.i_to_expr_for_apply
1ms 0.0% tactic.interactive.get_rule_eqn_lemmas
1ms 0.0% tactic.clear
1ms 0.0% format.join
1ms 0.0% environment.get_ext_eqn_lemmas_for
1ms 0.0% tactic.interactive.dsimp
1ms 0.0% tactic.save_info_thunk
1ms 0.0% tactic.dsimp_target
1ms 0.0% tactic.interactive.get_rule_eqn_lemmas._lambda_3
1ms 0.0% tactic.clear_goals
1ms 0.0% tactic.assertv
1ms 0.0% expr.get_app_fn
1ms 0.0% resolve_name'
1ms 0.0% tactic.clear'
1ms 0.0% expr.is_napp_of
1ms 0.0% list.foldl
1ms 0.0% tactic.mk_const._lambda_1
1ms 0.0% tactic.intro
1ms 0.0% expr.const
1ms 0.0% simp_lemmas.dsimplify
1ms 0.0% format.compose
1ms 0.0% expr.is_app_of
1ms 0.0% tactic.swap
1ms 0.0% tactic.instantiate_mvars
1ms 0.0% _interaction._lambda_15
1ms 0.0% tactic.assertv_core
1ms 0.0% tactic.congr'._main._lambda_2
Scott Morrison (Oct 06 2020 at 08:03):
I think that is saying simpa
is the culprit.
Bryan Gin-ge Chen (Oct 06 2020 at 08:05):
Indeed, changing the two simpa
s to simpa only
speeds it up to 2.64 seconds!
Johan Commelin (Oct 06 2020 at 08:07):
I almost have a PR ready
Johan Commelin (Oct 06 2020 at 08:07):
I'm restructuring it a bit more
Bryan Gin-ge Chen (Oct 06 2020 at 08:08):
Ah, OK. I was just about to submit one myself (with just the simpa only
change).
Johan Commelin (Oct 06 2020 at 08:10):
Johan Commelin (Oct 06 2020 at 08:10):
It's below 2s on my box now
Scott Morrison (Oct 06 2020 at 08:21):
oops, I did it as well...
Scott Morrison (Oct 06 2020 at 08:22):
I'll close.
Bryan Gin-ge Chen (Oct 06 2020 at 08:34):
Hmm, coeff_mul
just caused the most recent bors
batch to fail again: https://github.com/leanprover-community/mathlib/runs/1213623252
So maybe we should see if we can try and merge that first.
Johan Commelin (Oct 06 2020 at 08:36):
Yup, we might want to cancel everything, and squeeze this in first.
Bryan Gin-ge Chen (Oct 06 2020 at 08:43):
I just added #4469 to the queue. Even if the currently running batch fails, #4469 should be in the next batch and then everything should be good, I think.
Eric Wieser (Oct 06 2020 at 09:14):
bors r+ p=10
will give it higher priority
Bryan Gin-ge Chen (Oct 06 2020 at 09:16):
I considered doing that but since #4469 won't conflict with any of the other PRs, it should be more efficient just to have it run in a batch.
Bryan Gin-ge Chen (Oct 06 2020 at 09:18):
Also, the batch including #4469 is now running, since the batch that was running failed again. I went ahead and reconsolidated those PRs along with #4444 into a new batch.
Johan Commelin (Oct 06 2020 at 09:36):
#4472 is not critical for bors, but it brings coeff_mul
down into the 1.0s - 1.2s range, and shortens the proof.
Moritz Doll (Aug 09 2022 at 12:56):
@maintainers #15837 and #15836 should not be on the same batch, can someone jank one of them from the queue?
Johan Commelin (Aug 09 2022 at 12:57):
Done
Johan Commelin (Aug 09 2022 at 12:57):
I pulled #15837 off the queue
Moritz Doll (Aug 09 2022 at 12:58):
thanks
Oliver Nash (Aug 09 2022 at 12:58):
Bors handles this sort of this automatically though. It just tries again, dividing and conquering as it goes, right?
Moritz Doll (Aug 09 2022 at 13:05):
but since we know what the issue is (#15836 proves stuff with positive_cone
and #15837 renames positive_cone
to positive
), it is easier to cancel on of them and not waste time for builds that will fail anyways.
Eric Wieser (Aug 09 2022 at 13:06):
It's a shame there's no way to tell bors to undo a divide and conquer when we know better
Johan Commelin (Aug 09 2022 at 13:22):
I wish bors
priorities worked properly
Andrew Yang (Aug 09 2022 at 14:26):
By the way, I think anyone can bors cancel
?
At least it work for me in #15436 where I was not delegated.
María Inés de Frutos Fernández (Aug 15 2022 at 11:02):
Could someone take a look at why #14115 failed? The check that failed was bors — Merge conflict
, but I hadn't seen any merge conflicts before approving the PR.
Yaël Dillies (Aug 15 2022 at 11:04):
It's about to have a conflict with #15594.
Kevin Buzzard (Aug 15 2022 at 11:05):
That's not the answer to the question though.
Yaël Dillies (Aug 15 2022 at 11:06):
The answer is that bors detected the conflict.
Kevin Buzzard (Aug 15 2022 at 11:06):
With an open PR??
Kevin Buzzard (Aug 15 2022 at 11:07):
Oh I see! My mistake. It's also on the queue.
Eric Wieser (Aug 15 2022 at 11:16):
This question comes up over and over again; is this something we need to document, or that we need to ask bors to document better?
Kevin Buzzard (Aug 15 2022 at 11:20):
How does one fix this?
Kevin Buzzard (Aug 15 2022 at 11:20):
I mean the conflict, not the fact that it confused some people
Yaël Dillies (Aug 15 2022 at 11:21):
Wait for bors to be done with either PR, then you can merge the one that went through in the one that didn't.
Mario Carneiro (Aug 15 2022 at 11:22):
you can also head the process off by merging onto bors' staging commit
Yaël Dillies (Aug 15 2022 at 11:22):
Given that this batch already failed once, I'm not sure that's wise.
Yaël Dillies (Aug 15 2022 at 11:23):
A maintainer could check in the build logs whether #15594 was the faulty PR in that batch.
Arthur Paulino (Aug 15 2022 at 11:23):
I think this question comes up often because the way that bors operates is not widely known, especially if you're not a programmer
María Inés de Frutos Fernández (Aug 15 2022 at 11:24):
Eric Wieser said:
This question comes up over and over again; is this something we need to document, or that we need to ask bors to document better?
I think some documentation would be helpful.
Mario Carneiro (Aug 15 2022 at 11:24):
Once you have done the merge work on that staging commit, it shouldn't be too bad to merge on whatever ends up on master since that will probably also have the conflicting PR
Mario Carneiro (Aug 15 2022 at 11:24):
(when I say "merge" I mean "rebase")
Mario Carneiro (Aug 15 2022 at 11:25):
or you can just wait the 4 hours or so
Eric Wieser (Aug 15 2022 at 11:27):
Yaël Dillies said:
A maintainer could check in the build logs whether #15994 was the faulty PR in that batch.
So can a non-maintainer
Yaël Dillies (Aug 15 2022 at 11:29):
Yaël Dillies (Aug 15 2022 at 11:30):
Sometimes I can see the logs on the commit, but here https://github.com/leanprover-community/mathlib/commit/6eb146e873f32077111041472656aa3153f88b6f doesn't have them.
Eric Wieser (Aug 15 2022 at 11:34):
The logs are here: https://github.com/leanprover-community/mathlib/actions/workflows/bors.yml
Eric Wieser (Aug 15 2022 at 11:35):
Your page just links to the appropiate entry on that one
Yaël Dillies (Aug 15 2022 at 13:09):
Okay, #15594 was indeed the one failing. I just fixed the error (to_additive
was now guessing the wrong name). Can someone put it back on the queue?
Anatole Dedecker (Aug 15 2022 at 16:26):
Now it's my turn to cause build failures with #14534
Anatole Dedecker (Aug 15 2022 at 16:27):
It is a quite old PRs so there are "implicit conflicts" e.g some rewrites no longer work
Anatole Dedecker (Aug 15 2022 at 16:27):
I'm fixing it, sorry for the inconvenience
Oliver Nash (Aug 15 2022 at 16:43):
No need at all to apologise; this sort of thing is a normal part of collaborative development.
Junyan Xu (Aug 15 2022 at 16:57):
strangely #bors is currently showing no jobs running
Junyan Xu (Aug 15 2022 at 16:58):
but at the bottom of my PR #16046 it shows bors is running
Johan Commelin (Aug 15 2022 at 18:40):
It looks like bors is still feeling ill :sad:
Joseph Myers (Aug 15 2022 at 18:43):
I don't know why it hasn't restarted automatically with a smaller batch, but the failure of the last batch shows a genuine conflict between #16011 (uses topological_group_is_uniform
) and #16027 (renames topological_group_is_uniform
), so one of those needs to be removed from the batch before it's restarted.
Johan Commelin (Aug 15 2022 at 18:55):
Thanks. I asked it to take #16011 of the queue
Johan Commelin (Aug 15 2022 at 18:57):
Ha!
Johan Commelin (Aug 15 2022 at 18:57):
I just kicked a PR on the queue, and bors started working again.
Johan Commelin (Aug 15 2022 at 18:58):
So it seems like it just completely lost track of all its queues :oops:
Junyan Xu (Aug 15 2022 at 19:00):
Does bors cancel vs. bors r-
matter?
(Edit: apparently they're equivalent: https://bors.tech/documentation/ )
Thomas Browning (Aug 15 2022 at 19:01):
Johan Commelin said:
I just kicked a PR on the queue, and bors started working again.
But does #bors still remember the other PRs?
Johan Commelin (Aug 15 2022 at 19:08):
Nope. All the queues were gone.
Johan Commelin (Aug 15 2022 at 19:08):
I'm adding everything back now.
Johan Commelin (Aug 15 2022 at 19:09):
Doing an experiment with manually putting 10 PRs in a batch using priorities.
Johan Commelin (Aug 15 2022 at 19:15):
I think everything that had ready-to-merge
should now be back on a #bors queue. Let's hope it doesn't forget the queues again.
Anatole Dedecker (Aug 15 2022 at 19:50):
#14534 is ready again, should I just bors r+
it again?
Johan Commelin (Aug 15 2022 at 19:52):
Feel free to try
Bryan Gin-ge Chen (Aug 15 2022 at 20:08):
It looks like all the queues were dropped because bors crashed (only maintainers can view this link). If it happens again it could be worth reporting.
Ruben Van de Velde (Aug 15 2022 at 21:34):
Hmm, I didn't know bors would combine PRs with different priorities
Screenshot-from-2022-08-15-23-33-33.png
Thomas Browning (Aug 15 2022 at 21:44):
If you go to the PR, I think it's because the priority was set twice
Joseph Myers (Aug 16 2022 at 00:05):
The next build failure says that #15065 should be removed from the queue until it's been updated for the has_scalar
rename. (But bors seems to have carried on automatically with a different subset of the queue after that build failure.)
Bryan Gin-ge Chen (Aug 16 2022 at 00:26):
Thanks! It looks like #15065 has since been removed from the queue.
Johan Commelin (Aug 16 2022 at 05:27):
Crazy. Bors is now building a queue with prio 0, even though there are two queues with prio 4.
Johan Commelin (Aug 16 2022 at 05:28):
Thomas Browning said:
If you go to the PR, I think it's because the priority was set twice
But all queues had a unique prio ~ 8hrs ago. So there is no good reason for mixing them.
Johan Commelin (Aug 16 2022 at 05:28):
I completely don't understand how bors priorities work.
Junyan Xu (Aug 16 2022 at 05:36):
It seems that when bors bisects (due to failed build), the half that's not chosen forms a batch that goes to the end of the queue.
Last updated: Dec 20 2023 at 11:08 UTC