Zulip Chat Archive

Stream: CSLib

Topic: nightly-testing


Kim Morrison (Jul 04 2025 at 04:01):

This is getting a bit ahead of the game, but once the repo launches the Lean FRO would like to include it both in the list of repositories that we support updating to new Lean versions, and the list of repositories that we do testing against for individual Lean pull requests.

For the former, this will mean creating a nightly-testing branch, which I'll then install some CI for, on which we will accumulate adaptations to Lean nightly releases. Most likely you'll also want to set up some infrastructure so that these adaptations can be progressively reviewed during the month, rather than waiting until the end of the month (or delegating that review to the FRO release manager, usually me) --- I'll explain how all that works when the time comes! Keeping the nightly-testing branch up to date with Lean nightly releases is mostly then handled by automation, and when it's not it's helpful to know what's going wrong early!

Kevin Sullivan (Jul 04 2025 at 14:43):

<Moved to better subchannel.>

Fabrizio Montesi (Aug 19 2025 at 18:05):

@Kim Morrison, we have a repo now! I've created a nightly-testing branch. What now? :)

Swarat Chaudhuri (Aug 19 2025 at 23:07):

Thank you @Kim Morrison !

Kim Morrison (Aug 19 2025 at 23:31):

The first step is https://github.com/leanprover/cslib/pull/38, which adds a regularly scheduled job that will merge main into nightly-testing, automatically resolving conflicts in favour of nightly-testing.

(Note this is a bit dangerous: whoever is fixing nightly-testing needs to realise that sometimes there will be bad merge artifacts. The solution is usually to revert that file to the version from main and reapply any adaptations required for the latest Lean nightly.)

Before we can merge this, we will need to create a token with repo privileges, in whichever Github account will push these changes (I recommend leanprover-community-mathlib4-bot for now, and Cslib can acquire its own bots later), and install this token as a Github secret on this repository, under the name NIGHTLY_TESTING.

Now, I have an admin bit on leanprover/cslib, so I can actually do this step myself, but I'll wait for confirmation. :-)

Kim Morrison (Aug 19 2025 at 23:36):

The second step is https://github.com/leanprover/cslib/pull/39, which will regularly update the Lean toolchain used on the nightly-testing branch.

This relies on the same token the previous workflow uses.

Kim Morrison (Aug 19 2025 at 23:46):

Note that this second job checks that Mathlib has a nightly-testing-YYYY-MM-DD tag for the current nightly (these tags are only created once Mathlib's nightly-testing branch is working), and only bumps the toolchain if this exists. It then simultaneously bumps the toolchain and changes the Mathlib require statement (which on main is pinned to a fixed revision of Mathlib) to point to that tag.

This should have the effect that you'll only see errors which are actually about Cslib, once Mathlib has its shop in order!

Kim Morrison (Aug 19 2025 at 23:48):

However you will see failures here from changes in Mathlib itself between release cycles which break Cslib. I think it's best to be aware of these as these happen, rather than landing all at once at the end of each month.

Kim Morrison (Aug 20 2025 at 00:06):

The third step is https://github.com/leanprover/cslib/pull/40, which will:

  • report successes or failures on the nightly-testing branch to Zulip in #nightly-testing > Cslib status updates
  • when nightly-testing succeeds
    • tag it as nightly-testing-YYYY-MM-DD
    • attempt to automatically create a PR from nightly-testing to the current bump/v4.X.0 branch, which will need to be reviewed (i.e. changes which accumulate on bump/v4.X.0 are considered "good to go", and in particular I have permission to merge them back to main when a new release comes out). If this succeeds, a bot will post a link to the PR to zulip. If this fails, a bot will post instructions for preparing this PR manually.

Kim Morrison (Aug 20 2025 at 00:07):

This step relies on an additional secret being installed, so give access to the Zulip bot. As above, I recommend that for now you just reuse the existing bot managed by Mathlib, and replace it as needed down the track.

Kim Morrison (Aug 20 2025 at 00:07):

I can install this secret, but again I'll wait for confirmation.

Kim Morrison (Aug 20 2025 at 00:08):

I think this is everything for now.

Kim Morrison (Aug 20 2025 at 00:08):

One thing that is not done here is allowing contributors to the Lean repository to test their PRs against Cslib before merging (this is possible for mathlib and for the reference manual). I propose leaving that unimplemented for now.

Kim Morrison (Aug 20 2025 at 00:11):

(Maybe one thing I should have said earlier: we also have the option of going with a more lightweight process! This setup adds some overhead, and is all predicated on the assumption that Cslib is going to take off and see significant growth. If that happens, and we're including Cslib in the batch of repositories that the FRO assists in keeping in sync with Lean releases, then I think this infrastructure is appropriate. If Cslib grows more slowly, or doesn't want to be included in the release process, we can certainly simplify.)

Fabrizio Montesi (Aug 20 2025 at 03:27):

Thanks for the great summary, @Kim Morrison!
I agree with all your suggestions, please go ahead and use your admin powers. Better realise breakage sooner rather than later.

While you're at it, could I ask you to add lean-release-tag as well with the same bot? :⁠-⁠)

Chris Henson (Aug 20 2025 at 04:25):

Thanks for the help, @Kim Morrison! I think this could be helpful. One point to mention is that we currently have a pinned Mathlib version in our lakefile. Previously I had updated this manually when I knew there was a release. Is the recommendation to remove this pin? (Otherwise the lake update will do nothing for nightly testing). I somewhat liked the certainty of having the pinned dependency, and have seen other projects do the same. Is there a way to make this compatible with this tooling? Or is the idea that with so much automated feedback and a human in the loop on merging the accumulated bump/v4.X.0 PRs that it's unnecessary to pin?

Kim Morrison (Aug 21 2025 at 11:32):

No -- I was only suggesting removing the pin on the nightly-testing branch.

Chris Henson (Sep 15 2025 at 01:38):

I see that @Fabrizio Montesi has approved cslib#38, cslib#39, and cslib#40 a few weeks ago. Anything holding us back from merging?

Fabrizio Montesi (Sep 15 2025 at 13:30):

I think Kim is just busy with the Lean release. :)

Kim Morrison (Sep 15 2025 at 23:48):

Sorry about the delay here. I need to install some secrets for these PRs to work which I'm doing now as part of completing the v4.24.0-rc1 checklist.

Kim Morrison (Sep 16 2025 at 00:04):

I have added a NIGHTLY_TESTING token in the cslib repository, which allows CI to use the leanprover-community-mathlib4-bot github account. I've also give that bot account write access, and accepted the invitation.

I've also added the ZULIP_API_KEY secret, using the same account that mathlib4 uses to post to Zulip.

(For now I'll record this information where we keep similar information for leanprover-community repositories, but if there's somewhere cslib specific you'd like it recorded, let me know. Similarly if/when we'd like to use separate accounts for these functions from the Mathlib ones, I can help with that.)

I think these are the remaining requirements for merging these PRs, so I'll do that now.

Chris Henson (Sep 16 2025 at 00:26):

Kim Morrison said:

Sorry about the delay here. I need to install some secrets for these PRs to work which I'm doing now as part of completing the v4.24.0-rc1 checklist.

No worries, I was just making sure it wasn't action required on our part. Thanks again!

Fabrizio Montesi (Sep 17 2025 at 14:05):

Hi @Kim Morrison, the update broke our doc toolchain because I forgot to mention that upgrades to the toolchain should be reflected here as well: https://github.com/leanprover/cslib/tree/main/docs

Is it possible to automate that bit as well?

Fabrizio Montesi (Sep 17 2025 at 14:06):

The problem is in lakefile.toml, the rev of doc-gen4 needs to be bumped.

Kim Morrison (Sep 18 2025 at 01:06):

Hopefully this is addressed by

Kim Morrison (Nov 19 2025 at 01:30):

There are some grind failures in nightly-testing --- if anyone is available to take a look at these that would be much appreciated. I would like to verify that Cslib works against the "modulized" Mathlib that we're about to merge, but these failures get in the way of testing.

I can take a look later this afternoon.

Chris Henson (Nov 19 2025 at 03:19):

I pushed a fix to the one of the two errors, having more trouble with the other.

I am confused by a behavior of grind? here. I usually take a grind that fails on nightly, change it to grind? on main, and see how I can minimize the resulting grind only to help debug. But for this failure (this line) I seem to get a grind only that fails.

Kim Morrison (Nov 19 2025 at 03:24):

I think this is fallout from me removing some problematic annotations in #11206.

Kim Morrison (Nov 19 2025 at 03:26):

or .... seems not. Adding them back locally doesn't help.

Chris Henson (Nov 19 2025 at 03:30):

Can you reproduce the grind? behavior I describe above on main? Seems like a bug, yes?

Chris Henson (Nov 19 2025 at 03:34):

This theorem has been a problem previously, considering the adaptation note in a separate branch of the proof. I can look again in the morning and maybe try to make it more explicit. I think I have the original proof I wrote in a branch somewhere.

Kim Morrison (Nov 19 2025 at 03:37):

Indeed the suggestion from grind? doesn't work, however:

    · grind (instances := 10000) only [= append_assoc, = dlookup_cons_ne, = Option.or_some,
      = Option.or_eq_none_iff,
       Pairwise.of_cons, !var, = dlookup_append, =_ mem_toFinset, = NodupKeys.eq_1, weaken,
      = dlookup_cons_eq,  wf, = Option.none_or, = nodup_iff_count, = Option.or_none,
      =_ singleton_append, = append_nil, = nodup_cons, = Nodup.eq_1, = dlookup_nil,
      = nodup_iff_pairwise_ne, = Option.or_self, = nodup_append, = Option.some_or,  to_ok,
      =_ Option.or_assoc, =_ cons_append, = pairwise_append, = haswellformed_def,  NodupKeys.nodup,
       perm_nodupKeys,  lc, = keys_append, usr Perm.nodup_iff, = nil_append, = cons_append,
      = Option.or_assoc, LC.var, =_ haswellformed_def, = mem_append, = Option.or_eq_some_iff,
      = Option.mem_some,  Pairwise.isChain, =_ append_assoc, = nodupKeys_middle, = Option.mem_def,
      = pairwise_middle, usr eq_or_mem_of_mem_cons, = keys_cons, trans_tvar, =_ append_eq, cases Or]

does (and I knew to add (instances := 10000) by looking at the "thresholds" section of the grind failure output.

Kim Morrison (Nov 19 2025 at 03:39):

Ideally that wouldn't happen, but I don't think we can guarantee that there won't be edge cases where the grind only suggestion requires higher thresholds than grind?. We could of course re-run the grind only suggestion, detect hit thresholds, and raise them and try again...

Chris Henson (Nov 19 2025 at 03:45):

Thanks, that makes sense. I think I need to look more carefully at what is happening, my guess is there are some annotations corresponding to weakening that get called continually.

If still outstanding by the time I wake up, I'll know where to start :)

Kim Morrison (Nov 19 2025 at 03:45):

That can be manually minimized to

    by_cases X = X'
    · grind only [= append_assoc, = Option.or_some, = dlookup_append, weaken, = dlookup_cons_eq,
         wf, =_ singleton_append, =_ cons_append, = mem_append, = Option.or_eq_some_iff,
        = Option.mem_some, =_ append_assoc, trans_tvar, cases Or]
    · grind [Sub.weaken, sublist_append_of_sublist_right]

Chris Henson (Nov 19 2025 at 03:47):

Ah okay, that works for nightly then. I'll still look into making this less fragile.


Last updated: Dec 20 2025 at 21:32 UTC