Zulip Chat Archive

Stream: lean4 dev

Topic: Confusing build failure sometimes in stage1


Joachim Breitner (Sep 13 2023 at 15:00):

I observe some oddity that possibly stems from a misunderstanding of the bootstrapping, or something else; in any case, I am confused and clearing up the confusion will probably be enlightening.

In lean4#2529 I am swapping out the the implementation of Lean.Parser.Trie. This now leads to a parse error at a subsequent module

Lean/Parser/Extra.lean:290:82: error: expected ')'

I see this

  • with nix build .#lean
  • with make -C build/release/stage1 when building using cmake (in nix-shell to provide the clang deps, no ccache)
  • on CI on previous commits for that PR.
  • with elan override set lean4-stage0in src/, startingcode src/ and then opening Lean/Parser/Extra.lean

So far so normal, I must have made a mistake somehow. But strangely, the current commit (1c641e7) on that branch suddenly builds cleanly on CI! Did I somehow fix things? But then why would it fail locally?

I am particularly worried that the CI ran nix build .#cacheRoots fine but that command fails here… why is even nix letting me down here? Does the ccache on CI play a role?

I am also a bit confused how a change to Lean/Data/Trie.lean would affect how Lean/Parser/Extra.lean is built by stage0 (which shouldn’t have my change yet) – is it somehow using the stage1 parser here already?

Joachim Breitner (Sep 13 2023 at 15:15):

Oh, Github is setup to do this thing where CI doesn’t actually test the commit you are pushing, it tests a synthetic merge with master:

  HEAD is now at c42ed09 Merge 1c641e7a87918bb633a979020e3889d17c669e1c into e9d60e143a8b1ee06be1f201f78068250a2b8949

so whether CI is green or not doesn’t say much about the state of the branch itself, and success or failure depends on whatever happens on master right now. I never understood how they could make that the default… I wonder if it might be related to my problem, though.

Sebastian Ullrich (Sep 13 2023 at 15:30):

For a second you made me lose faith in Nix :big_smile: ... we're thinking about enabling Merge Queues for the lean4 repo, which would be a much more robust approach than this automatic merging

Joachim Breitner (Sep 13 2023 at 15:39):

Indeed, after git merge origin/master, my local nix build succeeds.
I still don’t understand why my change caused that kind of failure in a later module, but I guess that’s history now :-)

Joachim Breitner (Sep 13 2023 at 15:40):

This is the action/checkout setting to make CI test the branch as is: <https://github.com/actions/checkout#Checkout-pull-request-HEAD-commit-instead-of-merge-commit>
It goes well with the “require strict merge” setting. That will require PRs to be up to date before they can be merged (auto or not), and also adds the “Update to latest master” button on the PR, which needs to be manually pressed after pushes to master. Good enough on low activity repositories; if you need more throughput merge queues of some form (mergify or github native) are needed.

Sebastian Ullrich (Sep 13 2023 at 15:42):

If you haven't already seen https://leanprover.github.io/lean4/doc/dev/bootstrap.html#further-bootstrapping-complications, things can certainly get complicated/mysterious when changing the parser ABI :) . But I can't immediately say what could have gone wrong in this case.

Sebastian Ullrich (Sep 13 2023 at 15:45):

And now I see that I have committed prefer_native=true to master, oops...

Joachim Breitner (Sep 13 2023 at 15:45):

I skimmed that file, can’t claim I understood it all, but unless the TokenCache is pickled somehow I couldn’t quite see what could go wrong :shrug:

Sebastian Ullrich (Sep 13 2023 at 15:47):

It's not about on-disk representation but about necessarily mixing native code from the stage 0 binary and interpreted code from the stage 1 stdlib when doing metaprogramming while bootstrapping

Sebastian Ullrich (Sep 13 2023 at 15:47):

So you end up running code not necessarily compiled against the same definitions of parser types

Sebastian Ullrich (Sep 13 2023 at 15:48):

Pushed a stage0 update, so your PR should be broken again now :) ... you would have to set the flag in stage0/src/stdlib_flags.h yourself

Joachim Breitner (Sep 13 2023 at 16:11):

Should stage0/src/stdlib_flags.h be excluded from

stage0/** binary linguist-generated

if that’s a file that is manually edited in certain circumstances?

Sebastian Ullrich (Sep 13 2023 at 16:15):

Yes, that would probably make sense

Joachim Breitner (Sep 13 2023 at 16:17):

I’ll make a PR at https://github.com/leanprover/lean4/pull/2537

Mac Malone (Sep 13 2023 at 21:24):

Sebastian Ullrich said:

And now I see that I have committed prefer_native=true to master, oops...

Oh, so that was I mistake! I saw that when trying to debug lean4#245 and thought it may be part of the problem, but second guessed myself and believed I was misunderstanding something about the bootstrapping.

Joachim Breitner (Sep 14 2023 at 10:18):

At least that mistake debugged my branch, I wonder how much longer I might have stared at it if it wasn’t for that :)
In hindsight I kinda understand why my build failed without that.


Last updated: Dec 20 2023 at 11:08 UTC