Zulip Chat Archive
Stream: lean4
Topic: stage0
Julian Berman (Mar 04 2021 at 15:30):
What's the stage0/src
directory used for? From the name I guess some thing that handles early parts of the startup process? diff -r stage0/src src
shows it's nearly but not exactly a hard copy of the src/
directory?
Sebastian Ullrich (Mar 04 2021 at 15:35):
See https://leanprover.github.io/lean4/doc/make/index.html#lean-build-pipeline
Julian Berman (Mar 04 2021 at 15:37):
Thank you!
Yury G. Kudryashov (Mar 30 2024 at 18:28):
Are there any precautions against a self-replicating stage0 attack in Lean? I mean, what if someone pushes a PR that updates stage0 but also adds a self-replicating code that opens a backdoor in all future versions? I've just read about a recent attack on liblzma and I'm more worried about security than usual.
Henrik Böving (Mar 30 2024 at 18:51):
update-stage0
is currently mostly performed by a bot or trusted contributors. At least I have never seen an update-stage0
by an outsider get merged.
That said, in the xz attack one of the trusted contributors placed the code in question. And there aren't really any measures against this in place. If say an FRO member with Lean 4 push access decides to put some crypto trojan code into the repo that somehow makes it until a release that gets actively used by mathlib the mathlib people do indeed end up in big trouble.
Henrik Böving (Mar 30 2024 at 19:41):
Oh and for an xz type of attack. Of course a mathlib maintainer (or whoever else has push access to master) can push malware to master that gets run on machines of people that pull from master.
Furthermore if an account of a maintainer gets corrupted by someone it is also possible to carry out this type of attack. So really the security of projects like mathlib is only as good as the maintainer's security, in particular of course the maintainer with the worst security, whoever that might be.
In the end this is all a very trust based system^^ There are measures against this such as allowing nobody to actually put things on to master but instead enforcing a review process with as many people as you feel secure with signing off on changes (e.g. in the regulated German IT security industry it is mandated that the author + 2 colleagues must sign off on changes) but this always carries an overhead of course. In the end, as with almost all things in security, it is a question of balancing producitvity and security
Joachim Breitner (Mar 30 2024 at 20:38):
I would love to see someone replaying the history of lean4, redoing all the stage0 updates, as far back as they manage, so see how close we get to a bootstrappable build (in the sense of https://bootstrappable.org/), even if it may not be fully achievable. Bonus points if this effort is made reproducible using nix.
Joachim Breitner (Mar 30 2024 at 20:50):
Thinking about this, it shouldn't be too hard to write a script that reproduces a single stage0 update in isolation. Then add a script that runs that on all of them, and displays the result somewhere. Then look at those that fail and see if they can be fixed, or what to do with the few commits that accidentally combine stage0 with other changes (maybe grab the separated changes from the original PR). In some cases shortcuts may be possible, i.e. skipping a stage0, or maybe it will be the case thst one stage0 output cannot be reproduced (actually quite likely given that it may be constructed from a state on a branch that differs from the rebased-onto-master stage), but as long as the output is identical eventually, it should be fine.
The result of this exercise would be identifying a hopefully very early stage0 that cannot be replicated (yet), reducing the trust needed to those contributers that were already around then. In particular and crucially, not me :smiling_devil:
Joachim Breitner (Mar 31 2024 at 11:34):
Although it’s going to be quite a bit of work: Ever since the stage0
directory was introduced in
commit 91b68d8fa42153c696f51c1b57a4c67f803c54c2
Author: Sebastian Ullrich <sebasti@nullri.ch>
Date: Sun Nov 10 23:00:02 2019 +0100
feat: cleanly separate bootstrap stages
there have been 1825 stage0 updates…
Henrik Böving (Mar 31 2024 at 17:02):
Joachim Breitner said:
Although it’s going to be quite a bit of work: Ever since the
stage0
directory was introduced incommit 91b68d8fa42153c696f51c1b57a4c67f803c54c2 Author: Sebastian Ullrich <sebasti@nullri.ch> Date: Sun Nov 10 23:00:02 2019 +0100 feat: cleanly separate bootstrap stages
there have been 1825 stage0 updates…
I'm not 100% sure if I am recalling this correctly but I believe that we very rarely manually copied parts of the build system over into stage0 while reboostrapping? So that could make it harder to automate this
Joachim Breitner (Mar 31 2024 at 18:57):
I'm not optimistic to retrace the bootstrapping completely all the way back, but maybe we can achieve an “from X onwards” audit. And in the future enthusiastic sleuthers can push that X further back.
Josha Dekker (Mar 31 2024 at 19:07):
(deleted)
Joachim Breitner (Apr 02 2024 at 20:23):
preview.png
Sneak preview into a little stage0 audit tool.
Because of git rebase
, the stage0 cannot always be reproduced from the previous commit, in these cases one has to replay a couple of the stage0 updates to go justify one particular good state from another older one. I hope I can automate trying that.
In other cases we have accidential squash merges where rebase should happen. In these cases probably some manual investigation is needed, e.g. justify the resulting stage0 by going via commits from the original PR.
This is a little vacation side project, not sure how far I’ll take this.
Eric Wieser (Apr 02 2024 at 20:31):
Is it the case that the bot-based stage0 updates mean that this type of non-reproduction is still possible going forward?
Joachim Breitner (Apr 02 2024 at 21:28):
Yes: the bot can only be used if the stage0 update can happen on master, between two (squashed) PRs. Some features need one or more stage0 updates in the middle of some changes. These PRs are then rebase-merged, and if that happens without first rebasing them on master and updating the stage0 commit therein we get these discrepancies.
Eric Wieser (Apr 02 2024 at 21:33):
Would it be better to do a regular merge on those PRs in future so that the bootstrapping can be reproduced?
Joachim Breitner (Apr 03 2024 at 10:52):
Possibly, if one values that more than the linear history on master. I am not too concerned either way, the history is still in the GitHub repo, and with a bit of plumbing we can probably still retrace the bootstrapping.
The recent xz backdoor debacle kinda emphasizes the value of such audit, doesn't it? :-)
Joachim Breitner (Apr 20 2024 at 19:50):
Did I a bit more work on this, my tool can now be told “revision c8d77b83de
is a bad squash merge, pretend that instead we have b762567174
at this point in the history”. That justfies a few more stage0 bumps.
But some are odd in the sense that they simply don’t build. For example this stage0 bump: https://github.com/leanprover/lean4/commits/c8d77b83de1be912bd9aa232d666c6896b05d90e/stage0
If I check out the parent and run nix run .#update-stage0-commit
it fails with
error: builder for '/nix/store/iq2bpkkj77phd4x8qrr1b9fr3fhghwjm-Init.Tactics.drv' failed with exit code 1;
last 8 log lines:
> Init/Tactics.lean:1323:41: error: application type mismatch
> tk.raw
> argument
> tk
> has type
> Syntax : Type
> but is expected to have type
> TSyntax (Name.mkStr1 "term" :: List.nil) : Type
For full logs, run 'nix log /nix/store/iq2bpkkj77phd4x8qrr1b9fr3fhghwjm-Init.Tactics.drv'.
I fear that maybe sometimes developers (no finger pointing intended here!) ran make update-stage0-commit
on a different state than was committed before.
At this point probably manual labor is needed to re-bootstrap from this version from an earlier one.
Joachim Breitner (Apr 21 2024 at 19:20):
Ok, the weekend is coming to an end, and I should wrap up this weekend project hacking phase, before I use this for even more procrastination.
https://lean-stage0.nomeata.de/ explains the problem, my methodology and what the icons above mean. There are not few commits that may require some investigation into how to recreate their stage0. Code at https://github.com/nomeata/lean-stage0-audit.
So far, this is static, I did not yet write the automation needed to reproduce commits on github action and update this page automatically. Will do the next time round.
If someone would enjoy sleuthing around: Can you patch commit c8d77b83de
(or its parent) so that a new stage0 can be produced from it?
Mario Carneiro (Apr 21 2024 at 20:14):
I have on occasion produced stage0 commits which do not exactly match the result you would expect, so I can provide at least one explanation for why this might happen:
- The PR needs a stage0-update
- Create part 1
- run
make update-stage0
- Create part 2
- Run the tests
- oops, missed a doc comment in part 1
- create a fixup commit and
git rebase -i
to apply it to part 1 - I can't be bothered to run
make update-stage0
again because I'll need to do it again later after every rebase
I generally expect that stage0 commits in PRs don't matter though, because they are recreated from scratch on the master branch in the end anyway. They are just used to make sure the tests pass
Mario Carneiro (Apr 21 2024 at 20:16):
you can also replace "doc comment" with "trivial code typo" in the above as well, at which point the possibility of a stage0 update that fails to build becomes more likely
Mario Carneiro (Apr 21 2024 at 20:20):
Joachim Breitner said:
Yes: the bot can only be used if the stage0 update can happen on master, between two (squashed) PRs. Some features need one or more stage0 updates in the middle of some changes. These PRs are then rebase-merged, and if that happens without first rebasing them on master and updating the stage0 commit therein we get these discrepancies.
My impression was that in these cases you always rerun the update-stage0 during the rebase-merge. I will usually make use of the x
command in rebase -i
to replace every update-stage0 commit.
Mario Carneiro (Apr 21 2024 at 20:20):
it would be nice if there was a way to get the bot to do this too
Joachim Breitner (Apr 22 2024 at 06:14):
The final rebase is usuay done by GitHub (triggered by an admin), I think, and thus won't do anything special to the commit. For this to work out nicely, the PR author has to have the feature branch up to date with master and the stage0 commit fresh also has to be up to date. Maybe this isn't clearly documented? But even when it is, it's plenty of manual busiwork that can go wrong. We could have CI complain if thats not the case, so far we don't have such a check.
Joachim Breitner (Apr 24 2024 at 16:39):
I couldn’t resist playing around with this some more. Most of the “red” commits on master
could be fixed by digging through Github to see from what branch they were rebased from, and using that instead.
In two instances that failed as well, probably because of not committing all changes before updating stage0. In these cases I found a manual fix that I could apply to make it build again. I pushed these to stage0-graft/<hash of commit to be replaced>
.
This means I can now reproduce the history of stage0 until … :drum: … February!
Not so impressive yet, but its a start. Going back further is initially just busiwork (and better automation). Going forwards should hopefully be easier as well, as we hopefully pay a bit more attention to this.
Joachim Breitner (May 19 2024 at 20:28):
I’m on vacation, and for reasons best not dwelled too much on this meant I pushed this project a bit further.
There is now a github action workflow that uses that free payed-by-someone-else computer power to audit new lean4 commits, and also tries to reproduce older stage0 updates, pushing the “time of trust” further back. So every two hours, one step of progress should happen on https://lean-stage0.nomeata.de/.
Right now there aren’t any stage0 commits whose providence needs manual investigation, we’ll see how much of that we need.
Mario Carneiro (May 19 2024 at 20:38):
Another thing I'd like to try, maybe after you have finished replicating all the stage0's back to the beginning, is to see whether you can skip steps: That is, if you can use an earlier lean compiler to compile and reproduce a later stage0 update, you may be able to remove steps from the chain. My guess is that most steps can't be removed, or else we would not have an update-stage0 there, but it's possible that independent changes can be merged
Joachim Breitner (May 19 2024 at 20:51):
That would be rather easy to test with the current infrastructure – the build script takes a stage0 and a non-stage0 source and builds the latter with the former, so one could probe larger steps
But it doesn't matter for the existential question “can we reproduce current stage0”, only for (non-Prop?) task of actually doing that. So if, one day, the guix people want to ship lean4 they’ll be very happy if we can shorten the bootstrap chain from what it’s now :-). Besides, that, do you have another motivation for investigating in that direction?
Mario Carneiro (May 19 2024 at 20:55):
Shortening the chain is my primary motivation. I have a dream that we can shorten the chain to O(1) compilers, but it will not at all be following history. For your current work, it does have a side benefit that skipping steps allows you to not worry about breaks in the chain and potentially decrease the amount of commits to test, although since my prior on the probability of steps being skippable is low, it's likely to be more work and not less to optimize the chain length
Mario Carneiro (May 19 2024 at 21:05):
At one commit per 2 hours, I project it to finish at
Joachim Breitner (May 19 2024 at 21:08):
And that's only if the build script works fine with old commits, and we don't have too many bad merges or other commits that need manual intervention, and the page starts at a relative random point in the lean history (I think when the stage0
directory was created)…
It takes about 15 mins on a runner to do one step, and it's quite trivially parallizable, so we could run it faster of course. But it just means that we hit some more serious road blocks earlier…
Mario Carneiro (May 19 2024 at 21:10):
I would like to better understand the "prehistory" of stage0
Mario Carneiro (May 19 2024 at 21:11):
does there actually exist a working C++ lean compiler in the lean4 history?
Mario Carneiro (May 19 2024 at 21:13):
If it only takes 15 minutes per step, then I suggest increasing the frequency to 30 minutes instead of 2 hours. And ideally it shouldn't stop at bad merges, it should just keep going until the status of all updates is determined. You can do manual intervention stuff in parallel
Mario Carneiro (May 19 2024 at 21:23):
The produced
stage0/
trees, which may not exist as such in the main lean4 repository, are stored in this git repository underrefs/stage0/<short tree hash>
. This allows the reproduction of each individual step even in a long “alternative” chain.
@Joachim Breitner I can't find this in the git repo. Is this a tag, a branch, or a folder? I can't find any of them from github. It's also not clear from the description how short the tree hash is or whether it refers to the commit before or after a stage0 update
Joachim Breitner (May 20 2024 at 04:55):
It does keep going if the build fails, but will get stuck if my GitHub action fails.
Joachim Breitner (May 20 2024 at 04:57):
It's neither, it's a ref ”next to” tags and branches, and stores a tree object instead of a commit object there. May unnecessary advanced git foo :-)
You can fetch them using
git fetch https://github.com/nomeata/lean-stage0-audit 'refs/stage0/*:refs/stage0/*'
Joachim Breitner (May 20 2024 at 06:11):
Bumped frequency to 30mins. It's now hitting a stretch of commits where there seems to be a stray #exit
in the source making the build fails (signified by a frowny in the table), so probably needs manual intervention (possibly just deleting the exit and everything after). Not sure when and if I'll investigate myself, but happy to assist if someone else feels like sleuthing.
Joachim Breitner (May 20 2024 at 10:35):
Many build failures or reproducibility failures can be fixed by rewriting the git history and replacing the rebased commits with the original one. This is supported and helped a few times.
I wonder if this can be automated: is there a way to query GitHub for a given commit if this is commit is the result of a rebase merge on GitHub, and what the original commit was?
Maybe https://github.com/orgs/community/discussions/24679#discussioncomment-3245032 is worth investigating.
Eric Wieser (May 20 2024 at 11:34):
I would guess your best option is to hit the API for every PR and build the map in the other direction
Eric Wieser (May 20 2024 at 11:35):
(I hit the API this for all ~30k mathlib PRs when building a .bib file for my thesis; PyGithub is clever enough to do the necessary rate limit backoff)
Joachim Breitner (May 20 2024 at 21:11):
This is mildly addictive; I fixed a few more failing stage0 builds, mostly by digging out the corresponding commit from the pre-rebase feature branch, but in a few cases the builds failed due to some temporary #exit
in a source file; if I just remove that part of the file it reproduces. Not sure if the make-based stage0 update ignores such warnings and the nix-based one (which my tool uses) doesn’t, or how else that worked initially. If that comes up more often I guess the build script can use sed
to remove #exit
’s from the source so that that needs less manual intervention.
Patrick Massot (May 20 2024 at 21:12):
I know that kind of vacations.
Joachim Breitner (May 20 2024 at 21:12):
Granted, it’s :rainy: :-)
Joachim Breitner (May 22 2024 at 12:13):
https://lean-stage0.nomeata.de/ is chugging along nicely now, seems to be able to fix the occasional bad merge by itself, and has reached last November.
Kim Morrison (May 22 2024 at 12:19):
I'm sorry about the weather. :-)
Joachim Breitner (May 22 2024 at 13:55):
I did get another two flights done this morning, but now it's hailing…
Mario Carneiro (May 22 2024 at 22:37):
The new completion forecast is
Joachim Breitner (May 23 2024 at 06:40):
Note that I shortened the list of commits shown on the webpage to begin with the first one that introduces the nix based stage0 update, which my build script is using. Going further back will require more work on that front. Maybe that changed the forecast.
Joachim Breitner (May 23 2024 at 06:42):
The attack horizon is now pushed back to Jul 25, 2023, which is before I started at the FRO, so that should be a relief for many ;-)
Ruben Van de Velde (May 23 2024 at 07:12):
Ah, but if we can't trust you, can we trust your verification? :)
Joachim Breitner (May 23 2024 at 07:25):
Darn, and I thought I got through with this
Joachim Breitner (May 23 2024 at 07:27):
Jokes aside: it should be possible to clone the repo, delete the builds.csv
file, let it sit for a while while GitHub does their work, and reproduce the audit. Of course you also have to check the scripts in tbe repo (not much) and every single non-stage0 source change in the lean4 history (a bit more work…)
Joachim Breitner (May 23 2024 at 20:57):
And already we are at Dec 1, 2022. Suddenly we have much fewer problematic stage0 updates. I wager that's because back then there were fewer developers, so less parallel feature branches, and more direct pushes to master, rather than going via a PR. That's good news. If it continues like this the next hurdle will be adjusting to older build setups in the repo.
Joachim Breitner (May 24 2024 at 21:04):
It is unreasonably satisfying to reload the page and see a few more green tickmarks there. But I was tired of manually scrolling past the green part, so the top of https://lean-stage0.nomeata.de/ now has a status line of the form
The current
stage0/
code copy can be traced tostage0/
in revision ✨f9acb7fc9f
from 2022-10-15 in 191 steps.
Assuming we ever get to a non-lean bootstrap, I’m looking forward to the conversation with the nix or guix people
– So lean is self-boostrapping, is it?
– Yes, but we also have a bootstrap path from C++!
– Great! Then we can include lean in our bootstrapped distribution…
– Cool. Here are the 1456 versions of lean you have to build in sequence.
– :fear:
Mario Carneiro (May 24 2024 at 21:44):
– We tested it, it totally works. You just need to tie up your build machine / your users' computers for a month
Joachim Breitner (Jun 05 2024 at 17:14):
Now at 2021-09-12 in 612 steps. In that time frame I find a few with actual build errors such as
> Init/SimpLemmas.lean:71:51: error: type mismatch
> h₂ h
> has type
> x = u : Prop
> but is expected to have type
> ite c x y = ite c u v : Prop
that likely require manual investigation. The tool keeps churning on below, so when and if someone investigates the gap will be filled.
Jireh Loreaux (Jun 05 2024 at 18:00):
this is so cool
Joachim Breitner (Jun 06 2024 at 16:57):
Glad to hear you think it’s cool, because it’s very questionable whether it’s useful :-D
Yaël Dillies (Jun 06 2024 at 16:58):
Well clearly it makes the weather go better by the time it's done :mostly_sunny:
Joachim Breitner (Jun 10 2024 at 20:28):
Did not investigate the few failing commits yet, but the tool chugged on, so as additional motivation it now says how far it could build things if one did investigate these builds:
The current
stage0/
code copy can be traced tostage0/
in revision ✨1de3efff9d
from 2021-09-12 in 615 steps.
Investigating 6 revisions will trace it to revision
5787fc5ce4
from 2021-03-06 in 839 steps.
Joachim Breitner (Jun 17 2024 at 13:43):
Investigated a few commits that really needed some work (splitting commits, reordering). Now at
The current
stage0/
code copy can be traced tostage0/
in revision ✨14c438234a
from 2020-12-08 in 1027 steps.
I’m nearing the commit that introduced the nix run .#update-stage0-commit
feature, going earlier than that will require more plumbing work on my end.
Joachim Breitner (Jun 20 2024 at 07:28):
We have reached the end of this list of commits
The current
stage0/
code copy can be traced tostage0/
in revision ✨e2ff1c2b7e
from 2020-11-25 in 1088 steps.
This is not the beginning of development, far from it, but is the stage0 following the introduction of nix run .#update-stage0-commit
in 2794ae76
.
Going further back would require writing a small script and nix derivation that sets up the build environment using nix (for reproduciliby) but then uses the make
setup to build stage0. Maybe during the next rainy vacation.
Until then I have reduced the cron rate of the script; it will still update https://lean-stage0.nomeata.de/ as new stage0 commits are added to the top.
Last updated: May 02 2025 at 03:31 UTC