Zulip Chat Archive
Stream: Equational
Topic: CI Lean Environment Check
Shreyas Srinivas (Oct 05 2024 at 09:25):
If we start using AI, we must be careful and add leanchecker in the CI
Terence Tao (Oct 05 2024 at 17:16):
I'm not familiar with this tool. What does it do?
By the way, I was informed that there was a run by a major AI company performed on the implications in this project, which achieved a completion rate slightly better than our current state of 99.9%; but the proofs were very lengthy and not suitable for uploading directly in the project. They are hoping to make some more targeted contributions in the near future.
Shreyas Srinivas (Oct 05 2024 at 18:41):
I hope I don't botch up this explanation because it is a bit subtle. Hopefully @Joachim Breitner can correct me about it. Here is a link to the tool: https://github.com/leanprover/lean4checker
While Lean remains perfectly sound, Lean's metaprogramming system is very powerful and allows some flexibility in adding declarations to the "environment". Here, the environment is the ambient context of assumptions, hypotheses, etc. It is possible for a malicious actor to sneak in a proof of False
while doing this and evade #print_axioms
. This will seemingly allow exfalso quodlibet proofs.
Lean4checker replays the environment from scratch, this time checking each insertion of a hypothesis into the environment, thus ensuring that no False hypothesis has been inserted into the environment.
Shreyas Srinivas (Oct 05 2024 at 18:43):
The reason I didn't insist on this until now is that we have been using good old fashioned AI in the form of z3, Vampire etc.
Shreyas Srinivas (Oct 05 2024 at 18:44):
And these have produced proof certificates. We have essentially been reconstructing their proofs. The room for human error in this has been small.
Shreyas Srinivas (Oct 05 2024 at 18:45):
With modern AI being unpredictable as it is, we need to add this final layer of certainty since it might discover these metaprogramming sorceries
Terence Tao (Oct 05 2024 at 18:45):
I see. Would you expect implementing LeanChecker to noticeably increase the CI time? If not then it seems like a no-brainer to implement it, AI or no AI
Shreyas Srinivas (Oct 05 2024 at 18:46):
It is definitely going to increase the CI time a bit. We should be able to use mathlib's CI action for lean4checker more or less as it is. One thing that needs checking is how it interacts with cached mathlib build files.
Shreyas Srinivas (Oct 05 2024 at 18:47):
Because there are some underlying C libraries that lean uses that need to matches exactly with the machine that produces mathlib's binary files.
Shreyas Srinivas (Oct 05 2024 at 18:48):
I can check it out now in a PR.
Terence Tao (Oct 05 2024 at 18:51):
As long as the CI time is still on the order of minutes instead of hours, I'd say it's worth it to have the extra guarantee. (Note to future self reading this thread: when writing the final paper, make a note of the extra layers of checking we have beyond the core Lean compiler, for instance we also have a separate check that no extra axioms are being inserted (equational#242).)
It might be good to "red team" and brainstorm for any other potential ways in which this project could be compromised, either by a glitchy AI or by some mischevious human.
Shreyas Srinivas (Oct 05 2024 at 18:56):
This is already part of the standard mathlib check
Shreyas Srinivas (Oct 05 2024 at 19:07):
@Mario Carneiro : sorry for tagging you here, but I want to make sure I understand lean4checker correctly
Shreyas Srinivas (Oct 05 2024 at 19:07):
Do we need to delete the entire mathlib cache and rebuild it to run lean4checker on this repo?
Mario Carneiro (Oct 05 2024 at 19:10):
I should first caution that lean4checker is not an "external checker", it's actually just reusing the original lean checker but from a separate process. If there is a bug in the kernel then both lean and lean4checker will be compromised
Shreyas Srinivas (Oct 05 2024 at 19:10):
Another thing is how frequently would you advice we run lean4checker? We have gigantic PRs with 70-80k lines of code coming in, largely of lemmas and proofs from automatic provers. There is a chance of modern AI provers soon joining the fray
Mario Carneiro (Oct 05 2024 at 19:10):
The time taken by lean4checker is usually negligible, for mathlib it's around 5-10% of build time
Shreyas Srinivas (Oct 05 2024 at 19:11):
But mathlib's cache builds come from the same machine on which lean4checker runs correct?
Shreyas Srinivas (Oct 05 2024 at 19:11):
here we are using lake exe cache to get mathlib oleans
Mario Carneiro (Oct 05 2024 at 19:11):
I don't think it is in anyone's benefit to reduce the rate of running a trusted kernel
Shreyas Srinivas (Oct 05 2024 at 19:11):
Mathlib runs it once in 24 hours
Shreyas Srinivas (Oct 05 2024 at 19:12):
We might have to build mathlib in our CI from scratch to make sure that the machine bignum libraries match
Mario Carneiro (Oct 05 2024 at 19:12):
I'm aware, but mathlib also runs a lot of side runs and things. I don't think it was a good decision even then
Mario Carneiro (Oct 05 2024 at 19:13):
if you want high assurance the first thing you should be doing is actually running the piece of software that promises the high assurance
Mario Carneiro (Oct 05 2024 at 19:15):
lean4checker will check what you configure it to check. It can check everything including the parts of mathlib you depend on if you want it to
Shreyas Srinivas (Oct 05 2024 at 19:16):
If we configure it to only check the code we write, is that safe? and does that help us avoiding rebuilding mathlib from scratch locally?
Mario Carneiro (Oct 05 2024 at 19:16):
While we're on the subject, I'll also mention lean4lean, which is actually an external checker, and has the same interface as lean4checker
Mario Carneiro (Oct 05 2024 at 19:16):
Neither of these tools requires that you rebuild mathlib
Mario Carneiro (Oct 05 2024 at 19:16):
they work based on the produced .olean files
Shreyas Srinivas (Oct 05 2024 at 19:17):
The lean4checker explicitly advises against using cached oleans
Mario Carneiro (Oct 05 2024 at 19:18):
oleans that cause lean4checker to have problems also cause lean to have problems
Mario Carneiro (Oct 05 2024 at 19:18):
so as long as the cache is working for you there is not much to worry about
Mario Carneiro (Oct 05 2024 at 19:18):
the text there actually says that it will fail if you give it oleans for the wrong OS
Mario Carneiro (Oct 05 2024 at 19:18):
which is fine, that's failing in the right direction
Notification Bot (Oct 05 2024 at 19:19):
34 messages were moved here from #Equational > Thoughts and impressions thread by Shreyas Srinivas.
Mario Carneiro (Oct 05 2024 at 19:20):
in any case, when you use lake exe cache get
you aren't downloading .olean
s directly, you are downloading a cross platform format .lgz
which is unpacked into .olean
s suitable for your OS
Shreyas Srinivas (Oct 05 2024 at 19:20):
I am now considering setting up lean4lean
Shreyas Srinivas (Oct 05 2024 at 19:21):
If we want high assurance, we might as well go for the highest assurance
Shreyas Srinivas (Oct 05 2024 at 19:29):
Is there like a general instruction file on how to use lean4lean on other projects?
Mario Carneiro (Oct 05 2024 at 19:30):
the instructions are in the readme, LMK if they need improvement
Shreyas Srinivas (Oct 05 2024 at 22:08):
okay, I will get on with it and if I succeed, we should have lean4lean checking each PR
Shreyas Srinivas (Oct 06 2024 at 01:00):
@Mario Carneiro : I tested lean4lean locally
Shreyas Srinivas (Oct 06 2024 at 01:01):
Initially it failed on the equations project
Shreyas Srinivas (Oct 06 2024 at 01:01):
Then I tried it on a fresh batteries clone and it reported failures
Shreyas Srinivas (Oct 06 2024 at 01:01):
Turns out the toolchains need to match
Shreyas Srinivas (Oct 06 2024 at 01:01):
I got it working for batteries toolchain v4.12.0-rc1
Shreyas Srinivas (Oct 06 2024 at 01:02):
This is the latest toolchain on lean4lean
Mario Carneiro (Oct 06 2024 at 01:02):
yes, you need to rebuild it with the same toolchain as the project
Mario Carneiro (Oct 06 2024 at 01:02):
same for lean4checker btw
Shreyas Srinivas (Oct 06 2024 at 01:02):
yeah but Kim keeps lean4checker up to date
Shreyas Srinivas (Oct 06 2024 at 01:03):
Can I push an update and a regular CI script to keep the toolchain up to date?
Mario Carneiro (Oct 06 2024 at 01:03):
Shreyas Srinivas said:
yeah but Kim keeps lean4checker up to date
I don't think so? The script to run lean4checker on your project just rebuilds it, and it doesn't break often
Mario Carneiro (Oct 06 2024 at 01:04):
Shreyas Srinivas said:
Can I push an update and a regular CI script to keep the toolchain up to date?
Sure. Is lean-action able to do this?
Shreyas Srinivas (Oct 06 2024 at 01:06):
I tried building lean4lean on 4.13.0-rc3
Shreyas Srinivas (Oct 06 2024 at 01:06):
It fails on batteries dependencies
Shreyas Srinivas (Oct 06 2024 at 01:06):
I'll try deleting the manifest
Shreyas Srinivas (Oct 06 2024 at 01:07):
It still errors, this time on Verify
Shreyas Srinivas (Oct 06 2024 at 01:08):
First error:
error: ././././Lean4Lean/Verify/Axioms.lean:10:36: invalid field 'data', the environment does not contain 'Array.data'
x✝
has type
Array α
Shreyas Srinivas (Oct 06 2024 at 01:09):
I got the lean4lean cli to build, but I did not have to do this on 4.12.0-rc1
Mario Carneiro (Oct 06 2024 at 01:11):
wow, apparently lean core is still struggling with backward compatibility
Mario Carneiro (Oct 06 2024 at 01:11):
I'll deal with this, not right now
Shreyas Srinivas (Oct 06 2024 at 01:11):
Oh and lean4lean managed to crash my system
Shreyas Srinivas (Oct 06 2024 at 01:12):
To be fair equational_theories has some huge files
Mario Carneiro (Oct 06 2024 at 01:14):
you have to be careful, the multithreaded mode tends to use up all your memory if you don't limit the number of threads because it has to build a bunch of mathlibs in memory
Mario Carneiro (Oct 06 2024 at 01:15):
I can usually only run it in single threaded mode
Shreyas Srinivas (Oct 06 2024 at 01:15):
Yes. 28 GB out of 32GB ram
Shreyas Srinivas (Oct 06 2024 at 01:16):
It went upto 35 and started using swap
Shreyas Srinivas (Oct 06 2024 at 01:16):
I have never been in that territory with this machine
Shreyas Srinivas (Oct 06 2024 at 01:17):
It did find a problem in ProofWidgets.Component.Recharts
Shreyas Srinivas (Oct 06 2024 at 01:18):
README doesn't suggest an option to limit threads.
Mario Carneiro (Oct 06 2024 at 01:21):
env LEAN_NUM_THREADS=n lean4lean ...
Shreyas Srinivas (Oct 06 2024 at 01:21):
Okay. I started it with taskset
Mario Carneiro (Oct 06 2024 at 01:22):
(BTW IIRC lean4checker has exactly the same behavior in terms of memory usage)
Mario Carneiro (Oct 06 2024 at 01:23):
maybe there is something more that can be done about it
Shreyas Srinivas (Oct 06 2024 at 01:24):
The taskset solution worked for a while, but memory started filling up very quickly
Shreyas Srinivas (Oct 06 2024 at 01:25):
Now trying the env method you suggested
Shreyas Srinivas (Oct 06 2024 at 01:52):
A few observations about lean4lean run on the project.
- I ran it with 2 threads starting at roughly 3:25 am. As of 3:48 am lean4lean is still running. This means CI runs are going to be very very long.
- It takes upto 10 GB of RAM on an 9th gen i9 processor with 32 GB RAM. While not exactly brand new, this processor is rather powerful. I am guessing github doesn't run its CI allocating such resources.
- The threads seem to be unevenly used at times. I can't explain why. This is especially close to files for which lean4lean posts large numbers inside the occasional "lean4lean took xxxx" updates
- About the updates. On the one hand it serves as a nice progress counter. On the other hand if lean4lean finds a problem, it simply logs them in the same stream of outputs.
- The units of
lean4lean took xxxx
are unclear and I would guess milliseconds. I would also venture a guess that it reports whenever a file takes more than 1000 milliseconds or so - Could we have an option to get only the summary of problematic files in the end?
Mario Carneiro (Oct 06 2024 at 01:54):
there shouldn't be any problematic files, unless by problematic you mean taking a long time
Shreyas Srinivas (Oct 06 2024 at 01:54):
lean4lean found a problem in ProofWidgets.Component.Recharts
Shreyas Srinivas (Oct 06 2024 at 01:54):
The only one so far
Mario Carneiro (Oct 06 2024 at 01:55):
there should be an error after that?
Mario Carneiro (Oct 06 2024 at 01:56):
actually it should immediately exit after printing that
Shreyas Srinivas (Oct 06 2024 at 01:56):
None. lean4lean continues running and posting updates for files and lean4lean took xxxx
where xxxx > 1000
always
Shreyas Srinivas (Oct 06 2024 at 01:57):
This one has the largest xxxx
: Mathlib.Analysis.Complex.UpperHalfPlane.Metric:thmDecl UpperHalfPlane.isometry_vertical_line: lean4lean took 25603
Mario Carneiro (Oct 06 2024 at 01:57):
the times are in ms as you predicted
Shreyas Srinivas (Oct 06 2024 at 01:58):
Here's a screenshot:
Screenshot from 2024-10-06 03-58-02.png
Shreyas Srinivas (Oct 06 2024 at 01:58):
should have cropped that
Mario Carneiro (Oct 06 2024 at 01:59):
oh note that the error is printed to stderr, while the rest goes to stdout, so there could be some races between the prints
Shreyas Srinivas (Oct 06 2024 at 02:00):
It's still running so it didn't quit after the error
Mario Carneiro (Oct 06 2024 at 02:00):
try changing Main.lean:288 to IO.eprintln s!"lean4lean found a problem in {m}: {e.toString}"
and see if you get any more information
Shreyas Srinivas (Oct 06 2024 at 02:01):
I still want to find out how many minutes it takes to finish the run
Shreyas Srinivas (Oct 06 2024 at 02:01):
I'll try after this run
Mario Carneiro (Oct 06 2024 at 02:02):
I don't think I've ever tried to build mathlib in that configuration
Mario Carneiro (Oct 06 2024 at 02:02):
the paper claims it can do mathlib in 58 minutes
Shreyas Srinivas (Oct 06 2024 at 02:02):
Two more 10k+ champions:
Mathlib.NumberTheory.RamificationInertia:thmDecl Ideal.powQuotSuccInclusion_injective: lean4lean took 11703
Mathlib.NumberTheory.FactorisationProperties:thmDecl Nat.weird_seventy: lean4lean took 14217
Mario Carneiro (Oct 06 2024 at 02:05):
(in case it wasn't clear, this is not the configuration you want for CI. It's spending all its time checking mathlib instead of your project)
Mario Carneiro (Oct 06 2024 at 02:07):
I think you want lean4lean Equational
where Equational
is replaced by the name of the root module
Shreyas Srinivas (Oct 06 2024 at 02:07):
Oh I see
Mario Carneiro (Oct 06 2024 at 02:07):
also with LEAN_NUM_THREADS
set to something low
Mario Carneiro (Oct 06 2024 at 02:08):
the default behavior is to check every .olean file in .lake
Shreyas Srinivas (Oct 06 2024 at 02:11):
I terminated the older run at 4:10 am
This new run on the project was instantaneous
Shreyas Srinivas (Oct 06 2024 at 02:11):
but gave no success message
Shreyas Srinivas (Oct 06 2024 at 02:15):
Mario Carneiro said:
try changing Main.lean:288 to
IO.eprintln s!"lean4lean found a problem in {m}: {e.toString}"
and see if you get any more information
Tried this suggestion on the older run (equivalent to --fresh I guess)
Shreyas Srinivas (Oct 06 2024 at 02:15):
Got a longer error message
lean4lean found a problem in ProofWidgets.Component.Recharts: failed to read file '././.lake/packages/proofwidgets/.lake/build/lib/ProofWidgets/Component/Recharts.olean', invalid header
Shreyas Srinivas (Oct 06 2024 at 02:16):
But lean4lean hasn't quit
Shreyas Srinivas (Oct 06 2024 at 02:16):
The good news in all this is that the project is sound so far
Mario Carneiro (Oct 06 2024 at 02:18):
ah right, I remember this issue. ProofWidgets has a flag saying to download a release, even though it contains out of date olean files. lake exe cache
replaces them with correct olean files, but some files are left over and not overwritten, for lean files that have since been deleted
Mario Carneiro (Oct 06 2024 at 02:19):
these files are basically dead and unused, but lean4lean just checks everything in the folder because it doesn't do dependency analysis in the default mode because it doesn't know what the target is
Mario Carneiro (Oct 06 2024 at 02:20):
when you do lean4lean Equational
it will check only the dependencies of Equational
module, so make sure all the results you want checked are transitively reachable from it
Shreyas Srinivas (Oct 06 2024 at 02:21):
will do. I need to catch some sleep. I will put the CI up tomorrow
Mario Carneiro (Oct 06 2024 at 05:47):
Shreyas Srinivas said:
Can I push an update and a regular CI script to keep the toolchain up to date?
I pushed an update, but you are welcome to add the CI script
Shreyas Srinivas (Oct 07 2024 at 22:21):
@Mario Carneiro : Is it okay if I add tags to lean4lean? From the equational_theories side, it makes a lot of sense to clone the repository at the matching tag and if one is not found then complain about it.
Shreyas Srinivas (Oct 07 2024 at 22:25):
The alternative is to read the lean4lean file's lean-toolchain
Shreyas Srinivas (Oct 07 2024 at 22:26):
@Terence Tao : I am sorry it took me a couple of days, since I have been chasing other deadlines in my day job (PhD). I'll get this CI integrated now
Shreyas Srinivas (Oct 07 2024 at 22:26):
At the very least there will be a script to check the project with lean4lean soon.
Shreyas Srinivas (Oct 08 2024 at 10:19):
The CI PR is up already. Currently it only applies lean4lean if the toolchains match and if a freshly cloned lean4lean builds.
Shreyas Srinivas (Oct 09 2024 at 22:55):
@Mario Carneiro : in equational#421 we have CI action which invokes a python script that runs lean4lean on the project.
Shreyas Srinivas (Oct 09 2024 at 22:56):
One thing that bothers me is that lean4lean seems to run nearly instantaneously
Mario Carneiro (Oct 09 2024 at 22:56):
how did you invoke it?
Shreyas Srinivas (Oct 09 2024 at 22:56):
I want to make sure the script is doing the right thing.
Shreyas Srinivas (Oct 09 2024 at 22:59):
env LEAN_NUM_THREADS=2 lake env <lean4lean_bin_path> --verbose equational_theories
Shreyas Srinivas (Oct 09 2024 at 23:08):
Important point : The script is called as a step in the blueprint.yml CI where the project has already been built
Shreyas Srinivas (Oct 10 2024 at 16:57):
@Mario Carneiro : quick reminder about this.
Shreyas Srinivas (Oct 10 2024 at 17:32):
I was reading that script and realised it would be pointless to run lake update when trying to match toolchains
Last updated: May 02 2025 at 03:31 UTC