Zulip Chat Archive

Stream: Equational

Topic: CI Lean Environment Check

Shreyas Srinivas (Oct 05 2024 at 09:25):

If we start using AI, we must be careful and add leanchecker in the CI

Terence Tao (Oct 05 2024 at 17:16):

I'm not familiar with this tool. What does it do?

By the way, I was informed that there was a run by a major AI company performed on the implications in this project, which achieved a completion rate slightly better than our current state of 99.9%; but the proofs were very lengthy and not suitable for uploading directly in the project. They are hoping to make some more targeted contributions in the near future.

Shreyas Srinivas (Oct 05 2024 at 18:41):

I hope I don't botch up this explanation because it is a bit subtle. Hopefully @Joachim Breitner can correct me about it. Here is a link to the tool: https://github.com/leanprover/lean4checker

While Lean remains perfectly sound, Lean's metaprogramming system is very powerful and allows some flexibility in adding declarations to the "environment". Here, the environment is the ambient context of assumptions, hypotheses, etc. It is possible for a malicious actor to sneak in a proof of False while doing this and evade #print_axioms. This will seemingly allow exfalso quodlibet proofs.

Lean4checker replays the environment from scratch, this time checking each insertion of a hypothesis into the environment, thus ensuring that no False hypothesis has been inserted into the environment.

Shreyas Srinivas (Oct 05 2024 at 18:43):

The reason I didn't insist on this until now is that we have been using good old fashioned AI in the form of z3, Vampire etc.

Shreyas Srinivas (Oct 05 2024 at 18:44):

And these have produced proof certificates. We have essentially been reconstructing their proofs. The room for human error in this has been small.

Shreyas Srinivas (Oct 05 2024 at 18:45):

With modern AI being unpredictable as it is, we need to add this final layer of certainty since it might discover these metaprogramming sorceries

Terence Tao (Oct 05 2024 at 18:45):

I see. Would you expect implementing LeanChecker to noticeably increase the CI time? If not then it seems like a no-brainer to implement it, AI or no AI

Shreyas Srinivas (Oct 05 2024 at 18:46):

It is definitely going to increase the CI time a bit. We should be able to use mathlib's CI action for lean4checker more or less as it is. One thing that needs checking is how it interacts with cached mathlib build files.

Shreyas Srinivas (Oct 05 2024 at 18:47):

Because there are some underlying C libraries that lean uses that need to matches exactly with the machine that produces mathlib's binary files.

Shreyas Srinivas (Oct 05 2024 at 18:48):

I can check it out now in a PR.

Terence Tao (Oct 05 2024 at 18:51):

As long as the CI time is still on the order of minutes instead of hours, I'd say it's worth it to have the extra guarantee. (Note to future self reading this thread: when writing the final paper, make a note of the extra layers of checking we have beyond the core Lean compiler, for instance we also have a separate check that no extra axioms are being inserted (equational#242).)

It might be good to "red team" and brainstorm for any other potential ways in which this project could be compromised, either by a glitchy AI or by some mischevious human.

Shreyas Srinivas (Oct 05 2024 at 18:56):

This is already part of the standard mathlib check

Shreyas Srinivas (Oct 05 2024 at 19:07):

@Mario Carneiro : sorry for tagging you here, but I want to make sure I understand lean4checker correctly

Shreyas Srinivas (Oct 05 2024 at 19:07):

Do we need to delete the entire mathlib cache and rebuild it to run lean4checker on this repo?

Mario Carneiro (Oct 05 2024 at 19:10):

I should first caution that lean4checker is not an "external checker", it's actually just reusing the original lean checker but from a separate process. If there is a bug in the kernel then both lean and lean4checker will be compromised

Shreyas Srinivas (Oct 05 2024 at 19:10):

Another thing is how frequently would you advice we run lean4checker? We have gigantic PRs with 70-80k lines of code coming in, largely of lemmas and proofs from automatic provers. There is a chance of modern AI provers soon joining the fray

Mario Carneiro (Oct 05 2024 at 19:10):

The time taken by lean4checker is usually negligible, for mathlib it's around 5-10% of build time

Shreyas Srinivas (Oct 05 2024 at 19:11):

But mathlib's cache builds come from the same machine on which lean4checker runs correct?

Shreyas Srinivas (Oct 05 2024 at 19:11):

here we are using lake exe cache to get mathlib oleans

Mario Carneiro (Oct 05 2024 at 19:11):

I don't think it is in anyone's benefit to reduce the rate of running a trusted kernel

Shreyas Srinivas (Oct 05 2024 at 19:11):

Mathlib runs it once in 24 hours

Shreyas Srinivas (Oct 05 2024 at 19:12):

We might have to build mathlib in our CI from scratch to make sure that the machine bignum libraries match

Mario Carneiro (Oct 05 2024 at 19:12):

I'm aware, but mathlib also runs a lot of side runs and things. I don't think it was a good decision even then

Mario Carneiro (Oct 05 2024 at 19:13):

if you want high assurance the first thing you should be doing is actually running the piece of software that promises the high assurance

Mario Carneiro (Oct 05 2024 at 19:15):

lean4checker will check what you configure it to check. It can check everything including the parts of mathlib you depend on if you want it to

Shreyas Srinivas (Oct 05 2024 at 19:16):

If we configure it to only check the code we write, is that safe? and does that help us avoiding rebuilding mathlib from scratch locally?

Mario Carneiro (Oct 05 2024 at 19:16):

While we're on the subject, I'll also mention lean4lean, which is actually an external checker, and has the same interface as lean4checker

Mario Carneiro (Oct 05 2024 at 19:16):

Neither of these tools requires that you rebuild mathlib

Mario Carneiro (Oct 05 2024 at 19:16):

they work based on the produced .olean files

Shreyas Srinivas (Oct 05 2024 at 19:17):

The lean4checker explicitly advises against using cached oleans

Mario Carneiro (Oct 05 2024 at 19:18):

oleans that cause lean4checker to have problems also cause lean to have problems

Mario Carneiro (Oct 05 2024 at 19:18):

so as long as the cache is working for you there is not much to worry about

Mario Carneiro (Oct 05 2024 at 19:18):

the text there actually says that it will fail if you give it oleans for the wrong OS

Mario Carneiro (Oct 05 2024 at 19:18):

which is fine, that's failing in the right direction

Notification Bot (Oct 05 2024 at 19:19):

34 messages were moved here from #Equational > Thoughts and impressions thread by Shreyas Srinivas.

Mario Carneiro (Oct 05 2024 at 19:20):

in any case, when you use lake exe cache get you aren't downloading .oleans directly, you are downloading a cross platform format .lgz which is unpacked into .oleans suitable for your OS

Shreyas Srinivas (Oct 05 2024 at 19:20):

I am now considering setting up lean4lean

Shreyas Srinivas (Oct 05 2024 at 19:21):

If we want high assurance, we might as well go for the highest assurance

Shreyas Srinivas (Oct 05 2024 at 19:29):

Is there like a general instruction file on how to use lean4lean on other projects?

Mario Carneiro (Oct 05 2024 at 19:30):

the instructions are in the readme, LMK if they need improvement

Shreyas Srinivas (Oct 05 2024 at 22:08):

okay, I will get on with it and if I succeed, we should have lean4lean checking each PR

Shreyas Srinivas (Oct 06 2024 at 01:00):

@Mario Carneiro : I tested lean4lean locally

Shreyas Srinivas (Oct 06 2024 at 01:01):

Initially it failed on the equations project

Shreyas Srinivas (Oct 06 2024 at 01:01):

Then I tried it on a fresh batteries clone and it reported failures

Shreyas Srinivas (Oct 06 2024 at 01:01):

Turns out the toolchains need to match

Shreyas Srinivas (Oct 06 2024 at 01:01):

I got it working for batteries toolchain v4.12.0-rc1

Shreyas Srinivas (Oct 06 2024 at 01:02):

This is the latest toolchain on lean4lean

Mario Carneiro (Oct 06 2024 at 01:02):

yes, you need to rebuild it with the same toolchain as the project

Mario Carneiro (Oct 06 2024 at 01:02):

same for lean4checker btw

Shreyas Srinivas (Oct 06 2024 at 01:02):

yeah but Kim keeps lean4checker up to date

Shreyas Srinivas (Oct 06 2024 at 01:03):

Can I push an update and a regular CI script to keep the toolchain up to date?

Mario Carneiro (Oct 06 2024 at 01:03):

Shreyas Srinivas said:

yeah but Kim keeps lean4checker up to date

I don't think so? The script to run lean4checker on your project just rebuilds it, and it doesn't break often

Mario Carneiro (Oct 06 2024 at 01:04):

Shreyas Srinivas said:

Can I push an update and a regular CI script to keep the toolchain up to date?

Sure. Is lean-action able to do this?

Shreyas Srinivas (Oct 06 2024 at 01:06):

I tried building lean4lean on 4.13.0-rc3

Shreyas Srinivas (Oct 06 2024 at 01:06):

It fails on batteries dependencies

Shreyas Srinivas (Oct 06 2024 at 01:06):

I'll try deleting the manifest

Shreyas Srinivas (Oct 06 2024 at 01:07):

It still errors, this time on Verify

Shreyas Srinivas (Oct 06 2024 at 01:08):

First error:

error: ././././Lean4Lean/Verify/Axioms.lean:10:36: invalid field 'data', the environment does not contain 'Array.data'
  x✝
has type
  Array α

Shreyas Srinivas (Oct 06 2024 at 01:09):

I got the lean4lean cli to build, but I did not have to do this on 4.12.0-rc1

Mario Carneiro (Oct 06 2024 at 01:11):

wow, apparently lean core is still struggling with backward compatibility

Mario Carneiro (Oct 06 2024 at 01:11):

I'll deal with this, not right now

Shreyas Srinivas (Oct 06 2024 at 01:11):

Oh and lean4lean managed to crash my system

Shreyas Srinivas (Oct 06 2024 at 01:12):

To be fair equational_theories has some huge files

Mario Carneiro (Oct 06 2024 at 01:14):

you have to be careful, the multithreaded mode tends to use up all your memory if you don't limit the number of threads because it has to build a bunch of mathlibs in memory

Mario Carneiro (Oct 06 2024 at 01:15):

I can usually only run it in single threaded mode

Shreyas Srinivas (Oct 06 2024 at 01:15):

Yes. 28 GB out of 32GB ram

Shreyas Srinivas (Oct 06 2024 at 01:16):

It went upto 35 and started using swap

Shreyas Srinivas (Oct 06 2024 at 01:16):

I have never been in that territory with this machine

Shreyas Srinivas (Oct 06 2024 at 01:17):

It did find a problem in ProofWidgets.Component.Recharts

Shreyas Srinivas (Oct 06 2024 at 01:18):

README doesn't suggest an option to limit threads.

Mario Carneiro (Oct 06 2024 at 01:21):

env LEAN_NUM_THREADS=n lean4lean ...

Shreyas Srinivas (Oct 06 2024 at 01:21):

Okay. I started it with taskset

Mario Carneiro (Oct 06 2024 at 01:22):

(BTW IIRC lean4checker has exactly the same behavior in terms of memory usage)

Mario Carneiro (Oct 06 2024 at 01:23):

maybe there is something more that can be done about it

Shreyas Srinivas (Oct 06 2024 at 01:24):

The taskset solution worked for a while, but memory started filling up very quickly

Shreyas Srinivas (Oct 06 2024 at 01:25):

Now trying the env method you suggested

Shreyas Srinivas (Oct 06 2024 at 01:52):

A few observations about lean4lean run on the project.

I ran it with 2 threads starting at roughly 3:25 am. As of 3:48 am lean4lean is still running. This means CI runs are going to be very very long.
It takes upto 10 GB of RAM on an 9th gen i9 processor with 32 GB RAM. While not exactly brand new, this processor is rather powerful. I am guessing github doesn't run its CI allocating such resources.
The threads seem to be unevenly used at times. I can't explain why. This is especially close to files for which lean4lean posts large numbers inside the occasional "lean4lean took xxxx" updates
About the updates. On the one hand it serves as a nice progress counter. On the other hand if lean4lean finds a problem, it simply logs them in the same stream of outputs.
The units of lean4lean took xxxx are unclear and I would guess milliseconds. I would also venture a guess that it reports whenever a file takes more than 1000 milliseconds or so
Could we have an option to get only the summary of problematic files in the end?

Mario Carneiro (Oct 06 2024 at 01:54):

there shouldn't be any problematic files, unless by problematic you mean taking a long time

Shreyas Srinivas (Oct 06 2024 at 01:54):

lean4lean found a problem in ProofWidgets.Component.Recharts

Shreyas Srinivas (Oct 06 2024 at 01:54):

The only one so far

Mario Carneiro (Oct 06 2024 at 01:55):

there should be an error after that?

Mario Carneiro (Oct 06 2024 at 01:56):

actually it should immediately exit after printing that

Shreyas Srinivas (Oct 06 2024 at 01:56):

None. lean4lean continues running and posting updates for files and lean4lean took xxxx where xxxx > 1000 always

Shreyas Srinivas (Oct 06 2024 at 01:57):

This one has the largest xxxx : Mathlib.Analysis.Complex.UpperHalfPlane.Metric:thmDecl UpperHalfPlane.isometry_vertical_line: lean4lean took 25603

Mario Carneiro (Oct 06 2024 at 01:57):

the times are in ms as you predicted

Shreyas Srinivas (Oct 06 2024 at 01:58):

Here's a screenshot:
Screenshot from 2024-10-06 03-58-02.png

Shreyas Srinivas (Oct 06 2024 at 01:58):

should have cropped that

Mario Carneiro (Oct 06 2024 at 01:59):

oh note that the error is printed to stderr, while the rest goes to stdout, so there could be some races between the prints

Shreyas Srinivas (Oct 06 2024 at 02:00):

It's still running so it didn't quit after the error

Mario Carneiro (Oct 06 2024 at 02:00):

try changing Main.lean:288 to IO.eprintln s!"lean4lean found a problem in {m}: {e.toString}" and see if you get any more information

Shreyas Srinivas (Oct 06 2024 at 02:01):

I still want to find out how many minutes it takes to finish the run

Shreyas Srinivas (Oct 06 2024 at 02:01):

I'll try after this run

Mario Carneiro (Oct 06 2024 at 02:02):

I don't think I've ever tried to build mathlib in that configuration

Mario Carneiro (Oct 06 2024 at 02:02):

the paper claims it can do mathlib in 58 minutes

Shreyas Srinivas (Oct 06 2024 at 02:02):

Two more 10k+ champions:

Mathlib.NumberTheory.RamificationInertia:thmDecl Ideal.powQuotSuccInclusion_injective: lean4lean took 11703
Mathlib.NumberTheory.FactorisationProperties:thmDecl Nat.weird_seventy: lean4lean took 14217

Mario Carneiro (Oct 06 2024 at 02:05):

(in case it wasn't clear, this is not the configuration you want for CI. It's spending all its time checking mathlib instead of your project)

Mario Carneiro (Oct 06 2024 at 02:07):

I think you want lean4lean Equational where Equational is replaced by the name of the root module

Shreyas Srinivas (Oct 06 2024 at 02:07):

Oh I see

Mario Carneiro (Oct 06 2024 at 02:07):

also with LEAN_NUM_THREADS set to something low

Mario Carneiro (Oct 06 2024 at 02:08):

the default behavior is to check every .olean file in .lake

Shreyas Srinivas (Oct 06 2024 at 02:11):

I terminated the older run at 4:10 am

This new run on the project was instantaneous

Shreyas Srinivas (Oct 06 2024 at 02:11):

but gave no success message

Shreyas Srinivas (Oct 06 2024 at 02:15):

Mario Carneiro said:

try changing Main.lean:288 to IO.eprintln s!"lean4lean found a problem in {m}: {e.toString}" and see if you get any more information

Tried this suggestion on the older run (equivalent to --fresh I guess)

Shreyas Srinivas (Oct 06 2024 at 02:15):

Got a longer error message

lean4lean found a problem in ProofWidgets.Component.Recharts: failed to read file '././.lake/packages/proofwidgets/.lake/build/lib/ProofWidgets/Component/Recharts.olean', invalid header

Shreyas Srinivas (Oct 06 2024 at 02:16):

But lean4lean hasn't quit

Shreyas Srinivas (Oct 06 2024 at 02:16):

The good news in all this is that the project is sound so far

Mario Carneiro (Oct 06 2024 at 02:18):

ah right, I remember this issue. ProofWidgets has a flag saying to download a release, even though it contains out of date olean files. lake exe cache replaces them with correct olean files, but some files are left over and not overwritten, for lean files that have since been deleted

Mario Carneiro (Oct 06 2024 at 02:19):

these files are basically dead and unused, but lean4lean just checks everything in the folder because it doesn't do dependency analysis in the default mode because it doesn't know what the target is

Mario Carneiro (Oct 06 2024 at 02:20):

when you do lean4lean Equational it will check only the dependencies of Equational module, so make sure all the results you want checked are transitively reachable from it

Shreyas Srinivas (Oct 06 2024 at 02:21):

will do. I need to catch some sleep. I will put the CI up tomorrow

Mario Carneiro (Oct 06 2024 at 05:47):

Shreyas Srinivas said:

Can I push an update and a regular CI script to keep the toolchain up to date?

I pushed an update, but you are welcome to add the CI script

Shreyas Srinivas (Oct 07 2024 at 22:21):

@Mario Carneiro : Is it okay if I add tags to lean4lean? From the equational_theories side, it makes a lot of sense to clone the repository at the matching tag and if one is not found then complain about it.

Shreyas Srinivas (Oct 07 2024 at 22:25):

The alternative is to read the lean4lean file's lean-toolchain

Shreyas Srinivas (Oct 07 2024 at 22:26):

@Terence Tao : I am sorry it took me a couple of days, since I have been chasing other deadlines in my day job (PhD). I'll get this CI integrated now

Shreyas Srinivas (Oct 07 2024 at 22:26):

At the very least there will be a script to check the project with lean4lean soon.

Shreyas Srinivas (Oct 08 2024 at 10:19):

The CI PR is up already. Currently it only applies lean4lean if the toolchains match and if a freshly cloned lean4lean builds.

Shreyas Srinivas (Oct 09 2024 at 22:55):

@Mario Carneiro : in equational#421 we have CI action which invokes a python script that runs lean4lean on the project.

Shreyas Srinivas (Oct 09 2024 at 22:56):

One thing that bothers me is that lean4lean seems to run nearly instantaneously

Mario Carneiro (Oct 09 2024 at 22:56):

how did you invoke it?

Shreyas Srinivas (Oct 09 2024 at 22:56):

I want to make sure the script is doing the right thing.

Shreyas Srinivas (Oct 09 2024 at 22:59):

env LEAN_NUM_THREADS=2 lake env <lean4lean_bin_path> --verbose equational_theories

Shreyas Srinivas (Oct 09 2024 at 23:08):

Important point : The script is called as a step in the blueprint.yml CI where the project has already been built

Shreyas Srinivas (Oct 10 2024 at 16:57):

@Mario Carneiro : quick reminder about this.

Shreyas Srinivas (Oct 10 2024 at 17:32):

I was reading that script and realised it would be pointless to run lake update when trying to match toolchains

Last updated: May 02 2025 at 03:31 UTC