Zulip Chat Archive

Stream: general

Topic: Speeding up compilation time for Lean project


Donald Sebastian Leung (Feb 27 2020 at 08:03):

This is broadly related to Adding Lean Support to Codewars, but this time I'm directly testing out on my PC instead of on Codewars' servers.

So suppose I create a Lean project as follows:

$ leanpkg new my_project
$ cd my_project
$ leanpkg add leanprover-community/mathlib

I then remove _target/deps/mathlib/src/ entirely and replace it with nightly .olean files for mathlib. Then, after that, I return to the command line:

$ find _target | grep "[.]lean$" | xargs rm # Remove all .lean files from mathlib
$ find _target | grep "[.]olean$" | xargs touch # Give same timestamp to all .olean files

Now that the setup is complete, I go to src/ in my project directory and create three files Preloaded.lean, Solution.lean and SolutionTest.lean of the following format:

/- Preloaded.lean -/
import analysis.specific_limits -- Import something from mathlib

def SUBMISSION := some_mathematical_statement
notation `SUBMISSION` := SUBMISSION -- cheat prevention; prevent override of SUBMISSION

/- Solution.lean -/
import Preloaded
import data.real.basic -- maybe import some more from mathlib

-- Solution likely uses some results from mathlib
theorem solution : some_mathematical_statement := some_proof

/- SolutionTest.lean -/
import Preloaded Solution

theorem submission : SUBMISSION := solution

However, I get extremely slow performance when I run:

$ lean SolutionTest.lean -E SolutionTest.out --only-export=submission && leanchecker SolutionTest.out submission

For example, doing this on a proof that 0.999... = 1 takes 1m30s on my machine when it should take only about 8 seconds. Even just compiling SolutionTest.lean as follows takes half a minute:

$ lean SolutionTest.lean

Does anyone know what I'm doing wrong which is causing this subpar performance? I thought using pre-compiled .olean files (and deleting the original .lean files!) for mathlib should speed up the compilation process?

Kevin Buzzard (Feb 27 2020 at 09:02):

I think that the issue is that leanpkg new creates a file leanpkg.toml which states that the Lean to be used in this project is Lean 3.4.2. When you download the nightlies they are almost certainly all compiled with Lean 3.5.1 so none of the olean files are valid for the lean binary which the package is being told to use.

Johan Commelin (Feb 27 2020 at 09:02):

What are you seeing in the output?

Kevin Buzzard (Feb 27 2020 at 09:04):

The community in general only started using versions of Lean after 3.4.2 this month, after well over a year of being fixed to 3.4.2, and not all of the tooling has caught up.

Donald Sebastian Leung (Feb 27 2020 at 09:11):

Kevin Buzzard said:

I think that the issue is that leanpkg new creates a file leanpkg.toml which states that the Lean to be used in this project is Lean 3.4.2. When you download the nightlies they are almost certainly all compiled with Lean 3.5.1 so none of the olean files are valid for the lean binary which the package is being told to use.

My leanpkg.toml is as follows which says that the Lean version is 3.5.1 as far as I can tell?

[package]
name = "cw-lean-kata"
version = "0.1"
lean_version = "3.5.1"
path = "src"

[dependencies]
mathlib = {git = "https://github.com/leanprover-community/mathlib", rev = "f0bb2f826bd20e217edf199b8703b5baeba1354a"}

Donald Sebastian Leung (Feb 27 2020 at 09:12):

Johan Commelin said:

What are you seeing in the output?

The usual stuff (some axioms printed out and theorem submission : SUBMISSION), nothing out of the ordinary. I can also paste the exact output if it helps.

Patrick Massot (Feb 27 2020 at 09:17):

Kevin Buzzard said:

I think that the issue is that leanpkg new creates a file leanpkg.toml which states that the Lean to be used in this project is Lean 3.4.2. When you download the nightlies they are almost certainly all compiled with Lean 3.5.1 so none of the olean files are valid for the lean binary which the package is being told to use.

This question was discussed in https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/default.20toolchain. The issue is a big bug in leanpkg.

Kevin Buzzard (Feb 27 2020 at 09:18):

I am completely confused by your leanpkg.toml file. My leanpkg makes files with the line lean_version = "3.4.2" and in the projects which I have upgraded by hand to Lean 3.5.1, they say lean_version = "leanprover-community/lean:3.5.1". I am not an expert in this side of things by any means. I've never seen a toml like yours before but it might be fine.

Kevin Buzzard (Feb 27 2020 at 09:19):

...but it might also not be fine.

Patrick Massot (Feb 27 2020 at 09:19):

I went to the lean repository to find the link above and I'm really stunned: it seems lean3.6 was released yesterday without fixing this critical bug which prevents using the community Lean with leanpkg.

Kevin Buzzard (Feb 27 2020 at 09:22):

OK so basically the story is that there has been a community fork of Lean for a long time, but none of the tools used it. We decided earlier this month that it was time to now get users using the community fork, but some of the tools don't work correctly yet. I am using Lean 3.5.1 no problem with this toml:

[package]
name = "M4P33"
version = "0.1"
lean_version = "leanprover-community/lean:3.5.1"
path = "src"

[dependencies]
mathlib = {git = "https://github.com/leanprover-community/mathlib", rev = "1b1b6
26ae32b9a23d4eeab59dd78e0f9b3ac594e"}

so the first question is whether your problems are fixed by changing the toml so that lean_version is what it says above, and mathlib is set so that the revision is precisely the revision which you're checking out with the gzipped tarfile.

Sebastian Ullrich (Feb 27 2020 at 09:29):

Presumably @Donald Sebastian Leung is not using elan, otherwise he would have gotten an error using his leanpkg.toml.
@Donald Sebastian Leung Does checking an empty Lean file take any significant time? If yes, it could be that Lean is recompiling its own core library. If no, then perhaps the 30s vs 8s checking time simply is a matter of different CPUs, esp. different number of cores.

Johan Commelin (Feb 27 2020 at 09:30):

But a typical 100 line file should be checkable in <1s on any cpu that is <5 yrs old, right?

Johan Commelin (Feb 27 2020 at 09:30):

The 8s mark is already quite high.

Donald Sebastian Leung (Feb 27 2020 at 09:30):

Kevin Buzzard said:

OK so basically the story is that there has been a community fork of Lean for a long time, but none of the tools used it. We decided earlier this month that it was time to now get users using the community fork, but some of the tools don't work correctly yet. I am using Lean 3.5.1 no problem with this toml:

[package]
name = "M4P33"
version = "0.1"
lean_version = "leanprover-community/lean:3.5.1"
path = "src"

[dependencies]
mathlib = {git = "https://github.com/leanprover-community/mathlib", rev = "1b1b6
26ae32b9a23d4eeab59dd78e0f9b3ac594e"}

so the first question is whether your problems are fixed by changing the toml so that lean_version is what it says above, and mathlib is set so that the revision is precisely the revision which you're checking out with the gzipped tarfile.

I changed the .toml to the following:

[package]
name = "M4P33"
version = "0.1"
lean_version = "leanprover-community/lean:3.5.1"
path = "src"

[dependencies]
mathlib = {git = "https://github.com/leanprover-community/mathlib-nightly/", rev = "1b1b626ae32b9a23d4eeab59dd78e0f9b3ac594e"}

but compilation time is still 1m30s

Donald Sebastian Leung (Feb 27 2020 at 09:31):

Johan Commelin said:

But a typical 100 line file should be checkable in <1s on any cpu that is <5 yrs old, right?

Not sure if it's relevant but my Mac is already 6.5 years old :sweat_smile:

Sebastian Ullrich (Feb 27 2020 at 09:32):

Johan Commelin said:

The 8s mark is already quite high.

IIRC from the previous topic, Kevin had an alternative proof that takes 1s, so I suppose the proof just isn't very optimized.

Johan Commelin (Feb 27 2020 at 09:32):

Well, I just said some numbers.

Johan Commelin (Feb 27 2020 at 09:32):

I've done demos of Euclids infinite primes theorem using a 12 year old laptop

Donald Sebastian Leung (Feb 27 2020 at 09:33):

Sebastian Ullrich said:

Presumably Donald Sebastian Leung is not using elan, otherwise he would have gotten an error using his leanpkg.toml.
Donald Sebastian Leung Does checking an empty Lean file take any significant time? If yes, it could be that Lean is recompiling its own core library. If no, then perhaps the 30s vs 8s checking time simply is a matter of different CPUs, esp. different number of cores.

Nope, running lean SolutionTest.lean on an empty SolutionTest.lean takes less than 3 seconds

Johan Commelin (Feb 27 2020 at 09:33):

That is significant, I think.

Johan Commelin (Feb 27 2020 at 09:33):

It should be <0.1s

Sebastian Ullrich (Feb 27 2020 at 09:34):

Then I think your hardware is simply ~4x as slow as Kevin's (who reported the 8s number, afair?)

Donald Sebastian Leung (Feb 27 2020 at 09:35):

Thanks for all the help, I guess I just need to get a faster computer then (and Codewars needs some faster servers, though their servers are about 2x as fast as my Mac)

Donald Sebastian Leung (Feb 27 2020 at 13:42):

Final update: I went with @Sebastian Ullrich 's suggestion and went with elan + compiling the mathlib myself (which I swear took at least 2 hours), and my 0.999... = 1 example took 1m45s to compile. At this point I can be certain that my Mac (and Codewars' servers as well) is simply too slow for Lean. Thank you for all your help and support, I think this thread (and the thread on adding Lean support to Codewars) can be closed now.

Kevin Buzzard (Feb 27 2020 at 16:28):

There is something wrong. No file this short takes 1m45s to compile. Your set-up is broken. Earlier on I pointed out that you were probably using incompatible Lean and mathlib somehow. All this talk of > 30 seconds is completely absurd. Your set-up is broken right now. Nobody else reports this kind of time on any system, even Johan's old laptop.

Kevin Buzzard (Feb 27 2020 at 16:46):

@Donald Sebastian Leung I compiled your proof in less than 8 seconds on old hardware. I strongly suspect that you are also compiling a large chunk of Lean's maths library in your tests. If you use VS Code on your mac and look at your proof compiling, where are the orange bars? If they stay at the top of the file for 1 minute then you are compiling mathlib. If they quickly move from the top of the file and then spend 1 minute on your code then you are right, your hardware is not quick enough. The big question is what is going on during that 1 minute. You want to be compiling your code. I strongly suspect you are compiling a large chunk of mathlib due to a bad set-up. I'm sorry the infrastructure is not there. Ironically if you had tried to do all this a month or so ago before Lean 3.5.1 appeared and added to the confusion then you might have found things easier. Right now our tools are bad and everything has to be done by hand. But your belief that your file plus a correctly compiled mathlib takes 1m45 to compile on a machine which is e.g. capable of running a web browser is absurd.

Sebastian Ullrich (Feb 27 2020 at 17:13):

If he deleted the .lean files, it's definitely not recompiling mathlib at least

Mario Carneiro (Feb 27 2020 at 18:12):

could it be recompiling the core library? That happens to me sometimes and the 1m45s number seems about right for that

Patrick Massot (Feb 27 2020 at 20:22):

Sebastian Ullrich said:

If he deleted the .lean files, it's definitely not recompiling mathlib at least

I think he deleted them but restored them. We keep talkinh about olean archive hosted on GitHub or Azure, but they actually contain the lean files as well.

Patrick Massot (Feb 27 2020 at 20:23):

It would be really nice in Lean 3.7 (and Lean 4...) to have an option asking Lean to log the list of files it compiles and why.

Jason Rute (Feb 27 2020 at 20:28):

I wonder with the new mathlib tools (leanproject or whatever it will be called) if there could be a tool which lets one test that one's setup is complete, i.e. that it is using the .olean files, and that it compiles a file in a reasonable timeframe. Then I think for cases like this, telling the user to run that command can be part of the debugging process (and maybe a final step in the setup instructions).

Patrick Massot (Feb 27 2020 at 20:29):

Could you please copy that message to https://github.com/leanprover-community/mathlib-tools/pull/16?

Donald Sebastian Leung (Mar 03 2020 at 15:04):

Mario Carneiro said:

could it be recompiling the core library? That happens to me sometimes and the 1m45s number seems about right for that

So it just occurred to me to ask: how can I find out if I'm recompiling the core library? If I do lean --profile someFile.lean and the compilation time in the cumulative profiling times is in the order of fractions of a millisecond then I suppose that that is not the case?


Last updated: Dec 20 2023 at 11:08 UTC