Zulip Chat Archive

Stream: general

Topic: Speccing a mathlib Mac


Stuart Presnell (Jun 09 2022 at 15:18):

I'm thinking about buying a new Mac laptop to replace my struggling 2013 MacBook Pro. If you were speccing out a machine that's primarily intended for mathlib work, what would you be looking for? For example (and sorry for the very basic questions!):

  • Should I be looking to get more memory or more CPU/GPU cores?
  • Is there a threshold on memory or cores beyond which I won't see any noticeable benefit, or can VSCode use whatever I can throw at it?
  • I saw that some people were having issues with getting things set up on M1 machines. Are these issues generally resolved now?
  • If you're using a Mac, what are you using?
  • If you've upgraded from an old Mac to a new one have you seen appreciable improvements in performance when working on mathlib, or should I lower my expectations?

Johan Commelin (Jun 09 2022 at 15:20):

If you only care about running one lean project at a time, 16~20 GB RAM should be fine. Maxing out on CPU will help a lot, there's a lot of stuff that can be parallelized when building Lean projects. GPU isn't helpful at the moment.

Anne Baanen (Jun 09 2022 at 15:21):

I can confirm that 16GB is not enough anymore for compiling mathlib on my machine, so I am considering upgrading to 24 or 32 GB

Adam Topaz (Jun 09 2022 at 15:21):

I think @Matthew Ballard has experience running lean on the newer MacBook

Eric Wieser (Jun 09 2022 at 15:22):

I've been doing my mathlib work exclusively on gitpod for months now since my 8GB is nowhere near enough

Stuart Presnell (Jun 09 2022 at 15:23):

I'm assuming that even with a powerful new computer I would still be relying on pushing stuff up to GitHub and letting the remote machine do the heavy work, rather than recompiling mathlib locally?

Julian Berman (Jun 09 2022 at 15:27):

For completeness, I'm on an M1 MacBook Air with no upgrades == 8GB RAM, and have no real complaints honestly, but I do do a lot less work than others so it's possible I'd be more frustrated if I did. If I have a need to compile mathlib I do it on a beefier machine, or really I let GH do it and just pull the oleans, but I compile lean, lean4 and friends regularly.

Matthew Ballard (Jun 09 2022 at 15:29):

My data point. I had a M1 Macbook Air with 8 gb and building all mathlib was not possible. I now have a MacBook Pro with an M1 max with 64gb and it takes a little under an hour to build mathlib.

My belief, from experience and looking at other benchmarking, is that the processor bump from M1 to M1 max would not be seriously noticeable for these tasks.

Matthew Ballard (Jun 09 2022 at 15:30):

FYI. The new M2 lifts the memory upper limit to 24gb.

Anne Baanen (Jun 09 2022 at 15:31):

I'm using a late 2013 MacBook Pro (version 11,3) and a custom built desktop (Intel Core i5 8500 3.0 GHz; 16G RAM) running Arch Linux. I have the impression that the 6 cores/6 threads of the desktop are faster than the 4 cores/8 threads of the MacBook, so upgrading a CPU definitely makes sense to me.

Eric Rodriguez (Jun 09 2022 at 15:43):

My m1 pro 16GB macbook with lean compiled to ARM runs something like 2x faster than my desktop with a Ryzen 3600+32GB, and as of a month or two ago could compile mathlib.

Matthew Ballard (Jun 09 2022 at 16:06):

Eric Rodriguez said:

My m1 pro 16GB macbook with lean compiled to ARM runs something like 2x faster than my desktop with a Ryzen 3600+32GB, and as of a month or two ago could compile mathlib.

Did you notice a performance bump when moving to native from emulated lean?

Stuart Presnell (Jun 09 2022 at 16:11):

Can I use reported benchmarks to extrapolate anything about the day-to-day experience of doing mathlib work? For example, I'm looking at a computer whose single-core Geekbench scores are 2-3 times better than my current machine, and whose multi-core scores are 6-8 times better. So, for example, when I edit a file and I'm waiting for the orange bar to chug through, should I expect it to go about twice as fast, or 7 times faster — or is there nothing informative that can be said about this?

Wojciech Nawrocki (Jun 09 2022 at 16:29):

@Stuart Presnell Lean 3 has features to process multiple parts of a single file in parallel, so it might go above 2-3x, but I am guessing it depends on how independent parts of the file are, and it would not be by much. That is, is it a sequence of lemmas where each depends on the previous - not very parallelisable - or a bunch of more independent definitions - maybe it can go above 3x. Where multicore really helps is building whole projects such as mathlib, since then many files can be processed in parallel. Lean 4 processes single files sequentially, so as far as a single elaboration pass is concerned it could not go above 3x, although having more cores still helps the editing experience - we handle changes to the file and various requests such as printing the goal using all the available cores.

Stuart Presnell (Jun 09 2022 at 16:32):

That's a point I hadn't considered: I guess the majority of the lifespan of this computer will be spent on Lean 4 rather than Lean 3, and I have no idea about how that changes things.

Jannis Limperg (Jun 09 2022 at 16:35):

I did some benchmarking when I set up the benchmarking bot and my experience was that the mathlib build does not scale super well beyond 4 cores. 6-8 still got faster, but not by as much as one would expect. Things might have changed since then, of course, but I would prioritise single-core performance if you have a choice.

Patrick Massot (Jun 09 2022 at 16:42):

I think Lean 3 checks proofs of lemmas in parallel, after elaborating their statements (and definitions). A proof cannot depend in another proof in Lean (unfortunately it can in informal maths).

Ruben Van de Velde (Jun 09 2022 at 16:46):

Will lean 4 not be able to do that? That sounds unfortunate

Eric Rodriguez (Jun 09 2022 at 17:08):

Matthew Ballard said:

Eric Rodriguez said:

My m1 pro 16GB macbook with lean compiled to ARM runs something like 2x faster than my desktop with a Ryzen 3600+32GB, and as of a month or two ago could compile mathlib.

Did you notice a performance bump when moving to native from emulated lean?

I never really tried it, it was 50% about the performance and 50% about the fun for me. I'll try next time there's a community version bump.

Wojciech Nawrocki (Jun 09 2022 at 19:25):

Ruben Van de Velde said:

Will lean 4 not be able to do that? That sounds unfortunate

Parallel proof compilation could be added as a future optimization, but I believe it was decided not to add it for now because it introduces lots of engineering complexity and the gain is not as great as it might seem. In practice one cannot just elaborate the statement first and then move on with the proof, as the full statement might depend on the proof. Here is a silly contrived example of where we must descend into the proof term to find out which assumptions to pick up.

variable (a : Nat) (h3 : 3 = a) (h4 : 4 = a)
theorem le_a_3 : 3  a := Nat.le_of_eq h3
theorem le_a_3' : 3  a := Nat.le_of_succ_le (Nat.le_of_eq h4)
/- le_a_3 : ∀ (a : Nat), 3 = a → 3 ≤ a -/
#check @le_a_3
/- le_a_3' : ∀ (a : Nat), 4 = a → 3 ≤ a -/
#check @le_a_3'

Patrick Massot (Jun 09 2022 at 19:34):

To be honest I would prefer Lean doesn't try to guess which variables to include by inspecting the proof term. And non trivial proofs use tactics anyway.

Eric Rodriguez (Jun 09 2022 at 19:44):

yeah, it doesn't work with tactics and such which means it's just a mess 99% of the time

Eric Rodriguez (Jun 09 2022 at 19:44):

and often it doesn't work even without that, I had an example with is_cyclotomic_extension a couple days ago that makes sense but is really annoying

Sebastian Ullrich (Jun 10 2022 at 08:42):

Patrick Massot said:

To be honest I would prefer Lean doesn't try to guess which variables to include by inspecting the proof term. And non trivial proofs use tactics anyway.

No, inspecting the (final) proof term is what makes it work with arbitrary tactics, and is what Lean 4 does in contrast to Lean 3's lexical analysis of the surface syntax. Thus there is no need for an include statement in Lean 4.

Patrick Massot (Jun 10 2022 at 08:43):

Does it mean you chose to avoid needing include rather than allowing parallel proof checking?

Patrick Massot (Jun 10 2022 at 08:44):

Because I would clearly choose the other option (favoring parallel proof checking and using include from times to times).

Sebastian Ullrich (Jun 10 2022 at 08:47):

No, that was a distant secondary reason for removing parallel checking. The main reason was that it's complex to implement, and we wanted to do better than Lean 3's hacky implementation. But as Wojciech, I am somewhat optimistic that it could return in the future, preferably after we measure its potential impact on e.g. mathlib4.

Patrick Massot (Jun 10 2022 at 08:49):

Ok, thanks. That seemed a very weird reason to drop parallel checking. And indeed I may be imagining only that parallel checking is crucial. We'll see when measurement will be available.


Last updated: Dec 20 2023 at 11:08 UTC