Zulip Chat Archive

Stream: lean4

Topic: lean 4 hackers


James King (Nov 15 2021 at 02:43):

I pushed the new guide up tonight: https://agentultra.github.io/lean-4-hackers/

It's still a work in progress! I'm a push early, push often kind of person. I'll be adding more content following the style of the Lean 3 for Hackers guide with some new programs and different takes.

Thank you to the Lean community and the developers :pray: Your enthusiasm and expertise are inspiring!

Wojciech Nawrocki (Nov 15 2021 at 02:53):

Nice! Two comments:

  • One should install leanprover/lean4:nightly, stable is very outdated at the moment.
  • We are moving away from leanpkg, so the instructions would ideally point people towards lake. It has the same sort of command structure, but it's a different build system.

James King (Nov 15 2021 at 02:53):

Oh sweet, thanks for the update!

James King (Nov 16 2021 at 02:41):

Updated the guide and added a new section. :+1:

Paul Chisholm (Nov 16 2021 at 08:58):

A couple of comments:

  • You say "Note that the .gitignore file may need to be modified to contain: ...". Lake always adds '/build' to the '.gitignore' file so I don't think that modification is ever necessary.
  • Might also mention the 'lakefile.lean' and 'lean-toolchain' files created by Lake.
  • I am using a MacOS so I don't know if this is system dependent, but all three build folders have '.trace' files in them. Possibly worth mentioning them if not system specific. I have no idea what they are for.

Anders Christiansen Sørby (Nov 17 2021 at 20:43):

@James King You should also include stuff about the nix setup. I'm all in for that and use nix for building all software. Nix offers a plug and play experience for any software project which is really nice. It builds all packages deterministically and content addressed which means caching and distribution of builds is super easy.
Take a lot at the experimental template I set up here https://github.com/yatima-inc/nix-utils/tree/main/templates/lean-package
More work is needed, but this is a working deterministic build system

Scott Morrison (Nov 17 2021 at 21:01):

To be honest, I'd prefer to keep discussion of nix out of any getting started and new user guides.

Scott Morrison (Nov 17 2021 at 21:01):

In my understanding, nix is great until you have aspirations to work on all platforms. The Lean ecosystem needs to work painlessly on windows and macos, and hopefully all getting started guides can be written as platform agnostically as possible

Anders Christiansen Sørby (Nov 17 2021 at 21:10):

Yeah. There is nothing about nix that is very platform dependent really, but the windows support is not good (WSL is no problem though). MacOS support is okay for many things I think. I use NixOS myself.

Anders Christiansen Sørby (Nov 17 2021 at 21:11):

Nix should in theory be the best platform agnostic build tool, but not enough effort has been put into this yet

Mario Carneiro (Nov 17 2021 at 21:15):

I have never used nix, but it is not clear to me what advantage it brings for people who don't want to move their entire development process into its framework. Installing stuff works just fine for me, I don't have configuration bugs on a regular basis so repeatable builds is only a problem for software distributors and CI maintainers, not me (unless I hit on a configuration bug of course)

Mario Carneiro (Nov 17 2021 at 21:18):

If nix is a "build tool" then why should users need to care about it? If they just get the resulting executable then no build process is needed

Mario Carneiro (Nov 17 2021 at 21:18):

if you mean the build process for lean projects, that's managed by lake not nix

Anders Christiansen Sørby (Nov 17 2021 at 21:25):

There is a nix build as well which works very fine. I use this for all my lean projects.
You need to try it to experience the benefits. It is not perfect, but since I started using it I've never wanted to go back.
Nix is just as nice for one package as it is for the entire system.
It is not about installing stuff as much as providing a very configurable content addressed environment. Usually you express all dependencies in a file.
It has been especially useful for FFI since it can load and build any package.

Anders Christiansen Sørby (Nov 17 2021 at 21:29):

The thing nix does that for example lake doesn't is provide the content addresses lean binary and Lean packages for the build which can be set to any value. You don't need elan because nix provides lean already.

James King (Nov 18 2021 at 00:07):

I'd consider adding a separate page off the main one for nix at least. I'm not a nix user myself but I know quite a few hackers who are. A PR would be more than welcome as I'm not sure I should be writing about nix. :)

Mac (Nov 18 2021 at 03:48):

Anders Christiansen Sørby said:

The thing nix does that for example lake doesn't is provide the content addresses lean binary and Lean packages for the build which can be set to any value. You don't need elan because nix provides lean already.

I do not get this statement. Currently, everything lake builds and uses is content addressed -- that is what the .trace file is -- the content hash for fetching the target in question.

Mac (Nov 18 2021 at 03:49):

Also, even without Lake, every Lean package is content addressed because they are Git repositories and thus the commit hash serves as a content address.

Gabriel Ebner (Nov 18 2021 at 09:47):

To be honest, I'd prefer to keep discussion of nix out of any getting started and new user guides.

I can only agree with this statement. The nix build is a wonderful tech demo. I would have never thought that you could use nix on this level of granularity and that it would be as fast as Sebastian made it.

That said, it splits the ecosystem. The nix build configuration is completely separate from the lake one. You effectively need to maintain (and test) two build configurations. Otherwise we get into a wonderful version of hell: "yes we have a formalization of reals and a formalization of linear algebra. But one uses nix and the other uses lake, so no you can't use them together."

I think that the nix build should only be used in production once we have a reliable lake2nix tool. And then the lakefile should always come first.

Gabriel Ebner (Nov 18 2021 at 09:57):

which means caching and distribution of builds is super easy.

While this sounds like a neat solution on first glance, I don't think this will work for formalized mathematics (where we have multi-hour build times). For the current programming projects this is much less of a necessity. Who needs caching if a fresh build only takes a minute?

The reason I don't think it will work for math is because nix (by design) only reuses the cache if it was produced by the same Lean binary, with the same dependencies, etc. (Fun fact: changing a comment in the build script for the glibc package will rebuild all of nixpkgs.) In particular, this means you can't reuse the cache from Linux/amd64 on Darwin/aarch64 (even though both are supported by nixpkgs). Even ignoring windows, this means we need to build mathlib a couple of times, once for each architecture.
This is simply unacceptable, we are already at the limit of CI capacity. Compiling mathlib multiple times is out of the question.

Johan Commelin (Nov 18 2021 at 10:00):

At least not nightly (or even PRly) builds. I guess we could build a monthly nix cache, or so.

Sebastian Ullrich (Nov 18 2021 at 10:43):

Gabriel Ebner said:

In particular, this means you can't reuse the cache from Linux/amd64 on Darwin/aarch64 (even though both are supported by nixpkgs).

I'm not volunteering :sweat_smile: , but I have long wondered what a general solution to this issue could look like. Perhaps with content addressing activated, one could simply tell Nix "yeah, this derivation for macOS will definitely be realised into this store path produced by the Linux derivation, just trust me on that". I don't think Cachix supports storing and querying the realisation map yet though, it's still early days regarding CA.

Sebastian Ullrich (Nov 18 2021 at 10:46):

And then every month or so you might want to --check that invariant on the actual platforms

Anders Christiansen Sørby (Nov 18 2021 at 11:17):

I see your points. Didn't mean to hijack this as a nix discussion. Let's continue in the "Nix builds" topic.

Gabriel Ebner (Nov 18 2021 at 11:41):

one could simply tell Nix "yeah, this derivation for macOS will definitely be realised into this store path produced by the Linux derivation, just trust me on that"

I don't think such an option would ever make it past Eelco. He's already super unhappy about the cargo-vendor stuff (i.e., that we use a fixed-output derivation for the automatically generated vendor directory, instead of checksums for the individual upstream tarballs). Intentional impurity seems even less likely to be accepted.

James King (Nov 21 2021 at 19:52):

Alright, updated the guide again and added our first real-world program: an implementation of the wc program in Lean 4. :tada:

Varun Gandhi (Nov 23 2021 at 03:06):

Hey folks. I was wondering if a contribution for supporting Ninja as the generator would be accepted. Currently, only the Unix Makefiles generator is supported. (Related question: I noticed files under stage0/ being marked as auto-generated, and they largely seem to be the same as the files under src/ -- how do I trigger that generation?)

Wojciech Nawrocki (Nov 23 2021 at 03:28):

@Varun Gandhi The Lean devs are very pragmatic, so an additional generator is only likely to be accepted if the benefit of adding it outweighs the burden of code review and build breakage and maintenance. What would the benefit of Ninja be? Note that Lean already builds on all supported platforms (with MinGW on Windows).

For updating stage0/, you can use scripts/update-stage0. Lean 4 is a self-compiling compiler, and the stage0/ files serve to bootstrap it.

Varun Gandhi (Nov 23 2021 at 05:50):

The benefits of Ninja are (1) faster incremental builds, especially when working on leaf-y files, (2) ninjatracing (https://github.com/nico/ninjatracing) to profile builds and (3) built-in functionality to query targets via different subtools.

As an example, I did some preliminary measurements with Clang (files recompiled/total files), since that is a project I'm familiar with:

  • When changing a leafy file (e.g. 20/4300): 3.7s (Ninja) vs 7.0s (Make) -- To me, this is the most compelling point.
  • When changing a file much deeper (600/4300), or doing a full build (4300/4300): They're about the same with Ninja being 1~2% faster. This is not a big deal IMO: the key bit is Ninja is not slower.
    When I worked on Swiftc, we did not support make at all, because ninja is ~better along nearly all dimensions, with the incremental builds being the biggest win.

That said, I haven't actually implemented the change for Lean (I wanted to ask first), so I don't know how much the incremental build benefits will carry over.

Thanks for the pointer in the code.

Mac (Nov 23 2021 at 06:52):

@Varun Gandhi the problem with this idea is that primary bottleneck for the Lean dev build is not the CMake build but the rather the bootstrap Lean build which would be unaffected by the change in the CMake generator.

Varun Gandhi (Nov 24 2021 at 02:02):

@Mac : Just to make sure I'm correctly understanding what you're saying, the slow part of incremental builds is due to src/lean.mk.in, is that right?

Mac (Nov 24 2021 at 05:52):

@Varun Gandhi yes, though it is more due to the inefficiency of Lean builds (due to transitive import dependencies) than the Makefile / make itself

James King (Nov 29 2021 at 00:51):

Updated the WordCount example program and benchmarks, thanks for all of the feedback from here and on Github and Reddit. :tada:


Last updated: Dec 20 2023 at 11:08 UTC