Zulip Chat Archive

Stream: general

Topic: Rethinking elan toolchain management


Sebastian Ullrich (Mar 21 2023 at 12:27):

Motivated by recent discussions and reports of elan footguns, I have been thinking about drastically simplifying elan's workings and the necessary mental model for users by removing any kind of state that a) is not directly recorded in the affected repo/project and b) is not reproducible from that record. In detail, this would mean

  • removing overrides (elan override) in favor of temporary changes to the lean-toolchain file (or leanpkg.toml in Lean 3, this part should not need to be touched)
  • removing the notion of a default toolchain (as introduced by the installer and elan default); new projects would have to be created with an explicit version reference such as leanpkg +stable new or lake +leanprover/lean4:nightly new (at some point we will switch the default prefix from Lean 3 to Lean 4, I assume). Stand-alone files could be compiled in the same way; if desired, editors could provide a version selector on opening a stand-alone file (vscode-lean4 had code for something like this at one point, but it was removed precisely because it did not adhere to condition a) from above).
  • removing "channel" toolchains such as "stable" and "nightly" as installable toolchains; instead, on the commandline they are interpreted as mere references to a proper version while in lean-toolchain they are not accepted at all. The reference is not cached locally but looked up on every use, making sure that you actually get the latest nightly when invoking lake +nightly new. This would also solve the annoyance of lake ... new && lake build downloading two separate toolchains.

My main questions at this point are:

  • Are there any use cases not covered by the above, especially for special applications of elan such as in teaching?
  • Is there anyone interested in implementing these changes (NB, elan is written in Rust)? Because I'm not necessarily.

Arthur Paulino (Mar 21 2023 at 13:01):

Being explicit about the toolchain on lake new/lake init/leanpkg new is pretty annoying, and I think it's particularly hard on mathematicians who are new to CLIs.

Alternatively, the "default" toolchain could be picked up on the fly, as the one that was installed most recently. Or the one that was released most recently.

(of course, the user should be able to be explicit about it, but having a decent fallback policy would make things easier)

Ruben Van de Velde (Mar 21 2023 at 13:03):

Maybe the new command could show typical options to choose from, or even hardcode to something sensible

Jireh Loreaux (Mar 21 2023 at 13:19):

I agree with Arthur. While I personally wouldn't mind being forced to specify the toolchain, this would be a burden to mathematicians new to Lean who want things to Just Work out of the box.

Jannis Limperg (Mar 21 2023 at 13:34):

If lake new gives a friendly error message with "here are your options; when in doubt pick stable", that shouldn't be a dealbreaker imo. For mathlib-dependent projects, lake new math should anyway copy the mathlib lean-toolchain.

In general, I like this proposal. For our teaching, we've always used an explicit lean-toolchain and nothing else (precisely to minimise the potential for confusion).

Jireh Loreaux (Mar 21 2023 at 13:48):

Yeah, a helpful error should be sufficient.

Kevin Buzzard (Mar 21 2023 at 13:54):

Let me add to the mathematician's chorus: leanproject new my_project was a game-changer (speaking as someone who remembers when that didn't exist). It just means "make me a new project with today's mathlib as a dependency" which is exactly what a mathematician wants half the time; the other half of the time they want leanproject get <URI>. In both cases all relevant mathlib oleans are fetched too and this is crucial. I think most undergraduate mathematicians would want lake versions of those two use cases and pretty much nothing else, really. Remember that they just spent 5 minutes trying to remember how to open a command line (and typically have no idea which directory they're in either).

Perhaps another alternative is to have some simple buttons on the VS Code extension which give those two simple options, and you lake enthusiasts can have whatever you like.

Jeremy Avigad (Mar 21 2023 at 14:07):

I agree that it helps to have a sensible default for leanproject new. Soon we won't have to worry about distinguishing Lean 3 and Lean 4 (because we'll all be using Lean 4).

I have heard some people worry about all the toolchains that accumulate over time and take up space. We can see them with elan show. Would an elan purge or elan clean help people delete old ones every once in a while, or help them remember to do that? Space is cheap, so maybe it's not worth the bother.

Sebastian Ullrich (Mar 21 2023 at 14:22):

Yes, setting a default channel for lake new is something we can still do without contradicting the goals of the proposal. I hope that in the near future, we will all simply be using sufficiently frequently-released Lean 4 stable versions. Perhaps, and especially in the mean time, we can even keep the current defaulting mechanism but with the reference semantics outlined above.

Sebastian Ullrich (Mar 21 2023 at 14:24):

Jeremy Avigad said:

I have heard some people worry about all the toolchains that accumulate over time and take up space. We can see them with elan show. Would an elan purge or elan clean help people delete old ones every once in a while, or help them remember to do that? Space is cheap, so maybe it's not worth the bother.

There now is https://github.com/JLimperg/elan-cleanup, it would be good to have that integrated into elan at some point

Arthur Paulino (Mar 21 2023 at 14:26):

Idea: elan default-policy with options:

  • elan default-policy recent-install
  • elan default-policy recent-release
  • ...

Sebastian Ullrich (Mar 21 2023 at 14:28):

Please no, then we're back to lake new using random old nightlies

Arthur Paulino (Mar 21 2023 at 14:30):

Ah, then you'd want to stick with the most recently released, always? I like that option

Sebastian Ullrich (Mar 21 2023 at 14:31):

No, most recently released in the hard-coded or configured default channel. Most recently installed is completely non-deterministic.

Sebastian Ullrich (Mar 21 2023 at 14:32):

When two people talk about running lake new on the same day, they should get the same result, or they will talk past each other like we've seen it happen before

Sebastian Ullrich (Mar 21 2023 at 14:32):

I almost suggested having an interactive version selector in cmdline elan (also when invoked as lake new) in the original message, but I agree with Kevin in that that time would be better spent on improving that workflow in the extension.

Sebastian Ullrich (Mar 21 2023 at 14:39):

Jannis Limperg said:

For mathlib-dependent projects, lake new math should anyway copy the mathlib lean-toolchain.

Note that there is an interesting bootstrapping issue here: unless we teach elan about new math, we still have to download some toolchain first in order to obtain a Lake binary that then copies the lean-toolchain, which likely refers to a different toolchain

Jannis Limperg (Mar 21 2023 at 15:58):

Sebastian Ullrich said:

Note that there is an interesting bootstrapping issue here: unless we teach elan about new math, we still have to download some toolchain first in order to obtain a Lake binary that then copies the lean-toolchain, which likely refers to a different toolchain

Oh yeah, I didn't consider this. I would special-case this in elan. It's a major hack but the alternatives seem worse.

Arthur Paulino (Mar 22 2023 at 12:33):

Oh, btw, since we're here, there's an issue in which elan gets to a point I've never been able to make it recover from. It happened to me like 2 or 3 times in the past. From what I remember, it has something to do with manually deleting some toolchains and then elan doesn't know what to do if the default toolchain had been deleted. This may be the wrong hypothesis, but maybe someone else can explain it better.

The way that I got out of those situations was simply nuking the elan folder and reinstalling it from scratch. Can this redesign avoid that bad state?

Kevin Buzzard (Mar 22 2023 at 13:21):

One gotcha is that if you manually delete a toolchain from .elan/toolchains/ then you must also delete the corresponding file from .elan/update-hashes or else you end up with a borked system.

Mac Malone (Mar 23 2023 at 16:53):

Sebastian Ullrich said:

Yes, setting a default channel for lake new is something we can still do without contradicting the goals of the proposal. I hope that in the near future, we will all simply be using sufficiently frequently-released Lean 4 stable versions.

On my roadmap for Lake is having a lake install to install global packages into the toolchain (much like npm install -g). This would allow for easy installation of Lake-based CLI tools which, using JS as a reference point, become quite common as the package ecosystem grows. While some such CLI tools are package-specific and thus do not need a global toolchain, others do desire one. Thus, in order for this to be feasible, there needs to be some system-wide toolchain accessible for out-of-project CLI use. Therefore, I would greatly hope to retain the notion of a default toolchain.

Gabriel Ebner (Mar 23 2023 at 16:55):

Those globally installed tools should probably be built using the toolchain that they specify, not with whatever the user happens to have set up as the default.

Sebastian Ullrich (Sep 11 2023 at 08:29):

Sebastian Ullrich said:

  • removing "channel" toolchains such as "stable" and "nightly" as installable toolchains; instead, on the commandline they are interpreted as mere references to a proper version while in lean-toolchain they are not accepted at all. The reference is not cached locally but looked up on every use, making sure that you actually get the latest nightly when invoking lake +nightly new. This would also solve the annoyance of lake ... new && lake build downloading two separate toolchains.

This is now up at https://github.com/leanprover/elan/pull/106. The following toolchain references will (at the time of writing) resolve to the same toolchain, and not install it multiple times:

stable
4.0.0
v4.0.0
leanprover/lean4:<any of the above>

Sebastian Ullrich (Sep 11 2023 at 08:31):

I don't have any plans anymore to get rid of the notion of default toolchain, so with the default settings of elan default stable, lake new by itself will do the sensible thing of giving you the most recent stable.

Mario Carneiro (Sep 11 2023 at 13:17):

Does this mean that I can't run lake new without an internet connection?

Sebastian Ullrich (Sep 11 2023 at 13:53):

In that case you will have to think about exactly which of your local toolchains you want to use anyway, so do that and then use e.g. lake +4.0.0 new, which will work

Sebastian Ullrich (Sep 20 2023 at 16:34):

@Mario Carneiro Does this sound acceptable to you (also for your question on GH)? We could think about possibly caching the toolchain lookup for a while but I would really like to avoid that complexity if there are no sufficient use cases

Mario Carneiro (Sep 20 2023 at 19:28):

I think the command should suggest that if network connection fails

Mario Carneiro (Sep 20 2023 at 19:30):

ideally it should be able to determine exactly one local toolchain which is preferred too, not just give a list of all your toolchains. For example, suggest the most recent toolchain on the selected release track

Sebastian Ullrich (Sep 21 2023 at 07:39):

Yes, I wanted to improve the error message before merging. Storing a preferred version would be an interesting weak version of the aforementioned cache but I think I'd wait on user feedback before implementing that


Last updated: Dec 20 2023 at 11:08 UTC