Zulip Chat Archive

Stream: general

Topic: Lean standard library vision discussion


Markus Himmel (Jan 27 2025 at 09:39):

This is a discussion thread for #announce > The Lean standard library: vision and call for contributions

Yaël Dillies (Jan 27 2025 at 10:07):

Nitpick, but shouldn't

we are mostly interested in contributions that expand upon existing material rather than introducing novel concepts

rather read

we are mostly interested in contributions that expand upon existing material rather than introduce novel concepts

Eric Wieser (Jan 27 2025 at 10:12):

In terms of the practicality of actually contributing to Std:

With Batteries, I can check out the repo, run lake build, and things are ready to go in a minute or so. With Mathlib, the same is true thanks to cache get. When I've made contributions to Lean 4 itself, I've found it often takes more than 15 minutes to get to a point where I have a lean server running, as everything has to be built from scratch.

Are there tricks to shortcut such a cold build? Is this a place where nix can help?

Markus Himmel (Jan 27 2025 at 10:14):

Yaël Dillies said:

Nitpick, but shouldn't

we are mostly interested in contributions that expand upon existing material rather than introducing novel concepts

rather read

we are mostly interested in contributions that expand upon existing material rather than introduce novel concepts

I'm not a native speaker, but a quick internet search seems to suggest that both are correct.

Eric Wieser (Jan 27 2025 at 10:18):

I think it's a very minor stylistic point; as a native speaker "contributing/ introducing" or "contribute / introduce" would both be fine, but the mixture is very slightly imbalanced. I think there are more interesting things to discuss about Std though!

Markus Himmel (Jan 27 2025 at 10:23):

Eric Wieser said:

Are there tricks to shortcut such a cold build? Is this a place where nix can help?

I'm not aware of anyone working on something like stage1 olean caching. stage0 benefits from C being generally fast to compile and high ccache hit rates, but as far as I am aware at present the main way to get fast stage1 builds is a beefy machine.

Eric Wieser (Jan 27 2025 at 10:25):

Perhaps I didn't configure ccache correctly in the gitpod config; or maybe the cache only ends up hot if you're working on Lean full time!

Somo S. (Jan 27 2025 at 10:58):

I'm curious, what's the difference between "software verification and verified software development" (referenced in the visiou? what's an example of one vs. the other?

Markus Himmel (Jan 27 2025 at 11:06):

By "verified software development" I mean writing a piece of software in Lean and then proving (in Lean) that it is correct. An example of this is the Cedar project. By software verification I mean using a tool like Aeneas to prove in Lean that a piece of software written in a different programming language is correct.

François G. Dorais (Jan 27 2025 at 14:56):

Eric Wieser said:

In terms of the practicality of actually contributing to Std:

With Batteries, I can check out the repo, run lake build, and things are ready to go in a minute or so. With Mathlib, the same is true thanks to cache get. When I've made contributions to Lean 4 itself, I've found it often takes more than 15 minutes to get to a point where I have a lean server running, as everything has to be built from scratch.

For small contributions, I think it's a good idea to stage into Batteries first. In addition to fast builds:

  • The review process is quicker and not as fussy (no RFC approvals).
  • The contribution will be available right away rather than waiting for the appropriate Lean release.
  • Mathlib adaptations can be made right away, which simplifies work for Lean maintainers.
  • The upstreaming to Lean is often a trivial copy paste and can even be handled by Lean maintainers or more experienced users with beefy machines.
  • It provides a platform for live testing before submitting to Lean.

Markus Himmel (Jan 27 2025 at 15:11):

Yes, making it easy to extend existing standard library concepts with new operations and theorems in downstream projects (independently of whether there are plans to have the material upstreamed eventually) is one of the goals of the standard library.

Of course, different repositories will have different contribution processes, review styles, coding practices, etc., so I expect that there will be varying preferences about whether to contribute to Lean directly or go through a project like Batteries first, just like some material for mathlib is developed downstream of mathlib first while other material is contributed to mathlib directly.

Edward van de Meent (Jan 27 2025 at 16:08):

in that case, it sounds like it might be useful to make it clear which kind of suggestion should use which process. Particularly, i imagine it is not beneficial to have a RFC/PR to Std and a PR to Batteries open for the same thing... or at least not when they don't mention eachother?

Markus Himmel (Jan 27 2025 at 16:24):

I think it's more about what your goals are. If your goal is to see your code merged into the standard library, then you should follow the processes laid out in the call for contributions, i.e., discuss your possible contribution with someone from the standard library team and if we agree that your contribution would make a good addition to the standard library then we will invite you to open a PR. For this, it does not matter where the code was before that, if you write it directly inside of your fork of the lean4 repository, if you had it in your own downstream project or if you contributed it to Batteries before to get some initial feedback.

Jireh Loreaux (Jan 27 2025 at 16:46):

I'm confused by this sentence:
Markus Himmel said:

we will be adding a style guide and naming scheme (for the standard library only)

Specifically about the naming scheme: the standard library is going to have a different naming scheme from everything else? Maybe I misunderstand what is meant by this sentence.

Markus Himmel (Jan 27 2025 at 16:49):

At the moment the Lean codebase does not have a codified naming convention at all! Soon, at least the public parts of standard library will (and of course it will aim to be compatible with mathlib's naming convention).

Jireh Loreaux (Jan 27 2025 at 16:54):

Perhaps we mean something different by naming convention. Maybe what I'm referring to would be better called a naming scheme. The scheme I am referring to is documented here. I realize this is a leanprover community webpage, but I was under the impression that the naming scheme documented there was a consequence of decisions made in core Lean.

My main question is: will that scheme be changing for the standard library?

Markus Himmel (Jan 27 2025 at 16:55):

Nothing will change. The only thing that will happen is that the lean4 repository will contain a file with basically the same content as the file you linked to.

Kim Morrison (Jan 27 2025 at 21:54):

(Or, if anything will change, it's that both documents will improve as we discover missing coverage, and discuss with the Mathlib naming experts how to add consistent conventions for these.)

Violeta Hernández (Jan 28 2025 at 08:05):

Didn't we already have a standard library which got turned into Batteries? What's the difference here?

Markus Himmel (Jan 28 2025 at 08:12):

"Standard library" is the term we use for a certain part of the Lean distribution, as defined in the document. Batteries is a community project which builds on and extends the standard library and other parts of the Lean distribution to provide additional material which is not present in Lean core.

Eric Wieser (Jan 28 2025 at 10:36):

Just to be super clear, this is referring to Std, right?

Markus Himmel (Jan 28 2025 at 10:45):

As described in the vision document, the standard library does not correspond to a certain directory in the lean4 repository or a certain namespace. Rather, the standard library is defined by what is listed in the standard library outline (which is also part of the vision document).

Eric Wieser (Jan 28 2025 at 10:49):

I think explicitly mentioning that the scope is larger than / orthogonal toStd in the document would aid clarity

Markus Himmel (Jan 28 2025 at 10:50):

Is "Not all public APIs in the Lean distribution are part of the standard library, and the standard library does not correspond to a certain directory within the Lean source repository." not that?

Eric Wieser (Jan 28 2025 at 10:51):

"a certain directory (e.g. Std)" would be compatible with Ctrl+F :)

Ruben Van de Velde (Jan 28 2025 at 10:51):

That is, batteries could no longer be called "standard" to avoid confusion with this library

Yaël Dillies (Jan 28 2025 at 10:51):

Is Std a subset of the standard library?

Yaël Dillies (Jan 28 2025 at 10:52):

It's confusing that "the standard library" doesn't mean Std given that Std is an abbreviation of "standard". Maybe you should rename Std?

Markus Himmel (Jan 28 2025 at 10:54):

Ruben Van de Velde said:

That is, batteries could no longer be called "standard" to avoid confusion with this library

If I'm not mistaken the batteries project has been using the term "extended library" for a while now.

Markus Himmel (Jan 28 2025 at 10:57):

Yaël Dillies said:

It's confusing that "the standard library" doesn't mean Std given that Std is an abbreviation of "standard". Maybe you should rename Std?

I will make an amendment to the document that tries to address this confusion. I think it is reasonable for parts of the prelude like Nat to be considered part of the standard library while still providing additional material (like containers or the date and time library) that we would expect people to explicitly import as part of a Std namespace.

Niels Voss (Jan 28 2025 at 17:48):

Maybe you could provide some examples as to what constitutes part of the standard library and what doesn't?
The way I had previously thought about it was that everything that didn't require an input was part of the "Init library" and everything that shipped with the lean distribution but required an input was part of the standard library. But it seems this isn't quite true?

Basically, what I'm asking is if the standard library doesn't correspond to the files in Std, then what does it correspond to?

Niels Voss (Jan 28 2025 at 17:57):

Oh I see the outline now, which clears some things up. I still think that "Basic Types" is a bit ambiguous.

My concern is that a Lean user might want to make an API that depends only on the standard library, (maybe because it's less likely to make backward incompatible changes?) but doesn't know what's part of the standard library or not. For instance, is unsafe stuff like lcProof part of the standard library? What about ST or SizeOf? Or does the standard library not necessarily correspond to the parts of Lean which can be considered stable, and users wouldn't need to know what's part of the standard library?

Markus Himmel (Jan 28 2025 at 18:29):

There are stable public APIs beyond the standard library. We are aware that Lean currently does a poor job of telling users which declarations (in all namespaces) are considered public or even stable and which are not. We will improve this.

Yury G. Kudryashov (Jan 28 2025 at 22:47):

Currently, the document says "std library is a subset of lean distribution; there is no simple way to tell what files/definitions/theorems belong to it". WDYT about adding some way to figure this out without asking the developers? E.g., add a sentence "This file is a part of the standard library" to each relevant module doc string?

Markus Himmel (Jan 29 2025 at 06:14):

Yes, this is a good idea, thanks! The detailed outline and the reference manual will also eventually contain this information down to the level of individual types. So there will be plenty of ways.


Last updated: May 02 2025 at 03:31 UTC