Zulip Chat Archive

Stream: lean4

Topic: Lake's package vs lean_lib vs lean_exe


Yuri de Wit (Jun 21 2022 at 01:32):

@Mac , first of all, I am impressed with limited experience with Lake so far. Well done!

Now, I just updated to the latest Lake release and I am a bit confused by the meaning of package, lean_lib and lean_exe. I can´t claim I knew Lake before this change so this may not have anything to do with the latest update per-se.

Here is a little stroll ...

A lake build with an empty lakefile.lean (I explicitly removed its contents after running lake new t) gives me: error: configuration file is missing a package declaration, which is expected.

So I added a package declaration (from the initial lakefile.lean created by lake):

package t {
  -- add package configuration options here
}

Now lake build gives me the following warning:

warning: Package targets are deprecated. Add a `lean_exe` and/or `lean_lib` default target to the package instead.`

So I added a lean_exe (again from the original lakefile.lean content):

@[defaultTarget]
lean_exe t {
  root := `Main
}

And it builds fine, but the package warning is no longer there: is it deprecated or not? And if I removed the package and kept the lean_exe, I get the following error: error: configuration file is missing a package declaration. So it may not be deprecated after all?

Now, if I add lean_libback and have the original lakefile.lean created by the new command, it also builds fine. However, I see no difference in the generated files. Should lean_lib produce a .so or something like that?

I also see that changing the package name in package t breaks the build, so it seems that t maps to a folder named T containing nested .lean files or a T.lean file (or both?). So I guess that gives me the meaning of package!

Next, how do package's are mapped to lean_exe/lean_lib. Do all package's in a lakefile.lean contribute to generated exes/libs? Or is there a way to mix and match?

For now, sorry for the length of this message.

Yuri de Wit (Jun 21 2022 at 01:38):

Btw, after renaming package to an inexistent (say t2 instead of t), I get an error that T is an unknown package, which makes sense. But shouldnt there be at least a warning that the t2 package does not exist?

Yuri de Wit (Jun 21 2022 at 01:39):

Finally, should I instead open an issue in the repo?

Mac (Jun 21 2022 at 01:54):

Yuri de Wit said:

Mac , first of all, I am impressed with limited experience with Lake so far. Well done!

Thanks! :big_smile:

Yuri de Wit said:

I can´t claim I knew Lake before this change so this may not have anything to do with the latest update per-se.

Ah, yeah, this may be the pain point here. All the deprecation warnings were designed to make migration easy, but I can definitely see that they may be confusing if you aren't already familiar with the feature being deprecated.

Basically, before v3.1, each package could only have the equivalent of exactly 1 lean_lib and 1 lean_exe . The settings for these two targets were (and still technically are) embedded into the package configuration. Lake intelligently (maybe a bit too intelligently) deduced the initial configuration of these two targets from the package name (e.g., the lean_lib target is the upper camel case version of the name and the lean_exe target was package name's exactly with a root of Main).

Now, in v3.1, these embedded targets have been deprecated (hence the "Package targets" warning) and the user is expected to separately specify the lean_lib and lean_exe targets they wish to include. However, to make migration less painful (and too try and not break every lakefile previously in existence immediately), Lake will still fall back onto the embedded package targets if no other targets have been specified. This results in all the weird behavior you are observing with files with just a package and without a @[defaultTarget].

Now, to your question:

Yuri de Wit said:

And it builds fine, but the package warning is no longer there: is it deprecated or not? And if I removed the package and kept the lean_exe, I get the following error: error: configuration file is missing a package declaration. So it may not be deprecated after all?

The package declaration itself has not been deprecated only the previous feature of an embedded lean_lib / lean_exe within the declaration has been. A better warning would probably have been "warning: Configuration has a package declaration but no targets. Defining targets within the package declaration has been deprecated. You now need to specify separate lean_exe and/or lean_lib targets to build."

Mac (Jun 21 2022 at 02:01):

Yuri de Wit said:

Now, if I add lean_libback and have the original lakefile.lean created by the new command, it also builds fine. However, I see no difference in the generated files. Should lean_lib produce a .so or something like that?

How are you building the target? Are you using lake build? If so, it will only build by default if you have marked it as a @[defaultTarget]. Furthemore, lean_lib, by default, just builds Lean binary files for its modules (i.e., their .olean and *.ilean). To build a static or shared library for the target you can do lake build <target-name>:static or lake build <target-name>:shared. (See lake help build for more information.)

Mario Carneiro (Jun 21 2022 at 03:19):

one thing that I found difficult was how to do the equivalent of packageFacet := .oleans or whatever it was called

Mario Carneiro (Jun 21 2022 at 03:20):

I see there is a syntax for selecting facets in lake build <stuff> but the syntax is rather impenetrable

Mario Carneiro (Jun 21 2022 at 03:21):

for example, how can I build mathlib *.c files? It isn't normally set up to do this, it's a olean library only, but I used to modify the default facet to make this work

Mario Carneiro (Jun 21 2022 at 03:25):

lake build c   # error: unknown target `c`
lake build +c   # error: unknown module `c`
lake build :c   # error: unknown package facet `c`
lake build Mathlib:c   # error: unknown library facet `c`
lake build Mathlib+:c   # error: unknown target `Mathlib+`

Mario Carneiro (Jun 21 2022 at 03:28):

it doesn't help that I don't really understand what module / library / package facets are

Mac (Jun 21 2022 at 03:46):

@Mario Carneiro there is a direct way to build the .c files of a library target. You can use lake build mathlib:static to build a static library which will, to do so, produce the .c files of the library.

Mario Carneiro (Jun 21 2022 at 03:47):

I see in the documentation a something facet called c. How do I use it?

Mario Carneiro (Jun 21 2022 at 03:48):

MODULE FACETS:
  [default]             build the module's *.olean and *.ilean files
  c                     build the module's *.c file           <--
  o                     build the module's *.o file

Mac (Jun 21 2022 at 03:48):

Mario Carneiro said:

one thing that I found difficult was how to do the equivalent of packageFacet := .oleans or whatever it was called

A library target (i.e., lean_lib), by default, only builds the Lean binary (i.e., .olean / .ilean) files. So no extra setting is needed.

Mac (Jun 21 2022 at 03:49):

Mario Carneiro said:

I see in the documentation a something facet called c. How do I use it?

lake build My.Module:c or, unambiguously, lake build +My.Module:c

Mario Carneiro (Jun 21 2022 at 03:49):

so what if I want to temporarily modify it so that it also produces c files?

Mario Carneiro (Jun 21 2022 at 03:49):

That didn't work above, Mathlib:c says unknown library facet

Mac (Jun 21 2022 at 03:49):

Mario Carneiro said:

so what if I want to temporarily modify it so that it also produces c files?

build a target that requires them (i.e., the static or shared library or an exe using the library).

Mario Carneiro (Jun 21 2022 at 03:50):

+Mathlib:c works

Mario Carneiro (Jun 21 2022 at 03:50):

what is it ambiguous with?

Mac (Jun 21 2022 at 03:51):

Mario Carneiro said:

what is it ambiguous with?

The whole library. That is, Mathlib refers to the lean_lib target whereas +Mathlib refers to the module Mathlib (of the Mathlib library).

Mario Carneiro (Jun 21 2022 at 03:52):

My suggestion would be to have c and o also be library facets, which act like the corresponding module facet for the root module

Mac (Jun 21 2022 at 03:53):

Yeah, seems like a fine idea. I'll add it to the TODO list.

Mario Carneiro (Jun 21 2022 at 03:55):

in the other direction, lean could be a module facet (a synonym for the default), and possibly also static and shared (build whatever it would normally build but only for the selected module)

Mario Carneiro (Jun 21 2022 at 03:56):

if all of the different kind of facets could be unified I think it would make things a lot clearer

Yuri de Wit (Jun 21 2022 at 11:28):

How are you building the target? Are you using lake build? If so, it will only build by default if you have marked it as a @[defaultTarget]. Furthemore, lean_lib, by default, just builds Lean binary files for its modules (i.e., their .olean and *.ilean). To build a static or shared library for the target you can do lake build <target-name>:static or lake build <target-name>:shared. (See lake help build for more information.)

I see: RTFM! :-)

Yuri de Wit (Jun 21 2022 at 12:38):

Mario Carneiro said:

it doesn't help that I don't really understand what module / library / package facets are

I think this is part of my difficulties too, compounded by my own baggage.

From Lake's README:

To create a new package, either run lake init <package-name> to setup the package in the current directory or lake new <package-name> to create it in a new directory.

I initially thought that the <package-name> folder and the contained lakefile.lean were one package containing multiple .lean modules (some of the README content seems to corroborate). The 3.1 changes made it clear that neiter the <package-name> folder nor the lakefile.lean contained there represents one package, but potentially more than one. The lakefile.lean contains the description of one or more packages. I guess this raises other questions related to importing/sourcing specific packages.

A Lean library target defines a set of Lean modules available to import and how to build them.

So is lean_lib what is really a collection of .lean modules instead of the package? If so, how are packages related to libraries?

require mathlib from git
  "https://github.com/leanprover-community/mathlib4.git"@"master"

I am not sure what require here means. Is this sourcing the lakefile.lean from mathlib4.git including all package's, lean_lib's and lean_exe's included there? Or I misundertood and https://github.com/leanprover-community/mathlib4.git"@"master here always means the same set of .lean modules?

Note that I am assuming that a module here means one .lean file that is independent of its Lean namespace.

Mac (Jun 21 2022 at 15:35):

Yuri de Wit said:

Mario Carneiro said:

it doesn't help that I don't really understand what module / library / package facets are

I think this is part of my difficulties too, compounded by my own baggage.

It seems like it may be useful to have a glossary of some terms in the README (e.g., workspace, package, facet, library, module, etc.).

Mac (Jun 21 2022 at 15:44):

Yuri de Wit said:

Mario Carneiro said:

it doesn't help that I don't really understand what module / library / package facets are

I initially thought that the <package-name> folder and the contained lakefile.lean were one package containing multiple .lean modules (some of the README content seems to corroborate). The lakefile.lean contains the description of one or more packages.

At lakefile is meant to represent the configuration of a single package. However, the package can contain dependencies (i.e., through require). Collectively, the root package and its transitive dependencies make up the workspace.

A package itself is a set of build targets (i.e., things one cand lake build <target>) and other configuration options. In addition to Lean modules, the package can contain information on how to configure the server and build external libraries, executables, and more (to come).

To answer your question:

Yuri de Wit said:

I am not sure what require here means. Is this sourcing the lakefile.lean from mathlib4.git including all package's, lean_lib's and lean_exe's included there?

Yes.

Yuri de Wit (Jun 21 2022 at 15:50):

At lakefile is meant to represent the configuration of a single package

I am assuming that there can be only one package directive in a lakefile.lean then, right?

Mac (Jun 21 2022 at 15:53):

Mario Carneiro said:

it doesn't help that I don't really understand what module / library / package facets are

To give a quick explanation, a facet is a part of an otherwise configured target (e.g., module / library / executable) that can be independently built (i.e., through lake build <target>:<facet>).

There is thus some potential confusion between a package facet and a target. After all, targets could simply be considered a facet of some package. This is largely true, and the notion of a package facet is simply a left over of pre-v3.1 where it was not possible to separately configure multiple targets for a package. From v3.1 onwards, the notion of a package facet has been deprecated (and replaced with configurable targets of said package).

Mac (Jun 21 2022 at 15:53):

Yuri de Wit said:

At lakefile is meant to represent the configuration of a single package

I am assuming that there can be only one package directive in a lakefile.lean then, right?

Yep. Doing otherwise will give you and error. :)

Yuri de Wit (Jun 21 2022 at 15:55):

I should have tried before asking ... in any case, why do I need to specify a name in the package directive?

In any case, I think Lake model is much closer than what I thought initially. Thanks for the explanations.

Chris Lovett (Jun 21 2022 at 20:25):

I also found this very educational, would be great to capture some of this info in some helpful VSCode commands to help users manage Lake their packages... if you have ideas on how to do that please add vscode-lean4 issue suggestions.

Yuri de Wit (Jun 21 2022 at 20:44):

I guess, @Chris Lovett , some ideas may follow naturally from this and other discussions. I, at least, am in the understand phase trying to keep my "opinionated" hat far away.

Yuri de Wit (Jun 22 2022 at 17:56):

Found a https://blog.racket-lang.org/2015/08/modules-packages-and-collections.html of how Racket organizes code and I hope it can give some references on how other languages describe these concepts. At least, this is a point of comparison.

A module is the basic unit of functionality for code. Once the code grows we split them over multiple modules. This allows us to organize the source code better, enables separate compilation, etc. Note that a module in Racket dont necessarily mean, afaik, a single .rkt file because you could have more than one file contributing to the same module (using the (module+ ...) form) or even multiple modules within the same .rkt file. Racket does support submodules, afaiu.

A package is a group of modules that you can install together, and that usually provide one piece of functionality. Here is an example of a package in Racket with its collection of modules. I.e packages are units of code distribution.

A collection is a group of modules whose functionality is related to the same topic, for example data structures (the data collection). That is why modules in racket are referenced as <collection>/<module>: e.g. racket/class - the class module from the racket collection. Note that modules from a collection may come from any package (e.g. the hamt package provide a data/hamt containing data structures to the data collection). So, to sum up, collections are units of code classification.

And finally, the term library does not have a technical meaning in Racket. This usually use refers to a package, or to a set of packages that are developed together. E.g. the Rackunit library is split across multiple packages: rackunit, rackunit-lib, rackunit-gui, etc, and you can only depend on individual packages from the Rackunit library.

Yuri de Wit (Jun 22 2022 at 18:12):

Here are my own observations (please, take it with a grain of salt).

The notion of collection is novel to me, at least. It puts front and center the module classification (metadata) as part of the module reference. I see something similar taking place in Haskell, for instance, when same modules use the Data. prefix in namespaces, etc.

One notable difference between Racket and Lean is that all code and forms are always contained in a module: nothing happens without a module name (and corresponding language for the module). In Lean, it seems that code may or may not reside in namespaces.

We tend to use the more low level notion of an exe vs lib (necessary since we do want to produce executables and system libraries), but it seems that for Racket this is just an secondary fact: whether the module(s) have a main function (granted Racket is a dynamic language).

Mario Carneiro (Jun 22 2022 at 19:02):

One notable difference between Racket and Lean is that all code and forms are always contained in a module: nothing happens without a module name (and corresponding language for the module). In Lean, it seems that code may or may not reside in namespaces. This seems to be a problem for Racket because of their language or languages approach: bootstrapping the language to be used starts with a named module.

In lean, all code resides in a module. A lean module is just a file; it has a name that derives primarily from its location on the filesystem (although I think lake can customize that to some extent).

Mario Carneiro (Jun 22 2022 at 19:02):

"namespaces" are not a unit of code organization at all, they are a unit of API organization

Reid Barton (Jun 22 2022 at 20:34):

AFAIK C++ namespaces work more or less the same way as Lean namespaces (although they may not be used the same way in practice); that might be a more fruitful comparison

Mac (Jun 22 2022 at 21:36):

Yuri de Wit said:

Found a https://blog.racket-lang.org/2015/08/modules-packages-and-collections.html of how Racket organizes code and I hope it can give some references on how other languages describe these concepts. At least, this is a point of comparison.

Here is how I would map the concepts from Racket to Lean based on what you've said:

  • A Racket module is cross between a Lean namespace and a Lean module. It can be split across multiple files (like a Lean namespace) but has a single global hierarchy which all code must be a part of (like a Lean module).
  • A Racket package is essentially the same as a Lean package -- both are units of code distribution.
  • A Racket collection is cross between the notion of a Lake library and a Lean module root. The difference is that a Lake library is package specific, so in your Racket example of Rackunit, the rackunit-lib, rackunit-gui, etc. would be separate libraries. However, they could all contribute to the same root (i.e., Rackunit) to form a single collection of modules (e.g., rackunit-gui could contribute the modules under Rackunit.Gui).

Yuri de Wit (Jun 23 2022 at 21:28):

My understanding of Racket's module system was a bit off:

  • there is no concept of a multi-file module: each file is a single module and it can be declared using a short form #lang <language>, which expands to the (module ...) form using the file name as the module name (no paths, just the filename). So a .lean file corresponds to a .rkt module with an implicit #lang lean).

  • Importing other modules can be done using a file path (relative or absolute): e.g. (require "../a.rkt"). In this case, there is no search involved, just a direct file reference. Using relative paths seem to allow for invoking individual modules, but best practices is to avoid relative paths. Lean relies on the full module path, which is nice: a single way of importing whether intra- or extra-package at the small expense of having to fix a root.

  • Or a module can be imported using the package name e.g. (require data), which means that the main.rkt module will be imported, or `(require data/queue) which will import queue.rkt from the data collection. This involves searching the installed packages.

  • The only out-of-the-box way to contribute bindings to another package, is to use collections and only by adding additional modules to the collection not extending/changing existing modules.

Yuri de Wit (Jun 23 2022 at 21:32):

Mario Carneiro said:

"namespaces" are not a unit of code organization at all, they are a unit of API organization

You helped me understand how namespaces are important in Lean. It is a key difference between Racket and Lean since Racket does not support namespaces, only modules, packages and collections, afaiu. Anything like Lean's namespace extensibility would have to be custom coded into an extension protocol.

Yuri de Wit (Jun 23 2022 at 21:33):

Reid Barton said:

AFAIK C++ namespaces work more or less the same way as Lean namespaces (although they may not be used the same way in practice); that might be a more fruitful comparison

Thanks. I will take a look at it. But I think I will explore rust's module system and package management first.

Yuri de Wit (Jun 23 2022 at 21:40):

Mac said:

* A **Racket module** is cross between a **Lean namespace** and a **Lean module**. It can be split across multiple files (like a Lean namespace) but has a single global hierarchy which all code must  be a part of (like a Lean module).

Right, the only thing is that Racket does not support extending namespaces like Lean does.

* A **Racket package** is essentially the same as a **Lean package** -- both are units of code distribution.

Right.

* A **Racket collection** is cross between the notion of a **Lake library** and a **Lean module root**.  The difference is that a Lake library is package specific, so in your Racket example of Rackunit, the `rackunit-lib`, `rackunit-gui`, etc. would be separate libraries. However, they could all contribute to the same root (i.e., `Rackunit`) to form a single collection of modules (e.g., `rackunit-gui` could contribute the modules under `Rackunit.Gui`).

Do you have an example of a single Lean package that produces more than one library or exe? I am likely missing something here.

Yuri de Wit (Jun 23 2022 at 21:53):

And here is an example of a Racket package:

.
├── info.rkt
├── main.rkt
├── scribblings
   ├── base.rkt
   └── bestfit.scrbl

Info.rkt:

#lang setup/infotab

(define version "0.2")
(define collection "bestfit")
(define name "bestfit")
(define deps '("base" "typed-racket-lib" "plot-lib" "plot-gui-lib" "math-lib"))
(define build-deps '("racket-doc" "typed-racket-doc" "scribble-lib" "math-doc" "plot-doc"))

(define scribblings '(("scribblings/bestfit.scrbl" ())))

(Note how #lang here allows for specifying the specific custom language/DSL being used, which is orthogonal to modules, but a selling point in Racket that could also be explored by Lean considering how it can be used to create custom DSLs with custom syntax/notation).

Notice that (define collection "bestfit") defines the collection for the package (fyi: this is a single-collection package). A module can (require bestfit) and import the main.rkt module from the bestfit package or it can (require bestfit/scribblings/base) import the base.rkt module directly.

From Package concepts:

The package name is not mentioned with require, because packages are a way of managing library collections, not a way of referencing them. It is common, however, for a package to implement a collection whose name is the same as the package namein which case a require might appear to be referencing a package, but it is actually referencing a collection provided by the package

Yuri de Wit (Jun 23 2022 at 22:29):

Multi-collection vs single-collection packages:

A single-collection package’s directory doubles as a collection directory. By default, the package name also doubles as the collection name, but if the package has an "info.rkt" file that defines collection to a string, then the string is used as the name of the package’s collection.

A multi-collection package’s directory contains subdirectories, each of which is a collection that is provided by the package (where the directory name is used as the collection name). A multi-collection package must have an "info.rkt" file that defines collection as 'multi.

Mac (Jun 24 2022 at 22:33):

Yuri de Wit said:

Do you have an example of a single Lean package that produces more than one library or exe? I am likely missing something here.

The targets example in the Lake repository is a good toy demonstration. I don't have any practical examples on hand, though.


Last updated: Dec 20 2023 at 11:08 UTC