Zulip Chat Archive

Stream: mathlib4

Topic: partial cache for downstream projects


Jon Eugster (Jan 28 2025 at 23:23):

@Eric Wieser @Michael Rothgang I'll continue here from #21195 (and #20568) since this feels more natural to discuss.

(for context, I want lake exe cache get MyProject.lean to download only the relevant part of the Mathlib cache and both PRs do something in that direction)

@Eric Wieser re the suggested cleanups from #8767, the main hack comes from what's still in this PR:

    -- TODO this should be generated automatically from the information in `lakefile.lean`.
    let mut entries := #[
      ("Mathlib", root),
      ("Archive", root),
      ("Counterexamples", root)
    ]

because I do not know any way to get the currently available libs ("Mathlib", "MyProject", "Archive", etc.) or default-targets from lake... you can get stuff written in the manifest from

match ( Lake.Manifest.load? "lake-manifest.json") with

but that only tells you the names like leanprover-community/mathlib. Do you know of a way to access the lake workspace or reparse the lakefile?

Eric Wieser (Jan 28 2025 at 23:34):

I wonder if we could use the Lean search path instead of trying to parse the lake metadata

Eric Wieser (Jan 28 2025 at 23:35):

I no longer remember why we need this data in the first place; isn't knowing the module name and the file it came from enough? Why do we care which folder it is in?

Jon Eugster (Jan 28 2025 at 23:38):

where do I start looking at that, I've never looked at the search path?

I think the cache does a simple filter where it filters module names Mathlib.X.Y by its prefix Mathlib to decide which library it belongs to. maybe that can be done otherwise

Eric Wieser (Jan 28 2025 at 23:41):

one place you can get the search path is through the LEAN_PATH environment variable, which I believe lake sets

Jon Eugster (Jan 28 2025 at 23:43):

#eval IO.getEnv "LEAN_PATH"

does give something :+1: :thinking:

Eric Wieser (Jan 28 2025 at 23:44):

Right, the question is whether it is set in exe cache

Eric Wieser (Jan 28 2025 at 23:45):

What you actually want is docs#Lean.addSearchPathFromEnv which does the parsing for you

Eric Wieser (Jan 28 2025 at 23:50):

I tested it and it seems to work

Jon Eugster (Jan 29 2025 at 00:03):

Ah and I think docs#Lean.findLean is very informative on how to do this properly, this gives a mapping "module name" --> filePath

Jon Eugster (Jan 29 2025 at 00:05):

(Lean.findLean)

Damiano Testa (Jan 29 2025 at 00:09):

I do not know if this will help, but scripts/mk_all.lean contains the function getLeanLibs that returns all the libraries on which the current project depends.

Damiano Testa (Jan 29 2025 at 00:10):

mk_all uses this to generate Archive.lean and Counterexamples.lean (or rather, to "know" that Archive and Counterexamples exist).

Jon Eugster (Jan 29 2025 at 00:14):

That is indeed huge :D May or may not be needed for the cache, but regardless I've certainly wished for a tutorial to get a Lake.Workspace in a script, thanks! I'll try these things out tomorrow

Damiano Testa (Jan 29 2025 at 00:15):

I had asked about this on Zulip when working on mk_all and then a combination of Mac, Mario and Kim wrote that function and bumped it when it stopped working.

Damiano Testa (Jan 29 2025 at 00:15):

The conclusion should be: it works, but I do not know why... :smile:

Damiano Testa (Jan 29 2025 at 00:18):

If you search "getLeanLibs" on Zulip you get a few hits, but all pertinent.

Jon Eugster (Jan 29 2025 at 09:16):

While I'm on it, would it be a bad idea to change the syntax for cache to lake exe cache get Mathlib.Algebra.Field?

(instead of the current lake exe cache get Mathlib/Algebra/Field/*.lean)

Damiano Testa (Jan 29 2025 at 09:17):

I would rather it accepted both.

Damiano Testa (Jan 29 2025 at 09:17):

From the command-line it is very convenient to use auto completion and you only get that with paths.

Kevin Buzzard (Jan 29 2025 at 09:33):

Right now these fail:

lake env perf stat lean Mathlib.AlgebraicGeometry.ProjectiveSpectrum.Proper
lake build Mathlib/AlgebraicGeometry/ProjectiveSpectrum/Proper.lean

and these succeed:

lake env perf stat lean Mathlib/AlgebraicGeometry/ProjectiveSpectrum/Proper.lean
lake build Mathlib.AlgebraicGeometry.ProjectiveSpectrum.Proper

Damiano Testa (Jan 29 2025 at 09:39):

I would consider it a great improvement if you could always use the path, even in import ... statements!

Damiano Testa (Jan 29 2025 at 09:40):

I have my own function so that lb xxx does lake build and then converts xxx to whatever the right format is.

Johan Commelin (Jan 29 2025 at 09:47):

#!/usr/bin/env bash

set -euo pipefail

# Replace every '/' with '.'
modified_args="${@//\//.}"

# Remove every '.lean' suffix
modified_args="${modified_args//.lean/}"

# Call 'lake' with the modified arguments
lake $modified_args

Kim Morrison (Jan 29 2025 at 10:31):

This is #2756. It's on @Mac Malone's radar still, I hope.

Yaël Dillies (Jan 29 2025 at 13:08):

lean4#2756

Jon Eugster (Jan 30 2025 at 16:28):

A prototype is here now #21238 and while it works fine locally and downloads cache as expected, in CI it looks like it does not find the cache files. If there are CI wizards who know how to investigate here, I would appreciate the help. It might just be something with relative paths


Last updated: May 02 2025 at 03:31 UTC