Zulip Chat Archive

Stream: general

Topic: Saving disk space?


finegeometer (Jul 31 2024 at 17:43):

Hi,

Every time I start a Lean project, mathlib's build artifacts consume another 4.3 GB of my disk space. This adds up really fast, so I'm reluctant to start new Lean projects.

I get it, mathlib is large. But is there at least a way to only spend 4.3 GB total, rather than per project?

If an answer is documented somewhere, I couldn't find it.

Thanks!

Sebastian Ullrich (Jul 31 2024 at 17:56):

I haven't tried this so far, but one relatively simple solution could be to replace the duplicate files with hardlinks using a tool like https://github.com/pauldreik/rdfind. Just make sure it is indeed the same Mathlib version.

Shreyas Srinivas (Jul 31 2024 at 18:07):

I believe the symlink solution was proposed and rejected because something breaks in windows

Sebastian Ullrich (Jul 31 2024 at 20:11):

Don't use it on Windows then. We're talking about a quick workaround here, not something that would be put in Lake as such.

Jon Eugster (Jul 31 2024 at 21:21):

As a more officially (by lake) supported workaround you can also download mathlib once and then have your personal projects with require mathlib from "local" / "path" / "to" / "mathlib" in their lakefiles, that should avoid duplication at the cost of some additional maintainance like having to keep all projects on the same mathlib version and dealing with relative paths between projects.

Shreyas Srinivas (Jul 31 2024 at 21:43):

I do wish lake automatically managed mathlib like that on Linux and macOS, moreso on Linux. I have a smaller Linux partition than windows for other reasons, and this gets filled up very fast by lean

Eric Wieser (Jul 31 2024 at 22:12):

Jon Eugster said:

that should avoid duplication at the cost of some additional maintainance like having to keep all projects on the same mathlib version and dealing with relative paths between projects.

Just to be clear, this cost is not the cost of Jon's workaround; this is the cost of wanting to use one copy of mathlib only, and would be paid by any solution to your problem

Shreyas Srinivas (Jul 31 2024 at 22:31):

Sure. But if someone has, say, 10 projects, and 5 of them are kept on the latest toolchain, then that's 17GB space freed up. So an automated solution is worth it.

Mr Proof (Aug 01 2024 at 01:47):

I have a suggestion to the Mathlib team:

Is it possible to create a version of Mathlib which has all the statements but all set to "sorry" or unproven. Such that the Mathlib library could be used without having the 4GB of compiled proof files.

It seems a bit pointless for day-to-day work having 4GB of proof files. When in reality all we really need is the propositional statements and some verification that the propositional statements have been proved at some point - but the actual proofs themselves are unnecessary for day-to-day work.

Another suggestion is that instead of storing the proof files on the user's computer, the Mathlib propositions could be stamped with some verification code and/or url to show that the statements have been proved. (much like a blockchain?)

i.e. it would be great to separate out the Mathlib propositions from the related proofs, so as to create a lighter library. What's your opinion?

Yury G. Kudryashov (Aug 01 2024 at 04:31):

See https://leanprover.zulipchat.com/#narrow/stream/287929-mathlib4/topic/Running.20Mathlib.20under.20.60set_option.20debug.2EbyAsSorry.60

Kim Morrison (Aug 01 2024 at 11:07):

debug.byAsSorry is solving a slightly different problem, but yes, we have considered (and may be able to implement in the future) the possibility of splitting oleans into the core part and the "irrelevant proofs" part.

Mr Proof (Aug 02 2024 at 01:21):

That would be great if you get this to work. I did begin, a couple of months ago, an experiment of embedding Lean into a symbolic algebra GUI thing for Windows, but I put that into hibernation for the time being as the mathlib library was just too big to be practical. With some combination of debug.byAsSorry and perhaps sharing the library between projects etc. I might revive it. I'm sure other people would find that useful too. :+1:

Jon Eugster (Aug 02 2024 at 08:28):

Would it be desirable if I wrote some instructions somewhere on the community wiki about how to setup a shared mathlib for your projects with the existing tools? Where would be a good place?

I'd be surprise if lake decided to support any setup with shared packages as there are various small issues that might need user-specific answers, but if there was a "standard way" to achieve that, maybe eventually that gets refined to a point where support by lake is possible.

Jon Eugster (Aug 02 2024 at 11:40):

For what it's worth: leanprover-community.github.io#510 adds the tips and tricks I know about setting up custom lakefiles. See here for the rendered version.


Last updated: May 02 2025 at 03:31 UTC