Zulip Chat Archive

Stream: general

Topic: Importing mathlib into Lean 4


view this post on Zulip Gabriel Ebner (Feb 06 2020 at 16:05):

I've just come back from the workshop in Bonn where @Sebastian Ullrich patiently answered lots of questions about the Lean 4 API, and I managed to write a small importer from mathlib to Lean 4. This just creates a Lean 4 olean file with all the mathlib declarations (taken from the textual export format), so don't get your hopes up for converting the mathlib source code. There's no notation and no tactics. My main motivation for this importer is to try writing automation in Lean 4. https://github.com/gebner/lean4-mathlib-import

import Import.Mathlib
#check (deriv : (real  real)  (real  real))

It takes about 6 minutes and 13.5 gigabytes of RAM to import the 213 megabyte export file. The resulting Mathlib.olean file is 3.2 gigabytes large (and it doesn't even contain proofs). However importing this 3.2 gigabyte file is surprisingly snappy: it only takes 1.8 seconds on my machine.

view this post on Zulip Sebastian Ullrich (Feb 06 2020 at 16:16):

Any ideas on where the massive blowup is from? Does it compress well?

view this post on Zulip Mario Carneiro (Feb 06 2020 at 16:17):

the large size of the olean file is surprising to me, since tree duplication doesn't explain that

view this post on Zulip Mario Carneiro (Feb 06 2020 at 16:17):

unless it was reduplicated and not merged again

view this post on Zulip Mario Carneiro (Feb 06 2020 at 16:19):

If terms are getting split out into different identical subtrees with different term IDs, it might not compress well because all the addresses are different

view this post on Zulip Gabriel Ebner (Feb 06 2020 at 17:32):

Any ideas on where the massive blowup is from? Does it compress well?

zstd compresses it to 362M, the Lean 3 olean files are 70M uncompressed (including proofs). The blowup is probably in part because I'm not deduplicating the terms (and I guess the writeModule function doesn't either). Every definition duplicates the term, since I replace the names.

view this post on Zulip Sebastian Ullrich (Feb 07 2020 at 08:57):

@Gabriel Ebner It does now! https://github.com/leanprover/lean4/commit/469562d524adf9fefbbfa6904d8a7f24dff7b009 https://github.com/leanprover/lean4/commit/3c5b3cd91f1e059e73dfbc82e042a6eec1bc8fdc

view this post on Zulip Gabriel Ebner (Feb 07 2020 at 09:01):

Oh wow, this moves quickly. I've already added support for Expr.replace, I'll run the import again.

view this post on Zulip Gabriel Ebner (Feb 07 2020 at 09:09):

Now Mathlib.olean is just 89 megabytes, and the importer only requires 2 gigabytes of RAM!

view this post on Zulip Regivan Hugo Nunes Santiago (Nov 18 2020 at 20:34):

Dear Friends, I need to import a theory, but lean returns the following:

file 'Kenny_comm_alg/ideal_operations' not found in the LEAN_PATH

Can you help me?


Last updated: May 10 2021 at 19:16 UTC