Zulip Chat Archive
Stream: Is there code for X?
Topic: Restore information from olean files
Yicheng Tao (Nov 14 2024 at 06:26):
I wonder if there is a way to restore information from the olean files. For example, a map from module to constants defined in that module. I found PersistentEnvExtension
that may be relevant, but I don't know how to work with it. Is there any tutorials for such kind of problems?
Kim Morrison (Nov 14 2024 at 08:35):
Essentially everything is stored in the Environment
. You can use withImportModules
to start a session with a specified set of oleans loaded.
Kim Morrison (Nov 14 2024 at 08:36):
Searching for withImportModules
in either Mathlib, Batteries, or lean-training-data
should give you some examples to work off.
Yicheng Tao (Nov 14 2024 at 08:59):
Yes, for now I was struggling with Environment
. The first thing I want to do is to extract the theorems declared in a module. Taking piece of code from LeanDojo's ExtractData.lean
, I have the following function:
import Lean
import Lake
open Lean Elab System
set_option maxHeartbeats 2000000 -- 10x the default maxHeartbeats.
unsafe def processFile (path : FilePath) : IO Unit := do
println! path
let input ← IO.FS.readFile path
enableInitializersExecution
let inputCtx := Parser.mkInputContext input path.toString
let (header, parserState, messages) ← Parser.parseHeader inputCtx
let (env, messages) ← processHeader header {} messages inputCtx
if messages.hasErrors then
for msg in messages.toList do
if msg.severity == .error then
println! "ERROR: {← msg.toString}"
throw $ IO.userError "Errors during import; aborting"
let env := env.setMainModule (← moduleNameOfFileName path none)
let commandState := { Command.mkState env messages {} with infoState.enabled := true }
let s ← IO.processCommands inputCtx parserState commandState
let env' := s.commandState.env
let constmap := env'.constants.map₂.toList
println! s!"{constmap.length}"
for (name, cinfo) in constmap do
-- println! s!"{name}"
match cinfo with
| .thmInfo val =>
println! s!"{val.name}"
| _ => pure ()
unsafe def main (args : List String) : IO Unit := do
match args with
| [file]=> processFile file
| _ => throw $ IO.userError "Invalid arguments"
It works, but it works too much. I randomly choose a file from mathlib .lake/packages/mathlib/Mathlib/Algebra/CubicDiscriminant.lean
and run the script. The outputs contain some unexpected items like «.lake».packages.mathlib.Mathlib.Algebra.CubicDiscriminant._auxLemma.3
. I only want theorems that explicitly declared using the keyword theorem
or lemma
. How to filter these items out?
I noticed both LeanDojo and lean-training-data
analyzed the InfoTree
to get these data. Is that necessary? I'm also reading code from doc-gen4
. If everything can be done by Environment
, that would be best for me.
Last updated: May 02 2025 at 03:31 UTC