Zulip Chat Archive

Stream: lean4

Topic: list of all non-mathlib theorems used by another theorem


Metin Ersin Arıcan (Aug 06 2024 at 06:43):

Hi everyone,

I'm working on a formalization project using a blueprint setup, but updating the dependency graph by manually editing the content.tex file after every new definition and refactoring is cumbersome. It would be nice to have a tool that automatically analyzes the code and creates or updates the .tex files. Is there such a tool available?

If not, is there an easy way to extract a list of all the non-mathlib theorems, definitions, instances, etc., used by other theorems, definitions, or instances? This would help me implement such a tool myself.

Damiano Testa (Aug 06 2024 at 10:17):

This is a small adaptation of a command that counts all declarations in mathlib: I tweaked it to exclude the declarations defined in files whose path starts with one of Init, Lean, Qq, ImportGraph, ProofWidgets, Std, Aesop, Batteries, Mathlib.

import Mathlib

open Lean Elab Command

abbrev excludedRootNames : NameSet := NameSet.empty
  |>.insert `Init
  |>.insert `Lean
  |>.insert `Qq
  |>.insert `ImportGraph
  |>.insert `ProofWidgets
  |>.insert `Std
  |>.insert `Aesop
  |>.insert `Batteries
  |>.insert `Mathlib

/-- `AllowedModIdxs env` returns the `ModuleIdx`s corresponding to imports in the environment `env`
whose root component does not belong to `excludedRootNames`. -/
def AllowedModIdxs (env : Environment) : HashSet Nat × NameSet:= Id.run do
  let mut modIdx : HashSet Nat := .empty
  let mut modsSeen : NameSet := .empty
  let allowedImports := env.allImportedModuleNames.filter (!excludedRootNames.contains ·.getRoot)
  for imp in allowedImports do
    if let some nm := env.getModuleIdx? imp then
      modIdx := modIdx.insert nm
      modsSeen := modsSeen.insert imp
  return (modIdx, modsSeen)

elab "#not_in_mathlib" : command => do
  let env  getEnv
  let consts := env.constants
  let (allowed, mods) := AllowedModIdxs env
  let newDeclNames := consts.fold (init := ({} : NameSet)) fun a b _c =>
    match env.getModuleIdxFor? b with
      | none => a
      | some idx => if allowed.contains idx then a.insert b else a
  match newDeclNames.size with
    | 0 =>  logInfo m!"No declarations outside of\n{(excludedRootNames.toArray.qsort Name.lt)}"
    | n => logInfo m!"{n} declaration{if n == 1 then "" else "s"} found in the modules\n\
                      {(mods.toArray.qsort Name.lt)}\n\n\
                      {newDeclNames.toArray.qsort Name.lt}"

#not_in_mathlib  -- of course it reports 0!

Metin Ersin Arıcan (Aug 06 2024 at 15:46):

Thank you so much! It looks great. I’ll try it and let you know if I run into any issues.

Metin Ersin Arıcan (Aug 07 2024 at 05:24):

Damiano Testa said:

This is a small adaptation of a command that counts all declarations in mathlib: I tweaked it to exclude the declarations defined in files whose path starts with one of Init, Lean, Qq, ImportGraph, ProofWidgets, Std, Aesop, Batteries, Mathlib.

import Mathlib

open Lean Elab Command

abbrev excludedRootNames : NameSet := NameSet.empty
  |>.insert `Init
  |>.insert `Lean
  |>.insert `Qq
  |>.insert `ImportGraph
  |>.insert `ProofWidgets
  |>.insert `Std
  |>.insert `Aesop
  |>.insert `Batteries
  |>.insert `Mathlib

/-- `AllowedModIdxs env` returns the `ModuleIdx`s corresponding to imports in the environment `env`
whose root component does not belong to `excludedRootNames`. -/
def AllowedModIdxs (env : Environment) : HashSet Nat × NameSet:= Id.run do
  let mut modIdx : HashSet Nat := .empty
  let mut modsSeen : NameSet := .empty
  let allowedImports := env.allImportedModuleNames.filter (!excludedRootNames.contains ·.getRoot)
  for imp in allowedImports do
    if let some nm := env.getModuleIdx? imp then
      modIdx := modIdx.insert nm
      modsSeen := modsSeen.insert imp
  return (modIdx, modsSeen)

elab "#not_in_mathlib" : command => do
  let env  getEnv
  let consts := env.constants
  let (allowed, mods) := AllowedModIdxs env
  let newDeclNames := consts.fold (init := ({} : NameSet)) fun a b _c =>
    match env.getModuleIdxFor? b with
      | none => a
      | some idx => if allowed.contains idx then a.insert b else a
  match newDeclNames.size with
    | 0 =>  logInfo m!"No declarations outside of\n{(excludedRootNames.toArray.qsort Name.lt)}"
    | n => logInfo m!"{n} declaration{if n == 1 then "" else "s"} found in the modules\n\
                      {(mods.toArray.qsort Name.lt)}\n\n\
                      {newDeclNames.toArray.qsort Name.lt}"

#not_in_mathlib  -- of course it reports 0!

This just lists all the non-mathlib declarations, right? However, I also need the dependency relations between them. How can I modify this code to take the name of a declaration and output all the declarations it depends on?

Additionally, where can I learn more about these things? Is there any documentation or tutorial available?

Damiano Testa (Aug 07 2024 at 06:08):

I'm not at a computer now, but you can try adapting the code at https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/Unused.20theorems: the first function should return (some of) the dependencies of a declaration.

Damiano Testa (Aug 07 2024 at 06:09):

In general, there is a difference between the dependencies for the term produced by the declaration and the dependencies that are needed to produce that term.

Damiano Testa (Aug 07 2024 at 06:10):

For instance, tactics and syntax are mostly invisible to the term.

Metin Ersin Arıcan (Aug 10 2024 at 18:42):

Damiano Testa said:

I'm not at a computer now, but you can try adapting the code at https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/Unused.20theorems: the first function should return (some of) the dependencies of a declaration.

Thank you so much! I’ve been experimenting with the codes you provided, and they’ve been very useful. Do you know how I can get the docstring of a definition?

Damiano Testa (Aug 10 2024 at 20:25):

There is docs#Lean.TSyntax.getDocString: I think that you extract from the declModifiers syntax node the docComment one and apply this one. I have not tried it, though.

Damiano Testa (Aug 10 2024 at 20:25):

Let me know if you need more support and I can look at this! (I'm switching off my computer now, though!)

Damiano Testa (Aug 11 2024 at 09:53):

I just realized that maybe you wanted the docstring starting from a Name: this is one way of getting it

import Lean.DocString
import Lean.Elab.Command

open Lean

run_cmd
  let nms := [``Nat.succ, `WhoAmI, ``Nat.zero_le]
  for nm in nms do
    logInfo <| ( findDocString? ( getEnv) nm).getD s!"sorry, no doc-string found for '{nm}'"

Metin Ersin Arıcan (Aug 13 2024 at 19:32):

Damiano Testa said:

I just realized that maybe you wanted the docstring starting from a Name: this is one way of getting it

import Lean.DocString
import Lean.Elab.Command

open Lean

run_cmd
  let nms := [``Nat.succ, `WhoAmI, ``Nat.zero_le]
  for nm in nms do
    logInfo <| ( findDocString? ( getEnv) nm).getD s!"sorry, no doc-string found for '{nm}'"

Sorry for the late response. Somehow Zulip didn't notify me about your reply.

Thank you so much! Right now, I can get all the user-defined declarations, their docstrings, and their dependencies. But for a declaration, I need to differentiate between two kinds of dependencies:

  1. Those that are used in the type of the declaration (e.g. the statement of the theorem)
  2. Those that are used in the construction of the declaration (e.g. the proof of the theorem)

I don't really know how to approach this. Do you have any idea?

Damiano Testa (Aug 13 2024 at 21:44):

From the environment, you can extract the docs#Lean.ConstantInfo of each declaration. The ConstantInfos encode what kind of declaration you have: an axiom, a theorem, a definition and so on. Each one of these kinds ConstantInfo has some more information associated to them. Some of them have a type (e.g. probably what you mean when you say "the statement of the theorem") and a value (e.g. probably what you mean when you say "the proof of the theorem").

Damiano Testa (Aug 13 2024 at 21:44):

Carefully extracting this kind of information and feeding it to the visited machinery should yield what you want.

Damiano Testa (Aug 13 2024 at 21:45):

Btw, the kind of filtering that you are doing seems like it could be useful for mathlib as well: do share some of the code, if you feel like it! :smile:

Metin Ersin Arıcan (Aug 17 2024 at 10:33):

Damiano Testa said:

From the environment, you can extract the docs#Lean.ConstantInfo of each declaration. The ConstantInfos encode what kind of declaration you have: an axiom, a theorem, a definition and so on. Each one of these kinds ConstantInfo has some more information associated to them. Some of them have a type (e.g. probably what you mean when you say "the statement of the theorem") and a value (e.g. probably what you mean when you say "the proof of the theorem").

Hi again! In your examples, you always used logInfo, but since I need to write the results to a file, I want to work with a function Main : IO String. The function that reads the declarations has type CommandElabM String but I couldn't managed to convert it to IO String. Do you have any idea?

Damiano Testa (Aug 17 2024 at 12:05):

If you are in CommandElabM you should already have access to IO. The other way round, running code in CommandElabM from IO, should also be possible, but you would have to use some monad lifting/state creation. There is a Monad map on the wikipage for Mathlib that has a lot of information about what lifts/runs are available.

Damiano Testa (Aug 17 2024 at 12:06):

However, unless you want to avoid CommandElabM, you should be able to work within it and not have to worry about lifting from IO to CommandElabM.

Metin Ersin Arıcan (Aug 17 2024 at 12:53):

Damiano Testa said:

However, unless you want to avoid CommandElabM, you should be able to work within it and not have to worry about lifting from IO to CommandElabM.

Yeah, I misphrased my intention. The thing is, I want my script to be executable but lean requires the main function to use the IO monad.

Damiano Testa (Aug 17 2024 at 13:25):

This page is very useful for all monadic lifts:

https://github.com/leanprover-community/mathlib4/wiki/Monad-map

Damiano Testa (Aug 17 2024 at 13:27):

There does not seem to be a direct path to run CommandElabM code in IO, but maybe you can lift it to CoreM or TermElabM and then run it from there?

Damiano Testa (Aug 17 2024 at 13:27):

Someone else may have a better suggestion, though.


Last updated: May 02 2025 at 03:31 UTC