Zulip Chat Archive
Stream: mathlib4
Topic: Counting prerequisites of a theorem
David Loeffler (Mar 06 2024 at 09:04):
Is there some way of finding out "how much of mathlib is needed" to prove a given theorem?
This could mean either
- (a) number of files in a minimal set of mathlib imports required to get the proof to work;
- (b) total lines of code in (a);
- (c) (probably much harder) total number of definitions / theorems needed, divided by the number in the stats page.
(Number theorists love to boast that their area uses just about every other area of pure mathematics as input at some stage, and I'm wondering if the statistics really bear this out.)
Johan Commelin (Mar 06 2024 at 09:09):
I think (c) is very doable these days. Tools like leaff and shake do much more complicated calculations.
I remember that we computed (c) for and for perfectoid_space
, and we discovered that perfectoid spaces are only 4x more complicated than real numbers.
Kevin Buzzard (Mar 06 2024 at 09:09):
I did notice that my PhD thesis used results from 11 of the 12 courses I went to as a final year UG (logic was the only missing one) and one reason I was pushing in 2017 for mathlib to become what it's becoming was because I knew that FLT would need "everything". But I wonder whether we're high enough up to see this phenomenon yet. Given all this import pruning it's probably not too hard to figure out if all this L-function stuff we have now depends on topology by "eyeballing", but no doubt it's depending on a bunch of analysis and some algebra already.
Ruben Van de Velde (Mar 06 2024 at 09:13):
I know there's logic and logic, but I still was taken aback a bit by your claim that you didn't need logic for your PhD thesis :)
Kim Morrison (Mar 06 2024 at 10:48):
A creative work of pure aesthetic genius.
Damiano Testa (Mar 07 2024 at 02:06):
I had some code that did exactly this, since I had been curious about this question already. Unfortunately, the code is a little too slow to be really usable for "complicated" declarations, but it does provide some data:
import Mathlib
open Lean
/-- `getExprsName e` takes as input an `Expr`ession `e`.
It returns a pair consisting of
* the list of all the maximal proper sub-expressions of `e`,
* the "head" `Name` associated to `e`, if it exists.
-/
def getExprsName : Expr → List Expr × Option Name
| .const declName _ => ([], some declName)
| .app fn arg => ([fn, arg], none)
| .lam _ binderType body _ => ([binderType, body], none)
| .forallE _ binderType body _ => ([binderType, body], none)
| .letE _ type value body _ => ([type, value, body], none)
| .mdata _ expr => ([expr], none)
| .proj typeName _ struct => ([struct], some typeName)
| _ => ([], none)
/-- `getForallEs e` takes as input an expression `e`.
If `e` is of the form `∀ a₁, a₂,..., aₙ, <something>`, then it returns the names
of all the variables `a₁, a₂,..., aₙ`.
This is useful to extract (some of) the constructors of an inductive type. -/
def getForallEs : Expr → List Name
| .forallE name _ body _ => name :: getForallEs body
| _ => []
/-- `getConstructorsAux nm` takes as input a `Name` `nm`.
It returns the pair of lists of
* `Name`s of the constructors of `nm` obtained via `Lean.InductiveVal.ctors`,
if `nm` is the `Name` of an inductive type -- typically `nm.mk`;
* `Name`s of the constructors that appear in the `nm.mk` function for `nm`, extracted
via `getExprsName`.
The two projections are called `getConstructors'` and `getConstructors`. -/
def getConstructorsAux {m : Type → Type} [Monad m] [MonadResolveName m] [MonadEnv m] (nm : Name) :
m (List Name × List Name) := do
let env ← getEnv
if ! env.contains nm then return ([], []) else
let decl := (env.find? nm).get!
if ! decl.isInductive then return ([], []) else
let ctrs := decl.inductiveVal!.ctors
let decls := (ctrs.map (env.find? · )).reduceOption
let types := decls.map ConstantInfo.type
let ret := (types.map getForallEs).join
let withNamespace ← ret.mapM (resolveGlobalName <| nm.append ·)
return (ctrs, withNamespace.map fun x => (x.getD 0 default).1)
run_cmd do
let n := `Mul
if (← getConstructorsAux n) != ([`Mul.mk], [`Mul, `Mul.mul]) then
throwError "oh no! {← getConstructorsAux n}"
/-- `getConstructors' nm` returns the list of constructors of `nm`, assuming that
it is the `Name` of an inductive type.
Often, this is `nm.mk`. -/
def getConstructors' {m : Type → Type} [Monad m] [MonadResolveName m] [MonadEnv m] (nm : Name) :
m (List Name) := do
let (ctrs, _) ← getConstructorsAux nm
return ctrs
/-- `getConstructors nm` returns the list of `Name`s of the constructors of `nm`,
that appear in the `nm.mk` function for `nm`. -/
def getConstructors {m : Type → Type} [Monad m] [MonadResolveName m] [MonadEnv m] (nm : Name) :
m (List Name) := do
let (ctrs, ns) ← getConstructorsAux nm
return ctrs ++ ns
/-- `getConstsIn e` takes as input an `Expr`ession `e` and it returns the `Name`s of all the
constants recursively appearing in `e`.
The optional argument `hs` allows to increment `hs` by such constants, instead of starting
from scratch every time. -/
partial
def getConstsIn (e : Expr) (hs : HashSet Name := .empty) : HashSet Name :=
Id.run do
let mut hs := hs
let (se, na) := getExprsName e
if let some na := na then hs := hs.insert na
for s in se do
let nas := getConstsIn s
for n in nas do
if ! hs.contains n then hs := hs.insert n
return hs
/-- `getAll nm` takes as input a name `nm` and returns the `HashSet Name` of all the
non-blacklisted `Name`s that appear in `nm` and in all the constants that are implied
by `nm`. -/
def getAll {m : Type → Type} [Monad m] [MonadResolveName m] [MonadEnv m] (nm : Name) :
m (HashSet Name) := do
let env ← getEnv
let mut boundary := HashSet.empty.insert nm
let mut tot := HashSet.empty.insert nm
let mut seenDecls : HashSet Name := .empty
while ! boundary.isEmpty do
for new in boundary do
boundary := boundary.erase new
let decl := (env.find? new).getD default
if ! seenDecls.contains decl.name then
seenDecls := seenDecls.insert decl.name
let first := getConstsIn decl.type
for d in first do
if (! (← d.isBlackListed)) && (! tot.contains d) then
tot := tot.insert d
boundary := boundary.insert d
if decl.value?.isSome then
for d in getConstsIn decl.value! do
if (! (← d.isBlackListed)) && (! tot.contains d) then
tot := tot.insert d
boundary := boundary.insert d
for d in (← getConstructors new) ++ (← getConstructors' new) do
if (! (← d.isBlackListed)) && (! tot.contains d) then
tot := tot.insert d
boundary := boundary.insert d
return tot
/-- `moduleIdxPrefix env str` takes as input an environment `env` and a string `str`.
It returns the array of pairs `(name, moduleIdx)` consisting of a `Name` of a module
that begins with `str` and the `ModuleIdx` corresponding `name`.
This is essentially a dictionary between module names beginning with `str` and
their module indices. -/
def moduleIdxPrefix (env : Environment) (str : String) : Array (Name × ModuleIdx) :=
let filt := env.allImportedModuleNames.filter (str.isPrefixOf ·.toString)
filt.map fun d => (d, (env.getModuleIdx? d).getD default)
/-- `countDeclCmd id pathPrefix` is the main "command": it takes as input an identifier `id` and
a path string `pathPrefix` and returns the `HashSet Name` of all names of constants recursively
implied by the declaration determined by `id` and whose -/
def countDeclCmd {m : Type → Type} [Monad m] [MonadEnv m][MonadResolveName m] [MonadError m]
(id : TSyntax `ident) (pathPrefix : String) :
m (HashSet Name) := do
let nm := id.getId
let env ← getEnv
let idxPrefix := moduleIdxPrefix env pathPrefix
let (_, idx) := idxPrefix.unzip
if (env.find? nm).isNone then throwError "unknown constant '{nm}'"
let mut rest ← getAll nm
for d in rest do
let c := env.getModuleIdxFor? d
if ! (c.getD default) ∈ idx then rest := rest.erase d
return rest
/-- `#tally id` takes as input an identifier `id`. It returns a tally of the number of
declarations recursively implied by `id`, where the tally is split by the first two folders
in the path to the file containing each declaration.
For example,
```lean
#tally Nat
/-
3 Init.Prelude
3 total
-/
#tally Int
/-
3 Init.Data
3 Init.Prelude
1 [anonymous]
7 total
-/
```
-/
elab "#tally " tk:(noWs "!")? id:ident : command => do
let env ← getEnv
let filt := env.allImportedModuleNames
let dict := filt.map fun m => ((env.getModuleIdx? m).getD default, m)
let dict := dict.map fun (idx, nm) =>
(idx, (nm.components.take 2).foldl (· ++ ·) default)
let dict := dict.toList
let tot ← countDeclCmd id ""
let mut tal : HashMap Name Nat := .empty
for d in tot do
let didx := env.getModuleIdxFor? d
let pair := (dict.lookup didx).getD default
match tal.find? pair with
| none => tal := tal.insert pair 1
| some x => tal := tal.insert pair (x + 1)
let ct := tal.fold (fun n _ t => (n + t)) 0
let lth := (Nat.toDigits 10 ct).length
for (m, t) in tal.toArray.qsort (fun n m => n.1.toString < m.1.toString) do
let pad := List.replicate (lth - (Nat.toDigits 10 t).length) ' '
dbg_trace (⟨pad⟩ ++ s!"{t} {m}")
dbg_trace "\n{ct} total"
if tk.isSome then
dbg_trace "\nDeclarations in '{id.getId}':\n"
for t in tot.toArray.qsort (fun n m => n.toString < m.toString) do dbg_trace t
Damiano Testa (Mar 07 2024 at 02:06):
For instance, you get this, if you wait for a few seconds:
#tally Semiring
/-
1 Init.Core
4 Init.Data
1 Init.Meta
104 Init.Prelude
41 Mathlib.Algebra
8 Mathlib.Init
1 Std.Data
1 [anonymous]
161 total
-/
#tally Real.sin
/-
9 Init.Classical
25 Init.Control
132 Init.Core
489 Init.Data
1 Init.Meta
317 Init.Prelude
16 Init.PropLemmas
51 Init.SimpLemmas
13 Init.WF
1689 Mathlib.Algebra
471 Mathlib.Data
28 Mathlib.GroupTheory
147 Mathlib.Init
52 Mathlib.Logic
374 Mathlib.Order
9 Mathlib.Tactic
1 Mathlib.Topology
5 Std.Classes
179 Std.Data
2 Std.Logic
1 [anonymous]
4011 total
-/
David Loeffler (Mar 07 2024 at 06:37):
Hi Damiano! That looks cool, but given the time it takes for Real.sin
, I suspect #tally hasSum_zeta_two
would probably be totally infeasible.
David Loeffler (Mar 07 2024 at 06:39):
So maybe I was right to be pessimistic about (c), at least for the time being.
Johan Commelin (Mar 07 2024 at 06:42):
I think it is worth a shot. I would be surprised if it takes more than a few minutes
David Loeffler (Mar 07 2024 at 07:04):
It got there in the end (10 minutes or so). The final tally for hasSum_zeta_two
is 19410. That's actually only about 10% of mathlib, I'm disappointed :smile:
I'd be curious to compare this with other heavyweight theorems in the library – can anyone find a single declaration with more than this many prerequisites?
Kevin Buzzard (Mar 07 2024 at 07:11):
I remember that we were initially quite disappointed to discover that perfectoid spaces were only four times as complex as real numbers! I would be tempted to try a theorem about group cohomology for this game. It wouldn't surprise me if it's pulling in a bunch of the algebra hierarchy and a bunch of category theory but right now I have no feeling for numbers.
Actually perhaps the thing to do, now Mario has shaken mathlib, is to redo Scott's calculation of the longest pole and look in the leaf file. I found it very interesting that during the port the longest pole was changing between group cohomology and measure theory, it seemed like evidence that algebra and analysis were growing in complexity at similar rates.
Damiano Testa (Mar 07 2024 at 07:24):
With respect to speed: it is very likely that small tweaks could improve dramatically the running time. I don't think that the code that I wrote is particularly efficient.
David Loeffler (Mar 07 2024 at 07:27):
I'll try running it on groupCohomology.H1ofAutOnUnitsUnique
(Hilbert 90) and see how that compares.
David Loeffler (Mar 07 2024 at 08:01):
David Loeffler said:
I'll try running it on
groupCohomology.H1ofAutOnUnitsUnique
(Hilbert 90) and see how that compares.
Only 1611, so Hilbert 90 is "easier" than Real.sin
. Just goes to show how little correlation there is between the order things are built from the axioms, and the order we teach them to students!
Damiano Testa (Mar 07 2024 at 08:10):
Ok, besides speed improvements, this answer on Hilbert 90 makes me wonder whether my code is missing anything.
Damiano Testa (Mar 07 2024 at 08:16):
These are the declarations that it finds for H90.
H90 -- first 400 lines
Damiano Testa (Mar 07 2024 at 08:18):
second batch
Damiano Testa (Mar 07 2024 at 08:19):
third batch
Damiano Testa (Mar 07 2024 at 08:20):
final batch
David Loeffler (Mar 07 2024 at 08:56):
Kevin: you know this part of the code, and I don't – is there anything here that's missing? It definitely seems strange that Hilbert 90 only seems to touch five other declarations under groupCohomology
.
Damiano Testa (Mar 07 2024 at 09:05):
If it helps with the debugging, I added a !
flag to #tally
(I edited the code above):
#tally! Int
/-
3 Init.Data
3 Init.Prelude
1 [anonymous]
7 total
Declarations in 'Int':
Int
Int.negSucc
Int.ofNat
Nat
Nat.succ
Nat.zero
[anonymous]
-/
This prints the (sorted) list of declarations that it found.
Xavier Roblot (Mar 07 2024 at 09:25):
I tried it on docs#RingOfIntegers.isPrincipalIdealRing_of_abs_discr_lt that uses most of the algebraic number theory in Mathlib (except units) and got a score of 29220.
1 Aesop.BuiltinRules
7 Init.ByCases
18 Init.Classical
3 Init.Coe
31 Init.Control
244 Init.Core
906 Init.Data
2 Init.Ext
1 Init.Meta
4 Init.MetaTypes
121 Init.Omega
370 Init.Prelude
65 Init.PropLemmas
71 Init.SimpLemmas
4 Init.SizeOf
30 Init.WF
4665 Mathlib.Algebra
6 Mathlib.AlgebraicGeometry
2467 Mathlib.Analysis
5171 Mathlib.Data
33 Mathlib.Dynamics
251 Mathlib.FieldTheory
838 Mathlib.GroupTheory
281 Mathlib.Init
1443 Mathlib.LinearAlgebra
644 Mathlib.Logic
3105 Mathlib.MeasureTheory
153 Mathlib.NumberTheory
2727 Mathlib.Order
1182 Mathlib.RingTheory
377 Mathlib.SetTheory
180 Mathlib.Tactic
3355 Mathlib.Topology
5 Std.Classes
447 Std.Data
11 Std.Logic
1 [anonymous]
29220 total
And docs#IsCyclotomicExtension.Rat.three_pid should even get more but I don't have time to test it now...
Damiano Testa (Mar 07 2024 at 09:26):
By manual inspection, I think that there are definitely more declarations in the groupCohomology
namespace that it should have found. I'm investigating why it is not recursing properly.
Damiano Testa (Mar 07 2024 at 09:29):
To go a bit into the internals, the code omits declarations that are blackListed
: these should essentially be the declarations that lean uses internally to fill in some gaps, but I do not think that pruning them they should be responsible for the inconsistencies that seem to be present.
David Loeffler (Mar 07 2024 at 09:55):
Xavier Roblot said:
I tried it on docs#RingOfIntegers.isPrincipalIdealRing_of_abs_discr_lt that uses most of the algebraic number theory in Mathlib (except units) and got a score of 29220.
Ah, the Minkowski bound! I wondered why that needed quite so much analysis / topology, but it looks like the dependence goes via MeasureTheory.Measure.Lebesgue.VolumeOfBalls
. This may well be the "commanding height" of mathlib – I wonder if anyone can beat that score?
Xavier Roblot (Mar 07 2024 at 10:01):
docs#IsCyclotomicExtension.Rat.three_pid should beat it since it actually uses that function. Note that it tooks more than 30mn to compute this tally on my M1 laptop. Anyway, I am not in my office right now so somebody else will have to try :smile:
Riccardo Brasca (Mar 07 2024 at 10:23):
I guess this is because the definition of the cyclotomic polynomials uses complex roots of unity (this is my fault, I am to lazy to do an algebraic refactor)
Xavier Roblot (Mar 07 2024 at 10:34):
There is also a bunch of topology / analysis involved in the proof that two complex embeddings define the same infinite place iff they are complex conjugate.
Kevin Buzzard (Mar 07 2024 at 10:58):
David Loeffler said:
Kevin: you know this part of the code, and I don't – is there anything here that's missing? It definitely seems strange that Hilbert 90 only seems to touch five other declarations under
groupCohomology
.
I was expecting a lot more category theory and topology -- but looking at the code I see I'd misguessed what was going on. H1
is defined concretely as 1-cocycles over 1-coboundaries, rather than H n
for n = 1
, so all the general group cohomology machinery (which uses a bunch of resolutions, which uses a bunch of topology...) is not used. In short docs#groupCohomology.isoH1 is not rfl
, so it was my guess which was lousy, not Damiano's code.
Damiano Testa (Mar 07 2024 at 11:02):
Kevin Buzzard said:
In short docs#groupCohomology.isoH1 is not
rfl
, so it was my guess which was lousy, not Damiano's code.
Oh, I assure you, Damiano's code has a very high chance of being lousy!
Xavier Roblot (Mar 07 2024 at 17:59):
I finally had the time to test docs#IsCyclotomicExtension.Rat.three_pid, the tally is 31083
1 Aesop.BuiltinRules
8 Init.ByCases
18 Init.Classical
3 Init.Coe
31 Init.Control
247 Init.Core
953 Init.Data
2 Init.Ext
1 Init.Meta
4 Init.MetaTypes
147 Init.Omega
371 Init.Prelude
69 Init.PropLemmas
72 Init.SimpLemmas
4 Init.SizeOf
30 Init.WF
5094 Mathlib.Algebra
6 Mathlib.AlgebraicGeometry
2483 Mathlib.Analysis
5774 Mathlib.Data
39 Mathlib.Dynamics
368 Mathlib.FieldTheory
916 Mathlib.GroupTheory
283 Mathlib.Init
1483 Mathlib.LinearAlgebra
649 Mathlib.Logic
3105 Mathlib.MeasureTheory
239 Mathlib.NumberTheory
2735 Mathlib.Order
1517 Mathlib.RingTheory
384 Mathlib.SetTheory
193 Mathlib.Tactic
3355 Mathlib.Topology
5 Std.Classes
482 Std.Data
11 Std.Logic
1 [anonymous]
31083 total
Mario Carneiro (Mar 07 2024 at 18:36):
@Damiano Testa As you suspected, the code can be written more simply and a lot faster by reusing the code of #print axioms
, which is already doing basically all of this work. (You can also copy and paste it, it's not that complicated, but it so happens that we can actually use the core function unmodified in this case.)
import Mathlib
open Lean
elab "#tally " tk:(noWs "!")? id:ident : command => do
let env ← getEnv
let decl ← Elab.resolveGlobalConstNoOverloadWithInfo id
let (_, { visited, .. }) := Elab.Command.CollectAxioms.collect decl |>.run (← getEnv) |>.run {}
let mut truncate := if tk.isSome then
env.header.moduleNames
else
env.header.moduleNames.map fun n => Name.fromComponents <| n.components.take 2
let mut out := mkRBMap Name Nat Name.cmp
for d in visited do
if let some idx := env.getModuleIdxFor? d then
let n := truncate[idx.toNat]!
out := out.insert n (out.findD n 0 + 1)
let mut msg := m!""
let mut total := 0
for (name, n) in out do
total := total + n
msg := msg ++ m!"{(toString n).leftpad 8 ' '} {name}\n"
logInfo m!"{msg}\n{(toString total).leftpad 8 ' '} total"
Mario Carneiro (Mar 07 2024 at 18:36):
#tally IsCyclotomicExtension.Rat.three_pid
1 Aesop.BuiltinRules
16 Init.ByCases
32 Init.Classical
3 Init.Coe
20 Init.Control
331 Init.Core
1511 Init.Data
3 Init.Ext
1 Init.Meta
172 Init.Omega
497 Init.Prelude
156 Init.PropLemmas
99 Init.SimpLemmas
9 Init.SizeOf
45 Init.WF
8646 Mathlib.Algebra
9 Mathlib.AlgebraicGeometry
4591 Mathlib.Analysis
5 Mathlib.Combinatorics
9588 Mathlib.Data
64 Mathlib.Dynamics
637 Mathlib.FieldTheory
2084 Mathlib.GroupTheory
408 Mathlib.Init
3142 Mathlib.LinearAlgebra
1209 Mathlib.Logic
4749 Mathlib.MeasureTheory
444 Mathlib.NumberTheory
4425 Mathlib.Order
2980 Mathlib.RingTheory
666 Mathlib.SetTheory
254 Mathlib.Tactic
5234 Mathlib.Topology
5 Std.Classes
925 Std.Data
17 Std.Logic
52978 total
David Loeffler (Mar 07 2024 at 18:58):
As you suspected, the code can be written more simply and a lot faster by reusing the code of #print axioms, which is already doing basically all of this work
It gets – simply and a lot faster – to a very different answer! For Rat.three_PID
, Mario's code gives an answer of 52978 while Damiano's gives 31083. I wonder how the difference arises?
Damiano Testa (Mar 07 2024 at 19:04):
Mario, that is fantastic! I did not really know what print axioms did, so it hadn't even occurred to me to recycle it!
Damiano Testa (Mar 08 2024 at 09:50):
I've done a little bit of investigating and I think that the discrepancy between the numbers may be due to the fact that my code was removing blackListed
declarations (which I was doing in the hope of speeding up the computation).
Besides that, in terms of programming abilities, there are probably very few wider chasms than the one separating Mario and myself, so that should explain why his code works, while mine is, at best, shaky.
Finally, I see that Mario reinterpreted the value of the !
flag to mean "tally by whole path instead of 2-step truncated paths". What I had intended the !
flag to be was to actually print all the declarations found. If you want that behaviour, below I adapt minimally Mario's code from above.
So, now you can do
#tally CovariantClass
/-
3 Mathlib.Algebra
3 total
-/
#tally! CovariantClass
/-
3 Mathlib.Algebra.CovariantAndContravariant
3 total
-/
#tally CovariantClass all
/-
3 Mathlib.Algebra
3 total
[Covariant, CovariantClass, CovariantClass.mk]
-/
#tally! CovariantClass all
/-
3 Mathlib.Algebra.CovariantAndContravariant
3 total
[Covariant, CovariantClass, CovariantClass.mk]
-/
import Mathlib
open Lean
elab "#tally " tk:(noWs "!")? id:ident all?:(" all")? : command => do
let env ← getEnv
let decl ← Elab.resolveGlobalConstNoOverloadWithInfo id
let (_, { visited, .. }) := Elab.Command.CollectAxioms.collect decl |>.run (← getEnv) |>.run {}
let mut truncate := if tk.isSome then
env.header.moduleNames
else
env.header.moduleNames.map fun n => Name.fromComponents <| n.components.take 2
let mut out := mkRBMap Name Nat Name.cmp
for d in visited do
if let some idx := env.getModuleIdxFor? d then
let n := truncate[idx.toNat]!
out := out.insert n (out.findD n 0 + 1)
let mut msg := m!""
let mut total := 0
for (name, n) in out do
total := total + n
msg := msg ++ m!"{(toString n).leftpad 8 ' '} {name}\n"
let pmsg := m!"{msg}\n{(toString total).leftpad 8 ' '} total"
logInfo <| pmsg ++ match all? with
| none => ""
| some _ => m!"\n\n{visited.toArray.qsort (·.toString < ·.toString)}"
Riccardo Brasca (May 03 2024 at 19:19):
Can part of this code be used to find unused declarations to get to a specific theorem? For example, in flt-regular
we have a clear target (FLT for regular primes!), but I am sure we have a lot of useless code. It would be nice to get a list of useless things, to check it they just deserve to be removed.
Mario Carneiro (May 03 2024 at 19:26):
Std/Batteries has #show_unused
, although it only works within a single file
Mario Carneiro (May 03 2024 at 19:26):
it could work across files but it would need a different interface
David Loeffler (May 04 2024 at 06:04):
I recall a discussion not long back about a script intended to automatically produce MWE's for some failing declaration. That's the same problem as here, in some sense. I can't remember who was working on the auto-MWE script though – @Johan Commelin was it you?
Johan Commelin (May 04 2024 at 06:53):
No, it wasn't me.
Yaël Dillies (May 04 2024 at 06:55):
It was @Anne Baanen
Yaël Dillies (May 04 2024 at 06:56):
See https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/Bug.20causing.20stack.20overflow
Kim Morrison (May 04 2024 at 07:15):
I think the bottom line was that the automation wasn't yet good enough. I would love progress on this front.
There are a lot of tricks to effectively minimizing from deep inside Mathlib.
To begin, you need a good set of #guard_msgs
statements (or sometimes hand-rolled checks of the errors messages) that ensure that you have red iff you've over-minimized.
After that, you iteratively:
- move the contents of an import into the current file (I find it helpful to wrap it in a section named after the file)
- delete all unneeded declarations (
#show_unused in X
is super helpful) - further minimizing the remaining declarations (replace all proofs with sorry, remove structure fields -- hard to automate as you often need downstream fixes, try changing definitions to sorry (sometimes this doesn't work but changing the definition to be something "simpler" does), weakening extends statements or typeclass arguments...)
- minimize imports (try removing imports, and other replacing each by its own imports:
shake
can help here)
If you go through the @[to_additive]
parts of the library removing that dependency can be painfully manual.
Anne Baanen (May 04 2024 at 08:58):
Here's the main thread for the script: https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/Minimizer.20script
Anne Baanen (May 04 2024 at 09:00):
It essentially tries doing all the steps Kim suggests but by brute force, simply doing a search cutting out whole lines and paragraphs without any guidance from Lean. So it's not particularly effective at actually finding what can be dropped, nor does it do any downstream fixing.
Anne Baanen (May 04 2024 at 09:00):
In practice it means it is way too slow and way too ineffective for any use on Mathlib. :(
Damiano Testa (Jun 04 2024 at 20:13):
Out of curiosity, I tallied all lemmas: these are the top 10 using the most declarations
Declaration | Depends on |
---|---|
NumberField.Units.unitLattice_rank | 51024 |
NumberField.hermiteTheorem.finite_of_discr_bdd_of_isReal | 51088 |
NumberField.hermiteTheorem.finite_of_discr_bdd_of_isComplex | 51103 |
NumberField.finite_of_discr_bdd | 51121 |
NumberField.Units.rank_modTorsion | 51201 |
NumberField.Units.basisModTorsion | 51211 |
NumberField.Units.fun_eq_repr | 51214 |
NumberField.Units.exist_unique_eq_mul_prod | 51219 |
IsCyclotomicExtension.Rat.three_pid | 53083 |
IsCyclotomicExtension.Rat.five_pid | 53084 |
Kim Morrison (Jun 05 2024 at 01:30):
I feel like you should only list declarations that are not themselves used in other declarations!
Damiano Testa (Jun 05 2024 at 05:13):
I agree, but I need to think about how to do that with finite resources! :smile:
Damiano Testa (Sep 10 2024 at 15:34):
Warning: take what is below more as plausibility test, rather than a bug-free report!
I went back to the question of finding which declarations are maximal, in the sense that they are not used by any other declaration in mathlib. For those declarations, I computed the number of transitively involved constants, as a measure of "complexity". Here is a summary of the findings.
There are 72619 maximal declarations in Mathlib. The 10 longest ones are
- 47227, docs#NumberField.mixedEmbedding.exists_ne_zero_mem_ringOfIntegers_of_norm_le
- 47374, docs#riemannZeta_neg_nat_eq_bernoulli
- 47578, docs#ZMod.LFunction_one_sub
- 50729, docs#NumberField.abs_discr_gt_two
- 51817, docs#NumberField.Units.regulator_ne_zero
- 51819, docs#NumberField.Units.regulator_pos
- 51856, docs#NumberField.finite_of_discr_bdd
- 52068, docs#NumberField.Units.regulator_eq_det
- 53751, docs#IsCyclotomicExtension.Rat.five_pid
- 55694, docs#fermatLastTheoremThree
I find it a little ironic that docs#WeierstrassCurve.toProjective involves 4 constants and is at position 42.
Source: the Find tips
tab.
David Ang (Sep 10 2024 at 15:56):
To be fair docs#WeierstrassCurve.toProjective doesn't actually do anything...
Damiano Testa (Sep 10 2024 at 15:59):
I know: as soon as I saw it, I went and checked, since I thought "there is no way that this can be right", but it is!
In case you are curious, these are the 4 involved constants:
#[WeierstrassCurve,
WeierstrassCurve.Projective,
WeierstrassCurve.toProjective,
WeierstrassCurve.mk]
Michael Stoll (Sep 10 2024 at 16:27):
Damiano Testa said:
Warning: take what is below more as plausibility test, rather than a bug-free report!
I went back to the question of finding which declarations are maximal, in the sense that they are not used by any other declaration in mathlib. For those declarations, I computed the number of transitively involved constants, as a measure of "complexity". Here is a summary of the findings.
There are 72619 maximal declarations in Mathlib. The 10 longest ones are
- 47227, docs#NumberField.mixedEmbedding.exists_ne_zero_mem_ringOfIntegers_of_norm_le
- 47374, docs#riemannZeta_neg_nat_eq_bernoulli
- 47578, docs#ZMod.LFunction_one_sub
- 50729, docs#NumberField.abs_discr_gt_two
- 51817, docs#NumberField.Units.regulator_ne_zero
- 51819, docs#NumberField.Units.regulator_pos
- 51856, docs#NumberField.finite_of_discr_bdd
- 52068, docs#NumberField.Units.regulator_eq_det
- 53751, docs#IsCyclotomicExtension.Rat.five_pid
- 55694, docs#fermatLastTheoremThree
I find it a little ironic that docs#WeierstrassCurve.toProjective involves 4 constants and is at position 42.
Source: the
Find tips
tab.
Number Theory rules!
David Loeffler (Sep 10 2024 at 17:45):
Michael Stoll said:
Number Theory rules!
In mountaineering, people talk about the "prominence" of a mountain peak, defined as the height of that peak above the saddle connecting it to the next highest peak. By analogy with this, I wonder what the "most prominent" mathlib theorems are? I.e. what are the first few elements of the list characterised by
L[N + 1] := the theorem "x" which maximizes # { prerequisites of x which are not prerequisites of any of L[1], ..., L[N] }.
Presumably L[2] would be something outside of number theory – I'm guessing it would be one of the results about C-star algebras that are the highest-ranked non-number-theory items on the current list.
Damiano Testa (Sep 10 2024 at 21:10):
I agree that this information is still not ideal, but at least these results are not "in a chain", like the previous attempt that I made!
Damiano Testa (Sep 10 2024 at 21:11):
However, this code took almost 4 hours to run, so any more sophisticated analysis will require a different idea for how to implement it.
Daniel Weber (Sep 11 2024 at 06:50):
Would it be possible to do this count ignoring proofs, i.e. the most complex definitions?
Damiano Testa (Sep 11 2024 at 06:53):
Probably in byAsSorry
with roughly the same code? I never used byAsSorry
, though.
Damiano Testa (Sep 11 2024 at 07:00):
If that works, then also I expect that the whole process would be much faster and a lot of extra analysis could be done without rethinking about optimizing the current script.
Kim Morrison (Sep 11 2024 at 07:03):
No, byAsSorry is irrelevant here. Just pull the constants from the type, not from the value.
Daniel Weber (Sep 11 2024 at 07:21):
How many constants are there? I don't know much about meta-programming, but if you're able to export a graph I could do some analysis
Damiano Testa (Sep 11 2024 at 07:22):
230827 is the number of declarations.
Damiano Testa (Sep 11 2024 at 07:23):
(Possibly after filtering out the ones that are "implementation details" for some meaning of "implementation details".)
Damiano Testa (Sep 11 2024 at 07:23):
In any case, those are the declarations that the linter uses, although it counts the implementation details as dependencies as well. I am not sure how much of a difference that would make.
Daniel Weber (Sep 11 2024 at 07:23):
How many edges are there?
Damiano Testa (Sep 11 2024 at 07:24):
I do not know, but a lot vertices have valence way over 10k.
Damiano Testa (Sep 11 2024 at 07:25):
The maximum valence is 55k (as per above).
Daniel Weber (Sep 11 2024 at 07:27):
I mean direct edges, not in the transitive closure
Damiano Testa (Sep 11 2024 at 07:27):
That information I have not looked at...
Damiano Testa (Sep 11 2024 at 07:27):
However, maybe the length of the proof is a good indication?
Damiano Testa (Sep 11 2024 at 07:45):
In any case, even though I have not looked explicitly at the number of direct dependencies, I do think that it is much smaller than the number of transitive dependencies and it might even be quite small (maybe even ~100 on average? Pure speculation).
Damiano Testa (Sep 11 2024 at 08:39):
This build tallies what I think are the direct imports of each declaration. The maximum appears to be 579.
Damiano Testa (Sep 11 2024 at 08:43):
Notice how this step took 3mins, as opposed to almost 4 hours for the "full" check: definitely there is room for improvement!
Daniel Weber (Sep 11 2024 at 11:01):
How would I get it to output all of tips
to a file?
Damiano Testa (Sep 11 2024 at 11:02):
You would like the "lighter" one, right? Just the direct dependencies and then you take care of taking the transitive closure?
Damiano Testa (Sep 11 2024 at 11:04):
(I am saying this, since tips
is what I have been using for the expr-maximal declarations, whereas I imagine that you want a list of pairs (declaration, array of declarations that are directly used in the first entry).)
Damiano Testa (Sep 11 2024 at 11:04):
(And this is not directly available from the script, but could probably be generated.)
Daniel Weber (Sep 11 2024 at 11:06):
Damiano Testa said:
You would like the "lighter" one, right? Just the direct dependencies and then you take care of taking the transitive closure?
Yes
Damiano Testa (Sep 11 2024 at 11:11):
This list would contain, roughly, 20k entries, each with a corresponding array of roughly 300 entries. So, roughly, 10^8 characters (I'm trying to overestimate, using that each declaration name is 100 characters, since there no longer are insane instance names :smile: ).
Damiano Testa (Sep 11 2024 at 11:12):
It does not sound insane, but I am not sure that I should make GitHub produce that in CI...
Daniel Weber (Sep 11 2024 at 11:13):
Would it be possible to number them? i.e. Say that 2 uses 4, 6, and 9, and have the map to the constant name separate
Damiano Testa (Sep 11 2024 at 11:14):
It is certainly possible and since everything goes via stuff that is called HashSomething
I imagine that Lean is already doing that internally... But maybe someone who is more knowledgeable of actual coding should help here...
Daniel Weber (Sep 11 2024 at 12:06):
I managed to write
elab "#direct_deps" : command => do
let env ← getEnv
let mut name_ids : Std.HashMap Name Nat := ← liftCoreM do
env.constants.map₁.foldM (init := {}) fun map n ci => do
if (← n.isBlackListed) || n.getRoot == `Lean then return map
else return map.insert n map.size
let mut tips : Std.HashMap Name NameSet := ← liftCoreM do
env.constants.map₁.foldM (init := {}) fun map n ci => do
if (← n.isBlackListed) || n.getRoot == `Lean then return map
else return map.insert n (ci.getUsedConstantsAsSet)
IO.FS.withFile ((← IO.currentDir) / "test.txt") (.write) fun hfile ↦ do
hfile.putStr s!"{name_ids.size}\n"
for (name, id) in name_ids do
hfile.putStr s!"{name} {id}\n"
for (name, maps_to) in tips do
hfile.putStr (s!"{name_ids.get! name} {maps_to.size} " ++
" ".intercalate (maps_to.toList.map (toString ∘ name_ids.get!)) ++ "\n")
it seems to be working. Do you know if there's a way to get a nice progress bar, like Python's tqdm?
Damiano Testa (Sep 11 2024 at 12:15):
I don't think that there is a way to get a progress bar. I also don't think that it is possible to print something "mid-computation". I think it is all-or-nothing, unfortunately.
Damiano Testa (Sep 11 2024 at 12:16):
What I thought might have been an option was to try to split the computation in two parts and get a "halfway report", but did not think hard about it.
Daniel Weber (Sep 11 2024 at 12:28):
Does VS code compile and run this, or is this getting interpreted? It still hadn't finished, and it's only with what Tips.lean
imports
Damiano Testa (Sep 11 2024 at 12:30):
I do not really know the difference between interpret and compile, but I think that it is the slower of the two.
Damiano Testa (Sep 11 2024 at 12:33):
Also, in CI, the commented import Mathlib
gets uncommented so that the code runs on all of mathlib, but I was using the lighter imports for testing.
Daniel Weber (Sep 11 2024 at 13:25):
I did this, I'm not sure how well it's working
/-
Copyright (c) 2024 Damiano Testa. All rights reserved.
Released under Apache 2.0 license as described in the file LICENSE.
Authors: Damiano Testa
-/
import ImportGraph
import Cli
--import Mathlib
/-!
# Find declarations that are not used by other declarations
-/
open Lean Meta Elab Command
def isMatcher' (env : Environment) (declName : Name) : Bool :=
isMatcherCore env declName
def isRec' (env : Environment) (declName : Name) : Bool :=
isRecCore env declName
def Lean.Name.isBlackListed' (env : Environment) (declName : Name) : Id Bool := do
if declName == ``sorryAx then return true
if declName matches .str _ "inj" then return true
if declName matches .str _ "noConfusionType" then return true
pure <| declName.isInternalDetail
|| isAuxRecursor env declName
|| isNoConfusion env declName
|| isRec' env declName || isMatcher' env declName
/--
`lake exe deps`
-/
def mainCli (args : Cli.Parsed) : IO UInt32 := do
let to ← match args.flag? "to" with
| some to => pure <| to.as! Cli.ModuleName
| none => ImportGraph.getCurrentModule -- autodetect the main module from the `lakefile.lean`
IO.eprintln "test"
searchPathRef.set compile_time_search_path%
IO.FS.withFile ((← IO.currentDir) / "test.txt") .write fun hfile ↦ do
unsafe withImportModules #[{module := to}] {} (trustLevel := 1024) fun env => do
let mut name_ids : NameMap Int := {}
for (n, _) in env.constants do
if (← n.isBlackListed' env) || n.getRoot == `Lean then pure () else
if name_ids.size % 10 == 0 then IO.eprintln s!"finished {name_ids.size}"
name_ids := name_ids.insert n name_ids.size
hfile.putStrLn s!"{name_ids.size}"
for (n, ci) in env.constants do
if (← n.isBlackListed' env) || n.getRoot == `Lean then pure () else
let names := (ci.getUsedConstantsAsSet.toList.filterMap name_ids.find?).map toString
hfile.putStrLn (s!"{n} {name_ids.find! n} {names.length} " ++ " ".intercalate names)
return 0
/-- Setting up command line options and help text for `lake exe pole`. -/
def deps : Cli.Cmd := `[Cli|
deps VIA mainCli; ["0.0.1"]
"Calculat a thing."
FLAGS:
to : Cli.ModuleName; "Calculate the longest thing with the thing."
]
/-- `lake exe deps` -/
def main (args : List String) : IO UInt32 :=
deps.validate args
Daniel Weber (Sep 11 2024 at 15:02):
It finished running and the file is about 50MB. I'll do some analysis, is there anything in particular which might be interesting to look at?
Daniel Weber (Sep 11 2024 at 15:14):
I wanted to check that it's a DAG as a sanity check, but surprisingly it's not - there are SCCs of size 28, for instance:
- CommRingCat.Colimits.Relation
- CommRingCat.Colimits.Relation.mul
- CommRingCat.Colimits.Relation.neg_add_cancel
- CommRingCat.Colimits.Relation.zero_mul
- CommRingCat.Colimits.Relation.one_mul
- CommRingCat.Colimits.Relation.zero
- CommRingCat.Colimits.Relation.neg
- CommRingCat.Colimits.Relation.left_distrib
- CommRingCat.Colimits.Relation.mul_comm
- CommRingCat.Colimits.Relation.add_1
- CommRingCat.Colimits.Relation.add_zero
- CommRingCat.Colimits.Relation.map
- CommRingCat.Colimits.Relation.refl
- CommRingCat.Colimits.Relation.add_2
- CommRingCat.Colimits.Relation.mul_zero
- CommRingCat.Colimits.Relation.zero_add
- CommRingCat.Colimits.Relation.one
- CommRingCat.Colimits.Relation.mul_1
- CommRingCat.Colimits.Relation.mul_assoc
- CommRingCat.Colimits.Relation.add_comm
- CommRingCat.Colimits.Relation.symm
- CommRingCat.Colimits.Relation.right_distrib
- CommRingCat.Colimits.Relation.trans
- CommRingCat.Colimits.Relation.add_assoc
- CommRingCat.Colimits.Relation.add
- CommRingCat.Colimits.Relation.mul_2
- CommRingCat.Colimits.Relation.mul_one
- CommRingCat.Colimits.Relation.neg_1
Daniel Weber (Sep 11 2024 at 15:15):
What is CommRingCat.Colimits.Relation.below
? I don't see it on the docs - I think I might not be filtering implementation details correctly
Markus Himmel (Sep 11 2024 at 15:17):
below
is explained in Theorem Proving in Lean: https://lean-lang.org/theorem_proving_in_lean4/induction_and_recursion.html?highlight=below#structural-recursion-and-induction
Daniel Weber (Sep 11 2024 at 15:31):
The height (longest chain) of Mathlib is 261 (note - edited):
- docs#fermatLastTheoremThree
- docs#FermatLastTheoremForThreeGen.Solution.exists_Solution_multiplicity_lt
- docs#FermatLastTheoremForThreeGen.Solution.Solution'_descent_multiplicity_lt
- docs#FermatLastTheoremForThreeGen.Solution.Solution'_descent_multiplicity
- docs#FermatLastTheoremForThreeGen.Solution.Solution'_descent
- docs#IsCyclotomicExtension.Rat.Three.eq_one_or_neg_one_of_unit_of_congruent
- docs#IsCyclotomicExtension.Rat.Three.Units.mem
- docs#NumberField.Units.exist_unique_eq_mul_prod
- docs#NumberField.Units.fun_eq_repr
- docs#NumberField.Units.fundSystem
- docs#NumberField.Units.basisModTorsion
- docs#NumberField.Units.rank_modTorsion
- docs#NumberField.Units.unitLattice_rank
- docs#NumberField.Units.instZLattice_unitLattice
- docs#NumberField.Units.dirichletUnitTheorem.unitLattice_span_eq_top
- docs#NumberField.Units.dirichletUnitTheorem.exists_unit
- docs#NumberField.Units.dirichletUnitTheorem.seq_norm_ne_zero
- docs#NumberField.Units.dirichletUnitTheorem.seq_ne_zero
- docs#NumberField.Units.dirichletUnitTheorem.seq
- docs#NumberField.Units.dirichletUnitTheorem.seq_next
- docs#NumberField.mixedEmbedding.exists_ne_zero_mem_ringOfIntegers_lt
- docs#NumberField.mixedEmbedding.exists_ne_zero_mem_ideal_lt
- docs#NumberField.mixedEmbedding.mem_span_fractionalIdealLatticeBasis
- docs#NumberField.mixedEmbedding.fractionalIdealLatticeBasis_apply
- docs#NumberField.mixedEmbedding.fractionalIdealLatticeBasis
- docs#NumberField.mixedEmbedding.det_basisOfFractionalIdeal_eq_norm
- docs#NumberField.mixedEmbedding.latticeBasis_repr_apply
- docs#NumberField.mixedEmbedding.mem_rat_span_latticeBasis
- docs#NumberField.mixedEmbedding.latticeBasis_apply
- docs#NumberField.mixedEmbedding.latticeBasis
- docs#NumberField.mixedEmbedding.disjoint_span_commMap_ker
- docs#NumberField.canonicalEmbedding.latticeBasis_apply
- docs#NumberField.canonicalEmbedding.latticeBasis
- docs#Complex.isAlgClosed
- docs#Complex.exists_root
- docs#Differentiable.apply_eq_of_tendsto_cocompact
- docs#Differentiable.eq_const_of_tendsto_cocompact
- docs#Differentiable.exists_eq_const_of_bounded
- docs#Differentiable.exists_const_forall_eq_of_bounded
- docs#Differentiable.apply_eq_apply_of_bounded
- docs#Complex.liouville_theorem_aux
- docs#Complex.norm_deriv_le_of_forall_mem_sphere_norm_le
- docs#Complex.norm_deriv_le_aux
- docs#Complex.deriv_eq_smul_circleIntegral
- docs#DiffContOnCl.hasFPowerSeriesOnBall
- docs#Complex.hasFPowerSeriesOnBall_of_differentiable_off_countable
- docs#Complex.two_pi_I_inv_smul_circleIntegral_sub_inv_smul_of_differentiable_on_off_countable
- docs#Complex.circleIntegral_sub_inv_smul_of_differentiable_on_off_countable_aux
- docs#circleIntegral.integral_sub_inv_of_mem_ball
- docs#circleIntegral.integral_sub_zpow_of_ne
- docs#circleIntegral.integral_sub_zpow_of_undef
- docs#circleIntegrable_sub_zpow_iff
- docs#not_intervalIntegrable_of_sub_inv_isBigO_punctured
- docs#not_intervalIntegrable_of_tendsto_norm_atTop_of_deriv_isBigO_punctured
- docs#not_intervalIntegrable_of_tendsto_norm_atTop_of_deriv_isBigO_within_diff_singleton
- docs#not_intervalIntegrable_of_tendsto_norm_atTop_of_deriv_isBigO_filter
- docs#not_integrableOn_of_tendsto_norm_atTop_of_deriv_isBigO_filter
- docs#not_integrableOn_of_tendsto_norm_atTop_of_deriv_isBigO_filter_aux
- docs#intervalIntegral.integral_deriv_eq_sub
- docs#intervalIntegral.integral_eq_sub_of_hasDerivAt
- docs#intervalIntegral.integral_eq_sub_of_hasDeriv_right
- docs#intervalIntegral.integral_eq_sub_of_hasDeriv_right_of_le
- docs#intervalIntegral.integral_eq_sub_of_hasDeriv_right_of_le_real
- docs#intervalIntegral.integral_le_sub_of_hasDeriv_right_of_le
- docs#intervalIntegral.sub_le_integral_of_hasDeriv_right_of_le
- docs#intervalIntegral.continuousOn_primitive_interval_left
- docs#intervalIntegral.continuousOn_primitive_interval
- docs#intervalIntegral.continuousOn_primitive_interval'
- docs#intervalIntegral.continuousWithinAt_primitive
- docs#intervalIntegral.integral_indicator
- docs#MeasureTheory.integral_indicator
- docs#MeasureTheory.integral_add_compl
- docs#MeasureTheory.integral_add_compl₀
- docs#MeasureTheory.integral_union_ae
- docs#MeasureTheory.integral_add_measure
- docs#MeasureTheory.setToFun_congr_measure_of_add_left
- docs#MeasureTheory.setToFun_congr_measure_of_integrable
- docs#MeasureTheory.continuous_setToFun
- docs#MeasureTheory.L1.setToFun_eq_setToL1
- docs#MeasureTheory.setToFun_eq
- docs#MeasureTheory.setToFun
- docs#MeasureTheory.L1.setToL1
- docs#MeasureTheory.Lp.simpleFunc.denseRange
- docs#MeasureTheory.Lp.simpleFunc.denseInducing
- docs#MeasureTheory.Lp.simpleFunc.denseEmbedding
- docs#MeasureTheory.SimpleFunc.tendsto_approxOn_range_Lp
- docs#MeasureTheory.Lp.tendsto_Lp_iff_tendsto_ℒp''
- docs#MeasureTheory.Lp.tendsto_Lp_iff_tendsto_ℒp'
- docs#MeasureTheory.Lp.instNormedAddCommGroup
- docs#MeasureTheory.Lp.norm_eq_zero_iff
- docs#MeasureTheory.Lp.nnnorm_eq_zero_iff
- docs#MeasureTheory.Lp.nnnorm_zero
- docs#MeasureTheory.Lp.nnnorm_def
- docs#MeasureTheory.Lp.instNNNorm
- docs#MeasureTheory.Lp.instNorm
- docs#MeasureTheory.Lp
- docs#MeasureTheory.eLpNorm_add_lt_top
- docs#MeasureTheory.eLpNorm_add_le'
- docs#MeasureTheory.eLpNorm_add_le
- docs#MeasureTheory.eLpNorm'_add_le
- docs#ENNReal.lintegral_Lp_add_le
- docs#ENNReal.lintegral_rpow_add_le_add_eLpNorm_mul_lintegral_rpow_add
- docs#ENNReal.lintegral_mul_rpow_le_lintegral_rpow_mul_lintegral_rpow
- docs#ENNReal.lintegral_mul_le_Lp_mul_Lq
- docs#ENNReal.lintegral_mul_le_Lp_mul_Lq_of_ne_zero_of_ne_top
- docs#ENNReal.lintegral_rpow_funMulInvSnorm_eq_one
- docs#ENNReal.funMulInvSnorm_rpow
- docs#ENNReal.inv_rpow
- docs#ENNReal.mul_rpow_of_ne_zero
- docs#ENNReal.mul_rpow_eq_ite
- docs#ENNReal.rpow_eq_zero_iff
- docs#ENNReal.zero_rpow_of_pos
- docs#ENNReal.instPowReal
- docs#ENNReal.rpow
- docs#NNReal.instPowReal
- docs#NNReal.rpow
- docs#Real.rpow_nonneg
- docs#Real.rpow_def_of_nonneg
- docs#Real.instPow
- docs#Real.rpow
- docs#Complex.instPow
- docs#Complex.cpow
- docs#Complex.log
- docs#Complex.arg
- docs#Real.arcsin
- docs#Real.sinOrderIso
- docs#Real.bijOn_sin
- docs#Real.injOn_sin
- docs#Real.strictMonoOn_sin
- docs#Real.sin_lt_sin_of_lt_of_le_pi_div_two
- docs#Real.cos_pos_of_mem_Ioo
- docs#Real.sin_add_pi_div_two
- docs#Real.sin_pi_div_two
- docs#Real.sin_pos_of_pos_of_lt_pi
- docs#Real.sin_pi_sub
- docs#Real.sin_antiperiodic
- docs#Real.sin_pi
- docs#Real.cos_pi_div_two
- docs#Real.pi
- docs#Real.exists_cos_eq_zero
- docs#Real.continuousOn_cos
- docs#Real.continuous_cos
- docs#Complex.continuous_re
- docs#Complex.reCLM
- docs#NormedSpace.complexToReal
- docs#NormedSpace.restrictScalars
- docs#RestrictScalars.normedSpace
- docs#norm_algebraMap'
- docs#norm_algebraMap
- docs#norm_smul
- docs#norm_inv
- docs#normHom
- docs#NormedDivisionRing.to_normOneClass
- docs#norm_eq_zero
- docs#norm_eq_zero'
- docs#norm_le_zero_iff'
- docs#MetricSpace.ofT0PseudoMetricSpace
- docs#Metric.inseparable_iff
- docs#EMetric.inseparable_iff
- docs#EMetric.mem_closure_iff
- docs#EMetric.nhds_basis_eball
- docs#uniformity_basis_edist
- docs#uniformSpace_edist
- docs#uniformSpaceOfEDist
- docs#ENNReal.half_pos
- docs#ENNReal.div_pos_iff
- docs#ENNReal.div_eq_zero_iff
- docs#ENNReal.inv_eq_zero
- docs#ENNReal.instInvolutiveInv
- docs#ENNReal.coe_inv
- docs#ENNReal.coe_inv_le
- docs#ENNReal.coe_le_coe
- docs#ENNReal.instCompleteLinearOrder
- docs#NNReal.instConditionallyCompleteLinearOrderBot
- docs#Real.instConditionallyCompleteLinearOrder
- docs#Real.is_glb_sInf
- docs#Real.sInf_def
- docs#Real.instInfSet
- docs#Real.instSupSet
- docs#Real.exists_isLUB
- docs#Real.instFloorRing
- docs#Real.instArchimedean
- docs#Real.mk_le_of_forall_le
- docs#Real.instDivisionRing
- docs#Real.field
- docs#Real.instLinearOrderedField
- docs#Real.linearOrderedCommRing
- docs#Real.nontrivial
- docs#Real.strictOrderedSemiring
- docs#Real.strictOrderedCommSemiring
- docs#Real.instStrictOrderedCommRing
- docs#Real.instPreorder
- docs#Real.partialOrder
- docs#Real.mk_le
- docs#Real.mk_lt
- docs#Real.lt_cauchy
- docs#Real.instLT
- docs#Real docs#Real.ofCauchy
- docs#Rat.instLinearOrderedRing
- docs#Rat.instLinearOrderedCommRing
- docs#Rat.instPreorder
- docs#Rat.instPartialOrder
- docs#Rat.instSemilatticeInf
- docs#Rat.instLattice
- docs#Rat.instDistribLattice
- docs#Rat.linearOrder
- docs#Rat.le_total
- docs#Rat.le_iff_sub_nonneg
- docs#Rat.addGroup
- docs#Rat.addCommGroup
- docs#Rat.add_assoc
- docs#Rat.numDenCasesOn'
- docs#Rat.numDenCasesOn
- docs#Rat.mk'_eq_divInt
- docs#Rat.num_divInt_den
- docs#Rat.divInt_self
- docs#Rat.mkRat_self
- docs#Rat.normalize_self
- docs#Rat.mk_eq_normalize
- docs#Rat.normalize_eq
- docs#Rat.normalize.reduced'
- docs#Int.div_eq_ediv_of_dvd
- docs#Int.ediv_eq_iff_eq_mul_left
- docs#Int.ediv_eq_iff_eq_mul_right
- docs#Int.eq_mul_of_ediv_eq_right
- docs#Int.mul_ediv_cancel'
- docs#Int.ediv_mul_cancel
- docs#Int.emod_eq_zero_of_dvd
- docs#Int.mul_emod_right
- docs#Int.mul_emod_left
- docs#Int.add_mul_emod_self
- docs#Int.add_mul_ediv_right
- docs#Int.lt_trichotomy
- docs#Int.not_le
- docs#Int.lt_of_not_ge
- docs#Int.lt_iff_le_not_le
- docs#Int.lt_iff_le_and_ne
- docs#Int.ne_of_lt
- docs#Int.lt_irrefl
- docs#Int.lt.dest
- docs#Int.add_left_comm
- docs#Int.add_assoc
- docs#Int.add_assoc.aux2
- docs#Int.subNatNat_add_negSucc
- docs#Int.subNatNat_sub
- docs#Int.subNatNat_add_add
- docs#Int.subNatNat_elim
- docs#Nat.lt_of_sub_eq_succ
- docs#Nat.lt_of_sub_pos
- docs#Nat.lt_of_sub_ne_zero
- docs#Nat.not_le
- docs#Nat.not_le_of_gt
- docs#Nat.le_antisymm
- docs#Nat.lt_irrefl
- docs#Nat.not_succ_le_self
- docs#Nat.le_of_succ_le_succ
- docs#Nat.pred_le_pred
- docs#Nat.le_succ
- docs#instLENat
too long for Zulip
Daniel Weber (Sep 11 2024 at 15:37):
And in Init
it's 43:
- docs#BitVec.sshiftRight_eq_sshiftRightRec
- docs#BitVec.sshiftRightRec_eq
- docs#BitVec.sshiftRight'_or_of_and_eq_zero
- docs#BitVec.sshiftRight_add
- docs#BitVec.sshiftRight_msb_eq_msb
- docs#BitVec.getLsbD_sshiftRight
- docs#BitVec.sshiftRight_eq_of_msb_false
- docs#BitVec.sshiftRight_eq
- docs#BitVec.toInt_ofInt
- docs#Int.emod_bmod_congr
- docs#Int.emod_emod
- docs#Int.add_mul_emod_self_left
- docs#Int.add_mul_emod_self
- docs#Int.add_mul_ediv_right
- docs#Int.lt_trichotomy
- docs#Int.not_le
- docs#Int.lt_of_not_ge
- docs#Int.lt_iff_le_not_le
- docs#Int.lt_iff_le_and_ne
- docs#Int.ne_of_lt
- docs#Int.lt_irrefl
- docs#Int.lt.dest
- docs#Int.add_left_comm
- docs#Int.add_assoc
- docs#Int.add_assoc.aux2
- docs#Int.subNatNat_add_negSucc
- docs#Int.subNatNat_sub
- docs#Int.subNatNat_add_add
- docs#Int.subNatNat_elim
- docs#Nat.lt_of_sub_eq_succ
- docs#Nat.lt_of_sub_pos
- docs#Nat.lt_of_sub_ne_zero
- docs#Nat.not_le
- docs#Nat.not_le_of_gt
- docs#Nat.le_antisymm
- docs#Nat.lt_irrefl
- docs#Nat.not_succ_le_self
- docs#Nat.le_of_succ_le_succ
- docs#Nat.pred_le_pred
- docs#Nat.le_succ
- docs#instLENat
- docs#Nat.le docs#Nat.le.step docs#Nat.le.refl
- docs#Nat docs#Nat.zero docs#Nat.succ
Daniel Weber (Sep 12 2024 at 02:03):
Here are the most commonly used constants (directly):
- docs#Eq: 136377
- docs#congrArg: 68639
- docs#OfNat.ofNat: 66298
- docs#id: 52950
- docs#Nat: 51203
- docs#Eq.refl: 47945
- docs#Membership.mem: 47637
- docs#DFunLike.coe: 47548
- docs#Eq.mpr: 47412
- docs#Set: 45776
- docs#Eq.trans: 37794
- docs#PartialOrder.toPreorder: 36641
- docs#Zero.toOfNat0: 34565
- docs#Semiring.toNonAssocSemiring: 33889
- docs#Eq.symm: 33720
- docs#LE.le: 32398
- docs#of_eq_true: 30779
- docs#True: 30187
- docs#instOfNatNat: 29598
- docs#HAdd.hAdd: 28914
Daniel Weber (Sep 12 2024 at 03:28):
And transitively:
- docs#Eq docs#Eq.refl: 216505
- docs#outParam: 194929
- docs#rfl: 190300
- docs#autoParam: 179122
- docs#Iff docs#Iff.intro: 178574
- docs#Eq.symm: 173433
- docs#congrArg: 170376
- docs#id: 167147
- docs#Nat docs#Nat.succ docs#Nat.zero: 166751
- docs#True docs#True.intro: 164211
- docs#OfNat.mk docs#OfNat: 162772
- docs#OfNat.ofNat: 162764
- docs#False: 160825
- docs#Not: 159198
- docs#propext: 158829
- docs#trivial: 158664
- docs#And docs#And.intro: 157667
- docs#Eq.mpr: 157456
- docs#Eq.trans: 157240
- docs#letFun: 155946
Damiano Testa (Sep 12 2024 at 09:04):
If you are taking requests, I think that this might be in spirit similar to David's prominence.
The top "tips" probably all share a large number of common ancestors and therefore should probably all be considered "about the same". What is the next "group" of declarations? Maybe considering two "tips" close if they share more than 99% of ancestors, can you find a tip that is not close to the top one? Maybe varying 99% to some other more appropriate value.
Daniel Weber (Sep 12 2024 at 09:35):
I currently have a bug in my data, because of you only look at direct dependencies you can't just ignore private decelerations
Daniel Weber (Sep 12 2024 at 11:33):
David Loeffler said:
Michael Stoll said:
Number Theory rules!
In mountaineering, people talk about the "prominence" of a mountain peak, defined as the height of that peak above the saddle connecting it to the next highest peak. By analogy with this, I wonder what the "most prominent" mathlib theorems are? I.e. what are the first few elements of the list characterised by
L[N + 1] := the theorem "x" which maximizes # { prerequisites of x which are not prerequisites of any of L[1], ..., L[N] }.
Presumably L[2] would be something outside of number theory – I'm guessing it would be one of the results about C-star algebras that are the highest-ranked non-number-theory items on the current list.
Here are the first 30 values of L, and the number of new prerequisites they add:
- docs#fermatLastTheoremThree: 55694
- docs#elabSuppressCompilationDecl: 7164
- docs#CategoryTheory.Sheaf.cohomologyPresheaf: 7011
- docs#ZMod.LFunction_one_sub: 5268
- docs#AlgebraicGeometry.locallyOfFinitePresentation_stableUnderBaseChange: 4774
- docs#LieAlgebra.IsKilling.isReduced_rootSystem: 3819
- docs#Std.Tactic.BVDecide.Reflect.unsat_of_verifyBVExpr_eq_true: 3256
- docs#LightCondMod.epi_iff_locallySurjective_on_lightProfinite: 3049
- docs#CStarRing.rpow_neg_one_le_one: 2826
- docs#Aesop.expandNextGoal: 2121
- docs#LipschitzWith.ae_differentiableAt: 2038
- docs#ProbabilityTheory.set_integral_condKernel_univ_left: 1721
- docs#groupCohomologyIsoExt: 1616
- docs#RatFunc.valuation_eq_LaurentSeries_valuation: 1524
- docs#rothNumberNat_isLittleO_id: 1388
- docs#FirstOrder.Field.ACF_zero_realize_iff_finite_ACF_prime_not_realize: 1227
- docs#SchwartzMap.fourierTransformCLE_symm_apply: 1172
- docs#WittVector.equiv: 1109
- docs#frattini_nilpotent: 1077
- docs#CategoryTheory.Abelian.DoldKan.equivalence_inverse: 1066
- docs#EuclideanGeometry.concyclic_or_collinear_of_two_zsmul_oangle_eq: 1054
- docs#spinGroup.conjAct_smul_range_ι: 953
- docs#Mathlib.Tactic.RewriteSearch.SearchNode.search: 802
- docs#WeierstrassCurve.Jacobian.Point.instAddCommGroup: 799
- docs#AdicCompletion.flat_of_isNoetherian: 752
- docs#Aesop.evalAesop: 718
- docs#Surreal.dyadicMap_apply_pow: 706
- docs#Mathlib.Tactic.Widget.StringDiagram.rpc: 707
- docs#SimpleGraph.card_ConnectedComponent_eq_rank_ker_lapMatrix: 656
- docs#CategoryTheory.PreGaloisCategory.FiberFunctor.isPretransitive_of_isConnected: 650
Johan Commelin (Sep 12 2024 at 11:55):
Pretty cool! I'm proud that WittVector.equiv
is in the list :smiley:
Violeta Hernández (Sep 16 2024 at 13:12):
No idea what Surreal.dyadicMap_apply_pow
is doing there... Is it just that it's unused and uses some heavy number theory stuff?
Violeta Hernández (Sep 16 2024 at 13:13):
I guess it does help how isolated game theory is from everything else
Felix Weilacher (Nov 04 2024 at 23:29):
Question from a noob: How does the #tally!
from this thread differ from Lean.Name.transitivelyUsedConstants
?
Damiano Testa (Nov 04 2024 at 23:35):
Probably just that it was written before the function.
Last updated: May 02 2025 at 03:31 UTC