Zulip Chat Archive
Stream: mathlib4
Topic: Task 26: Replace Type u by Type* wherever possible
Martin Dvořák (Apr 22 2025 at 08:48):
Claiming task 26.
Jovan Gerbscheid (Apr 22 2025 at 12:10):
I think the difference is that Type*
introduces a fresh universe, while in Type _
, the level can be inferred by unification (or be fresh if it isn't unified to be anything else).
But there might still be instances in Mathlib where this replacement is possible.
Notification Bot (Apr 22 2025 at 12:13):
A message was moved here from #mathlib4 > Call for help: technical/ organisational debt by Yaël Dillies.
Matthew Ballard (Apr 22 2025 at 12:48):
Re 26: I would be very very careful with this. We have been encountering significant still-unidentified slowdowns that depend on the choice of universe name.
Re 27: I would replace Type _
- first by
Type*
if possible - second by
Type whatever
wherewhatever
is the universe level it actually unifies to.
Ruben Van de Velde (Apr 22 2025 at 12:50):
I would not replace Type _
at all in cases where Type*
doesn't work
Edward van de Meent (Apr 22 2025 at 12:51):
actually, i think that Type _
happens precisely because arguments have Type*
, since you don't get to refer to the universe variable created?
for example:
def Foo {A : Type*} (s:Set A) : Type _ := s
Matthew Ballard (Apr 22 2025 at 13:01):
There has been at least one case where Type _
really meant Type 0
(oops). I don't like the obfuscation if it can reasonably be avoided but it is not high up my priority list.
Martin Dvořák (Apr 22 2025 at 13:02):
Matthew Ballard said:
Re 26: I would be very very careful with this. We have been encountering significant still-unidentified slowdowns that depend on the choice of universe name.
I believe readability is more important than compilation speed (having one more variable in the context increases the cognitive load for readers); however, we will benchmark the PR to check it isn't too bad.
Matthew Ballard (Apr 22 2025 at 13:04):
Please be aware of #12737 and request my review
Jovan Gerbscheid (Apr 22 2025 at 14:07):
My understanding is that #12737 tracks slowdowns caused by universe metavariables. But Type*
(unlike Type _
) introduces a fresh universe parameter, not a metavariable. So I think going from Type u
to Type*
isn't bad for performance
Martin Dvořák (Apr 22 2025 at 14:33):
WIP part 1 (Algebra without category theory): #24285
Junyan Xu (Apr 22 2025 at 14:54):
Is there any automation to replace (R : Type*) (S : Type*) (M : Type*)
by (R S M : Type*)
and {R : Type*} {A : Type*}
by {R A : Type*}
, etc.?
Martin Dvořák (Apr 22 2025 at 14:57):
I guess you could write a regex for the basic cases.
Yaël Dillies (Apr 22 2025 at 14:57):
Not that I know of, but I have many times wished there was!
Martin Dvořák (Apr 22 2025 at 15:00):
I suggest not to do it in this PR (because of the way diffs are visualized).
Yury G. Kudryashov (Apr 22 2025 at 15:15):
Also note that
- category theory relies on a specific order of universe variables, so that
Category.{u} C
means "category with morphisms inType u
"; - definitions and theorems about docs#ULift should use the same order of universe arguments as
ULift
does (this is not always the case at the moment), for similar reasons.
Matthew Ballard (Apr 22 2025 at 15:20):
Jovan Gerbscheid said:
My understanding is that #12737 tracks slowdowns caused by universe metavariables. But
Type*
(unlikeType _
) introduces a fresh universe parameter, not a metavariable. So I think going fromType u
toType*
isn't bad for performance
Yes that is a reasonable belief but see #PR reviews > #21099 use unification hints for generators normalization of levels includes sorting of names which, last I checked, behaves unexpectedly for names with more than one part or more generally outside of ASCII. If anyone has more time to track down the offending code in core and locate the proper changes, please do.
Eric Wieser (Apr 22 2025 at 15:33):
Generally if you're working in an area where the universes are an important part of the theory (like ULift
and category theory), you should write
universe u v
variable {X : Type u} {Y : Type v}
rather than Type*
Jovan Gerbscheid (Apr 22 2025 at 15:34):
The changes that @Martin Dvořák is proposing also include removing explicit universe annotations (e.g. Category.{v} C
-> Category C
). This does introduce universe metavariables, so this may give a slowdown.
Martin Dvořák (Apr 22 2025 at 15:35):
Given the info I have just learnt, I will revert the changes related to categories.
Jovan Gerbscheid (Apr 22 2025 at 15:42):
I looked a bit further into the slowdowns from universe metavariables, and I think part of the problem is that isDefEq
has two kinds of caches: transient
and permanent
(the permanent
one is also temporary, but resets less often). Having (universe) metavariables means that the permanent
cache cannot be used.
Matthew Ballard (Apr 22 2025 at 15:44):
I think I tried using docs#Lean.Expr.hasExprMVar instead at some point for the cache. It did help a bit
Jovan Gerbscheid (Apr 22 2025 at 15:47):
That seems reasonable but I'd also worry if that can introduce bugs in isDefEq
.
Matthew Ballard (Apr 22 2025 at 15:47):
It was very much a hack which is why I didn’t pursue it at all
Martin Dvořák (Apr 22 2025 at 16:16):
PR #24285 is ready for review.
Martin Dvořák (Apr 22 2025 at 16:55):
Benchmark +25.08% in Mathlib.Algebra.Module.Presentation.Tensor
...
Is it too much? Should I revert the changes in this file?
PS: I have just noticed that the file is not imported by any other Mathlib file except the sink. Are the changes I made there valid, first of all?
Jovan Gerbscheid (Apr 22 2025 at 17:01):
Interesting. This should be caused by removing the explicit universe annotation in
variable (relations₁ : Relations.{w₁₀, w₁₁} A)
I wonder if we should introduce a notation
variable (relations₁ : Relations.{*, *} A)
In the same spirit as Type*
Jovan Gerbscheid (Apr 22 2025 at 17:03):
I think all your changes are valid. I think the only one where you actually changed the definition is Mathlib/Algebra/Quotient.lean
, but I agree with that change.
Martin Dvořák (Apr 22 2025 at 17:05):
Ah yeah, this is the one I was worried about the most.
Matthew Ballard (Apr 22 2025 at 17:05):
@Adam Topaz had thought about a Category*
version in the past.
I haven’t looked yet but this sounds like a smaller scale incident of what I described before.
Matthew Ballard (Apr 22 2025 at 17:07):
branch#mrb/merge_type_stars merges some Type*
’s Please feel free to PR but not that there is currently a Python script to understand the chances. It should probably be removed in the end.
I will be unavailable for 24-48 hours.
Junyan Xu (Apr 22 2025 at 17:13):
Thanks! Another automation target is to remove declared universe variables that are no longer used.
Martin Dvořák (Apr 22 2025 at 17:15):
Absolutely! There are plenty.
Kyle Miller (Apr 22 2025 at 17:22):
Jovan Gerbscheid said:
So I think going from
Type u
toType*
isn't bad for performance
Matthew mentioned an issue (though I'm not sure about the ASCII part, since Type*
only creates names of the form u_1
, u_2
, ...), but another is that Type*
adds the new level parameter to the end of the level parameter list, which may result in a different order. Yury mentioned this in the context of API, but I'm mentioning this as something that could cause performance differences (though I don't know if this can cause significant problems).
(I like the idea of *
as a fresh universe level parameter. If it makes it to core, I think it would need to be rethought — it's odd how Type*
creates non-hygienic level variables — you can refer to u_1
for example.)
Jovan Gerbscheid (Apr 22 2025 at 17:26):
Yes, I think the implementation of introducing a fresh universe level on the fly should be redone. Although the issue of universe hygiene is more fundamental; that also exists when using Type _
.
Kyle Miller (Apr 22 2025 at 17:29):
A couple options:
*
could create an inaccessible name, and these names can be cleaned in a pass before the declaration is added to the environment.*
could create a level metavariable along with a record in the TermElab state that it is an error if this metavariable unifies with anything (there would be a pass at the end checking for this, flagging the place where the level was introduced along with what it unified to). This is nice since it works just likeType _
except for the fact it can throw an error when the uniqueness assumption isn't met.
Kyle Miller (Apr 22 2025 at 17:29):
Jovan Gerbscheid said:
that also exists when using
Type _
What hygiene issue are you talking about?
Jovan Gerbscheid (Apr 22 2025 at 17:33):
example (a : Type _) : a = a := Eq.refl (a : Type u_1)
I think this is what you were referring to: the universe level u_1
is valid here.
Kyle Miller (Apr 22 2025 at 17:37):
Yeah, that's related, and it's using the same system as Type*
too. (Type*
works by creating a metavariable and immediately using levelMVarToParam
on it, rather than waiting for it to be applied at a later elaboration checkpoint. For theorems, that's before elaborating the body.)
Jovan Gerbscheid (Apr 22 2025 at 17:39):
I think *
should create a level parameter, not a metavariable, because we need the function Expr.hasMVar
to return false for the better isDefEq
cache to be used
Kyle Miller (Apr 22 2025 at 17:41):
With my second bulleted option (metavariables), this would also give a better error messages than this one without having to add checks everywhere in the language:
-- error is on the whole `theorem` line
theorem weird : True := by
have := Type*
trivial
/-
error: AddConstAsyncResult.commitConst: constant has level params [u_1] but expected []
-/
Compare to
theorem not_so_weird : True := by
have := Type _
--~~ error position
trivial
/-
failed to infer universe levels in 'have' declaration type
Type (?u.635 + 1)
-/
Jovan Gerbscheid (Apr 22 2025 at 17:46):
That doesn't seem right. If we make it a def
, not_so_weird
should compile correctly (in the hypothetical implementation of *
), but using a metavariable will trigger this error.
Jovan Gerbscheid (Apr 22 2025 at 17:49):
I think it should have the same effect as the current Type*
, appart from issues around naming and hygiene.
Jovan Gerbscheid (Apr 22 2025 at 17:52):
I noticed some interesting behaviour regarding when universe metavariables get abstracted and when not:
variable (a : Type _)
-- a : Type ?u.5
-- b : Type u_1
def test (b : Type _) : True := by sorry
-- a : Type u_2
-- b : Type u_1
def test' (b : Type _) : a = a := by sorry
-- b : Type u_1
theorem test'' (b : Type _) : True := by sorry
It seems that universe metavariables are not abstracted inside the body if they are introduced via variable
and do not (at first) appear in the type, and if we are in a def
.
The file that slowed down by 25% contains isPresentationCoreTensor
, which satisfies all these conditions. You can see the offending metavariables with set_option pp.universes true
Jovan Gerbscheid (Apr 22 2025 at 17:55):
Also, the universe names are in a weird order when combining variable
and normal arguments:
variable (a₁ a₂ : Type _)
-- a₁ : Type u_3
-- a₂ : Type ?u.11
-- b₁ : Type u_1
-- b₂ : Type u_2
def test (b₁ b₂ : Type _) : a₁ = a₁ := by sorry
Kevin Buzzard (Apr 22 2025 at 18:16):
Kyle Miller said:
Matthew mentioned an issue (though I'm not sure about the ASCII part, since
Type*
only creates names of the formu_1
,u_2
, ...), but another is thatType*
adds the new level parameter to the end of the level parameter list, which may result in a different order. Yury mentioned this in the context of API, but I'm mentioning this as something that could cause performance differences (though I don't know if this can cause significant problems).
You know about the 29% speedup in the (unmerged!) #21129 caused by changing the alphabetical order of the universes, right?
We have not had the benefit of that speed-up literally thousands of times since then.
Kyle Miller (Apr 22 2025 at 18:34):
I didn't @Kevin Buzzard; and I guess that proves that changing universe variable orders could have caused a 29% slowdown (assuming we had a good order from the beginning).
Kevin Buzzard (Apr 22 2025 at 18:37):
Just to be clear, this wasn't "physical" universe order changing, this was literally renaming of the universes so that alphabetically they were in a different order. But if that can cause a 29% effect then I'm sure that reordering them can!
Kyle Miller (Apr 22 2025 at 18:42):
Jovan Gerbscheid said:
If we make it a
def
,not_so_weird
should compile correctly
Declarations with universe levels that can't be inferred from the type are a very special case, and such definitions are harder to use (since you have to provide universe levels). I don't think I'd mind if you had to opt in with an explicit universe level.
Plus: Is there any case of a universe level that's used only in the body of a declaration that couldn't just be replaced by 0
?
Kyle Miller (Apr 22 2025 at 18:52):
Jovan Gerbscheid said:
I noticed some interesting behaviour regarding when universe metavariables get abstracted and when not:
Additionally, for def
it also depends on whether the type is a proposition or not. If it's not, everything appears to remain as level metavariables.
Jovan Gerbscheid (Apr 22 2025 at 19:43):
@Kyle Miller, using explicit universes also gives the error you said is weird:
universe u
-- error is on the whole `theorem` line
theorem weird : True := by
have := Type u
trivial
So I think the problem here is just that Lean's universe error messages have room for improvement. (Although this isn't a common scenario in practice anyways)
Kyle Miller (Apr 22 2025 at 19:50):
It looks like theorem
is not removing the universe
levels that don't appear in the header. I'd expect "unknown universe level 'u'", just like
variable (x : Nat)
theorem notWeird : True := by
have := x -- unknown identifier 'x'
trivial
(Ideally it would say a bit more about how these were variables unused in the header.)
Aaron Liu (Apr 22 2025 at 19:53):
Like this?
-- unused universe parameter 'u'
theorem goodError.{u} : True := by
trivial
Kyle Miller (Apr 22 2025 at 19:56):
No, in the context of Jovan's example, one direct fix would cause it to say just "unknown universe level 'u'" but it could say something more informative like "universe level 'u' was not used in the theorem header, so it was not included" with more work.
Jovan Gerbscheid (Apr 23 2025 at 12:12):
What do we think of a set_option
that makes lean throw an error if there are universe metavariables left after elaboration? It would be false by default, but turned on in Mathlib, similar to autoImplicit. This would force the replacement of Type _
by Type*
when possible. It would be beneficial for performance, but also for readability, because it forces universes to be introduced explicitly (either with a named level, or with the new *
syntax for simple cases).
Kyle Miller (Apr 24 2025 at 13:16):
Kevin Buzzard said:
The culprits are declarations such as docs#Algebra.Extension and docs#Algebra.Generators .
Oops, I hope I didn't cause a diversion in the discussion here. I meant to say "is there any case of a universe level parameter that's used only in the body of a definition or theorem that couldn't just be replaced by 0
?" Structures definitely make use of this sometimes for a seemingly legitimate reason, as you point out. (And that certainly doesn't change the fact that structure
s doing this are harder to use!)
I was thinking about this at some point because one rule Lean could take with the "failed to infer universe levels" error is to set them to zero.
def f :=
let xs : List PUnit := []
-- ~~~~~~~~~~
true
/-
failed to infer universe levels in 'let' declaration type
List.{?u.2513} PUnit.{?u.2513 + 1}
-/
Indeed, when this error happens, Lean sets the level metavariables to 0 so that it can make progress with the rest of the file.
set_option pp.universes true
#print f
/-
def f : Bool :=
let xs := List.nil.{0};
true
-/
I don't think this could solve any of the universe level slowness you bring up in any way, unfortunately.
Notification Bot (Apr 24 2025 at 13:38):
17 messages were moved from this topic to #mathlib4 > JacobiZariski is slow. by Kevin Buzzard.
Kevin Buzzard (Apr 24 2025 at 13:40):
Oops, I hope I didn't cause a diversion in the discussion here.
You did, but that's not a bad thing! Clogging up this discussion is though (and this is my fault not yours), so I moved the relevant posts to the "JacobiZariski is slow" thread.
Last updated: May 02 2025 at 03:31 UTC