Zulip Chat Archive

Stream: general

Topic: universe issues in category theory


Johan Commelin (Mar 09 2020 at 05:58):

How hard would it be to fix the Lean fork so that it becomes better at figuring out universes? I got the impression that we have to remind Lean about universes in category theory, a lot more than strictly necessary, just because Lean was being a bit lazy...

Mario Carneiro (Mar 09 2020 at 06:05):

do you have a concrete example?

Johan Commelin (Mar 09 2020 at 06:43):

Just a random example:

instance full_subcategory : category.{v} {X : C // Z X} :=

If Lean figures out the category instance for C, then it knows which 2 universe variables are in play. And so it should be able to figure out the v by itself.

Johan Commelin (Mar 09 2020 at 06:43):

There might be other, better examples.

Simon Hudon (Mar 09 2020 at 06:46):

One remedy to this situation would be to make category fully bundled. Then, the type of your category would dictate the universe in which the arrows are.

Simon Hudon (Mar 09 2020 at 06:50):

I think it would be a design worth exploring if someone has the time to try it out

Mario Carneiro (Mar 09 2020 at 09:45):

That already exists, it's called small_category

Mario Carneiro (Mar 09 2020 at 09:46):

but the problem is that while small_category and large_category both fix the homset universe level, they fix it to different things, and category.{v} is the generalization

Mario Carneiro (Mar 09 2020 at 09:46):

I think this amounts to saying that v is an out_param on the category typeclass instance

Mario Carneiro (Mar 09 2020 at 09:46):

which is not a thing we can annotate

Simon Hudon (Mar 09 2020 at 09:51):

That is not what I'm talking about.

universes u v
structure category :=
(C : Type u)
(hom : C -> C -> Type v)
-- etc

Then, we fix the universes at the time of definition of the category or the declaration of the category variable

Mario Carneiro (Mar 09 2020 at 09:52):

that will make the problem worse

Mario Carneiro (Mar 09 2020 at 09:53):

you will have category.{u v} now

Mario Carneiro (Mar 09 2020 at 09:53):

the only reason u can be inferred is because C is a parameter

Simon Hudon (Mar 09 2020 at 09:55):

Already, we define variable (C : Type u) [category.{v} C] so both universes are already specified explicitly. What I'm proposing is variable (C : category.{u v}) and then never have to specify universes again as far as C is concerned

Mario Carneiro (Mar 09 2020 at 09:55):

it will certainly not eliminate the universe annotations in Johan's example

Simon Hudon (Mar 09 2020 at 09:56):

If C is a category, it certainly will

Mario Carneiro (Mar 09 2020 at 09:56):

I think you don't actually need to specify either currently. You can write variable (C : Type*) [category C] and it will mean the same thing, just with generated names. The actual reason universe names appear there is because they need to be referred to later

Mario Carneiro (Mar 09 2020 at 09:58):

(This is also the standard practice in the rest of mathlib, to avoid naming universes unless they need to be referred to by name later)

Simon Hudon (Mar 09 2020 at 10:00):

Mario Carneiro said:

I think you don't actually need to specify either currently. You can write variable (C : Type*) [category C] and it will mean the same thing, just with generated names. The actual reason universe names appear there is because they need to be referred to later

You should have a look at category_theory

Simon Hudon (Mar 09 2020 at 10:01):

Here is the standard practice:

variables (C : Type uā‚) [š’ž : category.{vā‚} C]

Every category has two names and universe names are a handful

Mario Carneiro (Mar 09 2020 at 10:01):

I'm aware. Those names are there because they need to be, but the reasons for that need are not on the variables line

Mario Carneiro (Mar 09 2020 at 10:02):

The name on the instance itself is also peculiar, that isn't done anywhere else and you can be sure it's being done for a reason

Simon Hudon (Mar 09 2020 at 10:03):

Yes and having a bundled model alleviates that need

Mario Carneiro (Mar 09 2020 at 10:03):

I think it would be best to stick to Johan's original question

Mario Carneiro (Mar 09 2020 at 10:04):

I don't think the bundled model would solve any universe problems, and they would be problematic for completely different reasons

Mario Carneiro (Mar 09 2020 at 10:05):

but taking for granted that categories are represented the way they are, is there anything we can do to eliminate the need to refer to universe names?

Simon Hudon (Mar 09 2020 at 10:07):

I disagree. The bundling does have an effect on universe inference and I think changing the design of the library is a smaller project than playing with the way the elaborator handles universes

Mario Carneiro (Mar 09 2020 at 10:08):

perhaps, but the present design already came about as a result of considering what worked best with lean 3 as it existed at the time. Yes, we considered fully bundled, half bundled, and unbundled options, and this was a point of discussion when category theory first entered mathlib, migrated from Scott's repository

Mario Carneiro (Mar 09 2020 at 10:09):

This question is challenging the premise of that earlier decision - what if we could change lean, could we find something better?

Simon Hudon (Mar 09 2020 at 10:20):

What's your suggestion?

Mario Carneiro (Mar 09 2020 at 10:21):

FIgure out what the technical issues are that are currently preventing us from banishing universe annotations in the category theory library, and see if there is another way to solve them using hypothetical elaboration hints

Mario Carneiro (Mar 09 2020 at 10:23):

I identified one above - category.{u v} C apparently treats v as an output parameter, and while this is a perfectly meaningful statement lean's method of annotating out_params involves sticking identity functions on the parameter, which doesn't work for universes

Mario Carneiro (Mar 09 2020 at 10:24):

One way I can imagine conveying this info by alternate means is by an annotation, i.e. @[out_param v]

Simon Hudon (Mar 09 2020 at 10:45):

I'm not sold. It looks like a weird feature to me

Kevin Buzzard (Mar 09 2020 at 10:50):

Mario Carneiro said:

The name on the instance itself is also peculiar, that isn't done anywhere else and you can be sure it's being done for a reason

I think it's because the universe issue stops the category C term being inferred by type class inference sometimes, so you have to explicitly include it, which means having a sensible name for it.

Mario Carneiro (Mar 09 2020 at 10:51):

So it sounds like this is also caused by the fact that lean doesn't get that v is an out param

Mario Carneiro (Mar 09 2020 at 10:52):

so if this hypothesis is correct, the fix is to put @[out_param v] on class category, patch lean, and then all of that stuff goes away and you can just write (C : Type*) [category C]

Sebastian Ullrich (Mar 09 2020 at 11:24):

What exactly does a universe out_param mean? Unlike expr mvars, universe mvars do not block typeclass search and can be instantiated by it by default.

Johan Commelin (Mar 09 2020 at 11:24):

Another issue is that we have to write

variables (C : Type u) [xyzzy : category.{v} C]
include xyzzy

Why that include statement? Lean shouldn't need it.

Mario Carneiro (Mar 09 2020 at 11:26):

the include statement is because unnamed parameters are included when the types they refer to are needed (by default), while named parameters are only included when referred to, like other variables

Johan Commelin (Mar 09 2020 at 11:28):

@Mario Carneiro I meant "Why do I need to name the parameter and include it?"

Johan Commelin (Mar 09 2020 at 11:29):

Because if I make it unnamed, then the library breaks.

Johan Commelin (Mar 09 2020 at 11:29):

And the reason has to do with Leans limited handling of universe parameters.

Mario Carneiro (Mar 09 2020 at 11:30):

I'm saying that if you name the parameter, you are also likely to have to include it. So it comes back to why you have to name it

Johan Commelin (Mar 09 2020 at 11:30):

Lean sees a statement using C, and decides not to include the instance [category.{v} C] because it depends on the universe v, and v is not determined by C.

Johan Commelin (Mar 09 2020 at 11:30):

Mario Carneiro said:

I'm saying that if you name the parameter, you are also likely to have to include it. So it comes back to why you have to name it

Ok, my answer that would have been: "You have to name it so that you can include it...."

Mario Carneiro (Mar 09 2020 at 11:34):

There are three levels of include here:

  • Unnamed parameter, e.g. variable [foo A]. This is included if A is included (plus some universe stuff apparently)
  • Included parameter, e.g. variable (x : A) include x. This is always included
  • Not included variable, e.g. variable (x : A) or variable [inst: foo A]. This is only included if the name x resp. inst is used in the theorem statement

Because instances are almost never referred to by name, the third option is usually not good enough, so while the first is preferred, the second is the fallback.

Sebastian Ullrich (Mar 09 2020 at 11:35):

Johan Commelin said:

Lean sees a statement using C, and decides not to include the instance [category.{v} C] because it depends on the universe v, and v is not determined by C.

It shouldn't be that. Lean only looks at expr variables/parameters when deciding what to include.

Johan Commelin (Mar 09 2020 at 11:39):

Crazy, I thought that's what was determined a long time ago. But now I can't reproduce.

Mario Carneiro (Mar 09 2020 at 11:40):

Here's a problem:

universes u v

class foo (C : Type u) : Type (max u (v+1)) := (A : Type v)

variables (C : Type u) [foo C]

def bar : Type u := C
kernel failed to type check declaration 'bar' this is usually due to a buggy tactic or a bug in the builtin elaborator
elaborated type:
  Ī  (C : Type u) [_inst_1 : foo C], Type u
elaborated value:
  Ī» (C : Type u) [_inst_1 : foo C], C
nested exception message:
invalid reference to undefined universe level parameter 'u_1'

Mario Carneiro (Mar 09 2020 at 11:41):

I think it pulled in _inst_1 : foo.{u u_1} C but forgot to also pull in u_1

Johan Commelin (Mar 09 2020 at 11:42):

import category_theory.category

universe variables v u

open category_theory

variables (C : Type u) [category.{v} C]

def foo (x : C) : x āŸ¶ x := šŸ™ x

Johan Commelin (Mar 09 2020 at 11:42):

Right, I can reproduce, it's just that the error doesn't show if you replace šŸ™ x with an underscore

Johan Commelin (Mar 09 2020 at 11:43):

So you really have to finish the code to see the problem.

Johan Commelin (Mar 09 2020 at 11:43):

And @Sebastian Ullrich is right. It does pull in the category instance. So maybe Mario's hypothesis is better.

Sebastian Ullrich (Mar 09 2020 at 11:46):

Yeah, collect_univ_params_ignoring_tactics is called on everything - except for anonymous instances... https://github.com/leanprover/lean/blob/72a965986fa5aeae54062e98efb3140b2c4e79fd/src/frontends/lean/decl_util.cpp#L256

Johan Commelin (Mar 09 2020 at 11:47):

Ok, I would like to have that as a feature request for Lean 3.7.1 (-;

Mario Carneiro (Mar 09 2020 at 11:50):

that's a bugfix, not a feature request

Mario Carneiro (Mar 09 2020 at 11:51):

and I don't see any reason not to slip it into the next release if we know what the fix is

Sebastian Ullrich (Mar 09 2020 at 11:52):

@Mario Carneiro So what about the out_param? What does it do?

Mario Carneiro (Mar 09 2020 at 11:52):

if your description is correct, then everything is an out param already and my hypothesis was wrong

Sebastian Ullrich (Mar 09 2020 at 11:53):

Ok

Mario Carneiro (Mar 09 2020 at 11:53):

I think this bug is the more likely culprit now

Reid Barton (Mar 09 2020 at 13:28):

  • the include xyzzy business is certainly due to a bug, which we could now fix in Lean 3
  • re out_param: In Lean 3 universe metavariables don't get assigned by an otherwise matching instance. That is if I search for category.{?l_1 u} C and there is an instance CI : category.{v u} C available, it won't be used (and set ?l_1 to v). In Lean 4 it apparently will. Philosophically I tend to think the Lean 3 behavior is the "correct" one, but certainly the Lean 4 behavior is more convenient in this case. (Actually the Lean 3 behavior is more complicated--there are two kinds of (universe?) metavariables and they behave differently here. If the other kind wasn't assigned by instances matching, then category theory would require many more explicit universes. I don't remember the details though.)
  • I agree with Simon that bundled categories are underexplored, and might ameliorate these issues, at least in Lean 3. I think they are unbundled mainly because other algebraic structures are unbundled. Maybe @Scott Morrison knows the history better. Naturally there might be other issues with bundled categories--it would be very tempting to coerce a bundled category to its type of objects for example, and perhaps it wouldn't play nicely with Lean 3's coercion issues.

Mario Carneiro (Mar 09 2020 at 13:33):

I think that both behaviors are "correct", but reflect the difference between input and output parameters. They both have their place, but I'm inclined to think that universe variables that are not already determined by typechecking later arguments should usually be treated as out parameters. It would be very strange to have instances of both foo.{u 1} X and foo.{u 2} X and rely on the input universe to figure out which one to use. I rather think that the uniqueness rule should apply even if you strip all universe checks

Reid Barton (Mar 09 2020 at 13:35):

Yeah, I agree that ideally you'd have out_param for universes too. I'm used to a setting where everything is "in_param" by default; let me try to understand your suggestion

Reid Barton (Mar 09 2020 at 13:37):

We're still marking out_params at the level of classes, not the level of instances, right?

Mario Carneiro (Mar 09 2020 at 13:37):

yes

Reid Barton (Mar 09 2020 at 13:39):

so generally I have class {v u} category : Type u -> Type (max u (v+1)) or whatever--is the idea to treat universe parameters which don't appear outside the final result type as out_param by default?
Does it matter syntactically whether I put parameters to the left of the :?

Reid Barton (Mar 09 2020 at 13:40):

I could mark pretty much any top-level declaration as a [class]; does that complicate things?

Mario Carneiro (Mar 09 2020 at 13:42):

You can actually just treat all universe parameters as out params I think. The ones that are determined already won't matter

Mario Carneiro (Mar 09 2020 at 13:42):

and we never use universe values for instance selection

Mario Carneiro (Mar 09 2020 at 13:43):

although an instance can be rejected if it is not well typed

Reid Barton (Mar 09 2020 at 13:44):

I think maybe this is what Lean 4 does

Mario Carneiro (Mar 09 2020 at 13:44):

I think so too

Mario Carneiro (Mar 09 2020 at 13:46):

Basically, because universe levels can be normalized or modified whenever during unification, they can't be relied on for dispatch anyway, so I would rather have a "universe-naive" typeclass inference mechanism that just ignores universes until unification/typechecking

Bryan Gin-ge Chen (Mar 09 2020 at 16:48):

How difficult is this bugfix? We should probably make an issue if nobody is planning to work on it right away, otherwise it'll get lost.

Edit: https://github.com/leanprover-community/lean/issues/146

Reid Barton (Mar 09 2020 at 16:49):

I think the "collect universe variables" bugfix should be fairly trivial


Last updated: Dec 20 2023 at 11:08 UTC