Zulip Chat Archive

Stream: general

Topic: I don't understand type class search.


Gabriel Ebner (Mar 03 2020 at 17:59):

I have to admit that I don't understand the type class search in Lean 3. (If there is a description somewhere, please point me to it.) In particular, given an instance like topological_ring.to_topological_add_group, will Lean enumerate all possible instances of [ring α]?

@[instance, priority 100]
protected def topological_ring.to_topological_add_group :
 (α : Type u) [_inst_1 : topological_space α] [_inst_2 : ring α] [t : topological_ring α],
  topological_add_group α :=

Gabriel Ebner (Mar 03 2020 at 17:59):

Context: I'm just looking at a type-class trace where Lean produces such wonderful instances of ring ℝ:

@domain.to_ring 
     (@integral_domain.to_domain 
        (@linear_ordered_comm_ring.to_integral_domain 
           (@decidable_linear_ordered_comm_ring.to_linear_ordered_comm_ring 
              (@discrete_linear_ordered_field.to_decidable_linear_ordered_comm_ring 
                 real.discrete_linear_ordered_field))))

Reid Barton (Mar 03 2020 at 18:00):

Yes this is really how it works

Gabriel Ebner (Mar 03 2020 at 18:02):

Is there some standard workaround? Should it be {ring ℝ} instead of [ring ℝ]?

Reid Barton (Mar 03 2020 at 18:05):

On the basis that it will be determined as the parameter of the topological_ring instance? Maybe?

Kevin Buzzard (Mar 03 2020 at 18:05):

attribute [instance, priority 1000] real.ring ?

Gabriel Ebner (Mar 03 2020 at 18:08):

real.ring is already an instance with the default priority 1000.

Kevin Buzzard (Mar 03 2020 at 18:08):

2000?

Gabriel Ebner (Mar 03 2020 at 18:09):

The problem is not that Lean doesn't find the real.ring instances. In fact, that's the one it tries first. The problem is that Lean then tries all other ring instances, of which there are unfortunately very many.

Kevin Buzzard (Mar 03 2020 at 18:10):

:four_leaf_clover: ?

Sebastian Ullrich (Mar 03 2020 at 18:10):

It shouldn't do that unless a later failure makes it backtrack

Gabriel Ebner (Mar 03 2020 at 18:12):

Yes, of course it does backtrack. This problem occurs when showing that the reals are a topological additive group. We already know that the reals are a uniform additive group, but we haven't yet shown that they are a uniform ring. Now there is an instance that every topological ring is a topological additive group, which Lean tries first, and then it gets stuck enumerating all ring structures on the reals (none of which is known to be a topological ring yet...).

Reid Barton (Mar 03 2020 at 18:13):

These "wrong way" instances are still insane to me

Kevin Buzzard (Mar 03 2020 at 18:14):

Gabriel Ebner said:

The problem is not that Lean doesn't find the real.ring instances. In fact, that's the one it tries first. The problem is that Lean then tries all other ring instances, of which there are unfortunately very many.

The other example you gave is defeq to real.ring of course -- but the system isn't smart enough to spot this? I thought the rule was supposed to be that you were only supposed to be able to find one instance anyway.

Gabriel Ebner (Mar 03 2020 at 18:20):

On the basis that it will be determined as the parameter of the topological_ring instance? Maybe?

Exactly, these should be filled in via unification and not via type class search. Looking back through the PRs, @Chris Hughes has already fixed a lot of such instances: https://github.com/leanprover-community/mathlib/pull/1525 We should probably write a linter for this.

Johan Commelin (Mar 03 2020 at 18:22):

Can the linter also output a sed script that will fix the source code for us?

Gabriel Ebner (Mar 03 2020 at 18:24):

I don't think it's easy to create a sed script. Maybe I'm bad at lean, but this abomination is the only thing I could get working: https://github.com/leanprover-community/mathlib/pull/2064/commits/88a084fbdc6deeff9cb498ea454ee5735ccc89c2

Kevin Buzzard (Mar 03 2020 at 18:36):

Is the point that this compiles much faster?

Gabriel Ebner (Mar 03 2020 at 18:43):

In this case the type-class search did not terminate within a few minutes. I guess it should terminate eventually, since we only have a finite number of ring instances for the reals.

Floris van Doorn (Mar 03 2020 at 19:17):

This occurs a lot in the library, see also issue #1561. I think we should indeed try to make all of these use curly braces. There might potentially be downsides to this, but probably very few.
This problem also occurs with automatically generated instances, like module.to_semimodule. Ideally the elaboration procedure would solve the type-class problems in the reverse order in cases like these.

Mario Carneiro (Mar 03 2020 at 20:38):

I think this would be helped a bit if topological_ring extended ring and topological_space

Mario Carneiro (Mar 03 2020 at 20:39):

That way it would not have to search every ring structure for every topological ring structure, it could do that in one go

Johan Commelin (Mar 03 2020 at 20:39):

But that comes with its own set of complications, right?

Mario Carneiro (Mar 03 2020 at 20:39):

That's the standard approach we've been using for the typeclass hierarchy everywhere else

Johan Commelin (Mar 03 2020 at 20:39):

See e.g., the recent commit that mv_polynomial over an integral domain is an integral domain.

Mario Carneiro (Mar 03 2020 at 20:40):

that sounds fine to me...?

Johan Commelin (Mar 03 2020 at 20:40):

It is already a comm_ring, and now you need to inductively build an instance of integral_domain, but the data should be defeq to the existing comm_ring instance.

Johan Commelin (Mar 03 2020 at 20:40):

So you have to jump through a bunch of hoops, instead of focusing on the core of the mathematical argument.

Mario Carneiro (Mar 03 2020 at 20:41):

Isn't this just the constructor?

Mario Carneiro (Mar 03 2020 at 20:41):

of integral_domain

Mario Carneiro (Mar 03 2020 at 20:41):

you use ..foo.comm_ring and you get all the defeqs

Johan Commelin (Mar 03 2020 at 20:42):

The code became really slow in certain parts.

Johan Commelin (Mar 03 2020 at 20:42):

If you can golf it and speed it up, I would love to learn how.

Johan Commelin (Mar 03 2020 at 20:43):

My gut feeling says that lots of mixing becomes easier if we have a strict separation between data and properties in our type classes.

Johan Commelin (Mar 03 2020 at 20:43):

Otoh, it makes tc search more painful, which is why you suggested bundling upstairs. Also, my suggestion boosts the verbosity of code. (Which I don't like myself either...)

Mario Carneiro (Mar 03 2020 at 20:54):

Johan Commelin said:

My gut feeling says that lots of mixing becomes easier if we have a strict separation between data and properties in our type classes.

This is not like topological_space vs t2_space we're talking about here; a topological_ring is data (a topology, a ring, and some coherence, just like all our other algebraic classes)

Mario Carneiro (Mar 03 2020 at 20:55):

I still want to stick to the rule "one type = one typeclass" to the maximum extent possible

Mario Carneiro (Mar 03 2020 at 20:57):

Your argument would also extend to the case of ring = add_group + monoid + properties

Sebastien Gouezel (Mar 03 2020 at 21:36):

I think the idea up to now has been: one big chain of typeclasses for topology (with topological_space, uniform_space, metric_space, normed_space), one big chain of typeclasses for algebra (with monoid, group, ring, integral_domain, field). If you start mixing the two, you will need something like 4 x 5 typeclasses (and in fact much more since I have omitted a big part of the hierarchy), so to keep things reasonable you will not extend things but rather add the properties you need as new typeclasses. In this way, if you have defined a topological ring and you want to talk about a normed integral domain, you don't need to define a new typeclass, only to say [normed_space alpha] [integral_domain alpha] [topological_ring alpha].

Yury G. Kudryashov (Mar 03 2020 at 22:06):

I think @Johan Commelin was talking about something like "make all algebraic classes take has_add, has_zero and other data as arguments. Am I right?

Yury G. Kudryashov (Mar 03 2020 at 22:08):

Then every type will have only one instance of has_add, and monoid etc live in Prop

Yury G. Kudryashov (Mar 03 2020 at 22:08):

I don't know if this will be easier or harder for lean

Sebastien Gouezel (Mar 03 2020 at 22:23):

Note that we have things like [nondiscrete_normed_field alpha], so we are not very coherent. I agree that for things that we use often it is good to have a dedicated typeclass.

Mario Carneiro (Mar 04 2020 at 00:49):

But you need 4 x 5 typeclasses anyway, because for each pair in those examples the coherence conditions are different. A topological group has continuous subtraction, a uniform ring has locally uniformly continuous multiplication, and so on. We can still fill the matrix by need so that it doesn't get out of hand, but for the structures that matter I think there should actually be one single class topological_group bundling all operations and coherence (and similarly for the other elements of that matrix)

Mario Carneiro (Mar 04 2020 at 00:52):

If you just need the conjunction of properties, i.e. a topological space that is a ring but without coherence (why?), you don't need any added typeclasses, you can just use [topological_space A] [ring A]. (This usually doesn't work because of diamonds but here they don't share any operations so it's fine.)

Johan Commelin (Mar 04 2020 at 05:54):

@Mario Carneiro No, the example that @Sebastien Gouezel gave is a good one: a topological integral domain is a topological ring that is also an integral domain. In my hypothetical setup that would mean:

[ring_data R] -- this gives you `has_zero`, `has_one`, `has_add`, `has_neg` and `has_mul`
[topological_space R]
[topological_ring R] -- this gives you the compatibility properties: mul, add, and neg are continuous
[integral_domain R] -- this tells you that mul is commutative, and there are no zero divisors

Johan Commelin (Mar 04 2020 at 05:55):

There would only be a few "mixing" classes, like topological_group, topological_ring etc... but only very few.

Mario Carneiro (Mar 04 2020 at 05:55):

Well let's at least have those be classes then

Mario Carneiro (Mar 04 2020 at 05:56):

It still seems preferable from a readability POV to have topological_integral_domain

Johan Commelin (Mar 04 2020 at 06:36):

@Mario Carneiro One downside is that we can not write instances like

instance {R : Type} [topological_ring R] [integral_domain R] : topological_integral_domain R := sorry

because we have no way of expressing that the ring data from the two instances in the assumptions are the same.

Johan Commelin (Mar 04 2020 at 06:36):

Which means that we can use the type class system to "prove" obvious things for us.

Johan Commelin (Mar 04 2020 at 06:37):

On the other hand, if we separate data and props, then we can do this.

Mario Carneiro (Mar 04 2020 at 06:37):

We already have that problem elsewhere in the hierarchy. The solution is to just create instances of topological_integral_domain whenever you have the first two

Johan Commelin (Mar 04 2020 at 06:38):

So... just because we have the problem elsewhere, you advocate to also have it here?

Johan Commelin (Mar 04 2020 at 06:41):

I would rather have some way to tell lean: hey, if I write [comm_ring R] you should understand that I really am abbreviating [ring_data R] [is_comm_ring R].

Johan Commelin (Mar 04 2020 at 06:41):

That would still give you the readable code, but also the flexibility of separating data and props

Sebastian Ullrich (Mar 04 2020 at 08:32):

Floris van Doorn said:

Ideally the elaboration procedure would solve the type-class problems in the reverse order in cases like these.

It does so by default in Lean 4.

Gabriel Ebner (Mar 04 2020 at 09:24):

Floris van Doorn said:

Ideally the elaboration procedure would solve the type-class problems in the reverse order in cases like these.

I don't understand how processing the arguments in reverse orders solves the underlying problem. In the original example, won't the type class instance search then not just enumerate all instances of topological_ring (instead of ring).

instance topological_ring.to_topological_add_group :
 (α : Type u) [_inst_1 : topological_space α] [_inst_2 : ring α] [t : topological_ring α],
  topological_add_group α

Oh, I see, it will still enumerate all of them, but in this special case it can then always find instances for topological_space and ring as well. Hence it won't enumerate all ring instances. However if there was a [separable_space α] at the beginning (and the current space happens not to be separable), then switching the order doesn't solve anything and we still end up enumerating all topological_ring and ring and topological_space instances. That is, the following instance would still cause cubic runtime in type class instance search:

instance topological_ring.to_topological_add_group' :
 (α : Type u) [_inst_1 : topological_space α] [separable_space α]  [_inst_2 : ring α] [t : topological_ring α],
  topological_add_group α

Sebastian Ullrich (Mar 04 2020 at 09:26):

Sure, it doesn't solve everything. It's just the better default whenever there are dependencies.

Gabriel Ebner (Mar 04 2020 at 09:29):

Well, fair enough. At least it's an improvement.

Kevin Buzzard (Mar 04 2020 at 11:11):

This algebra typeclass system might go on "forever" -- we have Huber rings which are topological rings + ..., and then perfectoid rings which are Huber rings + ... . This is what actual mathematics looks like, and it would be good if we could do actual mathematics in Lean 4. I'm optimistic that we can but I do find all this stuff scary.

Gabriel Ebner (Mar 04 2020 at 14:14):

@Rob Lewis has already told me he is not too happy about this approach but I don't see any better way in Lean 3:
https://github.com/leanprover-community/lean/pull/135


Last updated: Dec 20 2023 at 11:08 UTC