Zulip Chat Archive

Stream: new members

Topic: Class/instance syntax clarification


Tom (Sep 21 2024 at 22:10):

Could someone please clarify the differences between the following? (I'm trying to generalize the example code in FPiL in section 5.2).

I can sort of fix it but I'm not able to find a precise definition outside of seeing some examples in e.g. Prelude, and not able to find a description in the "Type Classes" section of TPiL:

1)

class Fail [Monad m] (m: Type -> Type) where
  fail : String -> m α

That seems to type check but then e.g.

instance : Fail Option where
  fail _ := none

is stuck due to metavariables.

2)

class Fail (m: Type -> Type) [Monad m] where
  fail : String -> m α

This now works

instance : Fail Option where
  fail _ := none

but I have to specify e.g.

def applyPrim [Monad f] [Fail f] : Arith  Int  Int  f Int -- note the "duplicate" specification of Monad

3) I think the "correct" way to specify this (based on e.g. Monad) is to use inheritance

class Fail  (m: Type -> Type) extends Monad m where
  fail : String -> m α

But I don't quite get the difference between this and 2) above (e.g. why the constraints on the class don't get automatically "propagated").

Thank you.

Etienne Marion (Sep 22 2024 at 07:07):

Tom said:

1)

class Fail [Monad m] (m: Type -> Type) where
  fail : String -> m α

That seems to type check but then e.g.

instance : Fail Option where
  fail _ := none

is stuck due to metavariables.

This does not really make sense because you introduce a hypothesis on m before introducing m itself. In fact what happens (you can see it by hovering over the class to see its type) is that when Lean encounters [Monad m] it does not know what m is and therefore will automatically create a variable m which makes sense, because of the option autoImplicit (which is disabled by default in Mathlib btw). So in the end your class takes two different Type → Type parameters which I think was not intended.

Tom said:

2)

class Fail (m: Type -> Type) [Monad m] where
  fail : String -> m α

This now works

instance : Fail Option where
  fail _ := none

but I have to specify e.g.

def applyPrim [Monad f] [Fail f] : Arith → Int → Int → f Int -- note the "duplicate" specification of Monad

The meaning of this definition is that given m which already has a Monad instance you can define a Fail instance. In particular writing [Fail f] makes no sense if you don't already know [Monad f]. It's like writing [Module R M] when you don't know that R is a semiring for instance.

Tom said:

3) I think the "correct" way to specify this (based on e.g. Monad) is to use inheritance

class Fail  (m: Type -> Type) extends Monad m where
  fail : String -> m α

But I don't quite get the difference between this and 2) above (e.g. why the constraints on the class don't get automatically "propagated").

Here the definition states that a Fail instance is given by a Monad instance plus the field fail. You could interpret it as saying that this is "equivalent" (it is not really but that's the idea) to writing

class Fail  (m: Type -> Type) where
  /--
    Fields of the `Monad` class
  -/
  fail : String -> m α

So when you write [Fail f] as an assumption you in particular assume [Monad f]. This is not really equivalent because when you write extends Monad m you explicitly states that m is a Monad which allows to use related theorems so extension is indeed the way to go.

Tom (Sep 22 2024 at 18:28):

Hi @Etienne Marion

This does not really make sense because you introduce a hypothesis on m before introducing m itself.

It made sense to me :laughing: . No seriously - I assumed since the syntax allows it, it had a meaning. I appreciate the pointer about autoImplicits, that seems like a good default to keep in mind!

In all fairness though, putting the type class first is the "same syntax" as when defining an instance, e.g.

instance [Add a]: Something Option a

Note that here, the hypothesis also gets introduced before introducing a, and so it's not unreasonable to assume that it would work similarly for class.

As for 2), I see with point, but still don't fully understand one thing, so let's go with your "math example". Suppose I give the definition of a Vector space over a Field F. When discussing a vector space, I don't have have to keep saying "and F is a field". I think I would try encode e.g. like this:

class VectorSpace (a : Type) [Field a] where ...

It would seem redundant to me to keep having to say e.g.

def add [Field a] [VectorSpace a] ...

What's the advantage of re-specifying that each time? The "hypothesis" already seems like a part of the type. Is there a reason why Lean can't "just" introduce it automatically?

Note that I'm only asking this about classes; for instances it makes perfect sense.

Thanks for taking the time to help me understand!

Edward van de Meent (Sep 22 2024 at 18:40):

Tom said:

What's the advantage of re-specifying that each time? The "hypothesis" already seems like a part of the type. Is there a reason why Lean can't "just" introduce it automatically?

the first example i can think of where this is a non-trivial problem is the following:

class A (T : Type)
class B (T : Type) [A T]
class C (T : Type) [A T]
class D (T : Type) extends A T
class E (T : Type) [D T]

-- how many `A T` parameters?
def test1 [B T] [C T] : sorry := sorry

-- should this have just [D T] or also [A T] as parameter?
def test2 [B T] [E T] : sorry := sorry

Etienne Marion (Sep 22 2024 at 18:48):

Tom said:

In all fairness though, putting the type class first is the "same syntax" as when defining an instance, e.g.

instance [Add a]: Something Option a

Note that here, the hypothesis also gets introduced before introducing a, and so it's not unreasonable to assume that it would work similarly for class.

I'm not sure what you mean here, a is not introduced at all. Or rather it is with autoImplicit, but in your first example you explicitly introduced m after a hypothesis, so that the hypothesis in fact does not refer to this m but to another one which was implicitly introduced.

When it comes to vector spaces, first a vector space is a structure with two different types: a field K and a group E. So it's rather VectorSpace K E. I guess you might expect something like

class VectorSpace (K E : Type*) extends Field K, AddCommGroup E

But this is not supported. However I'm unfortunately not able to tell you why it is as it is.

Kyle Miller (Sep 22 2024 at 18:57):

I think it's worth pointing out that the parameters list isn't a set, where you can add parameters in any order, but instead the exact order matters. In Lean, parameters can only ever refer to what's been defined in a previous parameter — this is due to the way it's using dependent type theory. (One could imagine a version of Lean where it tries to do a topological sort of the parameters in some way to get all the dependencies to be in order, but Lean doesn't do that.)

What's the advantage of re-specifying that each time? The "hypothesis" already seems like a part of the type. Is there a reason why Lean can't "just" introduce it automatically?

With autoImplicit it indeed "just" introduces an implicit parameter automatically. If you want an explicit parameter you have to introduce it yourself.

Tom (Sep 22 2024 at 19:42):

@Edward van de Meent

Thank you for constructing an example! I am trying to understand it, could you please clarify the following line:

-- how many `Foo T` parameters?
def test1 [Bar T] [Baz T] : sorry := sorry

I assume it's a typo but just want to check.

@Etienne Marion
Thanks! I am starting to appreciate more and more why you said you don't use autoImplicit in your code. Regarding the vector space, yes, I was being sloppy for the sake of the brevity/example, thanks for the correction.

@Kyle Miller
Thanks for the clarification. I didn't really think of the parameters as a set but I'm realizing that I wasn't quite cognizant of all the places where the auto implicits were being used, sort of thinking of something closer the the topo-sort idea you mention. I think the fact Etienne mentioned that

class Fail [Monad m] (m: Type -> Type)

actually introduces two variables despite the fact that they are both called m really threw me for a loop.

I appreciate you all helping with my silly/novice questions.

Edward van de Meent (Sep 22 2024 at 19:47):

Tom said:

Edward van de Meent

Thank you for constructing an example! I am trying to understand it, could you please clarify the following line:

-- how many `Foo T` parameters?
def test1 [Bar T] [Baz T] : sorry := sorry

I assume it's a typo but just want to check

Fixed it

Tom (Sep 24 2024 at 19:17):

Hi @Edward van de Meent

I thought about your counterexample and have to admit I am still not sure why it would pose a problem. From my meager understanding of Lean's type classes, I see the type class specification, in part, as a "predicate" which needs to be satisfied in order the definition to apply. While having the "predicate" multiple times is redundant, it's also idempotent.

Indeed, the following works for me:

def test {a: Type} (v : a) [Add a] [Add a]: a := v + v

It also seems that if I name the instance, e.g.

def test {a: Type} (v : a) [inst: Add a] [Add a]: a := v + v

#reduce test

Lean already de-duplicates the instances!!, because the output is fun v [inst : Add ?m.1452] [inst : Add ?m.1452] => inst.1 v v

It would also appear that this is not an uncommon thing. Looking, for example, at the definition of Module in Mathlib, it seems it's

class Module (R : Type u) (M : Type v) [Semiring R] [AddCommMonoid M] extends

In order to avoid repeating this incantation everywhere, a variable is used right after the definition. This is similar to my observation in another thread about trying to specify a "numeric" type, where I had to keep re-specifying a long string of type classes.

Kyle Miller (Sep 24 2024 at 19:28):

Typeclasses are not simple constraints, and they're not idempotent. There are lots of things that can go wrong when you have duplicate instances, for example [Field a] [Add a] gives an incredible amount of trouble (it's a frequent source of confusion as evidenced by Zulip questions).

If you want, each instance argument is like a vtable. There's no reason the different vtables passed to the two Add instances in your example have the same pointers.

Kyle Miller (Sep 24 2024 at 19:29):

Adding in missing instances without overlap can be slow. It depends on trying to synthesize instances and seeing whether they fail or not.

I implemented this in the mathlib variable? command, which you might be interested in looking at.

Tom (Sep 24 2024 at 19:31):

Thanks! That vtable analogy is good one, it's sort of how I understand things atm. As I mentioned, in the simple example using two [Add a] instances to my eyes it looked they actually go de-duplicated!

Kyle Miller (Sep 24 2024 at 19:32):

Here's @Edward van de Meent's example using variable?. It deduplicates the A instances:

import Mathlib

class A (T : Type)
class B (T : Type) [A T]
class C (T : Type) [A T]
class D (T : Type) extends A T
class E (T : Type) [D T]

set_option autoImplicit true

section
variable? [B T] [C T]
  => [A T] [B T] [C T]
end

section
variable? [B T] [E T]
  => [A T] [B T] [D T] [E T]
end

Kyle Miller (Sep 24 2024 at 19:33):

As I mentioned, in the simple example using two [Add a] instances to my eyes it looked they actually go de-duplicated!

There's no mechanism in Lean that deduplicates instances in binders though.

Tom (Sep 24 2024 at 19:33):

Is the slowness partially because of the trial-and-error of a (prolog-like/backtracking) recursive instance search? Could perhaps a solution be something Datalog-ish like which more deterministic?

Kyle Miller (Sep 24 2024 at 19:34):

All you're seeing is that the last local instance is the one that "wins":

def test {a: Type} (v : a) [Add a] [Add a]: a := v + v

set_option pp.explicit true
#print test
/-
def test : {a : Type} → a → [inst : Add a] → [inst : Add a] → a :=
fun {a} v [inst : Add a] [inst : Add a] ↦ @HAdd.hAdd a a a (@instHAdd a inst) v v
-/

(The first inst argument is unused.)

This means in the [Field a] [Add a] example, the second one "wins" for interpreting +, which could be completely different from the one that would have been supplied by Field.

Tom (Sep 24 2024 at 19:35):

Thanks for the explanations.

I guess there is a mechanism for not having to repeat myself too much when writing the code by using the variable commands in the current file, even though that doesn't apply/is brittle to the potential users of my code.

Kyle Miller (Sep 24 2024 at 19:38):

Tom said:

Is the slowness partially because of [...]

My understanding is that it's slow to refute the existence of an instance, because you have to consider all the many instances that will try to derive an instance. You can investigate this with set_option trace.Meta.synthInstance true on a failing instance problem to see — there might be some interesting ones in the variable? test file.

Tom (Sep 24 2024 at 19:39):

Kyle Miller said:

(The first inst argument is unused.)

This means in the [Field a] [Add a] example, the second one "wins" for interpreting +, which could be completely different from the one that would have been supplied by Field.

Ok, I see. :thinking: I guess I don't like it much because it seems brittle from a programming perspective but at least I feel I understand it now. Thanks for taking the time to explain!

Kyle Miller (Sep 24 2024 at 19:41):

At least it's flexible and fits in the theory.

What did you have in mind with the Datalog suggestion by the way? Doing bottom-up calculation of all instances? The actual full set of instances is infinite. The Lean instance synthesis algorithm does do caching, so it's sort of like a top-down Datalog query already.

Kyle Miller (Sep 24 2024 at 19:42):

There are probably gains to be had if instance synthesis could "pre-compile" query plans, where it would try synthesizing certain instances to see if it can throw out entire subtrees.

Kyle Miller (Sep 24 2024 at 19:48):

You may be thinking "What's the deal? Don't classes extend other classes? Can't we deduplicate instance implicit binders with reference to this hierarchy?" I think there are a number of issues that make this impossible at the moment (for example, in the algebraic hierarchy, there are instances that are morally from extension but there's no extends recording that fact), but it would be nice if it could work one day, maybe using additional annotations users can supply to record what structure implies what structure.

Sometimes people have suggested a [[Cls x y z]] notation to insert all the missing instance arguments. This variable? command is at least something in that direction.

Tom (Sep 24 2024 at 19:55):

Kyle Miller said:

At least it's flexible and fits in the theory.

What did you have in mind with the Datalog suggestion by the way? Doing bottom-up calculation of all instances? The actual full set of instances is infinite. The Lean instance synthesis algorithm does do caching, so it's sort of like a top-down Datalog query already.

Sort of. When you say "infinite", do you mean infinite because of type variables? Initially, I thought the instance dependency graph may be "infinite" (in the sense of e.g. having loops) but ultimately, is there not just a finite list of actual types defined in the program? So a type class Add a could correspond to a "datalog rule" Add(x) :- and specific instances would be e.g. Add(Int).
I realize there additional issues here with e.g. something like Foo(x, y) :- Add(x), Bar(y) in the case where I want to specialize Foo(Int, x), and also e.g. with Monad because datalog cannot natively represent higher-order predicates but it still sounds plausible.

I have some other thoughts but this is probably already too speculative and taking us too far off course.

Tom (Sep 24 2024 at 19:58):

Kyle Miller said:

You may be thinking "What's the deal? Don't classes extend other classes? Can't we deduplicate instance implicit binders with reference to this hierarchy?" I think there are a number of issues that make this impossible at the moment (for example, in the algebraic hierarchy, there are instances that are morally from extension but there's no extends recording that fact), but it would be nice if it could work one day, maybe using additional annotations users can supply to record what structure implies what structure.

Sometimes people have suggested a [[Cls x y z]] notation to insert all the missing instance arguments. This variable? command is at least something in that direction.

Awesome, thanks! That makes sense now.

Maybe at some point I can take a crack at something like that - that sounds like exactly the sort of project which would be fun.

Kyle Miller (Sep 24 2024 at 20:02):

When you say "infinite", do you mean infinite because of type variables

Rules like docs#instFintypeProd make the set of instances infinite.

Yuri (Sep 26 2024 at 12:59):

Kyle Miller said:

In Lean, parameters can only ever refer to what's been defined in a previous parameter — this is due to the way it's using dependent type theory. (One could imagine a version of Lean where it tries to do a topological sort of the parameters in some way to get all the dependencies to be in order, but Lean doesn't do that.)

Thanks, I was about to ask the question you answered here. It is good to know there is no deeper theoretic reason for this since for a newcomer it does feel like a paper cut.

Could we say the same about ordering of commands in a module assuming there are no dependency cycles?


Last updated: May 02 2025 at 03:31 UTC