Zulip Chat Archive
Stream: general
Topic: typeclass paper
Daniel Selsam (Jan 10 2020 at 14:25):
Hello Mathlib! As many of you know, @Sebastian Ullrich @Leonardo de Moura and I are writing a paper about our new typeclass resolution procedure for Lean4. We have a first draft and it would be great to get feedback, particularly from people who know more about Mathlib than they do about the internals of typeclass resolution. Here is a link: https://www.dropbox.com/s/5nxklkxvdh7xna9/typeclass.pdf?dl=0 Thanks in advance.
Rob Lewis (Jan 10 2020 at 14:34):
Do you have a submission/feedback deadline? I'd love to take a look but might not find time until next week.
Daniel Selsam (Jan 10 2020 at 14:35):
We are submitting to IJCAR 2020, January 23rd deadline
Johan Commelin (Jan 10 2020 at 15:15):
@Daniel Selsam See https://www.math.sciences.univ-nantes.fr/~gouezel/ for how to spell @Sebastien Gouezel's name outside of Zulip :wink:
Johan Commelin (Jan 10 2020 at 15:34):
Thus even though e.g. lists and multisets may usefully coerce into each other, one direction must be chosen arbitrarily for the Coe instance and the other must be sacrificed.
How do you coerce a multiset into a list? You've lost the ordering... Do you want "multisets" and "finsets"? (Note that the generic mathematical reader won't recognize "finset", but then, they aren't your target audience anyway.)
Johan Commelin (Jan 10 2020 at 15:35):
Also... you talk about coercions between "bijective" types. But for example multiset
and finset
are not bijective. It's just that there are natural maps in both directions, but those maps are not bijections.
Johan Commelin (Jan 10 2020 at 15:36):
Minor comment:
allows reducing the scalars in a module
It's restricting scalars, not "reducing"
Johan Commelin (Jan 10 2020 at 15:45):
@Daniel Selsam Is this new typeclass resolution procedure implemented in Lean4 or in C++?
Daniel Selsam (Jan 10 2020 at 15:49):
Daniel Selsam Is this new typeclass resolution procedure implemented in Lean4 or in C++?
@Johan Commelin in Lean4: https://github.com/leanprover/lean4/blob/master/src/Init/Lean/Meta/SynthInstance.lean
Johan Commelin (Jan 10 2020 at 15:49):
Ok, cool!
Johan Commelin (Jan 10 2020 at 15:51):
Is there a way to deal with diamonds that prop-eq but not def-eq?
Johan Commelin (Jan 10 2020 at 15:53):
For example, Chris mentioned the other day that we'll get two algebra ℚ ℂ
instances. But we know that algebra ℚ K
is a subsingleton, so they must be prop-eq.
Daniel Selsam (Jan 10 2020 at 15:54):
What do you mean by "deal with"? What are you afraid will happen, and what do you want to happen?
Johan Commelin (Jan 10 2020 at 15:54):
But maybe this is not something that the typeclass resolution should deal with, but another part of the system.
Johan Commelin (Jan 10 2020 at 15:55):
I might claim that algebra_map x = algebra_map x
, but both sides might find different instances (especially if the expressions are a bit more complicated and not syntactically equal).
Johan Commelin (Jan 10 2020 at 15:56):
And so Lean will refuse, because the algebra_map
s come from different instances.
Johan Commelin (Jan 10 2020 at 15:56):
But since those instances are prop-eq, we are a silly rw away from moving forward.
Johan Commelin (Jan 10 2020 at 15:57):
But I realize now that this is not a problem that you hit during tc resolution, but during type checking
Johan Commelin (Jan 10 2020 at 15:58):
So I guess it's a bit off topic in this thread.
Johan Commelin (Jan 10 2020 at 15:59):
@Daniel Selsam Section 5.3 ends with a parenthetical remark (that is not properly
Daniel Selsam (Jan 10 2020 at 16:00):
I have been pushing frequent fixes, including the 5.3 issue. Please reload the page.
Daniel Selsam (Jan 10 2020 at 16:01):
If 5.3 still trails off for you after reloading, you may need to click some kind of "jump to most recent version of paper" button.
Daniel Selsam (Jan 10 2020 at 16:11):
But I realize now that this is not a problem that you hit during tc resolution, but during type checking
This example is outside the scope of tc resolution, since presumably the two calls to tc are independent. There is a related issue we have discussed about TC though: whether to consider a second solution to the same (sub)goal to be a repeat (and so discard it) if it has the same type as the first solution, even if it is not definitionally equal to the first solution.
Daniel Selsam (Jan 10 2020 at 16:12):
Our current plan is to say that typeclasses are "morally canonical", and consider a solution (to a subgoal) to be a repeat if it has the same type as a previous solution.
Daniel Selsam (Jan 10 2020 at 16:14):
The "morally canonical" assumption would be violated by Algebra
, and some downstream goals may succeed or fail depending on which specific instance to Algebra
is found first.
Johan Commelin (Jan 10 2020 at 16:16):
Aah, it might be that it still pops up here...
Johan Commelin (Jan 10 2020 at 16:16):
Note that in general Algebra R A
is not a subsingleton... but if R
is \Z
or \Q
, then it is...
Daniel Selsam (Jan 10 2020 at 16:17):
Edited my comment -- the assumption is violated whether or not Algebra is a subsingleton.
Johan Commelin (Jan 10 2020 at 16:20):
By transitivity algebra R A
and f
will give me another instance of algebra R B
, and by some property field of alg_hom
we know that f
ensures it is prop-eq to the existing instance of algebra R B
.
Mathematicians treat these instances as def-eq all the time.
Johan Commelin (Jan 10 2020 at 16:20):
Or if R
is any field
Johan Commelin (Jan 10 2020 at 16:21):
Suppose we have algebra R A
, algebra R B
and f : alg_hom[R] A B
in the context.
Johan Commelin (Jan 10 2020 at 16:21):
Oops... messages are mangled up by German telecom
Daniel Selsam (Jan 10 2020 at 16:29):
Can you come up with an example where typeclass resolution will succeed or fail depending on which of two different instances with the same type are discovered first for some subgoal?
Daniel Selsam (Jan 10 2020 at 16:31):
How about I try to come up with a toy example first to make the issue clear, and then you can try to come up with a real example that would exhibit the same problem.
Chris Hughes (Jan 10 2020 at 16:51):
I don't think typeclass inference fails in these situations, the problem tends to be that it infers two different instances in different situations, so terms that look the same, and are in fact equal aren't definitionally equal, so exact
doesn't work when it looks like it should.
Daniel Selsam (Jan 10 2020 at 16:54):
Very artificial example but just to give intuition:
class Foo : Type := (b : Bool) class Bar (b : Bool) : Type := (u : Unit := ()) class Rig : Type := (u : Unit := ()) instance FooToBar [f:Foo] : Bar f.b := {} instance BarToRig [Bar true] : Rig := {} instance FooTrue : Foo := {b := true} instance FooFalse : Foo := {b := false} -- There are two instances to Foo with different values. -- Since they have the same type, the second one found is discarded. -- The overall query succeeds only if the first one found happens to be `FooTrue`. def synthRig [Rig] : Unit := () example : Unit := synthRig
Daniel Selsam (Jan 10 2020 at 16:56):
I don't think typeclass inference fails in these situations, the problem tends to be that it infers two different instances in different situations, so terms that look the same, and are in fact equal aren't definitionally equal, so
exact
doesn't work when it looks like it should.
I agree with @Johan Commelin that the issue you are describing is probably better addressed after the fact, outside of typeclass resolution. Note that Lean3 had a subsingleton-canonicalizer which would address this problem.
Chris Hughes (Jan 10 2020 at 17:04):
Here's an example that comes up in practice
import data.set.basic constant fintype (α : Type) : Type attribute [class] fintype def card (α : Type) [fintype α] : ℕ := sorry constant fintype_range {α : Type} [fintype α] {p : set α} : fintype ↥p attribute [instance] fintype_range instance {α β : Type} [fintype β] (f : β → α) : fintype (set.range f) := sorry lemma subset_lemma {α : Type} [fintype α] {p : set α} : card p = card p := sorry example {α β : Type} [fintype α] [fintype β] (f : β → α) : card (set.range f) = card (set.range f):= begin rw [subset_lemma], --fails end
Daniel Selsam (Jan 10 2020 at 19:57):
@Chris Hughes Thanks. I don't think this is an issue with typeclass resolution though. Higher-level tactics can be made to hide this kind of complication, similar to the way simp
does.
Tim Daly (Jan 10 2020 at 23:51):
@Daniel Selsam Note that the Diamond problem has been around a while. Symbolics Common Lisp solved this using the "Mixin" idea (https://en.wikipedia.org/wiki/Mixin) in its Flavors implementation. Common Lisp Object System (CLOS) has rules (http://www.lambdafaq.org/how-are-conflicting-method-declarations-resolved).
Tim Daly (Jan 11 2020 at 02:42):
Also note that CLOS checks class precedence to check for the cycle problem.
Johan Commelin (Jan 11 2020 at 05:23):
@Tim Daly Is there any reason to think that diamonds in those systems can be dealt with in the same way as diamonds in the algebraic hierarchy of an interactive theorem prover? Intuitively it feels to me like they might need different solutions.
Tim Daly (Jan 11 2020 at 07:06):
Axiom struggled with similar problems. Here is an abbreviated form of Axiom's typeclasses: https://github.com/daly/PDFS/blob/master/endpaper.pdf and the full form (with typeclass abbreviated names): http://axiom-developer.org/axiom-website/bookvol10.2abb.html
Tim Daly (Jan 11 2020 at 07:10):
@Johan Commelin If you have a function with the same name and same signature but different semantics available on two different paths there are various schemes to resolve them. Axiom knows that a function FOO on the path from typeclass X can be different from FOO on the path from typeclass Y. The compiler will complain. But you can be explicit FOO$X or FOO$Y where the $ operator explicitly says "use this typeclass"
Johan Commelin (Jan 11 2020 at 07:22):
How is that remark relevant to my reply to your remark about how lisp flavours solved the diamond problem?
Tim Daly (Jan 11 2020 at 08:37):
Sorry, I thought the referenced Wikipedia article covered that question. Flavors was merged into the Common Lisp standard as the Common Lisp Object System (CLOS)
"CLOS allows multiple inheritance. When the default order in which methods are executed in multiple inheritance is not correct, the programmer may resolve the diamond inheritance problems by specifying the order of method combinations".
"CLOS attempts to provide both reasonable default behavior and the ability to override it... by giving a specific method resolution order or stating a rule for combining methods. This is called 'method combination', which may be fully controlled. The MOP (metaobject protocol) also provides means to modify the inheritance, dynamic dispatch, class instantiation, and other internal mechanisms without affecting the stability of the system."
Essentially the CLOS programmer can specify their own method of solving the diamond problem by writing 'method combinators'. So you can override the system-provided version in a specific case if you need to. This was derived from the Symbolics Flavors system.
Tim Daly (Jan 11 2020 at 08:40):
See http://www.bracha.org/oopsla90.pdf ... Mixin-based Inheritance
Tim Daly (Jan 11 2020 at 08:47):
Axiom's approach is "Well, you've got two functions from different paths with the same signature. That's fine but you have to tell me which one you mean when you decide to use it."
Tim Daly (Jan 11 2020 at 08:49):
If Lean were implemented in Common Lisp it would be much easier to resolve certain problems. (Disclosure: I'm a lisper, in case that isn't obvious)
Tim Daly (Jan 11 2020 at 08:51):
Lean could adopt the approach of specifying not only the name of the function but, if there are duplicates, also specifying the particular typeclass (otherwise complain).
Johan Commelin (Jan 11 2020 at 09:15):
If Lean were implemented in Common Lisp it would be much easier to resolve certain problems. (Disclosure: I'm a lisper, in case that isn't obvious)
Too bad (for you) that most of Lean4 is implemented in Lean4...
Johan Commelin (Jan 11 2020 at 09:16):
In Lean it is always possible to explicitly provide the type class. So I think all of the things you've described are already available. The point is (mostly) that we want things to be fast.
Johan Commelin (Jan 11 2020 at 09:16):
My remark was that mathlib's diamonds are probably the same kind as those in Lisp, but considerably more complex.
Tim Daly (Jan 11 2020 at 09:38):
Actually, that's a huge win. I'm looking forward to seeing the implementation.
Daniel Selsam (Jan 11 2020 at 13:22):
@Tim Daly Thanks for sharing the links. We seem to be using the phrase diamond problem to mean slightly different things. In Lean, in the common case, all the different paths through the diamond towers lead to the same implementation. So, deciding among alternative implementations is not the main challenge. The problem is that towers of diamonds in the instance graph induce an exponential number of paths, and without care, certain queries will spend exponential time traversing these paths. The diamond problem for us is purely a performance problem.
Tim Daly (Jan 11 2020 at 13:52):
The term 'diamond problem' has a long history. You might want to be very clear up front to distinguish your problem from the other uses of the term.
Tim Daly (Jan 11 2020 at 13:57):
If your problem is only that you might encounter the same object on different paths, it might help to memoize the path information so you never traverse it more than once. I will reread the paper with your new definition. I assumed more about the problem than you wrote I guess. My apology for the mistake.
Jason Rute (Jan 11 2020 at 14:11):
Also Rust is another language using type classes or something similar. Is any of this a concern there?
Jason Rute (Jan 11 2020 at 14:27):
In Scala, implicits are similar to Lean's type classes. They are used to support ad-hoc polymorphism, coercion, and true-type classes. They also have a similar syntax to Lean. I know they are one fo the things which makes compiling Scala so slow, particularly because of implicit macros, which are the Scala analogue of instances depending on other instances. Do you know off-hand if your method can be used to speed up implicit resolution in Scala? (Or if Scala uses a method similar to yours already?) I know the implicit resolution in Scala has some additional concerns such as trying to resolve implicits in closer levels of scope first. Also, I haven't checked for sure that diamonds or loops are allowed in implicit macros. Last, I don't know for sure that the slow down in implicit search in practice is due to an exponential search tree or if it is due to the execution time of the code in the implicit macros. Just curious if you have any thoughts on this.
Jason Rute (Jan 11 2020 at 14:28):
(Due to the Zulip mobile app issues, my messages got reversed.)
Jason Rute (Jan 11 2020 at 14:49):
Actually, I just checked with some Scala code. Diamonds are allowed in Scala in the sense that you can write the code. However, implicit search will fall if it finds two ways to resolve the same implicit at the same level of scope, so if the top of the diamond is inhabited, it will complain that there are two ways to match the expected type (even if they evaluate the same). This must mean it is checking all the paths (at least until it finds two satisfied paths). In that case, I hope it is doing something similar to your method or it could be checking an exponential number of paths.
Daniel Selsam (Jan 11 2020 at 14:53):
@Jason Rute Thanks. I have never used Scala and know very little about it. Could you please share your diamond example?
Jason Rute (Jan 11 2020 at 15:17):
You can paste this into a scala REPL.
// The classes case class A(n: Int) // this like a structure type case class B(n: Int) case class C(n: Int) case class D(n: Int) // the implicit macros implicit def dToC(implicit d: D): C = C(d.n) implicit def dToB(implicit d: D): B = B(d.n) implicit def bToA(implicit b: B): A = A(b.n) implicit def cToA(implicit c: C): A = A(c.n) // case 1: only one valid path. This block will return A(42) { // restrict scope implicit val b: B = B(42) // can use implicitly to check implicit implicit resolution "def implicitly[T](implicit t : T): T = T" implicitly[A] } // case 2: two valid paths which meet together. This will fail. { // restrict scope implicit val d: D = D(42) implicitly[A] // compiler will complain about "ambiguous implicit values" } // case 3: two valid paths (one which extends the other). This will succeed and return A(43) { // restrict scope implicit val b: B = B(42) implicit val d: D = D(43) implicitly[A] } // case 3: two valid paths not extending each other. This will fail. { // restrict scope implicit val b: B = B(42) implicit val c: C = C(43) implicitly[A] // compiler will complain about "ambiguous implicit values" }
Daniel Selsam (Jan 11 2020 at 15:59):
@Jason Rute Thank you very much. I just confirmed that it also scales exponentially on the tower of (failing) diamonds.
Daniel Selsam (Jan 11 2020 at 16:00):
I am not sure if it merits putting in the paper though. For all I know, their instance logic may be simple enough that relatively naive caching would address this. And I doubt it is a problem in practice for them.
Daniel Selsam (Jan 11 2020 at 16:03):
Haskell is the same boat. We discuss and compare against Haskell mainly due its historical significance in pioneering typeclasses. I suspect Haskell could implement a simple caching scheme, and I highly doubt it would make sense for them to implement tabled resolution.
Rob Lewis (Jan 16 2020 at 18:13):
@Daniel Selsam Thanks for sharing the paper, it's very nice. Very excited to use the new procedure in Lean 4. I have some minor line comments on the paper I'll send you directly, but a few more general points:
-
One of the frustrating parts of Lean 3 type class resolution is error reporting. To some extent I think this is inherent in the algorithm, it can be hard to localize what goes wrong. I wonder if this gets better or worse with tabled resolution. Is this something you've thought about at all and is there anything worth saying in the paper? I'd be very interested to read a paragraph on this in section 5.5 if there is.
-
Both appear in the literature, but "type class" seems to be more common than "typeclass" (except maybe surrounding Coq). Any reason to prefer the latter?
-
There might not be space, but I'd be curious to see one or two more examples at the end, eg the frustrating int coe issue we had in mathlib. I don't think it fits the same pattern as the tower of diamonds. (edit - thinking harder, maybe it is essentially the same idea?)
-
Mathlib is dropped in toward the beginning without introduction. One of the reasons we wrote the mathlib paper was just for this reason, so we can cite it without describing it!
Patrick Massot (Jan 16 2020 at 18:15):
Another question that Coq users keep asking whenever we mention type class resolution. Will we have cuts?
Daniel Selsam (Jan 16 2020 at 18:56):
@Patrick Massot To the best of my knowledge, nobody has yet brought to our attention any issue that cuts might help address, so we haven't considered it.
Daniel Selsam (Jan 16 2020 at 19:02):
Mathlib is dropped in toward the beginning without introduction. One of the reasons we wrote the mathlib paper was just for this reason, so we can cite it without describing it!
Funny, I thought we cited the Mathlib paper in both the abstract and the introduction, but we don't seem to cite it all... :thinking: ... must have gotten lost during refactoring. Great catch!
Daniel Selsam (Jan 16 2020 at 19:11):
One of the frustrating parts of Lean 3 type class resolution is error reporting. To some extent I think this is inherent in the algorithm, it can be hard to localize what goes wrong. I wonder if this gets better or worse with tabled resolution. Is this something you've thought about at all and is there anything worth saying in the paper? I'd be very interested to read a paragraph on this in section 5.5 if there is.
This is an interesting question. Tracing and error reporting for typeclass resolution will be way better in Lean4, not because of the tabled typeclass resolution procedure but because of a much better system-wide tracing framework. I think tabling will help as well, but not for any deep reason -- mainly just because the traces will be that much shorter (in some cases, exponentially shorter). I don't think this is a paper-worthy insight though.
Daniel Selsam (Jan 16 2020 at 19:14):
The discrimination trees (S5.3, indexing the instances) will shrink the traces a lot too. Lean3 tries to resolve every goal with every instance whose result type has the same head symbol, and the vast majority of these attempts fail. Lean4 will only try a much smaller superset of the instances that will successfully resolve.
Daniel Selsam (Jan 16 2020 at 19:32):
the frustrating int coe issue we had in mathlib
AFAICT this issue is just the "succeeding diamond following by failing downstream goal" variant of the diamond problem, which will be solved by tabling (and is also solved by Coq's short-circuit trick). However, even without the exponential blowup, typeclass resolution still does need to do search, so the "most-constrained variable first" advice you gave in Zulip awhile back is still sound.
Daniel Selsam (Jan 16 2020 at 19:38):
Both appear in the literature, but "type class" seems to be more common than "typeclass" (except maybe surrounding Coq). Any reason to prefer the latter?
I prefer "typeclass", since typeclass is a distinct concept from the compositional meaning of "type class". I also always write "datastructure" as opposed to "data structure" for the same reason.
Patrick Massot (Jan 16 2020 at 20:51):
To the best of my knowledge, nobody has yet brought to our attention any issue that cuts might help address, so we haven't considered it.
I know nothing about this, I'm only repeating questions I heard @Cyril Cohen and @Assia Mahboubi asking.
Rob Lewis (Jan 16 2020 at 20:54):
I see, thanks for clarifying! With the error reporting, my hope was that it might be possible to extract some kind of list of possible missing instances from the info in the table. I don't have any real proposal on how to do this; it just seems more plausible than with the old method. Maybe it's really just a matter of size. Either way, it does sound like the new approach will be way easier to work with. So thanks again.
Sebastian Ullrich (Jan 16 2020 at 21:13):
There could be some heuristics like reporting the "deepest" failing goal as the "most likely point of failure". This is relatively independent of the actual algorithm, could be worth an experiment at some point.
Daniel Selsam (Jan 16 2020 at 22:11):
There could be some heuristics like reporting the "deepest" failing goal as the "most likely point of failure". This is relatively independent of the actual algorithm, could be worth an experiment at some point.
Depth is not as straightforward a concept in tabled resolution as it is in SLD.
Cyril Cohen (Jan 17 2020 at 09:28):
Hi, not only cuts are a good tool to control the behavior of backtracking during proof search, but they are also a good tool for debugging, in order to find in which branch the problems are actually happening, and to get a cleaner trace.
Daniel Selsam (Jan 17 2020 at 19:54):
Hi, not only cuts are a good tool to control the behavior of backtracking during proof search, but they are also a good tool for debugging, in order to find in which branch the problems are actually happening, and to get a cleaner trace.
My hope is that typeclass resolution will be so fast in mathlib4 that there will be no need try to improve performance further using cut
. As for debugging, in Lean4, the trace messages are collected in a datastructure, and users will be able to write their owns scripts for traversing and filtering the traces as they desire.
Bas Spitters (Jan 20 2020 at 17:08):
I'm new here. Thanks for the nice paper!
I understand lean does not use canonical structures. Can anyone clarify, as they are popular in Coq (math-comp)
This should probably go somewhere else, but I'd like to understand more about congruence, and more generally automation in lean.
Is the congruence paper still the best reference?
Reid Barton (Jan 20 2020 at 20:18):
The short explanation is that type classes perform the corresponding function
Daniel Selsam (Jan 21 2020 at 02:49):
I'm new here. Thanks for the nice paper!
I understand lean does not use canonical structures. Can anyone clarify, as they are popular in Coq (math-comp)
@Bas Spitters Welcome! Typeclasses are remarkably simple and versatile, and (at least with tabling) seem to work extremely well for zillions of diverse uses. Do you see any major advantages offered by canonical structures?
Daniel Selsam (Jan 21 2020 at 03:08):
This should probably go somewhere else, but I'd like to understand more about congruence, and more generally automation in lean.
Is the congruence paper still the best reference?
We haven't started building automation for Lean4 yet. Note that the congruence closure procedure from IJCAR2016 doesn't seem to be that popular in mathlib --- the main workhorse has been the simplifier. We are not currently planning many changes to the simplifier, though our preliminary experiments suggest that we may be able to improve its performance by 10-100x. Mathlib also includes many diverse, interesting tactics written by users. You may want to ask a mathlib power-user to give you a tour.
Mario Carneiro (Jan 21 2020 at 03:17):
@Daniel Selsam Can you give any clues as to where the 10x performance improvement in simp
is likely to come from?
Simon Hudon (Jan 21 2020 at 03:56):
Already indexing simp
lemmas with discrimation trees should make a big difference. Right now, if you have a lemma of the same @coe (list a) (multiset a) xs = ...
, it is simply indexed by the function name coe
. With a discrimination tree, you can have the index distinguish between @coe (list a) (multiset a) xs
and @coe nat int n
. Where as the Lean 3 solution gives us hundreds of collisions for coe
, the Lean 4 solution should narrow down the search more aggressively before trying to apply rewrite rules.
Patrick Massot (Jan 21 2020 at 08:46):
@Daniel Selsam do you know what happened to https://github.com/leanprover/lean/wiki/Simplifier-Features? It seemed to have many ideas to improve the simplifier. Note that most of it is covered by the golden rule of automation: compare with Isabelle.
Daniel Selsam (Jan 21 2020 at 11:01):
Daniel Selsam Can you give any clues as to where the 10x performance improvement in
simp
is likely to come from?
@Mario Carneiro By far the single biggest speedup will come from better caching of the simp sets. Last I checked, building mathlib involves ~70 minutes of building simp sets but only ~25 minutes inside the actual simplifier. A second big speedup will come from using discrimination trees instead of head maps as @Simon Hudon said. A third will come from being more careful not to accidentally traverse huge terms (e.g. by eagerly reducing projections and abstracting all proof terms).
Daniel Selsam (Jan 21 2020 at 11:05):
Daniel Selsam do you know what happened to https://github.com/leanprover/lean/wiki/Simplifier-Features? It seemed to have many ideas to improve the simplifier. Note that most of it is covered by the golden rule of automation: compare with Isabelle.
I was not aware of this document. Thank you for sharing.
Bas Spitters (Jan 21 2020 at 17:11):
Here's a problem with the fully unbundled approach. The term size can blow up, so we need a mechanism to go between bundled and unbundled structures. My understanding is that you are using the (mostly) unbundled approach in lean.
I believe this is also a main argument by the math-comp group to use canonical structures.
It has been argued that we'd need general unification hints to get the best of both worlds (type classes and canonical structures)
https://coq.discourse.group/t/blog-post-exponential-blowup-when-using-unbundled-typeclasses-to-model-algebraic-hierarchies/289/2
Floris van Doorn (Jan 21 2020 at 18:22):
@Bas Spitters Note that Lean 3 does not use the fully unbundled approach. We use a partially bundled approach (for most classes in the algebraic hierarchy only the type is an argument to the class). This doesn't have the exponential blowup problem. This was also discussed here:
https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/what.20is.20a.20diamond.3F
There are problems with canonical structures as well. One thing I dislike about canonical structures is the amount of boilerplate code you have to write to add a new structure to the hierarchy (for example, the number of unification hints you have to explicitly declare is quadratic in the number of structures, and forgetting one can cause hard-to-debug errors).
Combining both type classes and canonical structures in a library sounds very dangerous: it sounds a lot more manageable to stick with a single approach, than continually translating between two different approaches.
Daniel Selsam (Jan 21 2020 at 19:50):
Here's a problem with the fully unbundled approach. The term size can blow up
AFAICT only the term tree size blows up, and I don't see any fundamental reason why all operations in the system couldn't in principle be made to scale in the dag sizes instead of the tree sizes. Note that Lean currently does scale exponentially on https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/huge.20term.20trees/near/185818383 , though this is a conscious decision: we don't hash-cons and we use faster, imprecise caches in many places. Some recent discussion at https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/huge.20term.20trees
Daniel Selsam (Jan 22 2020 at 21:51):
FYI the paper is on arXiv now: https://arxiv.org/abs/2001.04301
Patrick Massot (Jan 22 2020 at 21:52):
Are there differences with the version you posted here?
Patrick Massot (Jan 22 2020 at 21:53):
I can see in the reference section "35. mathlib Community, T.: The Lean mathematical library."
Daniel Selsam (Jan 22 2020 at 21:53):
Thanks again everyone for the helpful feedback.
Daniel Selsam (Jan 22 2020 at 21:56):
Are there differences with the version you posted here?
Almost no changes to the description of the procedure. The main improvements are highlighting mathlib as a primary motivator, and explaining the subtle but surprisingly big gap between what typeclasses mean in most systems vs. what they mean in Coq and Lean.
Bas Spitters (Jan 22 2020 at 23:00):
According to Georges, canonical structures are the real type classes in Coq. They are key in the math-comp library.
One advantage is that they are predictable. It's interesting that they are not used in lean.
On the other hand, they have also been an obscure feature for a long time in Coq, before they became popular in math-comp.
Daniel Selsam (Jan 22 2020 at 23:16):
No one will drive us from the paradise that Wadler and Blott created for us.
Kevin Buzzard (Jan 22 2020 at 23:18):
They do exist in Lean 3 but I think someone (Cyril Cohen?) once told me that they were used precisely once in mathlib, and so there was not much of an argument for keeping them in Lean 4...
Yury G. Kudryashov (Jan 22 2020 at 23:22):
Semi-offtopic: it would be nice to have a document that explain current mathlib
policies about class A extends B :=
vs class A [B] :=
.
François G. Dorais (Jan 22 2020 at 23:22):
Are there planned changes to unification hints in Lean 4?
Mario Carneiro (Jan 22 2020 at 23:23):
removal?
Mario Carneiro (Jan 22 2020 at 23:23):
that's more of an "effected change" than a "planned change" though
François G. Dorais (Jan 22 2020 at 23:24):
Why?
Mario Carneiro (Jan 22 2020 at 23:25):
They do exist in Lean 3 but I think someone (Cyril Cohen?) once told me that they were used precisely once in mathlib, and so there was not much of an argument for keeping them in Lean 4...
Mario Carneiro (Jan 22 2020 at 23:25):
Lean 4 also appears to be doubling down on typeclass support with this new algorithm, while effective implementation of unification hints for comparable benefit is a big wild card
Mario Carneiro (Jan 22 2020 at 23:26):
Typeclasses and unification hints are somewhat mutually exclusive mechanisms, so we have to pick one and stick with it
Yury G. Kudryashov (Jan 22 2020 at 23:27):
I mean, for a mathematician both .. extends ..
and ... [B] ...
mean the same, and it's hard to guess technical consequences of choosing one over another. As far as I understand, with extends
we have to think about diamonds, and with ... [B]
we may create really huge goals. Am I correct?
Mario Carneiro (Jan 22 2020 at 23:28):
I've written about the policy before, but the short version is: if you use the same types as the parent, inherit, otherwise add it as a parameter
Daniel Selsam (Jan 22 2020 at 23:29):
Before cementing a policy, I suggest figuring out where the term tree blowup is coming from that was discussed in https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/huge.20term.20trees
Mario Carneiro (Jan 22 2020 at 23:29):
The exponential blowup is in the number of mutually referring parameters in a highly parameterized definition. With this policy, this situation only arises if the typeclass relates many different types, with lots of structure between proper subsets of those types
François G. Dorais (Jan 22 2020 at 23:30):
I've been using them, but they have so little support it's hard to use them. I've been using type classes and unification hints together since they can be made to work together in orthogonal directions pretty well.
Mario Carneiro (Jan 22 2020 at 23:30):
@Daniel Selsam I'm talking about the lean 3 mathlib policy. This may change in lean 4 depending on how lean performs on different loads
Sebastien Gouezel (Jan 22 2020 at 23:30):
if you use the same types as the parent, inherit
except if you are only addings Props and there is a big zoo of them, and an exponentially large number of possible combinations (just as in topological spaces, which might be second countable, order topology, T2, and so on). Then the Props should be mixins.
Mario Carneiro (Jan 22 2020 at 23:32):
I agree with that too. You shouldn't have the definition of one such mixin depend on another as a parameter, so there is no parameter depth problem here either
Mario Carneiro (Jan 22 2020 at 23:33):
A data typeclass should never have a prop mixin as a parameter, since there is no diamond problem here
Yury G. Kudryashov (Jan 22 2020 at 23:36):
Another possibly stupid question. Would the following approach to algebraic classes create an exponential blow-up?
- Define classes
has_mul
,has_add
(one per an operation and a constant). - Declare a policy "only one instance of these classes per a type".
- Make prop classes, one per each property (e.g.,
class is_mul_comm
). These classes takehas_mul
etc as parameters. - Create common classes (
group
,monoid
, ...) byextend
ing various combinations of one-property classes.
Chris Hughes (Jan 22 2020 at 23:38):
There was talk at some point about making group
a Prop that took a binary function as an argument.
Yury G. Kudryashov (Jan 22 2020 at 23:39):
As a mathematician, I like a possibility to write a lemma taking specific is_*
as parameters if a specific combination is not declared as a class
. Think about all those ordered_comm_monoid
etc.
Mario Carneiro (Jan 23 2020 at 00:13):
That would not cause a blowup (the only thing that matters for that is the signatures of the classes), but it would make the hierarchy a bit larger and (IMO) a bit harder to understand, because the axioms are now separated from the useful classes. I think a policy like this is where mul_zero_class
and distrib
came from
Mario Carneiro (Jan 23 2020 at 00:15):
As a mathematician, I like a possibility to write a lemma taking specific
is_*
as parameters if a specific combination is not declared as aclass
. Think about all thoseordered_comm_monoid
etc.
These usually can't be just combinations of classes, because there are also compatibility claims between the order and the comm_monoid part
Chris Hughes (Jan 23 2020 at 00:24):
But there are tons of variations of it for partial and total orders, commutativity, cancellation. They're also slightly confusing, I remember something like not every ordered_comm_monoid
is an ordered_monoid
or something.
Mario Carneiro (Jan 23 2020 at 00:32):
Some of that is legacy from the core/mathlib split
Patrick Massot (Jan 23 2020 at 09:30):
We still have one example when canonical structures do something we have never been able to do with type classes: math-comp's bigop. They have a big operators library which is completely generic in the operator. This requires having lemmas whose input are predicates like is_associative
and is_commutative
. Such things were tried with type classes, there are remnants in the core library, but it never took off. My understanding is we always need some constant head symbol, like has_add.add
instead of a mysterious operator op
and a type class [is_associative op]
.
Patrick Massot (Jan 23 2020 at 09:31):
In this bigop library they can prove one lemma covering big sums, big products, big intersection, big direct sum of modules etc where we have separate lemmas for each case (with giant holes in the API until someone needs each specific case).
Patrick Massot (Jan 23 2020 at 09:33):
But there is a huge barrier to trying canonical structures, we would need to rewrite the whole library. At the end of summer there was almost enough incentive with all the type class resolution failure or crazy timing. But then Daniel came and promised all this will be solved in Lean 4 so we wait.
Gabriel Ebner (Jan 23 2020 at 09:48):
My understanding is we always need some constant head symbol, like has_add.add instead of a mysterious operator op
This is only true for the simplifier, which looks up the simp lemmas by the name of the head symbol. I don't think that's a problem to develop generic bigops. BTW, we have a type class is_associative
and also instances like is_associative (*)
for semigroups, etc.
Johan Commelin (Jan 23 2020 at 09:51):
@Gabriel Ebner Could you please clarify. Should simp
-lemmas have these constant head symbols, or only the terms that we are matching?
Johan Commelin (Jan 23 2020 at 09:52):
Because I think @Patrick Massot would like a world where you state a lemma for a generic op
and then use it in the simplifier when applying it to some has_add.add
occurence.
Gabriel Ebner (Jan 23 2020 at 09:53):
Errm, both:
@[simp] lemma (r : ℕ → ℕ → Prop) ... : r x y ↔ r y x := ... -- bad since relation `r` is not a constant @[simp] lemma (f : ℕ → ℕ → ℕ) [is_associative f] ... : f (f x y) z = f x (f y z) := ... -- bad since head symbol `f` of lhs is not a constant
Johan Commelin (Jan 23 2020 at 09:55):
Ok. so that's bad. Because it means we have to state each lemma 5 times, instead of once.
Johan Commelin (Jan 23 2020 at 09:55):
Which is exactly the situation that Patrick is describing.
Anne Baanen (Jan 23 2020 at 09:56):
If we replaced finset.prod
with finset.bigop (*)
, then finset.bigop
would be the head symbol, no?
Johan Commelin (Jan 23 2020 at 09:56):
Of course we could try to generalise the proofs, but it's clear that in this regard our setup with typeclasses is suboptimal.
Johan Commelin (Jan 23 2020 at 09:58):
If we replaced
finset.prod
withfinset.bigop (*)
, thenfinset.bigop
would be the head symbol, no?
Aha, I guess that might work, but it would mean a step down in readability.
Johan Commelin (Jan 23 2020 at 09:59):
If finset.prod
is than made an abbreviation
of finset.bigop (*)
and simp-lemmas can see through such an abbreviation, then it might work. But I'm not sure if that actually works.
Johan Commelin (Jan 23 2020 at 10:01):
I tried something like this in a commit on folding min and max: https://github.com/leanprover-community/mathlib/commit/b031290b228838d5e779908b129420331bb131c5
Patrick Massot (Jan 23 2020 at 10:02):
Sure, we can have finset.bigop (*)
and a notation for that. But then the library would still need lemmas about the generic op
and the simplifier wouldn't trigger
Johan Commelin (Jan 23 2020 at 10:02):
More than half of that commit is stating specialised versions of the two general lemmas, and even the two general lemmas ought to be unified.
Patrick Massot (Jan 23 2020 at 10:03):
See also https://github.com/PatrickMassot/bigop/blob/master/src/bigop.lean for my old attempt to port the bigop library
Johan Commelin (Jan 23 2020 at 10:04):
Could we have attributes that state the specialised simp-lemmas after the generic op
version?
Patrick Massot (Jan 23 2020 at 10:04):
It sorts of work (although it faces elaboration issues). But then you need to actually use it in the middle of other proofs where you expect the simplifier to work.
Sebastian Ullrich (Jan 23 2020 at 10:05):
Yes, that was one idea for making the simplifier work with more unbundled typeclasses. Throw a metaprogram at it.
Sebastian Ullrich (Jan 23 2020 at 10:05):
The specialized lemmas would be generated eagerly
Sebastien Gouezel (Jan 23 2020 at 10:12):
Or craft an attribute on the generic lemmas that would generate the versions for sums, products, unions, and so on, which would be generic enough that one can add new instances to the attributes (for instance max), and then all the lemmas would become available right away for max.
Sebastien Gouezel (Jan 23 2020 at 10:12):
Oups, Johan just said this two messages ago!
Daniel Selsam (Jan 23 2020 at 11:43):
... the simplifier, which looks up the simp lemmas by the name of the head symbol.
Lean4 will use a discrimination tree instead of a head map.
Johan Commelin (Jan 23 2020 at 11:50):
@Daniel Selsam In other words: in Lean 4 we can have typeclasses and an elegant bigop library?
Daniel Selsam (Jan 23 2020 at 12:18):
I didn't mean to imply that. To be honest, I don't remember all the desiderata and challenges involved in this issue.
Daniel Selsam (Jan 23 2020 at 12:25):
How much of the concern is automation (or the lack thereof) when proving the generic big-op lemmas themselves? For using the big-op theorems for concrete ops in downstream automation, what are the downsides of the meta-programming approach?
Sebastian Ullrich (Jan 23 2020 at 12:30):
There is a minor diamond issue: if you define a bigop simp lemma in one file and a new bigop in an independent file, you might not get a specialized version of this lemma for that bigop when you import both, depending on the implementation
Sebastian Ullrich (Jan 23 2020 at 12:31):
There could of course be an explicit command "generate all missing combinations at this point"
Patrick Massot (Jan 23 2020 at 16:49):
I'm sorry I have very very little time for Lean right now (and probably for the coming two weeks). I only try to remember what happened one year ago. But I see from https://github.com/leanprover/lean4/commit/c9e9208ea29a34fe715310f47fa5ce786a4b3b8e that Leo is on it.
Sebastian Ullrich (Jan 23 2020 at 17:10):
Good eye. This is strictly about notations for now though.
Patrick Massot (Jan 23 2020 at 20:12):
I noticed it was notation only, but it still suggest this bigop question is somewhere in Leo's mind.
Daniel Selsam (Jan 23 2020 at 20:28):
We all love big-ops. The main issue is that we don't want automation (e.g. the simplifier) to stupidly try applying every big-op lemma to every subterm for lack of decent indexing. It might be possible to extend our discrimination trees to index these lemmas better, but so far the macro-specialization approach seems acceptable to me.
Bas Spitters (Jan 30 2020 at 14:17):
The Coq devs tried to convert mathematical components to type classes, but got stuck
https://github.com/coq/stdlib2/wiki/EqClass
Would this be solved by the lean type class mechanism?
Bas Spitters (Feb 09 2020 at 12:53):
Robbert Krebbers asks whether tabled type classes solve term blow up:
https://www.ralfj.de/blog/2019/05/15/typeclasses-exponential-blowup.html
This thread seems to suggest that it doesn't??
https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/huge.20term.20trees
Of course, we were aware of the issue when writing the Coq math-classes library, and proposed to bundle parts when needed.
Floris van Doorn (Feb 10 2020 at 01:49):
That post was discussed here:
https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/what.20is.20a.20diamond.3F
If my memory is correct, since Lean's typeclasses are partially bundled and not unbundled, the term blow up is quadratic, not exponential, even with a tree-encoding of a term (maybe it's smaller with a dag-encoding?)
Mario Carneiro (Feb 10 2020 at 02:02):
everything is linear with a dag-encoding
Mario Carneiro (Feb 10 2020 at 02:03):
it is exponential in the number of parameters to a definition, but in mathlib n = 1 in almost all cases (for modules and such n = 2)
Mario Carneiro (Feb 10 2020 at 02:04):
You can also use the complexity theory term "fixed-parameter tractable" to give it a positive spin
Daniel Selsam (Feb 10 2020 at 15:44):
@Bas Spitters The size of the tree-expansion of terms is orthogonal to the problems that tabling solves. BTW I would have expected Coq to be polynomial on https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/huge.20term.20trees/near/185865827 since it hash-conses -- do you know why Coq scales exponentially on it? Perhaps they are just forgetting a cache somewhere and are traversing a term as a tree unnecessarily.
Bas Spitters (Feb 10 2020 at 16:16):
Enrico Tassi said:
Coq is quite hysterical: it does opportunistic sharing via evar assignment, but disregards that sharing while doing conversion/reduction...
They are looking into fixing this.
Daniel Selsam (Feb 10 2020 at 21:22):
@Bas Spitters FYI we briefly discussed this sharing issue in the Lean4 dev meeting this morning. Tentative plan is to not try to scale in the DAG size, nor to preserve sharing everywhere, but rather to aggressively let-abstract in key places and to make sure never to zeta-reduce unnecessarily.
Mario Carneiro (Feb 10 2020 at 23:50):
@Daniel Selsam Does the lean kernel know how to deal with let binders in the context now? I recall the algorithm in lean 3 (copied for my paper) is to eagerly expand let binders during typechecking, because the definition of "context" does not have let-binders (unlike in the tactic state, which represents let binders in the context explicitly)
Daniel Selsam (Feb 11 2020 at 17:36):
@Mario Carneiro [CORRECTED] Yes, it has support for let
binders in the context now.
Last updated: Dec 20 2023 at 11:08 UTC