Zulip Chat Archive

Stream: general

Topic: what is a diamond?

Kevin Buzzard (Apr 21 2019 at 04:00):

What is a typeclass diamond (note : this has nothing to do with Scholze's diamonds, it's an unfortunate name-clash)? I think it's something like the above link. Two distinct instance routes from A to D, with typeclass inference giving us two terms of type D which are provably distinct. I am assuming that the system should be regarded as having failed at this point. How much can I get away with? I think there are other lower category diamonds as well though, right? What other things can go wrong? Instances which are equal but not computably equal? Instances which are equal but not definitionally equal? Is there some sort of complete catalogue of diamonds?

Keeley Hoek (Apr 21 2019 at 04:03):

With regard to auto-naming, what do you think a good strategy for naming those would be? It would ~~probably~~ be really easy to implement

Kevin Buzzard (Apr 21 2019 at 04:09):

I was going from xena.B to xena.D in my example, in the xena namespace, and getting a name which depended only on D. I guess I want to have B tagging along somehow. Is there some clever thing you can do with dots?

Floris van Doorn (Apr 26 2019 at 14:55):

I think the term type class diamond is used whenever there are two ways to go from class A to class D, regardless of whether the two resulting instances are definitionally equal or not. I think it will be quite a big problem if you have a diamond and the resulting instances are not definitionally equal (whether they are provably equal is less important. I think we should restrict instances enough so that we don't get these kinds of bad diamonds.

Floris van Doorn (Apr 26 2019 at 14:55):

golf:

theorem you_are_in_typeclass_trouble_now : d1 ≠ d2 := by rintro ⟨⟩

Johan Commelin (Apr 26 2019 at 14:58):

If the class is a subsingleton, I want to be able to use non-defeq instances.

Johan Commelin (Apr 26 2019 at 14:59):

That ought to be possible, right?

Kevin Buzzard (Apr 26 2019 at 15:14):

That currently is annoying. @Sebastian Ullrich I know the type class system will not be changing, but this issue about typeclass inference failing to do subsingleton elimination is something we mathematicians are beginning to run into. Am I asking for rfl++ or something?

Patrick Massot (Apr 26 2019 at 15:15):

I don't think Sebastian said the type class system won't change. He wrote they didn't know yet what would be a better system. It's clear it has to improve, the current system simply doesn't scale.

Patrick Massot (Apr 26 2019 at 15:18):

Part of the problem is indeed type-theoretic but, as far as I understand, most of the issues are completely generic (in which order do we search, what is cached, etc.)

Wojciech Nawrocki (May 15 2019 at 21:31):

Ralf Jung, who's a Coq expert, just wrote a blog post (link) about the problems diamonds cause and possible solutions. It seems that it's still a mostly unsolved problem, and possibly some issues you encounter here are due to the exponential blowup described there.

Reid Barton (May 15 2019 at 23:13):

Interesting post. How did you come across it, by the way?

Reid Barton (May 15 2019 at 23:15):

The diamonds mentioned in that post are of a different type, related to classes extending other classes

Reid Barton (May 15 2019 at 23:24):

For example, B and C both extend A and D extends both B and C. If you have a D then you probably want the As to be defeq, but there's no way to even express that.

Simon Hudon (May 15 2019 at 23:26):

In both Coq and Lean?

Reid Barton (May 16 2019 at 03:05):

I think the issues are mostly the same in both. I guess I didn't pick the best example--in this case Lean will deduplicate the A fields in D if you use the old structure command, I think?

Reid Barton (May 16 2019 at 03:06):

A better example is if you don't want to make the class D at all

Reid Barton (May 16 2019 at 03:07):

because you have lots of things that extend A, and you don't want to make all the combinations.
Then you have no way to say "A which is both B and C"

Johan Commelin (May 16 2019 at 06:16):

@Wojciech Nawrocki Thanks for sharing this! Interesting read.

RalfJ (May 16 2019 at 09:33):

Hi! Author of that post here. Lean was also just mentioned in our Coq-side discussion of this post. I am very curious how Lean compares here.

RalfJ (May 16 2019 at 09:35):

The diamonds mentioned in that post are of a different type, related to classes extending other classes

Yeah, the diamonds I saw mentioned in the backlog here are diamonds in the inference search tree, not in the algebraic hierarchy (which are the kinds of diamonds that bundling approaches have problems with). I mention that issue in my post as well though I dont call it a dimaond -- this is the exponential blowup coming from the Monoid A -> Semigroup A instance together with the prod instances, where there are now two ways to derive Monoid (A*A) -> Semigroup (A*A).

Johan Commelin (May 16 2019 at 09:51):

Cool, welcome!

Mario Carneiro (May 16 2019 at 10:22):

@RalfJ Lean has the same foundational problem, but it isn't yet a big issue. The blog post doesn't discuss the style used in mathlib, which is to have "partially unbundled" structures, where the type is a parameter but none of the operators. This minimizes the nesting level, although there are some examples where we have deeper depth like modules (that use two variables) and there was an example using three variables that floated on the chat recently

Mario Carneiro (May 16 2019 at 10:23):

Also, not mentioned in that asymptotic analysis is the amount of subterm sharing in the term. I think the size of those instances is actually linear, but if you write it as a tree instead of a dag then it becomes quadratic or cubic etc

Johan Commelin (May 16 2019 at 10:23):

But it is very likely to become a problem in the (near?) future?

Mario Carneiro (May 16 2019 at 10:24):

I don't think so... you have to be dealing with some pretty complicated objects for it to come up, so if it was an issue it would have stopped us already

Mario Carneiro (May 16 2019 at 10:25):

We have issues with typeclass search order, but that's a different matter

Scott Morrison (May 16 2019 at 10:27):

This is a really great thing about the lean-perfectoid-spaces project: we've made a definition which is about as complicated as they come (at least in algebraic geometry / number theory?), and Lean could cope.

Kevin Buzzard (May 16 2019 at 10:36):

One of my main motivations for doing perfectoids was to see if Lean could handle them -- it was clear from the start that it was theoretically possible but Patrick pointed out very early on that this didn't mean that it was possible, and at times I was (incorrectly) worried that it might not be. I am still trying hard to understand what Mario wrote about timeouts. I've taken to trying to compile the perfectoid project with smaller values for the timeout parameter and am trying to understand what makes part of it slow to compile -- everything so far is pointing at naivity in the code -- poor proof engineering, as Mario calls it.

Kevin Buzzard (May 16 2019 at 10:36):

In short, I have no formal training in programming and this is now beginning to show

Kevin Buzzard (May 16 2019 at 10:37):

But at the end of the day the project does compile

RalfJ (May 16 2019 at 11:08):

"partially unbundled" structures, where the type is a parameter but none of the operators

I mentioned having the operators unbundled but the superclasses bundled, that gives you O(n^3) If you just have the type unbundled you get O(n^2), but you have even more sharing problems and you cannot even use the typeclasses for the operators and their notation before having shown the axioms. We had some cases where that was annoying, but the real trouble here is sharing once your inheritance hierarchy has a diamond.

RalfJ (May 16 2019 at 11:09):

@Mario Carneiro

not mentioned in that asymptotic analysis is the amount of subterm sharing in the term. I think the size of those instances is actually linear, but if you write it as a tree instead of a dag then it becomes quadratic or cubic etc

I explicitly mention that in the conclusion. :) And yes, if you fully exploit sharing I think it remains linear. But then you have to ensure that every pass traversing your AST (unification, reduction, elaboration, type checking, whatever...) exploits sharing. A single lapse and your get the bad complexities back. At least in the Coq world people tell me that is not feasible.

RalfJ (May 16 2019 at 11:12):

I don't think so... you have to be dealing with some pretty complicated objects for it to come up, so if it was an issue it would have stopped us already

in our case, it was really when we started to put the hierarchy to use that the performance became an issue. as long as you are just defining your algebraic classes and show all sorts of properties about them it's good, but then we ended up defining our logic with 5 or more nested combinators (i.e., the equivalent of Zn 5), and then we nest that again in several more layers, so we end up with n=10 and more -- and then we have quite a few lemmas about that, and everything got slow. I have no idea what people have been doing in Lean so I cannot compare.

Mario Carneiro (May 16 2019 at 11:13):

what do you mean about Zn 5?

RalfJ (May 16 2019 at 11:14):

I was referring to my post; I mean that we had something like iProp := T*U*V -> A -> B, where * and -> have TC instances like prod in my post. This is like Zn 5 = Z*Z*Z*Z*Z.

Mario Carneiro (May 16 2019 at 11:15):

so you are trying to infer that e.g. Z^5 has a ring structure?

RalfJ (May 16 2019 at 11:15):

and then we built things like ((X -> iProp) -> T -> iProp) -> ((X -> iProp) -> T -> iProp). So if you count the number of combinators here, it's... 20 or more.

Mario Carneiro (May 16 2019 at 11:16):

I'm confused about the role of T,U,V etc in those examples

RalfJ (May 16 2019 at 11:16):

just random types that are in our hierarchy

Mario Carneiro (May 16 2019 at 11:16):

You have a theorem that has 20 type arguments, plus all their typeclasses?

RalfJ (May 16 2019 at 11:17):

like, random rings, if we stick to that example

RalfJ (May 16 2019 at 11:17):

no, these are all concrete types

RalfJ (May 16 2019 at 11:17):

well, most of them

Mario Carneiro (May 16 2019 at 11:18):

Inferring that Z^5 has a ring structure should be quadratic, since you have to talk about Z^n itself and the ring structure for each of the subterms

Mario Carneiro (May 16 2019 at 11:18):

it's not quintic

RalfJ (May 16 2019 at 11:18):

with the fully unbundled approach and if ring is a subclass of group, the power is 6 at least

RalfJ (May 16 2019 at 11:19):

that's what I explained in my post

Mario Carneiro (May 16 2019 at 11:20):

So in mathlib if ring is a subclass of group that only increases things by a constant factor, I think, since the typeclass args will be twice as big

Mario Carneiro (May 16 2019 at 11:20):

I think fully unbundled is the killer here

RalfJ (May 16 2019 at 11:22):

I think fully unbundled is the killer here

it definitely is

RalfJ (May 16 2019 at 11:23):

but we've also had loads of problems with bundling^^

Mario Carneiro (May 16 2019 at 11:23):

with partially unbundled style you won't get that kind of thing unless you have n different type arguments, which is quite rare for mathematical structures

RalfJ (May 16 2019 at 11:23):

the sheer number of arguments doesnt even matter

RalfJ (May 16 2019 at 11:23):

only their "level of nesting" -- like arguments that have arguments

Mario Carneiro (May 16 2019 at 11:24):

So I guess this is another variant on "don't use dependent types in DTT"

RalfJ (May 16 2019 at 11:24):

;)

Johan Commelin (May 17 2019 at 03:31):

@RalfJ I don't know if you have Lean installed. I ran this code to illustrate how mathlib/Lean deals with this:

section
set_option pp.all true

lemma inst01 : ring ℤ := by apply_instance
lemma inst02 : ring (ℤ × ℤ) := by apply_instance
lemma inst03 : ring (ℤ × ℤ × ℤ) := by apply_instance
lemma inst04 : ring (ℤ × ℤ × ℤ × ℤ) := by apply_instance
lemma inst05 : ring (ℤ × ℤ × ℤ × ℤ × ℤ) := by apply_instance

#print inst01
#print inst02
#print inst03
#print inst04
#print inst05
end

The output of the last two prints is:

theorem inst04 : ring.{0} (prod.{0 0} int (prod.{0 0} int (prod.{0 0} int int))) :=
@prod.ring.{0 0} int (prod.{0 0} int (prod.{0 0} int int))
  (@domain.to_ring.{0} int
     (@to_domain.{0} int
        (@linear_ordered_comm_ring.to_linear_ordered_ring.{0} int
           (@decidable_linear_ordered_comm_ring.to_linear_ordered_comm_ring.{0} int
              int.decidable_linear_ordered_comm_ring))))
  (@prod.ring.{0 0} int (prod.{0 0} int int)
     (@domain.to_ring.{0} int
        (@to_domain.{0} int
           (@linear_ordered_comm_ring.to_linear_ordered_ring.{0} int
              (@decidable_linear_ordered_comm_ring.to_linear_ordered_comm_ring.{0} int
                 int.decidable_linear_ordered_comm_ring))))
     (@prod.ring.{0 0} int int
        (@domain.to_ring.{0} int
           (@to_domain.{0} int
              (@linear_ordered_comm_ring.to_linear_ordered_ring.{0} int
                 (@decidable_linear_ordered_comm_ring.to_linear_ordered_comm_ring.{0} int
                    int.decidable_linear_ordered_comm_ring))))
        (@domain.to_ring.{0} int
           (@to_domain.{0} int
              (@linear_ordered_comm_ring.to_linear_ordered_ring.{0} int
                 (@decidable_linear_ordered_comm_ring.to_linear_ordered_comm_ring.{0} int
                    int.decidable_linear_ordered_comm_ring))))))

and

theorem inst05 : ring.{0} (prod.{0 0} int (prod.{0 0} int (prod.{0 0} int (prod.{0 0} int int)))) :=
@prod.ring.{0 0} int (prod.{0 0} int (prod.{0 0} int (prod.{0 0} int int)))
  (@domain.to_ring.{0} int
     (@to_domain.{0} int
        (@linear_ordered_comm_ring.to_linear_ordered_ring.{0} int
           (@decidable_linear_ordered_comm_ring.to_linear_ordered_comm_ring.{0} int
              int.decidable_linear_ordered_comm_ring))))
  (@prod.ring.{0 0} int (prod.{0 0} int (prod.{0 0} int int))
     (@domain.to_ring.{0} int
        (@to_domain.{0} int
           (@linear_ordered_comm_ring.to_linear_ordered_ring.{0} int
              (@decidable_linear_ordered_comm_ring.to_linear_ordered_comm_ring.{0} int
                 int.decidable_linear_ordered_comm_ring))))
     (@prod.ring.{0 0} int (prod.{0 0} int int)
        (@domain.to_ring.{0} int
           (@to_domain.{0} int
              (@linear_ordered_comm_ring.to_linear_ordered_ring.{0} int
                 (@decidable_linear_ordered_comm_ring.to_linear_ordered_comm_ring.{0} int
                    int.decidable_linear_ordered_comm_ring))))
        (@prod.ring.{0 0} int int
           (@domain.to_ring.{0} int
              (@to_domain.{0} int
                 (@linear_ordered_comm_ring.to_linear_ordered_ring.{0} int
                    (@decidable_linear_ordered_comm_ring.to_linear_ordered_comm_ring.{0} int
                       int.decidable_linear_ordered_comm_ring))))
           (@domain.to_ring.{0} int
              (@to_domain.{0} int
                 (@linear_ordered_comm_ring.to_linear_ordered_ring.{0} int
                    (@decidable_linear_ordered_comm_ring.to_linear_ordered_comm_ring.{0} int
                       int.decidable_linear_ordered_comm_ring)))))))

Johan Commelin (May 17 2019 at 03:32):

It is slightly repetetive, but it's clearly linear (-;

Mario Carneiro (May 17 2019 at 03:59):

import algebra.pi_instances
set_option pp.all true

lemma inst01 : ring ℤ := by apply_instance
lemma inst02 : ring (ℤ × ℤ) := by apply_instance
lemma inst03 : ring (ℤ × ℤ × ℤ) := by apply_instance
lemma inst04 : ring (ℤ × ℤ × ℤ × ℤ) := by apply_instance
lemma inst05 : ring (ℤ × ℤ × ℤ × ℤ × ℤ) := by apply_instance

meta def count_int : list char → ℕ
| [] := 0
| l@(_::l') :=
  if l.take 3 = "int".to_list then count_int (l.drop 3) + 1 else
  count_int l'

open tactic
meta def count_int_in_decl (n : name) : tactic ℕ :=
do d ← get_decl n, e ← pp d.value,
   return $ count_int e.to_string.to_list
#eval do
  l ← mmap count_int_in_decl
    [``inst01, ``inst02, ``inst03, ``inst04, ``inst05],
  trace l, -- [5, 12, 20, 29, 39]
  guard $ l = (λ n, (n^2 + 13*n) / 2 + 5) <$> list.range 5

Mario Carneiro (May 17 2019 at 03:59):

it's quadratic

Mario Carneiro (May 17 2019 at 04:01):

but the linear term is pretty large, so the quadratic behavior doesn't assert itself until n = 13 or so

Mario Carneiro (May 17 2019 at 04:03):

to see why, notice that (prod.{0 0} int (prod.{0 0} int (prod.{0 0} int int))) appears on line 1, (prod.{0 0} int (prod.{0 0} int int)) appears on line 5, (prod.{0 0} int int) appears on line 10 and so on in inst05

Kevin Buzzard (May 17 2019 at 06:35):

I experimented with this stuff a few weeks ago. IIRC when you go up to 6 factors it doesn't work!

Kevin Buzzard (May 17 2019 at 06:36):

Some search gives up before it finds it. Is my memory correct?

Mario Carneiro (May 17 2019 at 06:45):

yes, you have to raise the instance search depth to get past 5 factors

RalfJ (May 17 2019 at 18:01):

@Johan Commelin @Mario Carneiro given that this uses just the type as an index, quadratic is exactly what my blog post predicts :)

RalfJ (May 17 2019 at 18:02):

what I am curious about now is if lean can keep the inference and/or checking times linear by exploiting sharing

Mario Carneiro (May 21 2019 at 10:34):

Heh, this just appeared on the isabelle mailing list: https://lists.cam.ac.uk/pipermail/cl-isabelle-users/2019-May/msg00089.html . Looks familiar...

Last updated: May 02 2025 at 03:31 UTC