Zulip Chat Archive

Stream: new members

Topic: Weird error message


Katherine Cordwell (Apr 30 2019 at 20:38):

I'm trying to use nat.prime.coprime_iff_not_dvd, which has signature nat.prime.coprime_iff_not_dvd : nat.prime ?M_1 → (nat.coprime ?M_1 ?M_2 ↔ ¬?M_1 ∣ ?M_2). But when I try to call it, I get an "maximum class-instance resolution depth has been reached" error, and I have no idea why.

Patrick Massot (Apr 30 2019 at 20:39):

We need more context

Patrick Massot (Apr 30 2019 at 20:40):

Could you try to get a minimized version of your code, still showing the error, but not much more?

Katherine Cordwell (Apr 30 2019 at 20:40):

I have p and q that are nats and I write
"have iffconc: nat.prime p → (nat.coprime p q ↔ ¬ p | q), from apply nat.prime.coprime_iff_not_dvd,"
which is the line that gives the error message.

Kenny Lau (Apr 30 2019 at 20:41):

maybe your p and q are integers

Katherine Cordwell (Apr 30 2019 at 20:42):

The context says "p q : ℕ", so I don't think so?

Patrick Massot (Apr 30 2019 at 20:42):

You may have the wrong vertical bar

Kenny Lau (Apr 30 2019 at 20:43):

right, make sure that it is \|

Kenny Lau (Apr 30 2019 at 20:43):

which results in a different bar than |

Katherine Cordwell (Apr 30 2019 at 20:44):

Oh great, that fixed it! Thanks so much

Patrick Massot (Apr 30 2019 at 20:44):

also, from apply doesn't mean anything

Patrick Massot (Apr 30 2019 at 20:44):

No problem. Welcome to Lean Zulip!

Rob Lewis (Apr 30 2019 at 20:49):

The vertical bar thing drives me nuts. I've been using this notation for years and I still mix the two up periodically. It never sticks in my brain.

Travis Hance (Apr 30 2019 at 21:12):

to tack onto this question, what exactly does the "maximum class-instance resolution depth has been reached" error message mean, in general? Every time I see it I know it just means I have a type error somewhere but the error never helps me find it and part of that is because I don't know what it means.

Patrick Massot (Apr 30 2019 at 21:13):

Do you know the basics of the type class system?

Kevin Buzzard (Apr 30 2019 at 21:13):

What it actually means is that Lean was trying to solve a typeclass problem, and it failed because it gave up

Kevin Buzzard (Apr 30 2019 at 21:13):

because things weren't looking good

Kevin Buzzard (Apr 30 2019 at 21:14):

What is misleading about the message is that sometimes Lean is trying to solve a completely random problem which you had no intention of making it solve; that's what was happening above.

Kevin Buzzard (Apr 30 2019 at 21:15):

The real error was some subtle syntax error (| instead of \|) and Lean got completely confused; the error message was ultimately irrelevant.

Kenny Lau (Apr 30 2019 at 21:18):

yeah Lean can work on its error messages

Kevin Buzzard (Apr 30 2019 at 21:19):

The | indicates the end of something, so Lean was faced with ¬ p with p : nat

Kenny Lau (Apr 30 2019 at 21:19):

but again, [insert saying about how it's easy to complain]

Simon Hudon (Apr 30 2019 at 21:19):

If you had to choose three error messages that you want to be more informative, which ones would it be?

Patrick Massot (Apr 30 2019 at 21:20):

Error message craft is highly non-trivial. In this case I would say that maximal depth error message would be much easier if it included an indication of what type class problem it failed to solve

Patrick Massot (Apr 30 2019 at 21:20):

That one would certainly be in my top 3

Kevin Buzzard (Apr 30 2019 at 21:21):

import data.nat.prime

example (p : ) : ¬ p := sorry
/-
maximum class-instance resolution depth has been reached yada yada
-/

Kevin Buzzard (Apr 30 2019 at 21:21):

-- import data.nat.prime

example (p : ) : ¬ p := sorry
/-
type mismatch at application
  ¬p
term
  p
has type

but is expected to have type
  Prop
-/

Simon Hudon (Apr 30 2019 at 21:21):

Excellent. How would you feel about a sort of introductory paragraph (or link to one) about type class search, instance problem and the main candidates tried?

Kevin Buzzard (Apr 30 2019 at 21:22):

So it's the import which causes it, on this occasion. Of course, the input is garbage

Kevin Buzzard (Apr 30 2019 at 21:22):

Whenever I see it I assume it's user error (i.e. my fault) rather than Lean actually failing to do type class inference

Simon Hudon (Apr 30 2019 at 21:22):

@Patrick Massot It might be worth opening an issue to start pitching specific improvements

Patrick Massot (Apr 30 2019 at 21:24):

Simon, I don't understand. Do you mean you want a link in the error message?

Patrick Massot (Apr 30 2019 at 21:24):

opening an issue where?

Patrick Massot (Apr 30 2019 at 21:24):

In the community fork?

Kevin Buzzard (Apr 30 2019 at 21:25):

@Travis Hance the answer to your question is that there's a thing called typeclass inference, which is invoked whenever you leave out an input to a function but the input is in [] square brackets. Lean invokes an algorithm called the type class inference algorithm, which searches through a database trying to find the term you didn't supply explicitly. If the algorithm is called accidentally then of course it can fail -- the unhelpful error message just means "you called typeclass inference and it didn't work"

Simon Hudon (Apr 30 2019 at 21:27):

@Patrick Massot yes community fork. And the link I'm talking about is like Rust errors: https://doc.rust-lang.org/edition-guide/rust-2018/the-compiler/improved-error-messages.html

Simon Hudon (Apr 30 2019 at 21:28):

When you have an error in Rust, it gives you a rule number and if you query the compiler about that rule it explains why it's there.

Rob Lewis (Apr 30 2019 at 21:28):

Printing the failing problem would be nice. But a lot of time time, the question isn't really which problem it failed to solve, it's why the search went deep before failing.

Kevin Buzzard (Apr 30 2019 at 21:28):

In this case Lean saw ¬ p and thought "hmm, p is a natural and it's supposed to be a proposition, that must mean that there is a coercion from naturals to propositions, I'll pass that over to the type class inference system because that handles coercions, and I'll leave it looking for a natural map from natural numbers to true/false statements". Lean fails to find that natural map, because nobody told Lean such a map, but unfortuately along the way it gets very keen on the idea of using canonically_ordered_comm_semiring in some way, and ends up in an infinite loop.

Rob Lewis (Apr 30 2019 at 21:28):

In Kevin's example, the culprit is with_top.canonically_ordered_comm_semiring looping for some reason.

Kevin Buzzard (Apr 30 2019 at 21:29):

The garbage out is because of the garbage input -- trying to coerce a natural to a proposition.

Kevin Buzzard (Apr 30 2019 at 21:30):

instance [canonically_ordered_comm_semiring α] [decidable_eq α] :
  canonically_ordered_comm_semiring (with_top α) := ...

in ordered_ring, which for some reason is being imported by data.nat.prime

Patrick Massot (Apr 30 2019 at 21:31):

Oh my God. When I read Kevin's message I thought it was a joke, because we suffered of lot with with_zero and ordered commutative monoids in the perfectoid project. But it seems with_top.canonically_ordered_comm_semiring is indeed involved here

Patrick Massot (Apr 30 2019 at 21:32):

Printing the failing problem would be nice. But a lot of time time, the question isn't really which problem it failed to solve, it's why the search went deep before failing.

I think what you ask is complicated. Displaying the failing problem would already be immensely useful

Patrick Massot (Apr 30 2019 at 21:33):

for instance, you could see immediately that Lean went looking for a coercion from nat to Prop in that case

Simon Hudon (Apr 30 2019 at 21:34):

I agree, that would be a good start

Kevin Buzzard (Apr 30 2019 at 21:34):

data.nat.basic imports algebra.ordered_ring which contains the offending instance.

Simon Hudon (Apr 30 2019 at 21:35):

Is that in mathlib? Is it universally problematic?

Kevin Buzzard (Apr 30 2019 at 21:35):

Yes this is all mathlib

Patrick Massot (Apr 30 2019 at 21:35):

It doesn't sound quite right that basics of natural numbers involves canonically ordered commutative semi-ring. That must be my mathematician bad habits...

Kevin Buzzard (Apr 30 2019 at 21:35):

it's OK, I don't think data.nat.basic is used very much

Kevin Buzzard (Apr 30 2019 at 21:36):

sounds pretty obscure to me

Kevin Buzzard (Apr 30 2019 at 21:36):

to be honest I am extremely surprised I didn't see this loop earlier

Kevin Buzzard (Apr 30 2019 at 21:39):

If you had to choose three error messages that you want to be more informative, which ones would it be?

deterministic timeout was my #1 choice last week when we were in perfectoid hell

Kevin Buzzard (Apr 30 2019 at 21:39):

It should have said "just change that Type* to Type u and I'll stop complaining"

Rob Lewis (Apr 30 2019 at 21:39):

I think what you ask is complicated. Displaying the failing problem would already be immensely useful

Oh, of course. "Why the search went deep" doesn't really have a well-defined answer. Printing the failing target is helpful when you accidentally invoke type class problems, and it's simple enough to do. But in that situation, you often don't want type class inference to time out, you want it to fail fast. It would be better to see a standard type checking error there than any kind of type class message.

Kevin Buzzard (Apr 30 2019 at 21:40):

Without the import you get the standard typechecking error, as you know

Rob Lewis (Apr 30 2019 at 21:42):

Right. That instance is worrying. It seems like there are too many metavariables for it to be applied here, so I'm not sure exactly why it's looping.

Rob Lewis (Apr 30 2019 at 21:43):

But it's late, I'm exhausted, not gonna investigate now.

Patrick Massot (Apr 30 2019 at 21:43):

Have a good night! (I should go to bed too)

Mario Carneiro (Apr 30 2019 at 23:09):

There is nothing wrong with that instance as written. I assume it is trying to solve a canonically_ordered_comm_semiring ? kind of problem?

Mario Carneiro (Apr 30 2019 at 23:53):

Here's an abbreviated trace:

[class_instances]  class-instance resolution trace
[class_instances] (0) ?x_3 : has_coe_to_sort ℕ := @coe_sort_trans ?x_5 ?x_6 ?x_7 ?x_8
[class_instances] (1) ?x_7 : has_coe_t_aux ℕ ?x_6 := @coe_base_aux ?x_9 ?x_10 ?x_11
[class_instances] (2) ?x_11 : has_coe ℕ ?x_10 := @nat.cast_coe ?x_14 ?x_15 ?x_16 ?x_17
[class_instances] (3) ?x_16 : has_one (with_zero ?x_18) := @with_zero.has_one ?x_24 ?x_25
[class_instances] (4) ?x_25 : has_one ?x_24 := @with_top.has_one ?x_26 ?x_27 ?x_28
[class_instances] (5) ?x_27 : canonically_ordered_comm_semiring ?x_26 := @with_top.canonically_ordered_comm_semiring ?x_29 ?x_30 ?x_31 ?x_32 ?x_33
[class_instances] (6) ?x_30 : canonically_ordered_comm_semiring ?x_29 := @with_top.canonically_ordered_comm_semiring ?x_34 ?x_35 ?x_36 ?x_37 ?x_38
...

The problem is a confusion of whether typeclass parameters are supposed to be input or output parameters. The canonically_ordered_comm_semiring instance assumes it will only ever be called on a concrete type, so the metavariable indicates the problem lies elsewhere. Similarly, with_zero.has_one assumes it is being called on a concrete type. So we must conclude that for nat.cast_coe to work correctly, it must only be called on concrete types. But that's not how the coe instances work, in particular coe_sort_trans:

instance coe_sort_trans {a : Sort u₁} {b : Sort u₂} [has_coe_t_aux a b] [has_coe_to_sort.{u₂ u₃} b] : has_coe_to_sort.{u₁ u₃} a

This is an instance which invents b from nothing, as a metavariable. It solves has_coe_to_sort a by trying to solve has_coe_t_aux a ?b (which quickly boils down to solving has_coe a ?b) and then if that is successful it looks for has_coe_to_sort b. This kind of forward searching doesn't work with the mathlib instances, since you can't enumerate all the types which have coercions from nat (there are infinitely many). In fact due to the DFS search order it never gets around to checking any of them and instead builds infinitely large instances and overflows.

Floris van Doorn (May 01 2019 at 00:40):

It would be nice if type class inference stops when its argument is a metavariable (note: I'm not saying it should stop when its argument contains a metavariable), so that nat.cast_coe would just fail, because the type class inference gives up once it has to find an instance of has_one ?x. If we ever rely on type class inference continuing in these cases, we should just provide more information.

Mario Carneiro (May 01 2019 at 00:54):

We could have a marker in_param that says "this argument must be a concrete type for this to trigger", or at least it shouldn't instantiate a metavariable in this slot. Better yet, it seems like this should be the default behavior for anything other than out_param

Mario Carneiro (May 01 2019 at 00:55):

In this case, that means that canonically_ordered_comm_semiring ?x_26 := @with_top.canonically_ordered_comm_semiring would not unify, because the metavar ?x_26 is treated as a constant

Mario Carneiro (May 01 2019 at 00:57):

Of course this sort of defeats the purpose of using a metavariable to begin with, but like I said, has_coe seems confused about whether its inputs are outputs or not

Mario Carneiro (May 01 2019 at 00:58):

coe_sort_trans seems like a very suspicious instance

Katherine Cordwell (May 01 2019 at 01:50):

Still on the thread of weird error messages, why might Lean be giving me an "unexpected end of comment block" error on a blank line?

Katherine Cordwell (May 01 2019 at 01:51):

Hmm never mind I solved it--an earlier comment was causing problems and the error message just manifested on an unrelated line.

Mario Carneiro (May 01 2019 at 01:54):

the location of this error could be improved (logically it should go on the comment open token, or the entire unclosed comment region)

Katherine Cordwell (May 01 2019 at 01:59):

Mostly what threw me off was that there were several (correct) theorems between the problematic comment and the error message.

Kevin Buzzard (May 01 2019 at 06:47):

Error messages are not the best in Lean 3. I remember realising how much better I had become at fixing up my own code when I had started getting the hang of working out what they were actually telling me. Another tip: don't try and debug the third error message -- the problem might be the unintended chaos caused by the first one.

Mario Carneiro (May 01 2019 at 06:58):

That's generally true in any programming language, I think. Compilers try their best to recover from an error but it's rarely successful


Last updated: Dec 20 2023 at 11:08 UTC