Zulip Chat Archive

Stream: general

Topic: odd repeated type class search


Rob Lewis (Aug 11 2021 at 21:09):

This showed up in a branch following some work by @Frédéric Dupuis . I'm not sure how to minimize it, since it doesn't happen on master and the changes don't seem relevant, and I have no guess about what's going on. See branch#semilinear_new2_inv_pair_tc_issue, algebra/category/Module/basic.lean.

In line 84, the context is ring R. The lemma tensor_product.mk_compr₂_inj assumes comm_semiring R. Applying this lemma takes 6 seconds to fail. The instance search trace seems to show that it does the same search for comm_semiring R many thousands of times.

Any guesses what's happening?

Frédéric Dupuis (Aug 11 2021 at 21:14):

To add some context (because the changes must be relevant somehow!), the changes in this branch involve redefining linear_map to be semilinear maps, along the lines discussed in this thread.

Rob Lewis (Aug 11 2021 at 21:18):

It's relevant in that the lemma in question says something about linear maps. But I'm not sure how any change like that causes apply to shoot off the same type class search 5000+ times...

Eric Wieser (Aug 11 2021 at 21:33):

For reference: https://github.com/leanprover-community/mathlib/blob/00cc062acb1491efb5a12cb65fcbd69a86751c40/src/algebra/category/Module/basic.lean#L85

Rob Lewis (Aug 12 2021 at 15:03):

Here it is somewhat minimized. Try this on that same branch.

import linear_algebra.basic

lemma bad2 {S} [comm_semiring S] {M} [add_comm_monoid M] [module S M]
  (f : M →ₗ[S] M) : f = f := sorry

example {R} [semiring R] {M} [add_comm_monoid M] [module R M]
  (f : M →ₗ[R] M) : f = f :=
by try_for 10000 {apply bad2 }

Same behavior in term mode, with exact, with refine.

Rob Lewis (Aug 12 2021 at 15:34):

Here we go: reproducible on master, so independent of @Frédéric Dupuis 's changes.

import data.equiv.ring

lemma bad5 (S) [comm_semiring S] : ring_equiv.refl S = ring_equiv.refl S := sorry

set_option trace.class_instances true
example (R) [semiring R] : ring_equiv.refl R = ring_equiv.refl R := by apply bad5

This performs 4306 searches for comm_semiring R. If you use by apply bad5 R instead, it fails instantly.

Rob Lewis (Aug 12 2021 at 15:40):

A little less extreme (only 154 repeated searches):

import algebra.ring
set_option trace.class_instances true

def foo (S) [has_add S] := 2

lemma bad6 (S) [comm_semiring S] : foo S = foo S := sorry
example (R) [semiring R] : foo R = foo R := by apply bad6

Frédéric Dupuis (Aug 12 2021 at 17:31):

Rob Lewis said:

If you use by apply bad5 R instead, it fails instantly.

It seems like it this only works with apply; exact, refine and term mode are all still very slow even when specifying R explicitly.

Rob Lewis (Aug 12 2021 at 18:30):

I'm not really sure where to go debugging this. My guess is it's buried somewhere in the elaborator. Does anyone (@Gabriel Ebner @Mario Carneiro ?) have an idea where to look next?

Mario Carneiro (Aug 12 2021 at 18:32):

I don't know much about typeclass caching, this probably requires looking at it in the debugger

Rob Lewis (Aug 12 2021 at 18:33):

It doesn't necessarily scream caching issue. Even without caching I think it shouldn't start this many searches.

Gabriel Ebner (Aug 12 2021 at 18:42):

I believe this is the culprit:

// HACK(gabriel): do not reuse type-class cache for nested resolution problems
// For one example that easily breaks, see the default field values in `init/control/lawful.lean`
context_cache tmp_cache(*m_cache, true);
type_context_old tmp_ctx(env(), m_mctx, mdecl->get_context(), tmp_cache, m_transparency_mode);
inst = tmp_ctx.mk_class_instance(m_type);
if (inst) m_mctx = tmp_ctx.mctx();

Gabriel Ebner (Aug 12 2021 at 18:45):

If you enable set_option pp.all true set_option trace.type_context.is_def_eq_detail true, then you see which unification problems cause these type class searches.

Rob Lewis (Aug 12 2021 at 18:52):

Gabriel Ebner said:

I believe this is the culprit:

// HACK(gabriel): do not reuse type-class cache for nested resolution problems
// For one example that easily breaks, see the default field values in `init/control/lawful.lean`
context_cache tmp_cache(*m_cache, true);
type_context_old tmp_ctx(env(), m_mctx, mdecl->get_context(), tmp_cache, m_transparency_mode);
inst = tmp_ctx.mk_class_instance(m_type);
if (inst) m_mctx = tmp_ctx.mctx();

Aha. I'm not entirely sure what's going on here but I understand the comment

Rob Lewis (Aug 13 2021 at 15:50):

@Gabriel Ebner just to check, I'm guessing there's no easy change to use the cache in this nested case, right? I'm still trying to sort out why the changes in Frederic's branch trigger this. But I'm worried there's not a good workaround

Gabriel Ebner (Aug 13 2021 at 16:04):

If it was easy, I would have done it already. But it breaks some code in nontrivial ways (see comment).

Eric Wieser (Aug 13 2021 at 16:24):

Just for context, the quoted lines are here and these are presumably the default field values causing issues

Eric Wieser (Aug 13 2021 at 16:25):

If the problem only appears in default field values, can we make the hack hackier by only having it enabled during structure construction?

Gabriel Ebner (Aug 13 2021 at 16:29):

I don't really remember what was going on precisely. If it's just the default values in that file, then it could also be an option to turn them into separate lemmas.

Rob Lewis (Aug 13 2021 at 16:47):

Eric Wieser said:

If the problem only appears in default field values, can we make the hack hackier by only having it enabled during structure construction?

The problem on Frederic's branch appears in a structure construction (buried a few layers deep: in an instance declaration, tidy called ext which tried to apply a lemma which triggered one of these nested resolution problems).

Frédéric Dupuis (Aug 13 2021 at 17:05):

Rob Lewis said:

The problem on Frederic's branch appears in a structure construction (buried a few layers deep: in an instance declaration, tidy called ext which tried to apply a lemma which triggered one of these nested resolution problems).

In fact the problem seems to appear whenever ext is invoked on a linear map. For example in representation_theory/maschke.lean, line 171.

Eric Wieser (Aug 31 2021 at 13:58):

#8868 pins the blame on the docs#tensor_product.mk_compr₂_inj ext lemma - do we know why this lemma makes things so slow?

Eric Wieser (Aug 31 2021 at 13:58):

Can we avoid typeclass searches by replacing [] arguments with {}, and will that have unwanted consequences?


Last updated: Dec 20 2023 at 11:08 UTC