Zulip Chat Archive

Stream: lean4

Topic: Transparency in CanonM

Heather Macbeth (Nov 14 2024 at 03:51):

I have been experimenting with the "canonicalizer" recently, and I am confused by its transparency configuration. What exactly does this configuration do? I had initially assumed that if e and e' were defeq at transparency red then they would canonicalize to the same thing when CanonM is set up with transparency red. But this seems not to be the case -- here are two expressions which are defeq at default transparency and do not canonicalize to the same thing at default transparency.

Heather Macbeth (Nov 14 2024 at 03:51):

import Lean.Elab.Tactic.Basic
import Lean.Meta.Canonicalizer

open Lean Meta Elab

def canonEqSides (g : MVarId) : CanonM Unit := do
  let some (_, lhs, rhs) := (← g.getType).eq? | unreachable!
  let lhs' ← canon lhs
  trace[debug] "canonicalized {lhs} to {lhs'}"
  let rhs' ← canon rhs
  trace[debug] "canonicalized {rhs} to {rhs'}"

elab "foo" : tactic => Tactic.liftMetaMAtMain <| fun g ↦ (canonEqSides g).run' .default

set_option trace.debug true

example (x : ℤ) : x = id x := by
  foo

/-
[debug] canonicalized x to x

[debug] canonicalized id x to id x
-/

Heather Macbeth (Nov 14 2024 at 03:54):

Relatedly, omega apparently doesn't let the user specify a transparency level (the way that Mathlib's ring, linarith etc. do). Is that because cataloguing atoms using the canonicalizer (rather than isDefEq) makes this impossible? (cc @Kim Morrison)

Daniel Weber (Nov 14 2024 at 04:12):

docs#id

Daniel Weber (Nov 14 2024 at 04:16):

Looking at the code of canon, the transparency setting is only used when checking if two things with the same hash aren't defeq - it isn't part of the hashing

Daniel Weber (Nov 14 2024 at 04:19):

You could reduce before canonicalizing, I guess. A more sophisticated algorithm could lazily compute the hash, repeatedly whnf-ing, but that doesn't seem to be implemented

Heather Macbeth (Nov 14 2024 at 04:29):

I don't follow, sorry -- what does reduce have to do with transparency? (It's my first time looking at that function.)

Daniel Weber (Nov 14 2024 at 04:31):

If you call docs#Lean.Meta.reduce before running canon that would solve the problem, although it'll be very inefficient

Daniel Weber (Nov 14 2024 at 04:33):

But generally this seems out-of-scope for CanonM — the docstring states

This module aims to efficiently identify terms that are structurally different, definitionally equal, and structurally equal when we disregard implicit arguments like @id (Id Nat) x and @id Nat x

Heather Macbeth (Nov 14 2024 at 04:40):

I'm a bit surprised to hear this would be out of scope, because I believe CanonM is now recommended for atom-identication in algebraic tactics, and traditionally (at least in mathlib) such tactics have supported a configurable reducibility level.

Daniel Weber (Nov 14 2024 at 04:42):

I could try to implement my idea with whnf if that'll be helpful, I think it should be able to support this

Heather Macbeth (Nov 14 2024 at 04:54):

Well, who knows, maybe the current advice is that algebraic tactics should not support multiple reducibility levels, and that instead users should unfold/zeta-reduce/etc in a separate step before the algebraic automation ...

Kim Morrison (Nov 14 2024 at 05:42):

I think that's the case, Heather. There's just a trade off here between "I want tactics that just work, doing extra work as required" and "I want tactics with reliable performance". You get to pick a failure mode... :-) That said, I think it's pretty reasonable if Lean and Mathlib make differing choices on this question.

Heather Macbeth (Nov 14 2024 at 05:44):

OK, thanks! But maybe it would not be unreasonable to adopt this paradigm in Mathlib, too.

Heather Macbeth (Nov 14 2024 at 05:47):

I was curious whether the no-transparency-configurability enforced by CanonM was actively recommended now for algebraic automation ... or whether it was just that you hadn't thought about it yet. Sounds like it's the former?

Heather Macbeth (Nov 14 2024 at 05:51):

(I've been experimenting with switching over AtomM to use CanonM, which is where I noticed this.)

Kim Morrison (Nov 14 2024 at 05:51):

Yeah, I think so. I think we should avoid hiding arbitrarily difficult defeq problems inside the running of other tactics.

Kim Morrison (Nov 14 2024 at 05:51):

When we did this switch for omega there were one or two regressions. Fewer than I expected, however.

Heather Macbeth (Nov 14 2024 at 05:55):

My mostly-fixed branch#HM-canonM touches 48 files so far, which is honestly pretty small in proportion to the number of files which use ring, abel, linarith, module, ... somewhere.

Jannis Limperg (Nov 14 2024 at 11:01):

Like Heather, I'm concerned about the user experience here. Lean tactics already use a range of notions of equality (usually not properly documented), including at least

defeq at default transparency
defeq at reducible transparency
defeq at default transparency but same DiscrTree keys (which are computed at reducible transparency but omit some parts of an expression)

This approach adds one (or two) more:

defeq at transparency T and structurally equal when ignoring implicit (but not instance) arguments

It seems inevitable to me that will degrade the user experience for new and intermediate-level users. So unless there's an actual performance issue, I would prefer that tactics continue to standardise on reducible transparency.

The most performant way to achieve this would now likely be:

reduce (explicitOnly := true) (skipTypes := false) (skipProofs := true)
canon

The skipProofs := true is based on the observation that the canonicaliser seems to ignore proofs as well, but I'm not sure about that.

Kim Morrison (Nov 14 2024 at 22:57):

There was very much a performance issue in omega.

Jannis Limperg (Nov 14 2024 at 23:19):

Yeah, but have you tried the middle ground I sketched? If this turns out to have reasonable performance, I think it would be worth sacrificing a bit of performance for sensible semantics.

Kim Morrison (Nov 14 2024 at 23:30):

Sorry, could you say more explicitly what your suggestion is? I don't understand yet.

Jannis Limperg (Nov 14 2024 at 23:40):

Jannis Limperg said:

reduce (explicitOnly := true) (skipTypes := false) (skipProofs := true)

canon

This one. So reduce in explicit arguments (and types?) with reducible transparency before canonicalising. Afaict, this should ensure that expressions are considered equal iff they are reducibly defeq.

Kim Morrison (Nov 14 2024 at 23:59):

Oh! I didn't understand that those were intended sequentially.

Kim Morrison (Nov 15 2024 at 00:01):

Unfortunately I don't have a great benchmark for the problem cases where we introduced CanonM for omega.

Kim Morrison (Nov 15 2024 at 00:01):

Maybe if we can produce such a benchmark again we could see how expensive adding a reduce step would be.

Jannis Limperg (Nov 15 2024 at 00:04):

Great, thanks! I want to start using this strategy more in Aesop as well, so I'm very curious about the performance implications.

Heather Macbeth (Nov 15 2024 at 00:04):

Does reduce have variants for different transparencies? If so (going back to my initial question in this thread) that approach would also allow omega to support a transparency config.

Jannis Limperg (Nov 15 2024 at 00:16):

reduce respects transparency afaik, so withTransparency md <| reduce e.

Heather Macbeth (Nov 15 2024 at 22:52):

@Jannis Limperg, I was thinking about this a bit more today. Should we really expect that reducing a and b (and then doing a quick comparison) is faster than running isDefEq a b? Won't it often happen that a and b are complicated and take a while to reduce, but nonetheless they are very similar structurally (or even the same) and isDefEq can succeed fast?

Heather Macbeth (Nov 15 2024 at 22:56):

I don't know what algorithm isDefEq implements, but I would hope that it can identify id a and a without reducing them both to some normal form ... and if a is complicated, that should be a lot faster, right?

Eric Wieser (Nov 15 2024 at 23:09):

One thing to watch out for when using AtomM / isDefEq is whether you actually want it to assign metavariables, which can probably make it much more expensive

Heather Macbeth (Nov 15 2024 at 23:11):

I have been noticing lately that in a few places people are relying on the fact that AtomM-based tactics can assign metavariables (e.g. @David Ang does sometimes in the elliptic curves files) ... I think it would be better not to do this!

Kim Morrison (Nov 18 2024 at 00:19):

Heather Macbeth said:

I have been noticing lately that in a few places people are relying on the fact that AtomM-based tactics can assign metavariables (e.g. David Ang does sometimes in the elliptic curves files) ... I think it would be better not to do this!

Should we add a withNewMCtxDepth for now?

Heather Macbeth (Nov 18 2024 at 04:54):

I'm not familiar with that function -- this is a fix to prevent tactics from creating goals whose type contains metavariables? How does it work?

Heather Macbeth (Nov 18 2024 at 04:55):

Here's an example, anyway, appearing in the library as docs#WeierstrassCurve.addY_sub_negY_addY:

import Mathlib.AlgebraicGeometry.EllipticCurve.Affine

open WeierstrassCurve Affine

variable {F : Type*} [Field F] {W : Affine F}
variable {x₁ x₂ : F} (y₁ y₂ : F)

example (hx : x₁ ≠ x₂) :
    letI x₃ := W.addX x₁ x₂ (W.slope x₁ x₂ y₁ y₂)
    letI y₃ := W.addY x₁ x₂ y₁ (W.slope x₁ x₂ y₁ y₂)
    y₃ - W.negY x₃ y₃ =
      ((y₂ - W.negY x₂ y₂) * (x₁ - x₃) - (y₁ - W.negY x₁ y₁) * (x₂ - x₃)) / (x₂ - x₁) := by
  simp_rw [addY, negY, eq_div_iff (sub_ne_zero.mpr hx.symm)]
  linear_combination 2 * cyclic_sum_Y_mul_X_sub_X y₁ y₂ hx

Heather Macbeth (Nov 18 2024 at 04:56):

If you separate the linear_combination into two steps,

linear_combination (norm := skip) 2 * cyclic_sum_Y_mul_X_sub_X y₁ y₂ hx
ring

you'll see that the ring is solving a goal which contains metavariables.

Daniel Weber (Nov 18 2024 at 04:57):

Heather Macbeth said:

I'm not familiar with that function -- this is a fix to prevent tactics from creating goals whose type contains metavariables? How does it work?

It increases the context depth, and isDefEq and similar don't assign metavariables with a smaller depth then the context's. There's a few sentences about that in https://leanprover-community.github.io/lean4-metaprogramming-book/main/04_metam.html#metavariable-depth

Heather Macbeth (Nov 18 2024 at 04:57):

It is correctly (for this application) identifying atoms which could be unified, and unifying them.

Heather Macbeth (Nov 18 2024 at 04:58):

But I think it's dangerous to be relying on this -- and in fact this is a sign that docs#WeierstrassCurve.cyclic_sum_Y_mul_X_sub_X should change its W argument from implicit to explicit.

Kim Morrison (Nov 18 2024 at 06:11):

Yes, I think adding a withNewMCtxDepth would prevent it from doing something correct, but which we do not want it to do!

Jannis Limperg (Nov 18 2024 at 17:45):

Heather Macbeth said:

Jannis Limperg, I was thinking about this a bit more today. Should we really expect that reducing a and b (and then doing a quick comparison) is faster than running isDefEq a b? Won't it often happen that a and b are complicated and take a while to reduce, but nonetheless they are very similar structurally (or even the same) and isDefEq can succeed fast?

For one-off comparisons, isDefEq is indeed much faster than full reduction + equality check. However, I think it's possible that when an expression participates in many comparisons, the eager reduction (at reducible transparency, which is hopefully not too costly) pays off. Eager reduction might also allow us to use more fancy techniques from automated reasoning, e.g. faster indexing schemes. I don't have any data on this yet, but I think it's an interesting research direction.

Heather Macbeth (Nov 18 2024 at 17:52):

Interesting, thanks!

Kim Morrison (Nov 19 2024 at 04:55):

(but noting this is a research direction: I don't think we want to be running reduce before canon in any Mathlib tactics at this point)

Last updated: May 02 2025 at 03:31 UTC