Zulip Chat Archive

Stream: lean4

Topic: isDefEq assigns type incorrect expression to metavariable

Yicheng Qian (Dec 20 2024 at 00:49):

This is a minimized example where isDefEq assigns type incorrect expression to metavariables:

structure foo where
  bar : Nat

def test (lhs rhs : Expr) : MetaM Unit := do
  let lhs := lhs
  let rhs := rhs
  let m ← Meta.mkFreshExprMVar lhs
  let defeq? ← Meta.isDefEq lhs (.app rhs m)
  let m ← instantiateMVars m
  IO.println m
  IO.println s!"Unifiable? {defeq?}"
  IO.println s!"Type correct? {(← Meta.isTypeCorrect m)}"

-- Unifying `foo` and `foo.bar ?m`
#eval test (.const ``foo []) (.const ``foo.bar [])

Note that the above behaviour does not show up in the following slightly modified example:

#eval test (.const ``Nat []) (.const ``foo.bar [])

Eric Wieser (Dec 20 2024 at 01:01):

This is user error; .app rhs m has type Nat, but lhs has type Type

Eric Wieser (Dec 20 2024 at 01:04):

You might be interested in Qq, which protects you from accidentally unifying things of the wrong types;

import Qq

open Lean Meta Elab

structure foo where
  bar : Nat

open Qq
def test {u v : Level} {α : Q(Type u)} {β : Q(Type v)} (lhs : Q($β)) (rhs : Q($α → $β)) : MetaM Unit := do
  let m ← mkFreshExprMVarQ α
  let defeq? ← isDefEqQ (u := u.succ) lhs q($rhs $m)
  let m ← instantiateMVars m
  IO.println m
  IO.println s!"Unifiable? {defeq? matches .defEq _}"
  IO.println s!"Type correct? {(← Meta.isTypeCorrect m)}"

-- Unifying `foo.bar ⟨1⟩` and `foo.bar ?m`
#eval test q(foo.bar ⟨1⟩) q(foo.bar)

Eric Wieser (Dec 20 2024 at 01:09):

(though actually there's a Qq bug here, the u := u.succ should be v := v.succ, but that leads to a crash)

Yicheng Qian (Dec 20 2024 at 01:11):

What is the expected behaviour of isDefEq? Is it guaranteed to be correct only when the types of the two expressions are syntactically equal, or when the types of the two expressions are unifiable?

Yicheng Qian (Dec 20 2024 at 01:16):

I would like to know more because I need to unify expressions whose types are only known at runtime (and potentially with metavariables in their types) in my code.

Kyle Miller (Dec 20 2024 at 01:49):

This looks like a bug to me. The isDefEq function is supposed to return true only if the expressions are definitionally equal, and the types are supposed to be definitionally equal in that case too.

Kyle Miller (Dec 20 2024 at 01:50):

It's probably gone unnoticed because most processes in Lean seem to do isDefEq of types of terms at some point as well.

Kyle Miller (Dec 20 2024 at 01:56):

Here's some of the trace:

[Meta.isDefEq] ✅️ foo =?= foo.bar ?m.3080 ▼
  [] ✅️ foo =?= ?m.3080.1 ▼
  [assign] ✅️ ?m.3080 := { bar := foo } ▼
  [beforeMkLambda] ?m.3080 [] := { bar := foo }
  [checkTypes] ✅️ (?m.3080 : foo) := ({ bar := foo } : foo) ▼
  [] ✅️ foo =?= foo

It looks like what's happening is that it's something to do with a rule where if you have a one-field structure, it will reduce x =?= y.1 to y =?= { bar := x }, but it seems to miss that x needs to have the correct type for this to apply.

If you add another field to your structure, you can observe that they are no longer unifiable.

Kyle Miller (Dec 20 2024 at 02:33):

Thanks, I've reported this issue at lean4#6420

Eric Wieser (Dec 20 2024 at 10:53):

Huh, I thought I'd read in the past that isDefeq was only allowed to be called on terms whose types themselves could unify

Kyle Miller (Dec 20 2024 at 19:34):

I had never heard that, but it's possible it's true. I know that isDefEq does check types when unifying metavariables though, so if there were such an assumption, it's not being taken advantage of to make the algorithm simpler.

Something that confuses me about this potential restriction is that this gives a lot of overhead to properly check that I never see done. Let's say we want to check that x and y are defeq. We infer their types and get X and Y. But we don't know that the types of X and Y are defeq yet, right? So we would need to infer their types to get Sort u and Sort v (after whnf), and then we can check that u and v are defeq, then X and Y are defeq, and then x and y are defeq. In the cases I can think of, at most meta code would check that X and Y are defeq, and it's to come up with better error messages, or maybe as a cheaper first pass at checking defeq.

Eric Wieser (Dec 20 2024 at 23:51):

What you describe is at least what Qq forces you to do...

Kyle Miller (Dec 21 2024 at 00:16):

Maybe for constructing terms, but there are plenty of metaprogramming tasks where you're not constructing terms but you want to test if terms are defeq.

For example, docs#Mathlib.Tactic.AtomM.addAtom is checking whether a term is defeq to a term that's in the AtomM state. It's not checking that types are defeq first, and even the Qq version docs#Mathlib.Tactic.AtomM.addAtomQ has a type signature that indicates it's not just for terms of a single type. That's just the first thing that came to mind, and searching for isDefEq across Lean and Mathlib, defeq testing without ensuring types are defeq first is frequent.

Given that you both reacted with :this: to @Eric Wieser's comment @Marcus Rossel and @Arthur Adjedj, I take it you've seen this documented somewhere? Would you mind digging it up?

Eric Wieser (Dec 21 2024 at 00:22):

For example, docs#Mathlib.Tactic.AtomM.addAtom is checking whether a term is defeq to a term that's in the AtomM state. It's not checking that types are defeq first,

For a linarith optimization that I was working on, I actually fix this in #19821 (draft)

Kyle Miller (Dec 21 2024 at 02:24):

You're saying it's a fix, but is it broken? It feels sort of like begging the question to call it a fix.

(What's the performance impact of doing an isLevelDefEq and two isDefEqQs rather than a single isDefEq? I could see it being net positive — doing isDefEq on types could exclude things quickly — but it's hard to say.)

Eric Wieser (Dec 21 2024 at 02:33):

Sorry, "fix" is the wrong word here. The context of the linarith change is one that wants to produces every term of the form x * y, but there is no need to do this unless x and y have the same type. So in this case it saves time by doing the isDefeq on the types as part of the atomization, rather than in the quadratic loop.

Marcus Rossel (Dec 21 2024 at 09:44):

@Kyle Miller I think I was just conflating the matter with this answer: https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/Unify.20level.20mvars/near/416858547.

Arthur Adjedj (Dec 21 2024 at 12:08):

Given that you [...] reacted with :this: to Eric Wieser's comment [...], I take it you've seen this documented somewhere? Would you mind digging it up?

I don't believe I have seen this be documented in Lean in particular, it is however a folklore rule of type theory, and why I reacted that way.
To give a more detailed answer, algorithmic checking for conversion comes in two flavors in most implementations: conversion is usually either typed or untyped. In Agda, the conversion algorithm takes another argument, that being the type of the terms being checked, whereas this is not the case in Lean and Coq (Lean is a bit of a weird mix though, since it does infer types when checking for eta-for-unit, more on that in a second).
The justification behind not carrying this extra information is twofolds, 1. for performance reasons, and 2. because the fact that terms being checked for conversion should always be well-typed and have the same type appears as a natural invariant when writing a type-checker.
The main upside from having typed conversion is that eta-expansion for types and structures becomes easier and more wide-spread. In agda for example, whenever agda checks for equality between two terms of a record type with eta, Agda will always eta-expand the two terms, making the algorithm in effect much simpler (in particular, it handles eta-for-unit for free).
I was going to point out that the aforementioned invariant (that two terms being compared should always have the same type) is visible in the kernel by the fact that is_def_eq_unit_like does not check that the two types inferred are defeq, but is seems like it does check that in practice... In the kernel in particular, this additional check could very certainly be removed, and provide additional performance in this corner case.

In any case, at the elaborator level, I would not be surprised if this invariant was not held in isDefEq, even more so when checking terms containing meta-variables, so this fix does not seem unreasonable to me.

Arthur Adjedj (Dec 21 2024 at 12:13):

Looking more into this question at the kernel level, I am quite surprised that the kernel would be this conservative when it comes to verifying that two terms always have the same type. If the invariant that isDefEq is always called on well-typed terms that have the same type up to defeq is held, then a fair few optimisations could be done there. In particular:

As mentioned before, is_def_eq_unit_like would not need to do this additional check
The algorithm would also need not verify that the domains of lambda-terms need to have the same type (this would hold from the fact that they both have "the same pi-type", and that pi-types are injective)
I feel confident that digging more into this, one could find additional checks that could be skipped, if the invariant was to hold in the rest of the kernel.

Arthur Adjedj (Dec 21 2024 at 12:54):

A quick scroll through all references of is_def_eq in the kernel indicates that these invariants almost hold. Indeed, is_def_eq is always only called on well-typed terms, and always on terms that have the same type up to level equality. In particular, when checking i.e a definition def foo : A := a, the kernel will:

verify that both Aand a have no fvar and are well-typed
that A is a type,
that the inferred type A' of a is defeq to A.

However, even though it is certain that A : Sort l and A' : Sort l' for two levels l l', there are at this point no guarantees that l and l' are defeq. If this additional check was added in add_definition, add_theorem, add_opaque and add_mutual, then the invariant would hold in the entire kernel (as well as all cpp libraries), and the two aforementioned checks could be removed completely.

Kyle Miller (Dec 21 2024 at 18:27):

Thanks a lot for the detailed explanation @Arthur Adjedj.

I managed to cook up an example that causes the elaborator to create a type-incorrect term:

structure S.{u} where
  t : Type u

/-
error on 'example': application type mismatch
  { t := Type }
argument has type
  Type 1
but function has type
  Type v → S
-/
example (α : Type v) : Type := by
  have m : (?m : S.{v}).t := ?n
  exact m
  exact Nat

(I still haven't found one that doesn't make use of a tactic script.)

This is taking advantage of the lean4#6420 issue, but it could be seen as taking advantage of any typechecker that fails to account for universe levels when unifying types.

Last updated: May 02 2025 at 03:31 UTC