Zulip Chat Archive

Stream: lean4

Topic: Reduction of terms with free variables is expensive

Yann Herklotz (Sep 10 2024 at 12:35):

There seems to be a significant difference between the reduction strategies when using rfl on expressions with and without free variables, but I am a bit confused why that is. I also can't come up with a completely standalone example of this (so I'm using a reduced example that uses Batteries.RBMap, because examples that are too small don't seem to exhibit the same issue.

Below the proof of equality, which reduces the term, takes 500ms, and becomes exponentially longer the more elements are added to the first tree. Here T is a fvar in the local context.

import Lean
import Batteries

set_option profiler true in -- takes 500ms (uncomment and it takes 7s)
example (T : Nat): (((Batteries.RBMap.ofList [(1, T), (2, T), (3, T) /-, (4, T), (5, T)-/] Ord.compare).mergeWith (λ _ a _ => a)
          (Batteries.RBMap.ofList [(1, T), (2, T)] Ord.compare)).find? 1) = some T := by rfl

What I find strange is that just making T a declaration, so that it becomes a constant in the expression, makes the proof of equality nearly instant.

import Lean
import Batteries

axiom T : Nat

set_option profiler.threshold 1 -- otherwise no time is shown
set_option profiler true in -- takes 8ms
example : (((Batteries.RBMap.ofList [(1, T), (2, T), (3, T), (4, T), (5, T)] Ord.compare).mergeWith (λ _ a _ => a)
          (Batteries.RBMap.ofList [(1, T), (2, T)] Ord.compare)).find? 1) = some T := by rfl

In our proofs we seem to be hitting this inefficiency quite often, because we are mainly referring to fvars in expressions. I would also not expect the reduction to be this slow, and I am not sure which exact step is taking so long. Does anyone have any insights on why this might be? Or if this is an expected behaviour of the reduction?

Kyle Miller (Sep 10 2024 at 15:29):

Maybe you're seeing a feature in the eq_refl tactic (which is what rfl for equalities), where if there are no free variables or metavariables, it uses kernel defeq instead of the usual elaborator defeq when validating that Eq.refl can prove the goal. Kernel defeq is faster.

Kyle Miller (Sep 10 2024 at 15:33):

I'm not sure why docs#Lean.MVarId.refl doesn't allow fvars at all, since docs#Lean.Kernel.isDefEq does accept a local context. Though, it would need to instantiate all mvars in the local context and check that no mvars are remaining.

Yann Herklotz (Sep 10 2024 at 15:35):

Ah yes, that probably explains the difference between the free variable vs constant version. But I still wasn't expecting there to be a difference of 7s vs 8ms!

And thank you for the pointers, that's interesting that it could maybe be passed to the kernel defeq check. Instantiating all mvars in the local context should be fine as well.

Kyle Miller (Sep 10 2024 at 15:42):

If you can do some metaprogramming, it wouldn't be too difficult to make a ker_rfl tactic.

One approach to make a local context would be to use mkForallFVars on the target with usedOnly := true to make a closed term, instantiate mvars at that point (and check that there are no mvars), then with an empty local context use forallTelescope. This is all a way to remove false dependencies on mvars while instantiating them.

Yann Herklotz (Sep 10 2024 at 15:49):

Oh great thanks, yeah that shouldn't be too difficult, I'll try that out. And thanks for the pointer on usedOnly, that should be useful!

Is there any point in trying to reduce the example further to see what takes 7s in the non kernel reduction? Or should one be avoiding the normal reduction in elaboration as much as possible?

Yann Herklotz (Oct 02 2024 at 13:36):

Just wanted to note that I've filed an issue about the eq_refl performance on this term with free variables, because I feel like the the exponential number of functions that are unfolded as the size of the list increases is unexpected lean#5596.

Eric Wieser (Nov 22 2024 at 18:20):

Kyle Miller said:

it uses kernel defeq instead of the usual elaborator defeq when validating that Eq.refl can prove the goal.

Is this still the case, or does it now use elaborator defeq?

Kyle Miller (Nov 22 2024 at 18:40):

Just checked, no it does not use the kernel anymore.

Kyle Miller (Nov 22 2024 at 18:41):

lean4#3772

Eric Wieser (Nov 22 2024 at 18:50):

Is there a version of rfl that does use the kernel?

Eric Wieser (Nov 22 2024 at 18:51):

I came up with

  run_tac do
    let g ← Lean.Elab.Tactic.popMainGoal
    let_expr Eq α a b := ← g.getType' | failure
    g.assign (← Lean.Meta.mkEqRefl a)

but it wouldn't surprise me if something is wrong here

Kyle Miller (Nov 22 2024 at 18:53):

If it's applicable, you could use decide's kernel mode (on nightly it's decide +kernel), which is a bit less efficient than just using Eq.refl

Kyle Miller (Nov 22 2024 at 18:53):

I think the only thing that's wrong with what you have is that it doesn't do any validation at tactic time that the tactic succeeds. The decide +kernel tactic will call out to the kernel at tactic time. It caches the result as a theorem too, so it doesn't need to be checked again.

Eric Wieser (Nov 22 2024 at 18:54):

That's decide! in non-nightly versions?

Kyle Miller (Nov 22 2024 at 18:54):

Yeah

Eric Wieser (Nov 22 2024 at 18:56):

Hmm, that doesn't exist in the stable version of Lean it seems

Eric Wieser (Nov 22 2024 at 18:57):

Only in the release candidates

Eric Wieser (Nov 22 2024 at 18:58):

Kyle Miller said:

I think the only thing that's wrong with what you have is that it doesn't do any validation at tactic time that the tactic succeeds

What would that validation look like?

Kyle Miller (Nov 22 2024 at 18:59):

You'd want to ask the kernel to check that a and b are defeq before assigning that metavariable, and if that fails then throw an error

Kyle Miller (Nov 22 2024 at 18:59):

(using the code that 3772 removed)

Kyle Miller (Nov 22 2024 at 19:02):

Or, you could look at how decide +kernel works. It uses the auxdecl interface to add a theorem to the environment.

Though, docs#Lean.Meta.mkAuxTheorem seems very convenient. It will auto-abstract the any fvars for you and create a theorem checked by the kernel, which you should be able to apply (though you'd have to be careful with transparency settings to make sure it doesn't trigger any expensive defeqs!)

Eric Wieser (Nov 22 2024 at 19:10):

I don't need anything sophisticated here, the purpose is to have a test that forces the kernel to crash

Eric Wieser (Nov 22 2024 at 19:11):

The test in question is

import Lean

/-- Check if the kernel would accept `rfl` -/
elab "rfl!?" : tactic => do
  let g ← Lean.Elab.Tactic.getMainGoal
  let_expr Eq α a b := ← g.getType' | failure
  guard <| ← Lean.ofExceptKernelException <| Lean.Kernel.isDefEq (← Lean.getEnv) {} a b

def npowRec : Nat → Nat → Nat
  | 0, _ => 1
  | n + 1, a => npowRec n a * a

theorem foo : npowRec 2000 (npowRec 2000 2001) = npowRec 2000 (npowRec 20001 2000) := by
  rfl!?

which indeed crashes / consumes all memory

Eric Wieser (Nov 22 2024 at 19:12):

(though arguably that's my fault for circumventing the elaborator)

Last updated: May 02 2025 at 03:31 UTC