Zulip Chat Archive

Stream: lean4

Topic: Decidable fight

Patrick Massot (Sep 29 2022 at 20:40):

I have a hard time understanding how decidable equality works in Lean 4. I need an instance for DecidableEq FVarId. I managed to get it (and it works) but this was quite a fight. What is the one-liner I'm missing?

Sebastian Ullrich (Sep 29 2022 at 20:41):

What is the context you need it in?

Patrick Massot (Sep 29 2022 at 20:44):

I need the list of free variables appearing in an expression. I have

import Lean

open Lean

/-- Get the list of free variables occuring in the expression (with repetitions). -/
def Lean.Expr.getFVars_aux : Expr → List FVarId
| .app fn arg => (getFVars_aux fn) ++ (getFVars_aux arg)
| .fvar Id => [Id]
| .lam _ bt body .. => getFVars_aux bt ++ getFVars_aux body
| .forallE _ bt body .. => getFVars_aux bt ++ getFVars_aux body
| .letE _ t v b .. => getFVars_aux t ++ getFVars_aux v ++ getFVars_aux b
| .mdata _ e => getFVars_aux e
| .proj _ _ e => getFVars_aux e
| _ => []

def List.dedup {α : Type _} [DecidableEq α] : List α → List α
| h::t => if h ∈ t then dedup t else h::(dedup t)
| [] => []

and I want to combine those two functions.

Kyle Miller (Sep 29 2022 at 20:47):

If you're not needing to prove anything, I've found it can be easier to derive docs4#BEq and use that typeclass instead of DecidableEq. (Edit: easier in that it had been more reliable than deriving DecidableEq.)

Sebastian Ullrich (Sep 29 2022 at 20:47):

We usually prefer BEq for programming, which already has an instance for FVarId. You would need to replace ∈ with List.contains in that case.

Sebastian Ullrich (Sep 29 2022 at 20:48):

As for the one-liner though:

deriving instance DecidableEq for Lean.FVarId

Patrick Massot (Sep 29 2022 at 20:50):

So you mean I should have

def List.dedup' {α : Type _} [BEq α] : List α → List α
| h::t => if t.contains h then dedup' t else h::(dedup' t)
| [] => []

Patrick Massot (Sep 29 2022 at 20:51):

Is this what we want in the standard library?

Patrick Massot (Sep 29 2022 at 20:51):

I mean having both List.dedup and List.dedup'?

Patrick Massot (Sep 29 2022 at 20:52):

Btw I noticed that BEq FVarId exist, I used it in my DecidableEq instance.

Mario Carneiro (Sep 29 2022 at 20:52):

I think that function already exists in Std

Patrick Massot (Sep 29 2022 at 20:53):

Which one?

Mario Carneiro (Sep 29 2022 at 20:53):

oh no it doesn't, List.eraseDup uses DecidableEq

Mario Carneiro (Sep 29 2022 at 20:54):

Well I guess List.pwFilter (. != .) would work

Patrick Massot (Sep 29 2022 at 20:56):

This is interesting but doesn't really answer the general question "should we duplicate every function needing some decidability in a DecidableEq version and a BEq version?"

Mario Carneiro (Sep 29 2022 at 20:57):

I think we should prefer BEq versions in Std with a LawfulBEq instance to bridge the gap to DecidableEq stuff

Mario Carneiro (Sep 29 2022 at 20:57):

that position may need refinement depending on our experience

Mario Carneiro (Sep 29 2022 at 20:58):

But also, in many cases we actually want more general than BEq or DecidableEq and just have a version that takes a predicate (in this case pwFilter is that, although the name is terrible)

Mario Carneiro (Sep 29 2022 at 20:59):

dedupBy makes more sense

Kyle Miller (Sep 29 2022 at 21:00):

@Patrick Massot We don't need duplicates of everything since there's docs4#instBEq

Mario Carneiro (Sep 29 2022 at 21:00):

Another thing: I don't expect these list functions to get nearly as much use as they did in lean 3. There is almost always a better data structure for the job

Mario Carneiro (Sep 29 2022 at 21:01):

in this case, can't you use a FVarIdSet?

Patrick Massot (Sep 29 2022 at 21:01):

I have no idea what is a FVarIdSet, but I can investigate.

Mario Carneiro (Sep 29 2022 at 21:02):

It's an RBSet of FVarIds

Mario Carneiro (Sep 29 2022 at 21:02):

i.e. it sorts them

Patrick Massot (Sep 29 2022 at 21:02):

Great, I only need to investigate what is a RBSet

Mario Carneiro (Sep 29 2022 at 21:02):

it's a binary search tree

Patrick Massot (Sep 29 2022 at 21:03):

It's crucial in my use case that my list is ordered. I don't want a fancy data structure to reorder them.

Mario Carneiro (Sep 29 2022 at 21:03):

you could push them to an array as you find them

Patrick Massot (Sep 29 2022 at 21:04):

But then you still to check they aren't in the array before pushing, right?

Mario Carneiro (Sep 29 2022 at 21:05):

i.e. use a rbset or hashset (FVarIdHashSet also exists) to answer membership queries and then push new things into both the array and the set

Kyle Miller (Sep 29 2022 at 21:05):

How big are these lists? Would the overhead of a hashset for deduplication pay for itself?

Mario Carneiro (Sep 29 2022 at 21:06):

If you anticipate that the set won't be too large in most cases then it's probably better to just use the array and linear search

Patrick Massot (Sep 29 2022 at 21:09):

I anticipate the array will have at most four or five elements in the end.

Mario Carneiro (Sep 29 2022 at 21:11):

yeah, an array would be best for that

Mario Carneiro (Sep 29 2022 at 21:12):

Oh, this function already exists

Mario Carneiro (Sep 29 2022 at 21:13):

def Lean.Expr.getFVars (e : Expr) : Array FVarId :=
  (Lean.collectFVars {} e).fvarIds

Patrick Massot (Sep 29 2022 at 21:16):

Does this version preserve the order in which the free variables occur in the expression?

Mario Carneiro (Sep 29 2022 at 21:17):

it does

Patrick Massot (Sep 29 2022 at 21:17):

Great!

Mario Carneiro (Sep 29 2022 at 21:18):

although if you use fancy elaborators your fvars may not end up in the expected order inside the Expr in the first place

Mario Carneiro (Sep 29 2022 at 21:18):

it's a bit suspect to be relying on this

Mario Carneiro (Sep 29 2022 at 21:18):

(e.g. have x := e; e' elaborates to (fun x => e') e)

Patrick Massot (Sep 29 2022 at 21:19):

I think I'm fine. But don't worry, I hope that very soon you'll be able to see the whole thing and rewrite it entirely.

Tomas Skrivan (Sep 30 2022 at 12:18):

Why in programming is BEq preferable to DecidableEq? I personally use DecidableEq almost everywhere.

Kyle Miller (Sep 30 2022 at 14:00):

@Tomas Skrivan My understanding is that, since DecidableEq implies BEq, you may as well use BEq in functions if they don't require equality proofs to operate. It seems related to a principle we use in mathlib that definitions are allowed to be nonsense on invalid inputs or when certain typeclasses cannot exist, but the theorems have these additional constraints as hypotheses.

This also gives Lean a more approachable learning curve since programmers from other languages are familiar with implementing equality functions, but writing DecidableEq instances tends to take more experience. It might even be easier writing DecidableEq instances in many cases by first writing a BEq instance and then a LawfulBEq instance, since it separates the concerns of writing an efficient BEq and then proving it's correct.

Mario Carneiro (Sep 30 2022 at 16:23):

Not just easier, I believe there is also reason to believe the compiler will generate better code using boolean logical operators rather than lots of if and match statements

Last updated: May 02 2025 at 03:31 UTC