Zulip Chat Archive
Stream: lean4
Topic: Decidable fight
Patrick Massot (Sep 29 2022 at 20:40):
I have a hard time understanding how decidable equality works in Lean 4. I need an instance for DecidableEq FVarId
. I managed to get it (and it works) but this was quite a fight. What is the one-liner I'm missing?
Sebastian Ullrich (Sep 29 2022 at 20:41):
What is the context you need it in?
Patrick Massot (Sep 29 2022 at 20:44):
I need the list of free variables appearing in an expression. I have
import Lean
open Lean
/-- Get the list of free variables occuring in the expression (with repetitions). -/
def Lean.Expr.getFVars_aux : Expr → List FVarId
| .app fn arg => (getFVars_aux fn) ++ (getFVars_aux arg)
| .fvar Id => [Id]
| .lam _ bt body .. => getFVars_aux bt ++ getFVars_aux body
| .forallE _ bt body .. => getFVars_aux bt ++ getFVars_aux body
| .letE _ t v b .. => getFVars_aux t ++ getFVars_aux v ++ getFVars_aux b
| .mdata _ e => getFVars_aux e
| .proj _ _ e => getFVars_aux e
| _ => []
def List.dedup {α : Type _} [DecidableEq α] : List α → List α
| h::t => if h ∈ t then dedup t else h::(dedup t)
| [] => []
and I want to combine those two functions.
Kyle Miller (Sep 29 2022 at 20:47):
If you're not needing to prove anything, I've found it can be easier to derive docs4#BEq and use that typeclass instead of DecidableEq
. (Edit: easier in that it had been more reliable than deriving DecidableEq
.)
Sebastian Ullrich (Sep 29 2022 at 20:47):
We usually prefer BEq
for programming, which already has an instance for FVarId
. You would need to replace ∈
with List.contains
in that case.
Sebastian Ullrich (Sep 29 2022 at 20:48):
As for the one-liner though:
deriving instance DecidableEq for Lean.FVarId
Patrick Massot (Sep 29 2022 at 20:50):
So you mean I should have
def List.dedup' {α : Type _} [BEq α] : List α → List α
| h::t => if t.contains h then dedup' t else h::(dedup' t)
| [] => []
Patrick Massot (Sep 29 2022 at 20:51):
Is this what we want in the standard library?
Patrick Massot (Sep 29 2022 at 20:51):
I mean having both List.dedup
and List.dedup'
?
Patrick Massot (Sep 29 2022 at 20:52):
Btw I noticed that BEq FVarId
exist, I used it in my DecidableEq
instance.
Mario Carneiro (Sep 29 2022 at 20:52):
I think that function already exists in Std
Patrick Massot (Sep 29 2022 at 20:53):
Which one?
Mario Carneiro (Sep 29 2022 at 20:53):
oh no it doesn't, List.eraseDup
uses DecidableEq
Mario Carneiro (Sep 29 2022 at 20:54):
Well I guess List.pwFilter (. != .)
would work
Patrick Massot (Sep 29 2022 at 20:56):
This is interesting but doesn't really answer the general question "should we duplicate every function needing some decidability in a DecidableEq version and a BEq version?"
Mario Carneiro (Sep 29 2022 at 20:57):
I think we should prefer BEq versions in Std with a LawfulBEq
instance to bridge the gap to DecidableEq stuff
Mario Carneiro (Sep 29 2022 at 20:57):
that position may need refinement depending on our experience
Mario Carneiro (Sep 29 2022 at 20:58):
But also, in many cases we actually want more general than BEq
or DecidableEq
and just have a version that takes a predicate (in this case pwFilter
is that, although the name is terrible)
Mario Carneiro (Sep 29 2022 at 20:59):
dedupBy
makes more sense
Kyle Miller (Sep 29 2022 at 21:00):
@Patrick Massot We don't need duplicates of everything since there's docs4#instBEq
Mario Carneiro (Sep 29 2022 at 21:00):
Another thing: I don't expect these list functions to get nearly as much use as they did in lean 3. There is almost always a better data structure for the job
Mario Carneiro (Sep 29 2022 at 21:01):
in this case, can't you use a FVarIdSet
?
Patrick Massot (Sep 29 2022 at 21:01):
I have no idea what is a FVarIdSet
, but I can investigate.
Mario Carneiro (Sep 29 2022 at 21:02):
It's an RBSet of FVarIds
Mario Carneiro (Sep 29 2022 at 21:02):
i.e. it sorts them
Patrick Massot (Sep 29 2022 at 21:02):
Great, I only need to investigate what is a RBSet
Mario Carneiro (Sep 29 2022 at 21:02):
it's a binary search tree
Patrick Massot (Sep 29 2022 at 21:03):
It's crucial in my use case that my list is ordered. I don't want a fancy data structure to reorder them.
Mario Carneiro (Sep 29 2022 at 21:03):
you could push them to an array as you find them
Patrick Massot (Sep 29 2022 at 21:04):
But then you still to check they aren't in the array before pushing, right?
Mario Carneiro (Sep 29 2022 at 21:05):
i.e. use a rbset or hashset (FVarIdHashSet
also exists) to answer membership queries and then push new things into both the array and the set
Kyle Miller (Sep 29 2022 at 21:05):
How big are these lists? Would the overhead of a hashset for deduplication pay for itself?
Mario Carneiro (Sep 29 2022 at 21:06):
If you anticipate that the set won't be too large in most cases then it's probably better to just use the array and linear search
Patrick Massot (Sep 29 2022 at 21:09):
I anticipate the array will have at most four or five elements in the end.
Mario Carneiro (Sep 29 2022 at 21:11):
yeah, an array would be best for that
Mario Carneiro (Sep 29 2022 at 21:12):
Oh, this function already exists
Mario Carneiro (Sep 29 2022 at 21:13):
def Lean.Expr.getFVars (e : Expr) : Array FVarId :=
(Lean.collectFVars {} e).fvarIds
Patrick Massot (Sep 29 2022 at 21:16):
Does this version preserve the order in which the free variables occur in the expression?
Mario Carneiro (Sep 29 2022 at 21:17):
it does
Patrick Massot (Sep 29 2022 at 21:17):
Great!
Mario Carneiro (Sep 29 2022 at 21:18):
although if you use fancy elaborators your fvars may not end up in the expected order inside the Expr in the first place
Mario Carneiro (Sep 29 2022 at 21:18):
it's a bit suspect to be relying on this
Mario Carneiro (Sep 29 2022 at 21:18):
(e.g. have x := e; e'
elaborates to (fun x => e') e
)
Patrick Massot (Sep 29 2022 at 21:19):
I think I'm fine. But don't worry, I hope that very soon you'll be able to see the whole thing and rewrite it entirely.
Tomas Skrivan (Sep 30 2022 at 12:18):
Why in programming is BEq
preferable to DecidableEq
? I personally use DecidableEq
almost everywhere.
Kyle Miller (Sep 30 2022 at 14:00):
@Tomas Skrivan My understanding is that, since DecidableEq
implies BEq
, you may as well use BEq
in functions if they don't require equality proofs to operate. It seems related to a principle we use in mathlib that definitions are allowed to be nonsense on invalid inputs or when certain typeclasses cannot exist, but the theorems have these additional constraints as hypotheses.
This also gives Lean a more approachable learning curve since programmers from other languages are familiar with implementing equality functions, but writing DecidableEq
instances tends to take more experience. It might even be easier writing DecidableEq
instances in many cases by first writing a BEq
instance and then a LawfulBEq
instance, since it separates the concerns of writing an efficient BEq
and then proving it's correct.
Mario Carneiro (Sep 30 2022 at 16:23):
Not just easier, I believe there is also reason to believe the compiler will generate better code using boolean logical operators rather than lots of if and match statements
Last updated: Dec 20 2023 at 11:08 UTC