Zulip Chat Archive
Stream: general
Topic: Best way to provide Union-Find in Lean?
Phil Nguyen (Dec 28 2023 at 02:12):
A while ago when preparing some lib for Advent of Code, I provided the Union-Find data structure as a state monad(-transformer), where I needed the state to allow path-compression.
What I found unsatisfactory was that there was no way to hide the path-compression "effect" that the client shouldn't have to worry about. The paper A Persistent Union-Find Data Structure suggests the following functional interface (translated to Lean by me):
class UnionFind (uf : Type → Type) where
create : [BEq α] → α → uf α
find : [BEq α] → uf α → α → α
union : [BEq α] → uf α → α → α → uf α
The problem with this interface is that it only has an efficient implementation in an impure language, where we can sneak path-compression imperatively behind the scene.
Do we have any guideline for the cleanest way to provide the Union-Find data structure in a pure FP language, specifically Lean?
Thanks!!
Mario Carneiro (Dec 28 2023 at 02:32):
Note that mathlib has a union-find implementation at docs#UnionFind, and lean4lean has an improved/simplified version at https://github.com/digama0/lean4lean/blob/master/Lean4Lean/UnionFind.lean
Mario Carneiro (Dec 28 2023 at 02:33):
This implementation does not attempt to address the problem you are indicating however, the functions that mutate return a mutated version of the original as is normal for other such functions in lean (and it has efficient in-place modification if you use the value linearly)
Kyle Miller (Dec 28 2023 at 02:38):
It looks like both of those do the obvious FP solution, which is to make find
return uf α × α
(using the analogous types in this thread). That's the explicit version of hiding the union-find state in a state monad.
I wonder -- what if union-find type were a quotient type of the UnionFind type by the relation where two terms are equal if they have the same find : uf α → α → α
. Could you then cheat by having find
be replaced by an implementation that mutates the term under the quotient to do path compression?
Mario Carneiro (Dec 28 2023 at 02:44):
That's mutable quotients, we investigated that a while ago and IIRC there were issues in getting the compiler to do the right thing. Currently the only hidden mutation we have is the Thunk A
type
Yuyang Zhao (Dec 29 2023 at 14:09):
I've seen in some papers about persistent data structures that replace a node that may be referenced many times with an equivalent node to keep the amortization analysis valid. It would be great if this could be implemented in Lean.
François G. Dorais (Dec 29 2023 at 19:58):
@Mario Carneiro Any chance for the L4L implementation to make it into Std?
Mario Carneiro (Dec 29 2023 at 19:59):
Probably not (it would go to core if anything), but union-find might
Mario Carneiro (Dec 29 2023 at 19:59):
or maybe that's what you mean
François G. Dorais (Dec 29 2023 at 19:59):
I was asking about union-find.
Mario Carneiro (Dec 29 2023 at 20:00):
TBH it's looking a bit out of place in mathlib
Mario Carneiro (Dec 29 2023 at 20:00):
I'd be in support of moving it to std, I have some correctness proofs for it too
François G. Dorais (Dec 29 2023 at 20:04):
Cool! I can do the initial move if that's okay with you.
François G. Dorais (Dec 29 2023 at 20:04):
Should I use the L4L version or the Mathlib version?
Mario Carneiro (Dec 29 2023 at 20:05):
IMO the L4L version, that one is more recent and had various modifications to support the proofs
François G. Dorais (Dec 29 2023 at 20:05):
Sounds good. I'll get on it.
Mario Carneiro (Dec 29 2023 at 20:06):
the proofs are in Lean4Lean.Verify.UnionFind
François G. Dorais (Dec 29 2023 at 21:12):
@Mario Carneiro I did a raw import that compiles (but doesn't pass the linter yet) at std4#489. I'll get back to it tomorrow and I hope to have a PR ready very soon. If you want to add some edits in the mean time that's totally fine by me.
François G. Dorais (Dec 30 2023 at 11:59):
All done! std4#489 is ready for review.
Yuyang Zhao (May 21 2024 at 16:59):
Phil Nguyen said:
What I found unsatisfactory was that there was no way to hide the path-compression "effect" that the client shouldn't have to worry about.
I tried to implement it. See https://fr-vdash-bot.github.io/Algorithm/Algorithm/Data/UnionFind.html#UnionFind
Mario Carneiro (May 21 2024 at 18:52):
UnionFind has been merged into Batteries btw, see docs#Batteries.UnionFind
Yuyang Zhao (May 23 2024 at 00:35):
It would be great if we could have a hidden mutation version in Batteries too...
Mario Carneiro (May 23 2024 at 00:38):
IIRC mutable quotients were proposed (by me) and an implementation attempted by Simon Hudon while visiting Leo a while ago, and they ran into some issues getting sound behavior from the compiler, although I forget the exact issues with it
Last updated: May 02 2025 at 03:31 UTC