Verification of the ordnode α
datatype #
THIS FILE IS SYNCHRONIZED WITH MATHLIB4. Any changes to this file require a corresponding PR to mathlib4.
This file proves the correctness of the operations in data.ordmap.ordnode
.
The public facing version is the type ordset α
, which is a wrapper around
ordnode α
which includes the correctness invariant of the type, and it exposes
parallel operations like insert
as functions on ordset
that do the same
thing but bundle the correctness proofs. The advantage is that it is possible
to, for example, prove that the result of find
on insert
will actually find
the element, while ordnode
cannot guarantee this if the input tree did not
satisfy the type invariants.
Main definitions #
ordset α
: A well formed set of values of typeα
Implementation notes #
The majority of this file is actually in the ordnode
namespace, because we first
have to prove the correctness of all the operations (and defining what correctness
means here is actually somewhat subtle). So all the actual ordset
operations are
at the very end, once we have all the theorems.
An ordnode α
is an inductive type which describes a tree which stores the size
at
internal nodes. The correctness invariant of an ordnode α
is:
ordnode.sized t
: All internalsize
fields must match the actual measured size of the tree. (This is not hard to satisfy.)ordnode.balanced t
: Unless the tree has the form()
or((a) b)
or(a (b))
(that is, nil or a single singleton subtree), the two subtrees must satisfysize l ≤ δ * size r
andsize r ≤ δ * size l
, whereδ := 3
is a global parameter of the data structure (and this property must hold recursively at subtrees). This is why we say this is a "size balanced tree" data structure.ordnode.bounded lo hi t
: The members of the tree must be in strictly increasing order, meaning that ifa
is in the left subtree andb
is the root, thena ≤ b
and¬ (b ≤ a)
. We enforce this usingordnode.bounded
which includes also a global upper and lower bound.
Because the ordnode
file was ported from Haskell, the correctness invariants of some
of the functions have not been spelled out, and some theorems like
ordnode.valid'.balance_l_aux
show very intricate assumptions on the sizes,
which may need to be revised if it turns out some operations violate these assumptions,
because there is a decent amount of slop in the actual data structure invariants, so the
theorem will go through with multiple choices of assumption.
Note: This file is incomplete, in the sense that the intent is to have verified
versions and lemmas about all the definitions in ordnode.lean
, but at the moment only
a few operations are verified (the hard part should be out of the way, but still).
Contributors are encouraged to pick this up and finish the job, if it appeals to you.
Tags #
ordered map, ordered set, data structure, verified programming
delta and ratio #
singleton
#
sized
#
dual
The balanced_sz l r
asserts that a hypothetical tree with children of sizes l
and r
is
balanced: either l ≤ δ * r
and r ≤ δ * r
, or the tree is trivial with a singleton on one side
and nothing on the other.
Equations
- ordnode.balanced_sz l r = (l + r ≤ 1 ∨ l ≤ ordnode.delta * r ∧ r ≤ ordnode.delta * l)
Instances for ordnode.balanced_sz
Equations
- ordnode.balanced_sz.dec = λ (l r : ℕ), or.decidable
The balanced t
asserts that the tree t
satisfies the balance invariants
(at every level).
Equations
- (ordnode.node _x l _x_1 r).balanced = (ordnode.balanced_sz l.size r.size ∧ l.balanced ∧ r.balanced)
- ordnode.nil.balanced = true
Instances for ordnode.balanced
Equations
- ordnode.balanced.dec t = ordnode.rec (ordnode.balanced.dec._main._proof_1.mpr decidable.true) (λ (t_size : ℕ) (t_l : ordnode α) (t_x : α) (t_r : ordnode α) (t_ih_l : decidable t_l.balanced) (t_ih_r : decidable t_r.balanced), _.mpr and.decidable) t
rotate
and balance
#
Build a tree from three nodes, with a () b -> (a ()) b
and a (b c) d -> ((a b) (c d))
.
Equations
- l.node4_l x (ordnode.node _x ml y mr) z r = (l.node' x ml).node' y (mr.node' z r)
- l.node4_l x ordnode.nil z r = l.node3_l x ordnode.nil z r
Build a tree from three nodes, with a () b -> a (() b)
and a (b c) d -> ((a b) (c d))
.
Equations
- l.node4_r x (ordnode.node _x ml y mr) z r = (l.node' x ml).node' y (mr.node' z r)
- l.node4_r x ordnode.nil z r = l.node3_r x ordnode.nil z r
Concatenate two nodes, performing a left rotation x (y z) -> ((x y) z)
if balance is upset.
Equations
- l.rotate_l x (ordnode.node _x m y r) = ite (m.size < ordnode.ratio * r.size) (l.node3_l x m y r) (l.node4_l x m y r)
- l.rotate_l x ordnode.nil = l.node' x ordnode.nil
Concatenate two nodes, performing a right rotation (x y) z -> (x (y z))
if balance is upset.
Equations
- (ordnode.node _x l x m).rotate_r y r = ite (m.size < ordnode.ratio * l.size) (l.node3_r x m y r) (l.node4_r x m y r)
- ordnode.nil.rotate_r y r = ordnode.nil.node' y r
A left balance operation. This will rebalance a concatenation, assuming the original nodes are not too far from balanced.
A right balance operation. This will rebalance a concatenation, assuming the original nodes are not too far from balanced.
The full balance operation. This is the same as balance
, but with less manual inlining.
It is somewhat easier to work with this version in proofs.
all
, any
, emem
, amem
#
to_list
#
mem
#
(find/erase/split)_(min/max)
#
glue
#
merge
#
insert
#
balance
properties #
raised n m
means m
is either equal or one up from n
.
bounded
#
bounded t lo hi
says that every element x ∈ t
is in the range lo < x < hi
, and also this
property holds recursively in subtrees, making the full tree a BST. The bounds can be set to
lo = ⊥
and hi = ⊤
if we care only about the internal ordering constraints.
Equations
- (ordnode.node _x l x r).bounded o₁ o₂ = (l.bounded o₁ ↑x ∧ r.bounded ↑x o₂)
- ordnode.nil.bounded (option.some a) (option.some b) = (a < b)
- ordnode.nil.bounded (option.some val) option.none = true
- ordnode.nil.bounded option.none _x = true
valid
#
The validity predicate for an ordnode
subtree. This asserts that the size
fields are
correct, the tree is balanced, and the elements of the tree are organized according to the
ordering. This version of valid
also puts all elements in the tree in the interval (lo, hi)
.
An ordset α
is a finite set of values, represented as a tree. The operations on this type
maintain that the tree is balanced and correctly stores subtree sizes at each level. The
correctness property of the tree is baked into the type, so all operations on this type are correct
by construction.
Instances for ordset
O(1). The empty set.
Equations
- ordset.nil = ⟨ordnode.nil α, _⟩
O(1). Construct a singleton set containing value a
.
Equations
- ordset.singleton a = ⟨{a}, _⟩
Equations
- ordset.has_emptyc = {emptyc := ordset.nil _inst_1}
Equations
- ordset.inhabited = {default := ordset.nil _inst_1}
Equations
- ordset.has_singleton = {singleton := ordset.singleton _inst_1}
O(1). Is the set empty?
Instances for ordset.empty
Equations
- ordset.empty.decidable_pred = λ (s : ordset α), decidable_of_iff' ↥(s.val.empty) ordset.empty_iff
O(log n). Insert an element into the set, preserving balance and the BST property. If an equivalent element is already in the set, this replaces it.
Equations
- ordset.insert x s = ⟨ordnode.insert x s.val, _⟩
Equations
- ordset.has_insert = {insert := ordset.insert (λ (a b : α), _inst_3 a b)}
O(log n). Insert an element into the set, preserving balance and the BST property. If an equivalent element is already in the set, the set is returned as is.
Equations
- ordset.insert' x s = ⟨ordnode.insert' x s.val, _⟩
O(log n). Does the set contain the element x
? That is,
is there an element that is equivalent to x
in the order?
Equations
- ordset.mem x s = decidable.to_bool (x ∈ s.val)
O(log n). Retrieve an element in the set that is equivalent to x
in the order,
if it exists.
Equations
- ordset.find x s = ordnode.find x s.val
Equations
- ordset.has_mem = {mem := λ (x : α) (s : ordset α), ↥(ordset.mem x s)}
Equations
- ordset.mem.decidable x s = bool.decidable_eq (ordset.mem x s) bool.tt
O(log n). Remove an element from the set equivalent to x
. Does nothing if there
is no such element.
Equations
- ordset.erase x s = ⟨ordnode.erase x s.val, _⟩
O(n). Map a function across a tree, without changing the structure.
Equations
- ordset.map f f_strict_mono s = ⟨ordnode.map f s.val, _⟩