Zulip Chat Archive

Stream: new members

Topic: Termination checking for dummies


Philipp (Aug 05 2023 at 21:51):

Where do I find a good explanation for termination checking?
I also tried to search for code on github that I could shamelessly adapt/copy but was not successful.

So far I know a bit of functional programming, played the Natural Numbers game, and a long time ago proved some things in Isabelle (so what I want to say is: I have no experience).

What I generally would like to do is prove the termination of my function:

def levenshtein (x' : List α) (y' : List α) [BEq α] : Nat :=
  match (x', y') with
  | ([],[]) => 0
  | ([],y) => y.length
  | (x,[]) => x.length
  | ((x::xs),(y::ys)) => if x == y then levenshtein xs ys
  else 1 + min3 (levenshtein xs (y::ys)) (levenshtein (x::xs) ys) (levenshtein xs ys)
where min3 (a : Nat) (b : Nat) (c : Nat) : Nat := min (min a b) c

What I tried so far:

termination_by levenshtein x y _ => x.length + y.length

but it complains about the x==y case.
I guess the next step is to add decreasing_by but I haven't figured out the syntax or found a good tutorial for what I need to do next.

PS: Is there a better way for the case matching instead of constructing the pair x,y ?

Scott Morrison (Aug 05 2023 at 23:15):

You can directly match x', y' with.

Scott Morrison (Aug 05 2023 at 23:17):

You might like to look at the PR #6117, which starts the development of Levenshtein distances. (You could review it! :-)

Scott Morrison (Aug 05 2023 at 23:19):

termination_by levenshtein x y _ => x.length + y.length is the right thing to do.

Scott Morrison (Aug 05 2023 at 23:19):

The error message:

failed to prove termination, possible solutions:
  - Use `have`-expressions to prove the remaining goals
  - Use `termination_by` to specify a different well-founded relation
  - Use `decreasing_by` to specify your own tactic for discharging this kind of goal
α: Type u_1
x'y': List α
inst: BEq α
x: α
xs: List α
y: α
ys: List α
h: (x == y) = true
 List.length xs + List.length ys < List.length x' + List.length y'

tells you what is missing.

Scott Morrison (Aug 05 2023 at 23:20):

You need to use have to show Lean the proof of decreasing-ness.

Scott Morrison (Aug 05 2023 at 23:21):

Notice however that the way you have structured the proof there is no connection in the context between xs and x' anymore (similarly for ys), and so this goal is not provable. You will need to restructure the proof to allow this.

Scott Morrison (Aug 05 2023 at 23:21):

Hopefully that is helpful --- and please ask again if you'd like further hints on that restructuring!

Arthur Adjedj (Aug 06 2023 at 08:03):

Philipp said:

Where do I find a good explanation for termination checking?
I also tried to search for code on github that I could shamelessly adapt/copy but was not successful.

So far I know a bit of functional programming, played the Natural Numbers game, and a long time ago proved some things in Isabelle (so what I want to say is: I have no experience).

What I generally would like to do is prove the termination of my function:

def levenshtein (x' : List α) (y' : List α) [BEq α] : Nat :=
  match (x', y') with
  | ([],[]) => 0
  | ([],y) => y.length
  | (x,[]) => x.length
  | ((x::xs),(y::ys)) => if x == y then levenshtein xs ys
  else 1 + min3 (levenshtein xs (y::ys)) (levenshtein (x::xs) ys) (levenshtein xs ys)
where min3 (a : Nat) (b : Nat) (c : Nat) : Nat := min (min a b) c

What I tried so far:

termination_by levenshtein x y _ => x.length + y.length

but it complains about the x==y case.
I guess the next step is to add decreasing_by but I haven't figured out the syntax or found a good tutorial for what I need to do next.

PS: Is there a better way for the case matching instead of constructing the pair x,y ?

Your termination hint makes sense, the only issue is the way you matched on your lists: By writing match (x',y') with ...`, you match on the tuple constructed by the two lists, before deconstructing it by pattern-matching, this gives lean trouble as to how to prove termination. What you should do instead is match on them separately, all you need to do for that is to remove the parenthesis:

def levenshtein (x' : List α) (y' : List α) [BEq α] : Nat :=
  match x', y' with
  | [],[] => 0
  | [],y => y.length
  | x,[] => x.length
  | (x::xs),(y::ys) =>
    if x == y then
      levenshtein xs ys
    else
      1 + min3 (levenshtein xs (y::ys)) (levenshtein (x::xs) ys) (levenshtein xs ys)
where
  min3 (a : Nat) (b : Nat) (c : Nat) : Nat := min (min a b) c
termination_by levenshtein xs ys _ => xs.length+ ys.length

This code should work

Arthur Adjedj (Aug 06 2023 at 08:12):

You could also keep the match on tuples, but then you'd need to

  1. add the equality hypothesis to your match by writing match h : (x',y') with ...
  2. prove all proofs of well-foundedness that the termination-checker wouldn't manage to prove by itself
    Here is a version of your code that matches on tuples and proves termination:
def levenshtein (x' : List α) (y' : List α) [BEq α] : Nat :=
  match h : (x', y') with
  | ([],[]) => 0
  | ([],y) => y.length
  | (x,[]) => x.length
  | ((x::xs),(y::ys)) =>
    have : List.length xs + List.length ys < List.length x' + List.length y' := by
      cases h
      apply Nat.add_lt_add <;> exact Nat.lt_succ_self _
    have : List.length xs + List.length (y::ys) < List.length x' + List.length y' := by
      cases h
      apply Nat.add_lt_add_right
      exact Nat.lt_succ_self _
    have : List.length (x::xs) + List.length ys < List.length x' + List.length y' := by
      cases h
      apply Nat.add_lt_add_left
      exact Nat.lt_succ_self _
    if x == y then
      levenshtein xs ys
    else
      1 + min3 (levenshtein xs (y::ys)) (levenshtein (x::xs) ys) (levenshtein xs ys)
where
  min3 (a : Nat) (b : Nat) (c : Nat) : Nat := min (min a b) c
termination_by levenshtein xs ys _ => xs.length+ ys.length

Arthur Adjedj (Aug 06 2023 at 08:13):

This also works, and is cleaner IMO:

def levenshtein (x' : List α) (y' : List α) [BEq α] : Nat :=
  match h : (x', y') with
  | ([],[]) => 0
  | ([],y) => y.length
  | (x,[]) => x.length
  | ((x::xs),(y::ys)) =>
    if x == y then
      levenshtein xs ys
    else
      1 + min3 (levenshtein xs (y::ys)) (levenshtein (x::xs) ys) (levenshtein xs ys)
where
  min3 (a : Nat) (b : Nat) (c : Nat) : Nat := min (min a b) c
termination_by levenshtein xs ys _ => xs.length+ ys.length
decreasing_by {
  simp_wf
  cases h <;> decreasing_trivial
}

Philipp (Aug 06 2023 at 16:18):

Ah thanks! I didn't know how to match on multiple variables without a tuple - or rather that (x, y) is different from x, y.

Could we go over another example? This one is a bit bigger unfortunately...

I used levenshtein to have a distance metric for strings:

class metric (α : Type _) :=
(distance : α  α  Nat)

instance levenshtein_metric {α: Type} [BEq α] : metric (List α) where
  distance x y := levenshtein x y

and defined BkTrees:

inductive BkTree (β : Type _) (d : metric β) where
| leaf
| node (value : β) (children : (Lean.AssocList Nat (BkTree β d)))
deriving Inhabited

For the size function I "cheated" a bit by writing my own foldl function - then the termination checker was fine:

def BkTree.size (t : BkTree β d) : Nat :=
  match t with
  | leaf => 0
  | node _ ts => foldl 1 ts
where foldl : Nat  Lean.AssocList Nat (BkTree β d)  Nat
      | n, Lean.AssocList.nil => n
      | n, Lean.AssocList.cons _ v t => foldl (n + v.size) t

Now I'd like to prove termination for contains:

def BkTree.contains (t : BkTree β d) (v : β) : Bool :=
  match h: t with
  | leaf => false
  | node val cs =>
    let dist := d.distance v val;
    if dist == 0 then true
    else match h2: cs.find? dist with
    | none => false
    | some c => c.contains v
termination_by BkTree.contains t _ => t.size
decreasing_by {
  simp_wf
  rw [ h]
}

Thanks to your tips, I have inserted h and h2, to keep the association between t and ts.
But I don't know if the rewrite makes sense because the goal is now to show size c < size t.

Do I need a lemma that the size of BkTrees is decreasing on recursion? And how do I even show that c is a sub-tree of t?

Arthur Adjedj (Aug 06 2023 at 17:27):

Now what you're asking for here is a bigger task :sweat_smile:
You shouldn't need to define size yourself, for every inductive type T you define, Lean defines a function sizeOf : T -> Nat which helps you prove wellfounded recursion, so feel free to use sizeOf t instead of t.sizeas your termination clue.
However, this doesn't solve your problem, you still need to prove something equivalent. AssocList is used in core to do some things, but it's not necessarily meant to be used by users. As such, it doesn't have many lemmas surrounding it. You'd have to prove yourself that the size of an element found in the AssocList is strictly smaller to the size of the list here.

Philipp (Aug 06 2023 at 20:51):

Ok so if I show that sizeOf t where t := BkTree _ cs is the sum of sizes of the bktrees in cs that would suffice?

Philipp (Aug 06 2023 at 20:53):

ofc I could also just use a custom list of tuples if that's easier - I just thought that AssocList is doing what I want (except that performance wise I should probably take RBMap)

Scott Morrison (Aug 07 2023 at 01:57):

@Philipp, I should try to tempt to you prove the triangle inequality for Levenshtein distance, on top of the definition that is making its way towards mathlib in #6117. :-)

Arthur Adjedj (Aug 07 2023 at 11:30):

Here's something that works. I am not used to proving much in lean, so I'm sure someone somewhere will have a much proper proof of all this, but at least it type-checks:

import Lean
import Mathlib.Tactic

open Lean (AssocList)

def levenshtein (x' : List α) (y' : List α) [BEq α] : Nat :=
  match x', y' with
  | [],[] => 0
  | [],y => y.length
  | x,[] => x.length
  | (x::xs),(y::ys) =>
    if x == y then
      levenshtein xs ys
    else
      1 + min3 (levenshtein xs (y::ys)) (levenshtein (x::xs) ys) (levenshtein xs ys)
where
  min3 (a : Nat) (b : Nat) (c : Nat) : Nat := min (min a b) c
termination_by levenshtein xs ys _ => xs.length+ ys.length

class metric (α : Type _) :=
(distance : α  α  Nat)

instance levenshtein_metric {α: Type} [BEq α] : metric (List α) where
  distance x y := levenshtein x y

inductive BkTree (β : Type _) (d : metric β) where
| leaf
| node (value : β) (children : (AssocList Nat (BkTree β d)))
deriving Inhabited

def BkTree.contains (t : BkTree β d) (v : β) : Bool :=
  match h: t with
  | leaf => false
  | node val cs =>
    let dist := d.distance v val;
    if dist == 0 then true
    else match h2: cs.find? dist with
    | none => false
    | some c =>
      have : sizeOf c < 1 + sizeOf cs := by
        cases h
        induction cs <;> simp at * <;>
        rw [AssocList.find?] at *
        · contradiction
        · rename_i key value tail tail_ih
          split at h2
          · cases h2
            linarith
          · have : sizeOf c < 1 + sizeOf tail := tail_ih h2
            linarith
      c.contains v
termination_by BkTree.contains t _ => sizeOf t

Mario Carneiro (Aug 07 2023 at 11:34):

I think the preferred style for that would be:

def BkTree.contains (t : BkTree β d) (v : β) : Bool :=
  match h: t with
  | leaf => false
  | node val cs =>
    let dist := d.distance v val;
    if dist == 0 then true
    else match h2: cs.find? dist with
    | none => false
    | some c =>
      have : sizeOf c < 1 + sizeOf cs := by
        cases h
        induction cs <;> simp at * <;> rw [AssocList.find?] at *
        · contradiction
        · next key value tail tail_ih =>
          split at h2
          · cases h2
            linarith
          · have : sizeOf c < 1 + sizeOf tail := tail_ih h2
            linarith
      c.contains v
termination_by _ => sizeOf t

since it's not clear what the decreasing_by is supposed to prove

Arthur Adjedj (Aug 07 2023 at 11:35):

Philipp said:

ofc I could also just use a custom list of tuples if that's easier - I just thought that AssocList is doing what I want (except that performance wise I should probably take RBMap)

There are indeed some more efficient structures for what you're doing such as RBMap. However, these are not considered to be strictly positive functors, can't be used in a recursive manner in inductive types. As such, your inductive definition of BkTree would sadly not be accepted by lean if you used RBMap instead

Arthur Adjedj (Aug 07 2023 at 11:37):

Mario Carneiro said:

I think the preferred style for that would be:

def BkTree.contains (t : BkTree β d) (v : β) : Bool :=
  match h: t with
  | leaf => false
  | node val cs =>
    let dist := d.distance v val;
    if dist == 0 then true
    else match h2: cs.find? dist with
    | none => false
    | some c =>
      have : sizeOf c < 1 + sizeOf cs := by
        cases h
        induction cs <;> simp at * <;> rw [AssocList.find?] at *
        · contradiction
        · next key value tail tail_ih =>
          split at h2
          · cases h2
            linarith
          · have : sizeOf c < 1 + sizeOf tail := tail_ih h2
            linarith
      c.contains v
termination_by _ => sizeOf t

since it's not clear what the decreasing_by is supposed to prove

woops, fixed, thanks

Mario Carneiro (Aug 07 2023 at 11:37):

it would be better to prove a lemma about AssocList.find? first though rather than inlining the whole thing in a termination proof

Mario Carneiro (Aug 07 2023 at 11:38):

While you can't use RBMap in a nested inductive, you can use RBNode, the inductive it is defined from

Mario Carneiro (Aug 07 2023 at 11:39):

I believe Json makes use of this trick for objects

Arthur Adjedj (Aug 07 2023 at 11:42):

Here is a version with an inlined theorem:

theorem foo [SizeOf α] [BEq α] [SizeOf β] {cs : AssocList α β } :  AssocList.find? d cs = some c  sizeOf c < 1 + sizeOf cs := by
  intro h
  induction cs <;> simp at * <;>
  rw [AssocList.find?] at *
  · contradiction
  · rename_i key value tail tail_ih
    split at h
    · cases h
      linarith
    · have : sizeOf c < 1 + sizeOf tail := tail_ih h
      linarith

def BkTree.contains (t : BkTree β d) (v : β) : Bool :=
  match t with
  | leaf => false
  | node val cs =>
    let dist := d.distance v val;
    if dist == 0 then true
    else match h2: cs.find? dist with
    | none => false
    | some c =>
      have : sizeOf c < 1 + sizeOf cs := foo h2
      c.contains v
termination_by BkTree.contains t _ => sizeOf t

Philipp (Aug 07 2023 at 17:55):

Arthur Adjedj schrieb:

As such, your inductive definition of BkTree would sadly not be accepted by lean if you used RBMap instead

Oh wow that's sad :(

Philipp (Aug 07 2023 at 17:57):

Arthur Adjedj schrieb:

Here is a version with an inlined theorem:

theorem foo [SizeOf α] [BEq α] [SizeOf β] {cs : AssocList α β } :  AssocList.find? d cs = some c  sizeOf c < 1 + sizeOf cs := by
...

Should such a theorem be added to the AssocList then? Or does that not make sense because AssocList should not be used outside of the Lean core?

Arthur Adjedj (Aug 07 2023 at 19:24):

Philipp said:

Arthur Adjedj schrieb:

Here is a version with an inlined theorem:

theorem foo [SizeOf α] [BEq α] [SizeOf β] {cs : AssocList α β } :  AssocList.find? d cs = some c  sizeOf c < 1 + sizeOf cs := by
...

Should such a theorem be added to the AssocList then? Or does that not make sense because AssocList should not be used outside of the Lean core?

I don't know what the policy is in regards to such PRs, sorry.

Philipp (Aug 07 2023 at 19:46):

Scott Morrison schrieb:

Philipp, I should try to tempt to you prove the triangle inequality for Levenshtein distance, on top of the definition that is making its way towards mathlib in #6117. :-)

I can try that. Unfortunately however, I can't use mathlib due to

error: toolchain 'leanprover/lean4:nightly-2023-08-05' does not have the binary `C:\Users\Philipp\.elan\toolchains\leanprover--lean4---nightly-2023-08-05\bin\lean.exe`

Last updated: Dec 20 2023 at 11:08 UTC