Zulip Chat Archive

Stream: lean4

Topic: using higher-order functions on inductive types: termination


Andrés Goens (Mar 13 2022 at 21:04):

I'm trying to prove termination on a function on an inductively-defined type. The problem is that my function uses higher-order functions, e.g. map, for which it seems Lean (understandably) doesn't know that they are well-behaved w.r.t induction. Here's an MWE:

inductive TreeNode :=
 | mkLeaf (name : String) : TreeNode
 | mkNode (name : String) (children : List TreeNode) : TreeNode

open TreeNode in def treeToList (t : TreeNode) : List String :=
 match t with
 | mkLeaf name => [name]
 | mkNode name children => name :: List.join (children.map treeToList)

Here, Lean won't be able to prove termination. Obviously, I can just add partial and be done with it, but I don't want to, for obvious reasons. For this simple example I can also just rewrite it to do it "manually", e.g.

open TreeNode in def treeToList (t : TreeNode) : List String :=
 match t with
 | mkLeaf name => [name]
 | mkNode name (c::rest) =>  treeToList (mkNode name rest) ++ treeToList c
 | mkNode name [] => [name]

But I guess it kind of breaks the point of using higher-order-functions if I can't use them for programming. This seems to be a pattern that should be common, so I'm hoping someone else has encountered it before. Is there a nice way to deal with it?

I also tried defining the depth of my inductive type to prove termination using well-founded recursion, but I can't figure out what I'm doing wrong there :/ If I write something like:
termination_by _ => t.depth, I get a message that says unexpected occurrence of recursive application. For this MWE it would look something like:

open TreeNode in def TreeNode.depth (t : TreeNode) : Nat :=
match t with
 | mkLeaf _ => 1
 | mkNode name (c::rest) => 1 + max (depth c) (depth (mkNode name rest))
 | mkNode _ [] => 1

open TreeNode in def treeToList (t : TreeNode) : List String :=
 match t with
 | mkLeaf name => [name]
 | mkNode name children =>  name :: List.join (children.map treeToList)
termination_by _ => t.depth

and that tells me

unexpected occurrence of recursive application
  treeToList

(and won't let me prove it's actually decreasing with decreasing_by either, it seems to break already with the line above).

Can someone help me understand what I'm doing wrong and/or even point me to how to deal with this pattern more generally (using higher-order functions on inductively-defined types)? Thanks!

James Gallicchio (Mar 13 2022 at 21:07):

I think using well-founded recursion is the right move in general, but a limitation is that you must actually apply the function recursively (not pass it to a HOF)

James Gallicchio (Mar 13 2022 at 21:07):

So here if you do children.map (fun T => treeToList T) it should give a better error message

James Gallicchio (Mar 13 2022 at 21:08):

In particular that you need to prove the T there has depth < the depth of t

James Gallicchio (Mar 13 2022 at 21:10):

(And doing that is also hard in general, but my first instinct would be to somehow map children to a list of children with proof that their depths are strictly smaller?)

Andrés Goens (Mar 13 2022 at 21:11):

interesting, yes, that extra lambda there actually helps, thanks! it's kind of strange that it would need that :thinking:
James Gallicchio said:

So here if you do children.map (fun T => treeToList T) it should give a better error message

James Gallicchio (Mar 13 2022 at 21:15):

Yeah... Lean won't do the eta expansion automatically, and also requires that recursive calls are fully applied.

The error message is pretty unhelpful there, it might be worth improving? @Leonardo de Moura

Andrés Goens (Mar 13 2022 at 21:44):

The error message after that eta expansion is much more helpful! I'm pretty puzzled by the proof obligation term though :/ In my actual example it has some sorryAx. In the MWE I was using here it doesn't, but it's still pretty confusing: it doesn't seem to have the equality of the pattern match:

failed to prove termination, possible solutions:
  - Use `have`-expressions to prove the remaining goals
  - Use `termination_by` to specify a different well-founded relation
  - Use `decreasing_by` to specity your own tactic for discharging this kind of goal
name : String
children : List TreeNode
t : TreeNode
 depth t < depth (mkNode name children)

I think I might not quite understand the input to decreasing_by, I thought it was a function of the inputs to some type that has a well-founded relation? But why won't it understand that the input t : TreeNode in this simple example here is the same as the pattern-matched mkNode name children?

Andrés Goens (Mar 13 2022 at 21:48):

by the way, when copy-pasting that Zulip noticed there's a typo there (specity -> specify), seems to come from https://github.com/leanprover/lean4/blob/fa0964c07e70bd3ad546a18d9c450c9c27d29920/src/Init/WFTactics.lean#L24 @Leonardo de Moura

Chris B (Mar 13 2022 at 21:49):

t is the element of children. decreasing_by is where you provide a proof that depth t < depth (mkNode name children), termination_by is where you specify what relation you want to use.

Andrés Goens (Mar 13 2022 at 21:52):

@Chris B how do you know that t is an element of children? where does it say that?

Chris B (Mar 13 2022 at 21:55):

I thought you were using the eta expanded(fun t => treeToList t), my bad. What is the definition you're working with now?

Chris B (Mar 13 2022 at 21:55):

You can preserve a hypothesis about the matched element with match ht:t with.

Andrés Goens (Mar 13 2022 at 22:05):

Yep, you're right about that! I am using that eta expanded (fun t => treeToList t). The problem is then that it's not in my hypotheses then in decreasing_by. That's a great suggestion, preserving it with the match ht:t with, thanks! That actually shows that indeed it wants me to prove two entirely unrelated terms to be decreasing. It then shows as:

x : TreeNode
name : String
children : List TreeNode
ht : x = mkNode name children
t : TreeNode
 depth t < depth (mkNode name children)

Andrés Goens (Mar 13 2022 at 22:10):

There seems to be something strange going on there :/ if I try to do an induction over t, it will then again relate the two to each other, but in a way that doesn't make sense. This seems really confusing, is that maybe a hygiene bug? Concretely, the code:

open TreeNode in def treeToList (t : TreeNode) : List String :=
 match ht:t with
 | mkLeaf name => [name]
 | mkNode name children => name :: List.join (children.map (λ t => treeToList t))
termination_by treeToList t => TreeNode.depth t
decreasing_by {
simp_wf
induction t
}

will give this state:

unsolved goals
case mkLeaf
x : TreeNode
name : String
children : List TreeNode
ht : x = mkNode name children
name : String
 depth (mkLeaf name) < depth (mkNode name children)

case mkNode
x : TreeNode
name : String
children : List TreeNode
ht : x = mkNode name children
t : TreeNode
name : String
children : List TreeNode
children_ih : ?m.40538 children
 depth (mkNode name children) < depth (mkNode name children)

case nil
x : TreeNode
name : String
children : List TreeNode
ht : x = mkNode name children
t : TreeNode
 ?m.40538 []

case cons
x : TreeNode
name : String
children : List TreeNode
ht : x = mkNode name children
t head : TreeNode
tail : List TreeNode
head_ih : depth head < depth (mkNode name children)
tail_ih : ?m.40538 tail
 ?m.40538 (head :: tail)

And why does it also have cases for t being a List TreeNode?

James Gallicchio (Mar 14 2022 at 00:32):

Hrm, this isn't a bug, but there's a lot of complexity here. If you write decreasing_by sorry you can see the proof obligation there, and it should be something like

depth t < depth t

James Gallicchio (Mar 14 2022 at 00:35):

We don't have any useful information in context because, a priori, the t in the lambda could be arbitrary. You have to find a way to communicate to within the lambda that the depth of t in the lambda is strictly smaller, i.e. that map only calls the function on elements of the list

James Gallicchio (Mar 14 2022 at 00:57):

This definition (not in Std) might be helpful:

  def List.map_sub {P : τ  Prop} : (L : List τ)  ( x, x  L  P x)  List { x : τ // P x }
  | []      => λ _ => []
  | (t::ts) => λ h =>
    t, h t (Mem.head _ _)⟩ ::
    map_sub ts (λ x mem => h x (Mem.tail _ mem))

which lets you subtype the members of a list by some property (e.g. having depth less than the parent node). Then you would have access to a proof of that property within the lambda that you make the recursive call at.

This might be a terrible way to go about it though, I'm not sure.

James Gallicchio (Mar 14 2022 at 01:01):

The easier way to do this would be to index your NodeTree type on its depth, but I don't think that's a good solution either.

James Gallicchio (Mar 14 2022 at 01:02):

James Gallicchio said:

Hrm, this isn't a bug, but there's a lot of complexity here. If you write decreasing_by sorry you can see the proof obligation there, and it should be something like

depth t < depth t

Oh I just saw you had sent this already :joy: sorry

James Gallicchio (Mar 14 2022 at 01:05):

Oh, another way to get this to prove termination is via a mutually recursive definition, where you have a mutually recursive definition that essentially computes children.map treeToList but by explicitly recursing on children

James Gallicchio (Mar 14 2022 at 01:05):

But I really don't like that solution, because rewriting map is annoying.

Andrés Goens (Mar 14 2022 at 01:05):

I mean, the TreeNode example is just the simplest example that shows the problem. I can't imagine this is a very far-fetched pattern here, but I'm starting to think there might be some problem in Lean underlying this. Perhaps because List is itself an inductive type? If I try to compute the simplest example it already breaks:

#reduce TreeNode.depth (TreeNode.mkNode "foo" [TreeNode.mkLeaf "bar"])

will yield:

maximum recursion depth has been reached (use `set_option maxRecDepth <num>` to increase limit)

Even though the individual patterns all work well:

#reduce TreeNode.depth (TreeNode.mkLeaf "bar")
#reduce TreeNode.depth (TreeNode.mkNode "foo" [])
#reduce 1 + max (TreeNode.depth (TreeNode.mkLeaf "bar")) (TreeNode.depth (TreeNode.mkNode "foo" []))

Chris B (Mar 14 2022 at 01:17):

Andrés Goens said:

I can't imagine this is a very far-fetched pattern here, but I'm starting to think there might be some problem in Lean underlying this. Perhaps because List is itself an inductive type?

This construction is called a nested inductive and it is indeed more complex than a regular inductive. Lean 4 has much better fundamental support for nested and mutual inductives than lean 3, and the developers have said that making these usable is one of their goals. While the kernel stuff is in place, I think they've mentioned elsewhere that polishing this part of the developer experience is further back in the queue (your example also uses well-founded recursion, which is in the same boat). Keep in mind you're using software that has not seen its first official stable release.

Chris B (Mar 14 2022 at 01:19):

The #reduce example works for me on m3 by the way.

Andrés Goens (Mar 14 2022 at 01:23):

Sure, I'm aware it's WIP! I'm not complaining at all, just trying to understand things :) thanks for the pointer to nested inductives.

Andrés Goens (Mar 14 2022 at 01:24):

Chris B said:

The #reduce example works for me on m3 by the way.

interesting! sorry, what's m3 :sweat_smile: ?

Chris B (Mar 14 2022 at 01:26):

I didn't take it as a complaint, just saying "it will probably work better in the near future". m3 is milestone3, the version of lean; v4.0.0-m3.

Andrés Goens (Mar 14 2022 at 01:32):

huh, interesting, it doesn't work with the latest nightly for me

Andrés Goens (Mar 14 2022 at 01:35):

(which I guess it does hint at a bug)

James Gallicchio (Mar 14 2022 at 01:41):

Also #reduce uses the kernel (which is very slow), if you use #eval it'll likely execute fine

Mario Carneiro (Mar 14 2022 at 02:39):

Andrés Goens said:

Yep, you're right about that! I am using that eta expanded (fun t => treeToList t). The problem is then that it's not in my hypotheses then in decreasing_by. That's a great suggestion, preserving it with the match ht:t with, thanks! That actually shows that indeed it wants me to prove two entirely unrelated terms to be decreasing. It then shows as:

x : TreeNode
name : String
children : List TreeNode
ht : x = mkNode name children
t : TreeNode
 depth t < depth (mkNode name children)

This goal is not provable, and this is basically why lean has trouble with using List.map in this way. List.map takes a function on all trees, so you can't do a recursive call because you don't know that the tree it is being called on is related to the list you are mapping. You need to use something like docs#list.pmap instead (not sure whether this exists in lean4 / mathlib4 yet)

Andrés Goens (Mar 14 2022 at 09:01):

that's insightful, thanks! is that a fundamental problem (i.e. it wouldn't work because of the underlying type theory and nested inductives or something), or is it more an engineering one (it's just hard to create the right goal in this context and they developers have not come to that yet)?

Jannis Limperg (Mar 14 2022 at 09:35):

Coq and Agda both support nested inductives, in the sense that the example in your initial post would compile as-is. So it's not a problem with the type theory but 'just' requires lots of engineering.

Leonardo de Moura (Mar 14 2022 at 15:36):

Thanks for pointing out the typos and poor error messages. We are going to improve them.

Here is the approach we are going to use in Lean to make this king of example easy to write:
https://github.com/leanprover/lean4/blob/master/tests/lean/run/combinatorsAndWF.lean
It includes a very similar example.

@Jannis Limperg Recall that both Coq and Agda include a termination checker in the kernel, and a lot of engineering is only really needed in Lean.

Jannis Limperg (Mar 14 2022 at 15:51):

Leonardo de Moura said:

Jannis Limperg Recall that both Coq and Agda include a termination checker in the kernel, and a lot of engineering is only really needed in Lean.

Yes, absolutely. I appreciate that you're keeping a lot of complexity out of the kernel with this approach.

Leonardo de Moura (Mar 14 2022 at 16:05):

BTW, the for in do notation already has support for for h : c in cs do ... where h : c ∈ cs. It is not very convenient for this example because it is pure code, but we can write it as follows

inductive TreeNode :=
 | mkLeaf (name : String) : TreeNode
 | mkNode (name : String) (children : List TreeNode) : TreeNode

open TreeNode in
def treeToList (t : TreeNode) : List String :=
 match t with
 | mkLeaf name => [name]
 | mkNode name children => Id.run do
   let mut r := [name]
   for h : child in children do
     -- We will not this the following `have` in the future
     have : sizeOf child < 1 + sizeOf name + sizeOf children := Nat.lt_trans (List.sizeOf_lt_of_mem h) (by simp_arith)
     r := r ++ treeToList child
   return r

Last updated: Dec 20 2023 at 11:08 UTC