Zulip Chat Archive

Stream: new members

Topic: Idiomatically working with mutual inductive types

Cameron M (Jan 18 2026 at 07:47):

Hi all. I'm getting started with Lean, and am looking to implement a data
structure I use at work and prove some statements about it. For reference, the
structure is a "sum tree", which is essentially a [B+ tree][b tree] that stores
the "sum" of all children in each node ([this][sum tree] is the Rust
implementation I'm copying). I'm quite new to Lean, and formal reasoning more
generally, but have a reasonable functional programming background.

Initially, my code looked like this:

class Summary (σ : Type u) where
  zero      : σ
  add       : σ -> σ -> σ
  add_zero  : ∀ s     : σ, add s zero = s
  add_comm  : ∀ s t   : σ, add s t = add t s
  add_assoc : ∀ s t v : σ, add (add s t) v  = add s (add t v)

class Item (α : Type u) where
  S : Type u
  [instSummary : Summary S]
  summarize : α -> S

structure Summarized (α : Type u) [Item α] [Summary (Item.S α)] where
  val : α
  summary : Item.S α := Item.summarize val

inductive SumTree (α : Type u) [Item α] [Summary (Item.S α)] : (depth : Nat) -> Type u where
  | leaf
    (items : List (Summarized α))
    (summary : Item.S α)
    : SumTree α 0

  | parent
    (children : List (SumTree α depth))
    (summary : Item.S α)
    : SumTree α (depth + 1)

This gave me an error though: invalid nested inductive datatype 'List', nested inductive datatypes parameters cannot contain local variables.. Some Googling
and LLM-ing suggested that, instead of using List, I could define my own
SumTree-specific list type, so I did this:

mutual
  inductive SumTree (α : Type u) [Item α] [Summary (Item.S α)] : (depth : Nat) -> Type u where
    | leaf
      (items : List (Summarized α))
      (summary : Item.S α)
      : SumTree α 0

    | parent
      (children : TreeList α depth)
      (summary : Item.S α)
      : SumTree α (depth + 1)

  inductive TreeList (α : Type u) [Item α] [Summary (Item.S α)] : (depth : Nat) -> Type u where
    | nil : TreeList α depth
    | cons (head : SumTree α depth) (tail : TreeList α depth) : TreeList α depth
end

This works, and also allows me to add some propositions to the cases. For example:

inductive SumTree ...
  | leaf
    -- ...
    (items_len : items.length < 12)
    -- the `summary` field should be the sum of the summaries in `items`
    (summary_eq : summary = (items.map Summarized.summary).foldl Summary.add Summary.zero)

However, when I try to add similar propositions to the parent case, I quickly
run into the issue that, since TreeList is a type I have just invented, there
is no foldl or length function defined.

I could implement these, but if I add def TreeList.length to the mutual
block, I get an error because I am mixing def and inductive in a mutual
block, which is not allowed. But I can't put it outside the mutual block because:

If it's before, then the def won't have access to the TreeList type
If it's after, then I can't access it in the definition of SumTree

I can't help but feel like I must be doing something wrong. I could of course
skip adding these propositions, but then I'm not sure how I'd prove any of the
more interesting theorems. But this seems like something that should be possible.

I also don't really understand the reason why referring to List in my first
attempt at defining SumTree isn't allowed. I think I understand what is not
allowed, but I don't know if this is a fundamental limitation, or just
something that hasn't been implemented yet, so I don't know how hard I should
try to work around it. I'd definitely much prefer a solution that allows me to
reuse all the functions/theorems defined on List in the standard library, but
appreciate this may be more complex than it seems.

Any advice is much appreciated.

Thanks :)

[sum tree]: https://github.com/zed-industries/zed/blob/main/crates/sum_tree/src/sum_tree.rs
[b tree]: https://en.wikipedia.org/wiki/B%2B_tree

Mirek Olšák (Jan 18 2026 at 11:46):

You could also separate the depth measurement from the low-level type

inductive FreeSumTree (α : Type u) [Item α] [Summary (Item.S α)] : Type u where
  | leaf
    (items : List (Summarized α))
    (summary : Item.S α)
  | parent
    (children : List (FreeSumTree α))
    (summary : Item.S α)

def FreeSumTree.hasDepth {α : Type u} [Item α] [Summary (Item.S α)] (tree : FreeSumTree α) (depth : Nat) : Prop :=
  match tree, depth with
  | leaf _ _, 0 => True
  | leaf _ _, _ => False
  | parent _ _, 0 => False
  | parent children _, d+1 =>
    ∀ child ∈ children, child.hasDepth d

def SumTree (α : Type u) [Item α] [Summary (Item.S α)] (depth : Nat):= {tree : FreeSumTree α // tree.hasDepth depth}

Mirek Olšák (Jan 18 2026 at 11:47):

Doing the mutual type approach, you can define functions operating on them outside of the mutual scope.

Mirek Olšák (Jan 18 2026 at 11:56):

Let me explain why nested inductives are tricky.

Inductive types are close to kernel
Lean must be sure that everything is mathematically well-defined (it can be modeled with something close to set theory)

So for example this is not allowed:

/-- A binary tree whose leafs are either natural numbers,
or a function that processes another function tree, and returns a boolean.
(!) This is not valid in Lean's logic.
-/
inductive FunctionTree
| fork (a b : FunctionTree)
| leafNat (n : Nat)
| leafFun (f : FunctionTree → Bool)

Why? Short answer: Storing a full function leafFun inside is not mathematically sound. There is a cardinality argument, or a non-termination argument. I explain it in my notes.

Now, imagine you hide the function into your own datatype

structure Foo (α : Type) where
  f : α → Bool

inductive FunctionTree
| fork (a b : FunctionTree)
| leafNat (n : Nat)
| leafFun (f : Foo FunctionTree)

this still must error out, on the other hand, using List is fine:

inductive FunctionTree
| fork (a b : FunctionTree)
| leafNat (n : Nat)
| leafFun (f : List FunctionTree)

So Lean must investigate the types that you use inside inductive types if it is well-behaved or not. It is true that having depth in SumTree is mathematically valid, but it is a bit too much for Lean's basic check that the definition of the inductive type is sound.

Jakub Nowak (Jan 18 2026 at 18:17):

Mirek Olšák said:

You could also separate the depth measurement from the low-level type

inductive FreeSumTree (α : Type u) [Item α] [Summary (Item.S α)] : Type u where
  | leaf
    (items : List (Summarized α))
    (summary : Item.S α)
  | parent
    (children : List (FreeSumTree α))
    (summary : Item.S α)

def FreeSumTree.hasDepth {α : Type u} [Item α] [Summary (Item.S α)] (tree : FreeSumTree α) (depth : Nat) : Prop :=
  match tree, depth with
  | leaf _ _, 0 => True
  | leaf _ _, _ => False
  | parent _ _, 0 => False
  | parent children _, d+1 =>
    ∀ child ∈ children, child.hasDepth d

def SumTree (α : Type u) [Item α] [Summary (Item.S α)] (depth : Nat):= {tree : FreeSumTree α // tree.hasDepth depth}

Unfortunately, the downside is that afaik you can't get match to work on this SumTree definition. You will have to constantly convert between SumTree and FreeSumTree.

Mirek Olšák (Jan 18 2026 at 19:38):

Also this approach can work for trees of a fixed depth

inductive Tree : Nat → Type where
| leaf : Tree 0
| node {n : Nat} {a : Nat} (children : Fin a → (Tree n)) : Tree (n+1)

Mirek Olšák (Jan 18 2026 at 20:09):

although it is a bit weird to use for programing...

Chris Bailey (Jan 18 2026 at 20:16):

I believe the following is a common approach to fancy invariant-preserving types taken in core/std as well, where (1) the core "data" part of the inductive is kept as simple as possible, (2) you define a separate well-formedness predicate (here the part about depth), and then (3) you define the new object of interest to be "the elements of T that are well formed":

inductive SumTree (α : Type u) [Item α] [Summary (Item.S α)] : Type u
  | leaf
    (items : List (Summarized α))
    (summary : Item.S α)
    : SumTree α

  | parent (depth : Nat)
    (children : List (SumTree α))
    (summary : Item.S α)
    : SumTree α

inductive SumTree.WellFormed {α : Type u} [Item α] [Summary (Item.S α)] : SumTree α → Nat → Prop
  | ofLeaf (items : List (Summarized α)) (summary : Item.S α) : WellFormed (SumTree.leaf items summary) 0
  | ofParent
    (depth : Nat)
    (children : List (SumTree α))
    (summary : Item.S α) :
    (∀ x ∈ children, WellFormed x depth) →
    WellFormed (SumTree.parent depth children summary) (depth+1)

def WellFormedSumTree (α : Type u) [Item α] [Summary (Item.S α)] (depth : Nat) :=
  { t : SumTree α // t.WellFormed depth }

-- Example, you can work with either `SumTree` or `WellFormedSumTree`, but...
--inductive SumTree.FoldPred {α : Type u} [Item α] [Summary (Item.S α)] : SumTree α → Prop
--  | ofLeaf
--    (items : List (Summarized α))
--    (summary : Item.S α)
--    (items_len : items.length < 12)
--    (summary_eq : summary = (items.map Summarized.summary).foldl Summary.add Summary.zero)
--    : FoldPred (.leaf items summary)

This is also nice for types that need to exit/re-enter the well formed world, like balanced trees that need to move through otherwise invalid states while being reshuffled for balancing.

Cameron M (Jan 19 2026 at 03:33):

Thanks everyone for the detailed replies :)

It sounds like this is a pretty core limitation. In fact I did try at first to
"hide" things by moving them into other structures, but I wasn't particularly
surprised to discover that this didn't work. The cardinality argument makes
sense to me (I think?) - the intuition I have is that, because Lean relies on
structural recursion, each "field" in an inductive must be "smaller" (i.e. it
would be invalid to have an inductive contain a "pointer to the original
inductive" as one of the fields, since even structural recursion could not
terminate). And functions similarly violate this constraint. Admittedly this is
all still quite fuzzy in my head.

As for actually making progress, the idea of a SumTree.WellFormed inductive
seems like it preserves the rough "API design" I was looking for. My assumption
then is that I can either choose between:

Have functions like .insert take a WellFormedSumTree and return a
WellFormedSumTree
Have it take a SumTree and return a SumTree, then provide a theorem like
∀ tree : SumTree, ∀ a : α, tree.WellFormed → (tree.insert a).WellFormed

My gut feeling is that the first approach is more correct/idiomatic? In a
sense, it's slightly meaningless to attempt to insert into a non-well-formed
data structure. Though I guess this is only true for the public API of the
tree, I'd probably have private helper methods that don't necessarily require
well-formedness all the time.

Thanks again :)

Jakub Nowak (Jan 19 2026 at 03:37):

Actually, you don't have to choose between. You can have both. You can write definition/theorem in the second form, and then make a first form as a simple wrapper around the second.

Last updated: Feb 28 2026 at 14:05 UTC