Zulip Chat Archive

Stream: new members

Topic: Typeclass for bounded natural numbers?


Simeon Duwel (Jan 11 2025 at 23:23):

Hi everyone,
very new to Lean4 (and dependently typed languages in general). I want to explore Lean's capability as a functional programming language and wanted to pose myself a non-trivial challenge; I came up with implementing a B-tree. This is a data structure in which there's a constraint on the amount of child nodes a node can have depending on whether it's a root node or an internal one. This seemed like a good place to try and make use of a dependent type system.

I came up with the following:

class Bounded [LE α] (min: α) (val: α) (max: α) where
  isBounded : min  val  val  max

inductive BTree (α: Type) (b: Nat) where
  | Root
      [Bounded 1 ln b]
      (keys: Vector α ln)
      (children: Vector Internal (ln + 1))
  | Internal
      [Bounded (Nat.div b 2) ln (b - 1)]
      (keys: Vector α ln)
      (children: Vector Internal (ln + 1))

but I have no idea if this is idiomatic (or even in the slightest whether or not this is the right way to go about defining such a type). The main problem I'm facing is this: for any triple of natural numbers (n, m, p) with n ≤ m ≤ p, I can trivially synthesise an instance of this typeclass:

instance : Bounded 1 2 3 where
  isBounded := by simp

however, I don't know how to do this "automatically", i.e. how to tell Lean that the typeclass should be synthesised for natural numbers that satisfy my constraints. As a result, I can't use this datatype in general cases right now, which is kind of a bummer.

Does anyone have any input on how to approach this? I come from a Haskell background, if that helps.
Many thanks in advance from an enthousiastic maths student :)

Simeon Duwel (Jan 11 2025 at 23:39):

Oh, by the way, only just read the channel guidelines: I'm Simeon, currently in the second year of my bachelor in mathematics at KU Leuven. I'm very interested in category, type and homotopy theory and everything that surrounds it, but all of that stuff is currently a bit too advanced for me at the moment :sweat_smile:

Eric Wieser (Jan 11 2025 at 23:46):

You could consider replacing [Bounded _ _ _] with (h : Bounded _ _ _ := by decide) (using autoParams instead of typeclasses), or alternatively using ln : Fin b

Eric Wieser (Jan 11 2025 at 23:47):

(you can disable the feature that gives you ln : Nat implicitly with set_option autoImplicit false)

Eric Wieser (Jan 11 2025 at 23:51):

Actually, I think you're being tripped up by autoImplicit defaulting to true, and the code you wrote doesn't mean what you think it does; the Vector Internal is not referring to the | Internal line, but to an unrelated and invisible Internal : Type _ variable

Simeon Duwel (Jan 11 2025 at 23:54):

Thanks for the advice! I'm home in like 15min, I'll try it out then

Simeon Duwel (Jan 11 2025 at 23:56):

So if I've got this right, the square brackets sort of, "pull" an instance of a typeclass from the context, and the parentheses (h : ...) notation specifically provides a value for it?

Eric Wieser (Jan 12 2025 at 00:10):

The (h : suggestion is "don't use a typeclass at all, just use a regular argument". I suppose I should have said to replace class with structure at the same time

Eric Wieser (Jan 12 2025 at 00:20):

But I think you should address the autoImplicit issue first, you have bigger concerns here

Simeon Duwel (Jan 12 2025 at 00:20):

Eric Wieser said:

(you can disable the feature that gives you ln : Nat implicitly with set_option autoImplicit false)

But I do _want_ to use ln as a value to use in the type for the Vectors, right? how can I get rid of the "unknown identifier ln" error? (Sorry for the amateurishly posed question, I feel like a littl'un writing their first Stack Overflow question again haha)

Simeon Duwel (Jan 12 2025 at 00:21):

Eric Wieser said:

But I think you should address the autoImplicit issue first, you have bigger concerns here

Okay, I'll try working on that (i.e. reading some more). Thanks in any case :))

Matt Diamond (Jan 12 2025 at 00:43):

how can I get rid of the "unknown identifier ln" error?

The problem is pretty simple: you're not passing in a variable named ln. It's like writing a function add (a : Nat) : Nat := a + b and getting an "unknown identifier b" error. Where is it coming from?

This is why it's important to add set_option autoImplicit false at the top, at least when you're a beginner... Lean is making choices for you behind the scenes and you might not be aware of them.

Matt Diamond (Jan 12 2025 at 00:45):

Also don't worry at all about "amateurish" questions... that's why it's the New Members channel! Feel free to ask anything

Simeon Duwel (Jan 12 2025 at 21:33):

Okay, after thinking about this for a bit (and studying for my Algebra final) I've furthered my progress a bit:

inductive BTreeNode (α: Type) (bf: Nat) where
  | Leaf (val : α)
  | Node (ln: Nat) :
         (h: Nat.div ln 2  ln := by decide)
       (children: Vector (BTreeNode α bf) ln)
       BTreeNode α bf

structure BTree (α: Type) (bf: Nat) (ln: Nat) where
  keys : (h: 0 < ln  ln < bf := by decide)  Vector α ln
  children : Vector (BTreeNode α bf) (ln + 1)

although, this gives an error I can't quite wrap my head around: (kernel) invalid nested inductive datatype 'Vector', nested inductive datatypes parameters cannot contain local variables., in the definition of BTreeNode. I suppose the error itself makes sense, because, well, the Vector does depend on the local variable ln. But I have no clue on how to fix this, I came across this thread but I can't quite grasp how to apply the solution presented there to the situation at hand.

Simeon Duwel (Jan 12 2025 at 21:34):

At least I learned something new about the parameter/index distinction hahah

Matt Diamond (Jan 13 2025 at 00:07):

You may have more luck representing the child nodes as a function from Fin n to the nodes... for example, this avoids the error (though idk if it does what you want):

import Mathlib

inductive BTreeNode (α: Type) (bf: Nat) where
  | Leaf (val : α)
  | Node (ln: Nat) :
         (h: Nat.div ln 2  ln := by decide)
       (Fin ln  BTreeNode α bf)
       BTreeNode α bf

structure BTree (α: Type) (bf: Nat) (ln: Nat) where
  keys : (h: 0 < ln  ln < bf := by decide)  Vector α ln
  children : Vector (BTreeNode α bf) (ln + 1)

Matt Diamond (Jan 13 2025 at 00:10):

Lean has a decent amount of API for working with tuples represented as functions from Fin n:

https://leanprover-community.github.io/mathlib4_docs/Mathlib/Data/Fin/Tuple/Basic.html

Simeon Duwel (Jan 14 2025 at 15:33):

Okay, here I am again. I've got it down to something that typechecks at least:

inductive BTreeNode (α: Type) (bf: PosNum) where
  | Nil
  | Leaf (val : α)
  | Node (ln: Nat) :
         (h: Nat.div ln 2  ln := by decide)
       (keys: Vector α ln)
       (children: Fin (ln + 1)  BTreeNode α bf)
       BTreeNode α bf

structure BTree (α: Type) (bf: PosNum) (ln: Nat) where
  keys (h: ln < bf := by decide) : Vector α ln
  children : Vector (BTreeNode α bf) (ln + 1)

def BTree.empty (α: Type) (bf: PosNum) : BTree α bf 0 :=
  { children := Vector.mkVector 1 BTreeNode.Nil, keys := λ (_: 0 < bf) => Vector.mkEmpty 0 }

One thing I don't understand yet, is that I can simply supply the λ (_: 0 < ↑bf) => Vector.mkEmpty 0 as a value for keys. My intention is to need to somehow provide a proof for the proposition that the ln parameter is actually less than the bf parameter (for example using PosNum.cast_pos, I suppose?) but I haven't had to provide a concrete proof here.
Where does my mental model not correspond to what Lean is actually doing?


Last updated: May 02 2025 at 03:31 UTC