Zulip Chat Archive
Stream: lean4
Topic: Help with my array language design.
Jan Keller (Mar 28 2025 at 23:54):
Hi everyone,
I'm relatively new to Lean4, so I apologize if the example is a bit lengthy. I tried to reduce it, but smaller examples didn't have the same problem. The core issue seems to revolve around how contexts and shapes interact in the term construction.
I'm working on implementing an array language in Lean4 where every term inherently has a shape, effectively making all terms arrays. I'm encountering some challenges with Lean4's type unification when using the Tm.bld
constructor, and I'd appreciate any insights or suggestions the community might have.
Problem Overview:
In my design, the Tm.bld
constructor is supposed to build terms by appending dimensions to the context. However, whenever using variables (Var
), Lean4 isn't able to unifying the necessary types, forcing manual specification/proofs.
Example Code:
source.lean
Any guidance, examples, or tips on language design would be greatoyl appreciated. Thanks in advance for any help!
Best regards,
Jan Keller
Kevin Buzzard (Mar 28 2025 at 23:55):
(You can get nicely-formatted code within a spoiler block by putting ````lean
before it and ````
after it)
Tomas Skrivan (Mar 29 2025 at 02:38):
I'm having a bit hard time to understand what is the code supposed to do. Longer names and maybe a few words on what each types is supposed to do would be helpful.
So far this is my understanding:
Shape r
- elementn1 :: n2 :: ... :: [n_r]
of type Shape r
is supposed to to represent shape of a array/tensor of rank r
with dimensions n1
, n2
, ..., n_r
.
This type is effectively Vector Nat r
with the only difference that Shape 0
is an empty type. You might want to switch to using Vector
as it has functions like append
already defined.
Ty
- is a some sort of type? It can be float but I have no idea what Ty.Idx
is supposed to mean.
Ctx n
- represents context with n
variables and Γ : Ctx n
stores type information of all those variables?
Here again Ctx n
is just Vector Ty n
and you might be better of using Vector
here.
I do not really understand what Var
and Tm
is supposed to mean and what t6₂
and t6₃
are supposed to define.
Anyway I'm happy help as I share interest in array languages!
Tomas Skrivan (Mar 29 2025 at 02:41):
Also my high level impression from your code is that you are overusing dependent types. Having everything indexed by its size sounds great when starting using dependently typed language but one learns soon enough that it can cause more trouble than benefit and should be used in moderation.
Tomas Skrivan (Mar 29 2025 at 02:50):
It is unclear what is your aim. Do you want to implement an array language in Lean? Or do you want to write a specification of an array language and prove some properties of it?
If the former then I would recommend to have a look how Lean itself defines expressions(the core lambda calculus terms) and for example how it defines a local context (the connection between variable names/ids and their types and values)
Tomas Skrivan (Mar 29 2025 at 03:11):
You can of course embed an array language in Lean and work only with types Tensor X shape
where the definition of Tensor
could look something like this:
code
Jannis Limperg (Mar 29 2025 at 18:18):
Here's a fix for the original example (with some miscellaneous style improvements):
inductive Shape: Nat → Type where
| S : Nat → Shape 1
| C : Nat → Shape n → Shape (n + 1)
deriving Repr
@[simp]
def Shape.append : Shape n → Shape m → Shape (m + n)
| .S x, ys => .C x ys
| .C x xs, ys => .C x (.append xs ys)
notation x "::" xs => Shape.C x xs
notation "["x"]" => Shape.S x
notation x "++" y => Shape.append x y
inductive Ty : Type where
| float : Ty
| Idx : Nat → Ty
deriving Repr
inductive Ctx : Nat → Type where
| nil : Ctx 0
| cons : Ty → Ctx n → Ctx (n + 1)
deriving Repr
@[simp]
def Ctx.append : Ctx n → Ctx m → Ctx (m + n)
| .nil, ys => ys
| .cons x xs, ys => .cons x (append xs ys)
@[simp]
def Ctx.fromDims : Shape n → Ctx n
| .S x => .cons (.Idx x) .nil
| .C x xs => .cons (.Idx x) (.fromDims xs)
@[simp]
def Ctx.appDims (c : Ctx n) (s : Shape m) : Ctx (m + n) :=
.append c (.fromDims s)
inductive Var : Ctx n → Type where
| zero : {ty : Ty} → {ctx : Ctx (n)} → Var (Ctx.cons ty ctx)
| succ : {ty : Ty} → {ctx : Ctx (n)} → Var ctx → Var (Ctx.cons ty ctx)
deriving Repr
notation "H" => Var.zero
notation "T" v => Var.succ v
@[simp]
def Ctx.get (Γ : Ctx n) (v : Var Γ) : Ty :=
match v with
| .zero (ty := ty) => ty
| .succ (ctx := ctx) v' => Ctx.get ctx v'
inductive Tm : Ctx n → (s : Shape m) → Ty → Type where
| bld : (dims : Shape n) → Tm (Ctx.appDims Γ dims) s τ → Tm Γ (dims ++ s) τ
| var : (p : Var Γ) → Tm Γ [1] (Ctx.get Γ p)
deriving Repr
def t6₂: Tm Ctx.nil (50::10::[1]) (.Idx 10) :=
Tm.bld (50::[10]) (.var (T H))
def t6₃ : Tm Ctx.nil (50::10::[1]) (.Idx 10) :=
let e1: Tm (Ctx.appDims Ctx.nil (50::[10])) [1] (.Idx 10) := (.var (T H))
Tm.bld (50::[10]) e1
I mostly switched from 1 + n
to n + 1
. This helps because addition recurses on the right in Lean, so you get nicer defeqs. The rest then falls out more or less automatically.
A suggestion: maybe Var
should be indexed not by the context but by the length of the context, i.e. Var n
with n : Nat
.
In general, like Tomas, I'm not convinced that indexing everything by length is worth it, but if your type system is simple enough (no dependencies, no type variables), it may work out without too much pain.
Jan Keller (Mar 30 2025 at 17:03):
Hey, thanks a bunch for your insights! You’re spot on indexing the contexts and shapes just isn’t needed and only makes things more complicated. That induction on the right side of add
is pretty cool too; I always thought it was just like Idr or Coq. And yeah, the shape term is basically just a vector.
Indexing the Var
type over the size is a smart simplification since I don’t need to carry the type around in the variable itself.
Really appreciate your help with this!
With best regards,
Jan Keller
Last updated: May 02 2025 at 03:31 UTC