Zulip Chat Archive
Stream: lean4
Topic: How you annotate the AST in lean4 compiler?
xiao (Mar 04 2025 at 22:31):
I am going to annotate the AST of some DSL. What is the best practice? And how did you do that in lean compiler? The basic example is from Haskell.
inductive Exp where
| Const: Nat -> Exp
| Add: Exp -> Exp -> Exp
inductive Inst where
| Ann: String -> Inst
| Push: Nat -> Inst
| Add | Mult
deriving Repr
abbrev Prog := List Inst
def Exp.compile: Exp -> Prog
| .Const n => [.Push n]
| .Add a b => a.compile ++ b.compile ++ [.Add]
def Exp.eval: Exp -> Nat
| .Const n => n
| .Add a b => a.eval + b.eval
Now, I want to make two variants: one with some line number annotation and the other with a new constructor Mult
.
inductive ExpA where
| Const: String -> Nat -> ExpA
| Add: String -> ExpA -> ExpA -> ExpA
def ExpA.compile: ExpA -> Prog
| .Const s n => [.Ann s, .Push n]
| .Add s a b => [.Ann s] ++ a.compile ++ b.compile ++ [.Add]
def ExpA.eval: ExpA -> Nat
| .Const _ n => n
| .Add _ a b => a.eval + b.eval
inductive ExpM where
| Const: Nat -> ExpM
| Add: ExpM -> ExpM -> ExpM
| Mult: ExpM -> ExpM -> ExpM
def ExpM.compile: ExpM -> Prog
| .Const n => [.Push n]
| .Add a b => a.compile ++ b.compile ++ [.Add]
| .Mult a b => a.compile ++ b.compile ++ [.Mult]
def ExpM.eval: ExpM -> Nat
| .Const n => n
| .Add a b => a.eval + b.eval
| .Mult a b => a.eval * b.eval
Apparently, the same logic for Const
and Add
is repeated many times. I understand the W-type is itself the lfp operator in lean. So I tried the following
inductive ExpF (T: Type) where
| Const: Nat -> ExpF T
| Add: T -> T -> ExpF T
inductive Exp where
| In: ExpF Exp -> Exp
def Exp.const n := Exp.In (ExpF.Const n)
def Exp.add a b := Exp.In (ExpF.Add a b)
def Exp.cata {B} (f: ExpF B -> B): Exp -> B
| .In (.Const n) => f (.Const n)
| .In (.Add a b) => f (.Add (a.cata f) (b.cata f))
def ExpF.compile: ExpF Prog -> Prog
| .Const n => [.Push n]
| .Add a b => a ++ b ++ [.Add]
def ExpF.eval: ExpF Nat -> Nat
| .Const n => n
| .Add a b => a + b
def Exp.eval := Exp.cata ExpF.eval
def Exp.compile := Exp.cata ExpF.compile
def e23: Exp := .add (.const 2) (.const 3)
example: e23.eval = 5 := by eq_refl
example: e23.compile = [.Push 2, .Push 3, .Add] := by eq_refl
And the other two variants are
inductive ExpA where
| Ann: String -> ExpF ExpA -> ExpA
def ExpA.cata
{B: Type}
(f: String -> ExpF B -> B)
: ExpA -> B
| .Ann s (.Const n) => f s (.Const n)
| .Ann s (.Add a b) => f s (.Add (a.cata f) (b.cata f))
def ExpA.eval := ExpA.cata (fun _ b => ExpF.eval b)
def ExpA.compile := ExpA.cata (fun s b => .Ann s :: ExpF.compile b)
def a2: ExpA := (.Ann "pos1" (.Const 2))
def a3: ExpA := (.Ann "pos2" (.Const 3))
def a23: ExpA := (.Ann "pos3" (.Add a2 a3))
example: a23.eval = 5 := by eq_refl
example: a23.compile = [
Inst.Ann "pos3",
Inst.Ann "pos1",
Inst.Push 2,
Inst.Ann "pos2",
Inst.Push 3,
Inst.Add
] := by eq_refl
inductive ExpM where
| In: ExpF ExpM -> ExpM
| Mult: ExpM -> ExpM -> ExpM
def ExpM.mult := ExpM.Mult
def ExpM.const n := ExpM.In (.Const n)
def ExpM.add a b := ExpM.In (.Add a b)
def ExpM.cata {B} (f: ExpF B -> B) (mult: B -> B -> B): ExpM -> B
| .In (.Const n) => f (.Const n)
| .In (.Add n1 n2) => f (.Add (ExpM.cata f mult n1) (ExpM.cata f mult n2))
| .Mult a1 a2 => mult (a1.cata f mult) (a2.cata f mult)
def ExpM.eval := ExpM.cata ExpF.eval Nat.mul
def ExpM.compile := ExpM.cata ExpF.compile (fun a b => a ++ b ++ [.Mult])
def m523 := ExpM.mult (.const 5) (.add (.const 2) (.const 3))
example: m523.eval = 5 * (2 + 3) := by eq_refl
example: m523.compile = [
Inst.Push 5, Inst.Push 2, Inst.Push 3, Inst.Add, Inst.Mult
] := by eq_refl
This time I can reuse the repeated logic, but the cons are that I have to define the catamorphism every time and I have to manually prove the induction rules (recursors). (I think there could be some macro or elaboration helping with that.)
def Exp.ind.{u}
{P: Exp -> Sort u}
(const: forall n, P (.const n))
(add: forall a b, P a -> P b -> P (.add a b))
:
forall (e: Exp), P e
:= fun e => match e with
| .In (.Const a) => const a
| .In (.Add a b) => add a b
(Exp.ind const add a) (Exp.ind const add b)
def ExpM.ind.{u}
{P: ExpM -> Sort u}
(const: forall n, P (.const n))
(add: forall a b, P a -> P b -> P (.add a b))
(mult: forall a b, P a -> P b -> P (.mult a b))
: forall e, P e
| .const n => const n
| .add a b => add a b (a.ind const add mult) (b.ind const add mult)
| .mult a b => mult a b (a.ind const add mult) (b.ind const add mult)
Which one should I choose? I know Haskell is even doing something like
inductive ExpF (C A O: Type) where
| Const: C -> Nat -> ExpF C A O
| Add: A -> ExpF C A O -> ExpF C A O -> ExpF C A O
| Out: O -> ExpF C A O
abbrev Exp := ExpF Unit Unit Empty
def Exp.const (n: Nat): Exp := .Const Unit.unit n
def Exp.add (a b: Exp): Exp := .Add Unit.unit a b
def ExpF.compile {C A: Type}: ExpF C A Prog -> Prog
| .Const _ n => [.Push n]
| .Add _ a b => a.compile ++ b.compile ++ [.Add]
| .Out p => p
inductive ExpM where
| In: ExpF Unit Unit ExpM -> ExpM
| Mult: ExpM -> ExpM -> ExpM
Is this how you annotate the AST of lean? And is there any general suggestion on tree-decoration?
(I aslo asked this in the general channel, but no one has responded.)
Chris Bailey (Mar 05 2025 at 00:19):
how did you do that in lean compiler?
I'm not sure if the compiler specifically has a different representation, but the Lean.Syntax type just has the annotation data included in the constructors. The approach you take in the first example is sort of fundamentally different in that the annotation purports to be its own constructor.
Chris Bailey (Mar 05 2025 at 00:31):
Just eyeballing this, you could probably tackle some of it with a derive handler. If you have your heart set on this kind of representation, I think the non-metaprogramming approaches are things like recursion schemes (which do not work the way you would expect in Lean because of universes) or "tagless finally".
xiao (Mar 05 2025 at 00:58):
I'm not sure if the compiler specifically has a different representation, but the Lean.Syntax type just has the annotation data included in the constructors. The approach you take in the first example is sort of fundamentally different in that the annotation purports to be its own constructor.
I used the product functor to add the annotation to all nodes of the AST. If you think there could be some leaves without the annotation, you can use a coproduct functor to add the annotations optionally. Though I named it as Ann
, it is actually the in
morphism of the initial algebra.
Last updated: May 02 2025 at 03:31 UTC