Zulip Chat Archive

Stream: general

Topic: Invoking a macro on a custom syntactic category


Andrej Bauer (Dec 13 2024 at 15:13):

Consider the following attempt at defining custom syntax of the form C ⊢ T where T will be a term and C comes from a custom syntactic category:

declare_syntax_cat cow
syntax "moo" : cow
syntax "eat" : cow

macro_rules
| `(cow| moo) => `(42)
| `(cow| eat) => `(13)

declare_syntax_cat barn
syntax cow " ⊢ " term : term

macro_rules
| `($c:cow  $t:term) => `(($c, $t))

In the last line, it complains that $c on the rhs has the wrong syntactic category (namely cow instead of term). How do i tell it to elaborate $c using the first macro?

Andrej Bauer (Dec 13 2024 at 15:25):

I came up with this, but is it an idiomatic solution?

declare_syntax_cat cow
syntax "moo" : cow
syntax "eat" : cow

macro_rules
| `(cow| moo) => `(42)
| `(cow| eat) => `(13)

declare_syntax_cat barn
syntax cow " ⊢ " term : term

syntax "mkCow" cow : term

macro_rules
| `($c:cow  $t:term) =>
  `(((mkCow $c), $t))

Shreyas Srinivas (Dec 13 2024 at 15:29):

the canonical way is to bring the syntax trees of category cow to term.

Andrej Bauer (Dec 13 2024 at 15:30):

How?

Shreyas Srinivas (Dec 13 2024 at 15:30):

you are already doing it right

Shreyas Srinivas (Dec 13 2024 at 15:30):

the MPIL book would suggest something slightly different like:

declare_syntax_cat cow
syntax "moo" : cow
syntax "eat" : cow

macro_rules
| `(cow| moo) => `(42)
| `(cow| eat) => `(13)

declare_syntax_cat barn
syntax cow " ⊢ " term : term

syntax "[cow| " cow "]" : term

macro_rules
| `($c:cow  $t:term) =>
  `(([cow| $c], $t))

Shreyas Srinivas (Dec 13 2024 at 15:30):

But this is just to spare oneself parsing agonies I think

Andrej Bauer (Dec 13 2024 at 15:33):

Yes, that makes sense. I am now hitting the following:

declare_syntax_cat cow
syntax "moo" : cow
syntax "eat" : cow

macro_rules
| `(cow| moo) => `(42)
| `(cow| eat) => `(13)

declare_syntax_cat barn
syntax cow " ⊢ " term : term

syntax "mkCow" cow : term

macro_rules
| `($c:cow  ($a:term + $b:term)) => `($c  $a)

It complains about the last line: "unexpected token '⊢'; expected ')' or '|'". Help!

Sebastian Ullrich (Dec 13 2024 at 15:34):

$c by itself is too greedy, it assumes it stands for the entire term and so there is nothing left to parse. Try $c:cow there too.

Sebastian Ullrich (Dec 13 2024 at 15:36):

But coming back to your original issue, macros should stay in the same category (we haven't integrated TSyntax far enough yet to make that a compile-time check). Depending on your specific use case, what you likely want to say is that the denotation of moo is 42, i.e. `(mkCow moo) => `(42) , which stays in term

Shreyas Srinivas (Dec 13 2024 at 15:40):

I checked and the $c:cow solution is already an error in the original example because it is trying to construct a tuple, and tuple syntax expects terms. It works in the third example though.

Andrej Bauer (Dec 13 2024 at 15:42):

I can tell you what I'd really like. I am trying to implement an internal language, and in particular sequents of the form x : X, y : Y ⊢ e, where e comes from some inductive type of expressions and x : X, y : Y is notation for an AssocList Lean.Name C (where C is some random type). So I defined a syntactic entity for contexts, and then for terms, like this:

  declare_syntax_cat fpentry
  syntax ident ":" term : fpentry

  declare_syntax_cat fpcontext
  syntax fpentry,* : fpcontext

  syntax "fpcontext|" fpcontext "]" : term

  macro_rules
  | `([fpcontext| $[$key:ident : $value:term],* ]) =>
    let key := key.map (fun x => Lean.quote x.getId)
    `([$[($key, $value)],*].toAssocList')

  declare_syntax_cat fpterm
  syntax ident : fpterm
  syntax "tt" : fpterm
  syntax "⟨" fpterm "," fpterm "⟩" : fpterm
  syntax "fst" fpterm : fpterm
  syntax "snd" fpterm : fpterm
  syntax term fpterm : fpterm

  syntax fpcontext "⊢ₑ" fpterm : term

  macro_rules
  | `($Γ:fpcontext ⊢ₑ $x:ident) => `(proj [fpcontext|$Γ:fpcontext] $(Lean.quote x.getId))
  | `($Γ:fpcontext ⊢ₑ  $a:fpterm, $b:fpterm ) => `(fp.lift ($Γ:fpcontext ⊢ₑ $a) ($Γ:fpcontext ⊢ₑ $b))
  | `($Γ:fpcontext ⊢ₑ fst $a:fpterm) => `(($Γ:fpcontext ⊢ₑ $a)  fp.fst _ _)
  | `($Γ:fpcontext ⊢ₑ snd $a:fpterm) => `(($Γ:fpcontext ⊢ₑ $a)  fp.snd _ _)
  | `($Γ:fpcontext ⊢ₑ $f:term $a:fpterm) => `(($Γ:fpcontext ⊢ₑ $a)  $f)

(Sorry, this is not a MWE). However, things are not working as expected. What's the right way?

Shreyas Srinivas (Dec 13 2024 at 15:44):

Give me a moment I will fix this in the live server

Andrej Bauer (Dec 13 2024 at 15:44):

When I try to use the syntax like this:

example {X Y : 𝒞} : X  Y  Y  X :=
      x : X, y : Y ⊢ₑ  y, x 

It says "elaboration function for '«term_⊢ₑ_»' has not been implemented
x:X, y:Y⊢ₑ⟨y,x⟩"

Andrej Bauer (Dec 13 2024 at 15:45):

Ah, I had a silly error, it's working now.

Shreyas Srinivas (Dec 13 2024 at 15:45):

Usually this means that the syntax category has not been lifted correctly to term

Andrej Bauer (Dec 13 2024 at 16:13):

Thanks for the help. I can now define morphisms in a category with chosen finite products like this:

 universe u v
 variable {𝒞 : Type u} [Category.{v, u} 𝒞] [fp : ChosenFiniteProducts 𝒞]

  /-- the twist morphism -/
  example {X Y : 𝒞} : X  Y  Y  X :=
    x : X, y : Y ⊢ₑ  y, x 

  /-- the diagonal -/
  example {X : 𝒞} : X  X  X :=
  x : X ⊢ₑ  x, x 

  /-- identity on the terminal -/
  example : 𝟙_ 𝒞  𝟙_ 𝒞 := ⊢ₑ tt

  /-- composition of morphisms -/
  example {X Y Z: 𝒞} (g : Y  Z) (f : X  Y): X  Z :=
    x : X ⊢ₑ [g] [f] x

  /-- right associator -/
  def assocRight (X Y Z : 𝒞) : (X  Y)  Z  X  (Y  Z) :=
  p : (X  Y)  Z ⊢ₑ fst (fst p), snd (fst p), snd p⟩⟩

  /-- left associator -/
  def assocLeft (X Y Z : 𝒞) : X  (Y  Z)  (X  Y)  Z :=
  p : X  (Y  Z) ⊢ₑ ⟨⟨fst p, fst (snd p), snd (snd p)

  /-- the associators are inverses -/
  example {X Y Z : 𝒞} : assocLeft X Y Z  assocRight X Y Z = 𝟙 _ := by
   simp [assocLeft, assocRight, proj, List.toAssocList']
   aesop_cat

Andrej Bauer (Dec 13 2024 at 16:13):

Coming next: λ-calculus and cartesian-closed-categories

Notification Bot (Dec 13 2024 at 16:14):

Andrej Bauer has marked this topic as resolved.

Shreyas Srinivas (Dec 13 2024 at 16:16):

I already did lambda calc a year ago. You might find the boilerplate useful for the syntax parts

Andrej Bauer (Dec 13 2024 at 16:16):

Aha, where do I see what you did?

Shreyas Srinivas (Dec 13 2024 at 16:19):

This should work hopefully

Shreyas Srinivas (Dec 13 2024 at 16:19):

It is a year too old. Lean has changed a lot since then. But it also shows how to write unexpanders

Shreyas Srinivas (Dec 13 2024 at 16:23):

One thing to note is that it might be easier to do this with notation

Shreyas Srinivas (Dec 13 2024 at 16:23):

Let's see if I can translate the above into notation

Andrej Bauer (Dec 13 2024 at 16:24):

What do unexpanders do?

Andrej Bauer (Dec 13 2024 at 16:26):

I am hoping to be able to handle bound variables properly one day. All Lean implementations I've seen so far use strings or Lean.Name for variables. And another thing that puzzles me is whether we could use u v as the syntax for application (rather than u · v).

Shreyas Srinivas (Dec 13 2024 at 16:35):

Unexpanders in this case are just a simpler form of delaborators

Shreyas Srinivas (Dec 13 2024 at 16:35):

https://leanprover-community.github.io/lean4-metaprogramming-book/extra/03_pretty-printing.html?highlight=Unexpanders#unexpanders

Shreyas Srinivas (Dec 13 2024 at 16:36):

Andrej Bauer said:

I am hoping to be able to handle bound variables properly one day. All Lean implementations I've seen so far use strings or Lean.Name for variables. And another thing that puzzles me is whether we could use u v as the syntax for application (rather than u · v).

There is also an ident syntax category to handle this naturally. I have used it here. I didn't do nameless binders because this formalisation was for a course (which used Coq).

Shreyas Srinivas (Dec 13 2024 at 16:37):

Andrej Bauer said:

I am hoping to be able to handle bound variables properly one day. All Lean implementations I've seen so far use strings or Lean.Name for variables. And another thing that puzzles me is whether we could use u v as the syntax for application (rather than u · v).

I don't see an obstacle to any syntax outside the term syntax category other than parsing issues.

Shreyas Srinivas (Dec 13 2024 at 16:39):

Naively replacing " \. " with " " throws invalid atom in syntax declarations.

Shreyas Srinivas (Dec 13 2024 at 16:40):

But lean also lets you write your own parser functions

Shreyas Srinivas (Dec 13 2024 at 16:41):

When I wrote the above I used this repo to learn the basics by example: https://github.com/arthurpaulino/FxyLang/blob/master/FxyLang/Implementation/Syntax.lean

Andrej Bauer (Dec 13 2024 at 16:52):

Thanks. I know about ident and I am using it, just like you. What needs to be done is proper error handling of undeclared variables. For example, in x : X, y : Y | 〈x, z〉 should say "unknown variable z". And we should use PHOAS to build expressions. In case you're interested, here is the current version of the code we're working on.

Shreyas Srinivas (Dec 13 2024 at 17:57):

There is a phoas example in lean's manual

Shreyas Srinivas (Dec 13 2024 at 17:58):

It is borrowed from cpdt if I remember correctly

Shreyas Srinivas (Dec 13 2024 at 19:25):

Andrej Bauer said:

Thanks. I know about ident and I am using it, just like you. What needs to be done is proper error handling of undeclared variables. For example, in x : X, y : Y | 〈x, z〉 should say "unknown variable z". And we should use PHOAS to build expressions. In case you're interested, here is the current version of the code we're working on.

Looks very interesting. I am currently commuting. I'll take a closer look when I am back

Notification Bot (Dec 13 2024 at 23:16):

Andrej Bauer has marked this topic as unresolved.

Andrej Bauer (Dec 13 2024 at 23:21):

If I can have your attention a bit more, I would greatly appreciate it. Consider the following syntax, which is meant to capture typing contexts like x : X , y : Y, z : Z:

  declare_syntax_cat fpentry
  syntax ident ":" term : fpentry

  declare_syntax_cat fpcontext
  syntax fpentry,* : fpcontext

I would like to implement a function that converts the above syntax to an AssocList Lean.Name Lean.Term. Here's my try:

  def mapify : Lean.TSyntax `fpcontext  Lean.AssocList Lean.Name Lean.Term
  | `(fpcontext| ) => .nil
  | `(fpcontext| $x:ident : $A:term, $Γ:fpentry,*) =>
      .cons x.getId A (mapify `(fpcontext| $Γ,* ))
  | _ => .nil

It doesn't like (fpcontext| $Γ:,* ) in the recursive call to mapify, it says "unexpected token ',*'; expected ')'" I tried a bunch of things and I always end up with some sort of an error like that. What's the Right Way to do this sort of thing?

Andrej Bauer (Dec 13 2024 at 23:34):

This works, if we ignore the silly .toList.toAssocList' thing:

  def mapify : Lean.TSyntax `fpcontext  Lean.AssocList Lean.Name Lean.Term
  | `(fpcontext| $[$xs:ident : $ts:term],*) =>
    (Array.zipWith xs ts (fun x t => (x.getId, t))).toList.toAssocList'
  | _ => .nil

And having a default case is also a bit annoying. I'll have to figure out what monad I want to be in.

Kyle Miller (Dec 14 2024 at 00:02):

There's a combination of two small errors, the first is it wants $Γ:fpentry,*, and the second is that syntax quotations are monadic. Since there's nothing that needs to be hygienic, it's fine to run it unhygienically:

  open Lean
  partial def mapify : Lean.TSyntax `fpcontext  Lean.AssocList Lean.Name Lean.Term
  | `(fpcontext| ) => .nil
  | `(fpcontext| $x:ident : $A:term, $Γ:fpentry,*) =>
      .cons x.getId A (mapify (Unhygienic.run `(fpcontext| $Γ:fpentry,* )))
  | _ => .nil

Kyle Miller (Dec 14 2024 at 00:03):

It also has to be partial unfortunately (without who knows how much more work) because it can't see that the arguments are getting smaller.

Kyle Miller (Dec 14 2024 at 00:04):

There's also a bug here in that it won't be able to handle the single-element case due to the fact that the second rule is matching a , after $A:term.

Given that, your zipWith solution seems cleanest to me.

Shreyas Srinivas (Dec 14 2024 at 00:09):

I have run into this monadicness of syntax quotations before and I think it was you who suggested unhygienic.run @Kyle.

Shreyas Srinivas (Dec 14 2024 at 00:13):

If I may say this, lean's quotation system is incredibly confusing to use because it doesn't lend itself to natural recursive definitions. I have made this same mistake many times before, and back then I thought it was because I was a beginner. I wish there was a cleaner way to handle quotations.

Andrej Bauer (Dec 14 2024 at 00:16):

We are all beginners in @Leonardo de Moura's eyes.

Kyle Miller (Dec 14 2024 at 00:20):

Maybe it's worth remembering that t:term,* yields an Array of terms, so you shouldn't expect to write recursive macro definitions as if it were a List of cons cells. It's more efficient doing a fold yourself.

If you're used to Scheme-style syntax-rules it can be a little surprising. (I was tripped up by this comma issue when making that Python-style list comprehension demo.)

Sebastian Ullrich (Dec 14 2024 at 00:26):

I wouldn't even argue from a performance PoV. There simply isn't a nice recursive representation that solves the separator issue so it's more natural to think in terms of repetition.

Kyle Miller (Dec 14 2024 at 00:29):

I almost edited to clarify that I meant "efficient in the number of cases you have to write" (though I also was thinking about the small performance impact of the splices needing to copy Arrays)

Andrej Bauer (Dec 14 2024 at 01:01):

I think I am getting the hang of it. I moved a lot of the computation to the elaboration phase, which results in much nicer terms, see current version


Last updated: May 02 2025 at 03:31 UTC