Zulip Chat Archive

Stream: lean4

Topic: Customizing the elaborator?


Joey Eremondi (Sep 14 2023 at 21:54):

I'm wondering, is there a way to customize the lean elaborator?

Henrik Böving (Sep 14 2023 at 21:58):

There is dozens of ways. What exactly do you want to do? @Joey Eremondi

David Renshaw (Sep 14 2023 at 21:59):

A good starting point might be: https://github.com/leanprover-community/lean4-metaprogramming-book/blob/master/md/main/07_elaboration.md

Joey Eremondi (Sep 14 2023 at 22:01):

I'm wondering, is there a way to customize the elaborator in lean?

I've seen some references in the docs to elaborators for type-aware macros. I'm wondering, is there a way to override existing forms without actually modifying the compiler? To have, for example, lambda or function application elaborate to something different?

I'm looking to do something like the CoqRETT plugin, where each datatype is modified to have additional constructors. Or like Cur, where core language forms can be redefined but you can still refer to the old version.

Mario Carneiro (Sep 14 2023 at 22:02):

yes

Patrick Massot (Sep 14 2023 at 22:02):

I'm afraid you will have to be more specific if you want a helpful answer.

Mario Carneiro (Sep 14 2023 at 22:04):

You seem to have asked the question expecting to get a "no" or "yes under these specific conditions", but the answer is more like "yes you can do almost every conceivable thing in this space" so you have to be more specific

Joey Eremondi (Sep 14 2023 at 22:32):

@Mario Carneiro @Patrick Massot @Henrik Böving Sorry, I submitted only the first line of the message by accident, I'm struggling with the mobile interface. Full message is https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/Customizing.20the.20elaborator.3F/near/391024490

As an example, is there a way to re-define how lambda, application, and function types are elaborated in a specific file? For example, in the CoqRETT example I gave, they add exceptions to Coq by extending each datatype with an extra "error" constructor. Then all terms are elaborated into an Error monad. So an application f x is normally just elaborated into an application in the core language, but here instead it would be elaborated into do f' <- f ; x <- x; (f' x').

Or suppose I want to add a typecase feature to Lean, so I use a custom elaborator that turns all types into Codes of a Universe ala Tarski, and implicitly inserts calls to the interpretation function for those codes at each point needed.

So the idea is I can write code using all the normal language core constructs, but I have a custom elaborator that it goes through to translate it into something different in the core. Basically I want to change the language without modifying the compiler, especially without modifying the core language, so I know whatever I get out the other end is sound, even if it's different from what the normal elaborator would produce.

I guess I'm expecting a "no" because the only language I know of that can do this is Cur, which is built on Racket macros and is ridiculously extensible. Coq can do it, but requires you to write a custom plugin. But I've heard that Lean's macros are powerful and take a lot from e.g. Scheme and Racket macros, so I thought there was a chance it would be possible.

Joey Eremondi (Sep 14 2023 at 22:33):

@David Renshaw Did not know that book existed, looks like the entire thing is probably useful to me. Thanks!

Mario Carneiro (Sep 14 2023 at 22:35):

again, the answer is yes

Joey Eremondi (Sep 14 2023 at 22:37):

@Mario Carneiro Cool. How? Are there docs?

Mario Carneiro (Sep 14 2023 at 22:37):

David Renshaw said:

A good starting point might be: https://github.com/leanprover-community/lean4-metaprogramming-book/blob/master/md/main/07_elaboration.md

Joey Eremondi (Sep 14 2023 at 22:42):

@Mario Carneiro Looking through that now. It seems to tell how to add new forms to the language, but not how to override existing forms, especially ones that don't have names (that I know of), like lambda and application.

Mario Carneiro (Sep 14 2023 at 22:42):

They all have names

Mario Carneiro (Sep 14 2023 at 22:42):

if you ctrl-click on lambda it should take you to the definition

Mario Carneiro (Sep 14 2023 at 22:43):

src4#Lean.Parser.Term.fun

Mario Carneiro (Sep 14 2023 at 22:43):

to override an existing form you use elab_rules instead of elab

Joey Eremondi (Sep 14 2023 at 22:49):

Ahh, okay. Thanks!

Yaël Dillies (Mar 21 2024 at 20:12):

@Mario Carneiro said:

to override an existing form you use elab_rules instead of elab

Could you expand? My naïve attempts at overriding an existing elab failed.

Mario Carneiro (Mar 21 2024 at 20:15):

failed how?

Yaël Dillies (Mar 21 2024 at 20:16):

The old elab is still used with the notation

Mario Carneiro (Mar 21 2024 at 20:16):

there are additional issues that can come up when you override an elab, just like overloaded notation

Mario Carneiro (Mar 21 2024 at 20:16):

the first elab needs to return throwUnsupportedSyntax I think

Mario Carneiro (Mar 21 2024 at 20:16):

otherwise it will be used

Yaël Dillies (Mar 21 2024 at 20:17):

The first elab is

syntax (name := _root_.setBuilder) "{" extBinder " | " term "}" : term

macro_rules (kind := _root_.setBuilder)
  | `({ $x:ident | $p }) => `(setOf fun $x:ident  $p)
  | `({ $x:ident : $t | $p }) => `(setOf fun $x:ident : $t  $p)
  | `({ $x:ident $b:binderPred | $p }) =>
    `(setOf fun $x:ident  satisfies_binder_pred% $x $b  $p)

Yaël Dillies (Mar 21 2024 at 20:17):

Wait wait, but it's not going to return throwUnsupportedSyntax in the cases it parses

Mario Carneiro (Mar 21 2024 at 20:17):

it's possible that the second elab would also be tried if the first one fails in another way, but elabs often don't "fail", they produce errors and continue with sorry

Mario Carneiro (Mar 21 2024 at 20:18):

also, this isn't an elab at all, it's a macro

Mario Carneiro (Mar 21 2024 at 20:18):

I'm not sure if macro/elab overloading works

Yaël Dillies (Mar 21 2024 at 20:18):

Sorry, I have no idea how most of this works

Yaël Dillies (Mar 21 2024 at 20:20):

Let me explain what's going on. I want {x | p x} to be parsed as a set or a finset depending on the expected type. The trouble is that the above macro_rules is defined very early on (in Mathlib.Init.Set) where Finset is not yet available.

Yaël Dillies (Mar 21 2024 at 20:21):

So I need one elab/macro/whatever (is there a generic name here?) to parse the {x | p x} notation as a Set until Finset is defined, then another one to parse the {x | p x} notation polymorphically.

Timo Carlin-Burns (Mar 21 2024 at 20:23):

Alternatively, you could introduce SetOf and Filter typeclasses and have an elab in Mathlib.Init.Set which works for any target type with those instances

Yaël Dillies (Mar 21 2024 at 20:24):

That sounds like useless overhead in an easy case that Lean should be able to handle

Mario Carneiro (Mar 21 2024 at 20:30):

You will probably have to have the earlier elaborator set up a hook which can be used by the later one

Yaël Dillies (Mar 21 2024 at 20:30):

How would that work?

Mario Carneiro (Mar 21 2024 at 20:32):

actually no, I'm sure this should work

Mario Carneiro (Mar 21 2024 at 20:32):

What if setOf is a macro notation and you overload that?

Mario Carneiro (Mar 21 2024 at 20:33):

do you have a MWE version of this code for testing?

Yaël Dillies (Mar 21 2024 at 20:34):

Let me try to put something together

Mario Carneiro (Mar 21 2024 at 20:37):

I guess there is some mismatch anyway because Finset doesn't use setOf

Yaël Dillies (Mar 21 2024 at 20:40):

Is this minimal enough to you? I don't trust my minimisation skills around elaboration any further.

import Mathlib.Data.Fintype.Basic

open Std.ExtendedBinder Finset Lean Meta Elab Term Level Qq

syntax "[" extBinder " | " term "]" : term

macro_rules
  | `([ $x:ident | $p ]) => `(setOf fun $x:ident  $p)
  | `([ $x:ident : $t | $p ]) => `(setOf fun $x:ident : $t  $p)
  | `([ $x:ident $b:binderPred | $p ]) =>
    `(setOf fun $x:ident  satisfies_binder_pred% $x $b  $p)

def mkSetOf (pred : TSyntax `term) (expectedType : Expr) : TermElabM Expr := do
  match_expr expectedType with
  | Finset _ => elabTerm ( `(filter $pred univ)) expectedType
  | _ => elabTerm ( `(setOf $pred)) expectedType

elab_rules : term <= expectedType
  | `([ $x:ident | $p ]) => do mkSetOf ( `(fun $x:ident  $p)) expectedType
  | `([ $x:ident : $t | $p ]) => do mkSetOf ( `(fun $x:ident : $t  $p)) expectedType
  | `([ $x:ident  $s | $p ]) => do
    let desugared 
      match_expr expectedType with
      | Finset _ => `(filter (fun $x:ident  $p) $s)
      | Set _ => `(setOf fun $x:ident => $x  $s  $p)
      | _ =>
        match_expr  whnfR ( inferType ( elabTerm s none)) with
        | Finset _ => `(filter (fun $x:ident  $p) $s)
        | _ => `(setOf fun $x:ident => $x  $s  $p)
    elabTerm desugared expectedType
  | `([ $x:ident $b:binderPred | $p ]) => do
    elabTerm ( `(setOf fun $x:ident  satisfies_binder_pred% $x $b  $p)) expectedType

variable {α : Type*} [Fintype α] {s : Finset α} {p : α  Prop} [DecidablePred p]

#check [x | p x]                  -- works
#check ([x | p x] : Set _)        -- works
#check ([x | p x] : Finset _)     -- fails
#check [x  s | p x]              -- works
#check ([x  s | p x] : Finset _) -- fails
#check ([x  s | p x] : Set _)    -- works
#check [x : α | p x]              -- works
#check ([x : α | p x] : Set _)    -- works
#check ([x : α | p x] : Finset _) -- fails

Mario Carneiro (Mar 21 2024 at 20:43):

I made this based on your code for the finset stuff

import Lean.Parser.Term
import Std.Util.ExtendedBinder

open Lean Elab Term Meta Std.ExtendedBinder

set_option autoImplicit true

def Set (α : Type u) := α  Prop
def setOf {α : Type u} (p : α  Prop) : Set α := p
instance : Membership α (Set α) := fun a s => s a

def Finset (α : Type u) := List α
instance : Membership α (Finset α) := List.Mem
def Finset.filter {α : Type u} (s : Finset α) (p : α  Prop) [DecidablePred p] : Finset α :=
  List.filter p s

syntax "{" extBinder " | " term "}" : term

macro_rules
  | `({ $x:ident | $p }) => `(setOf fun $x:ident  $p)
  | `({ $x:ident : $t | $p }) => `(setOf fun $x:ident : $t  $p)
  | `({ $x:ident $b:binderPred | $p }) =>
    `(setOf fun $x:ident  satisfies_binder_pred% $x $b  $p)

syntax "[" extBinder " | " term "]" : term

macro_rules
  | `([ $x:ident | $p ]) => `(setOf fun $x:ident  $p)
  | `([ $x:ident : $t | $p ]) => `(setOf fun $x:ident : $t  $p)
  | `([ $x:ident $b:binderPred | $p ]) =>
    `(setOf fun $x:ident  satisfies_binder_pred% $x $b  $p)

def mkSetOf (pred : TSyntax `term) (expectedType : Expr) : TermElabM Expr := do
  match_expr expectedType with
  | Finset _ => elabTerm ( `(filter $pred univ)) expectedType
  | _ => elabTerm ( `(setOf $pred)) expectedType

elab_rules : term <= expectedType
  | `([ $x:ident | $p ]) => do mkSetOf ( `(fun $x:ident  $p)) expectedType
  | `([ $x:ident : $t | $p ]) => do mkSetOf ( `(fun $x:ident : $t  $p)) expectedType
  | `([ $x:ident  $s | $p ]) => do
    let desugared 
      match_expr expectedType with
      | Finset _ => `(filter (fun $x:ident  $p) $s)
      | Set _ => `(setOf fun $x:ident => $x  $s  $p)
      | _ =>
        match_expr  whnfR ( inferType ( elabTerm s none)) with
        | Finset _ => `(filter (fun $x:ident  $p) $s)
        | _ => `(setOf fun $x:ident => $x  $s  $p)
    elabTerm desugared expectedType
  | `([ $x:ident $b:binderPred | $p ]) => do
    elabTerm ( `(setOf fun $x:ident  satisfies_binder_pred% $x $b  $p)) expectedType

#check fun (s : Set Nat) => {x  s | True} -- works
#check fun (s : Finset Nat) => [x  s | True] -- works (?)
#check fun (s : Finset Nat) => show Finset Nat from [x  s | True] -- fails

Mario Carneiro (Mar 21 2024 at 20:44):

The behavior in the second case seems a bit odd though, should it produce a Finset?

Yaël Dillies (Mar 21 2024 at 20:45):

It should produce a finset, yes

Mario Carneiro (Mar 21 2024 at 21:03):

Here's a working version of the MWE:

import Lean.Parser.Term
import Std.Util.ExtendedBinder

open Lean Elab Term Meta Std.ExtendedBinder

set_option autoImplicit true

def Set (α : Type u) := α  Prop
def setOf {α : Type u} (p : α  Prop) : Set α := p
instance : Membership α (Set α) := fun a s => s a

syntax "{" extBinder " | " term "}" : term

syntax "maybe_filter%(" ident " ∈ " term " | " term ")" : term
elab_rules : term
  | `(maybe_filter%($x  $s | $p)) => do
    elabTerm ( `(setOf fun $x:ident  $x  $s  $p)) none

macro_rules
  | `({ $x:ident | $p }) => `(setOf fun $x:ident  $p)
  | `({ $x:ident : $t | $p }) => `(setOf fun $x:ident : $t  $p)
  | `({ $x:ident  $s:term | $p }) => `(maybe_filter%($x  $s | $p))
  | `({ $x:ident $b:binderPred | $p }) =>
    `(setOf fun $x:ident  satisfies_binder_pred% $x $b  $p)

def Finset (α : Type u) := List α
instance : Membership α (Finset α) := List.Mem
def Finset.filter {α : Type u} (p : α  Prop) [DecidablePred p] (s : Finset α) : Finset α :=
  List.filter p s

elab_rules : term <= expectedType
  | `(maybe_filter%($x  $s | $p)) => do
    match_expr expectedType with
    | Finset _ => pure ()
    | Set _ => throwUnsupportedSyntax
    | _ =>
      let ty  try whnfR ( inferType ( elabTerm s none)) catch _ => throwUnsupportedSyntax
      match_expr ty with
      | Finset _ => pure ()
      | _ => throwUnsupportedSyntax
    elabTerm ( `(Finset.filter (fun $x:ident  $p) $s)) expectedType

#check fun (s : Set Nat) => {x  s | True}
#check fun (s : Finset Nat) => {x  s | True}
#check fun (s : Finset Nat) => show Finset Nat from {x  s | True}

Yaël Dillies (Mar 21 2024 at 21:05):

Why does your overloading work but not mine? :thinking:

Mario Carneiro (Mar 21 2024 at 21:08):

I confirmed that (and you can too, by editing this code) that mixed elab/macro overloading doesn't work

Mario Carneiro (Mar 21 2024 at 21:08):

I think macro/macro also doesn't work (except for what I said about macros specifically only partially implementing the underlying syntax)

Yaël Dillies (Mar 21 2024 at 21:08):

But if I changed the macro_rules into an elab_rules then overloading should work?

Yaël Dillies (Mar 21 2024 at 21:09):

Do you agree this would be better style than having each macro calling a specifically-defined elab just so that it's overridable?

Mario Carneiro (Mar 21 2024 at 21:10):

even here, you will note that although there are two elab_rules one of them is careful to throwUnsupportedSyntax so it's not using the generic notation overloading system

Yaël Dillies (Mar 21 2024 at 21:12):

Okay but I would have expected that to matter only in the first declared elab_rules?

Mario Carneiro (Mar 21 2024 at 21:12):

if it didn't do that, the second one would always take priority

Yaël Dillies (Mar 21 2024 at 21:12):

Yeah that's fine, right?

Mario Carneiro (Mar 21 2024 at 21:12):

I think that's what your version was doing

Yaël Dillies (Mar 21 2024 at 21:12):

In my version the problem was that the first one was always taking priority

Mario Carneiro (Mar 21 2024 at 21:13):

that's because you used macro/elab overloading

Yaël Dillies (Mar 21 2024 at 21:14):

Okay, so that message was correct? :point_down:
Yaël Dillies said:

But if I changed the macro_rules into an elab_rules then overloading should work?

Mario Carneiro (Mar 21 2024 at 21:14):

"work" meaning that the second one would take precedence

Yaël Dillies (Mar 21 2024 at 21:14):

Yep, great, that's what I wanted to hear

Yaël Dillies (Mar 21 2024 at 21:14):

Sorry I swear I am not always trying to squeeze knowledge out of you. It's just that docs fail me more often than they should.

Mario Carneiro (Mar 21 2024 at 21:15):

docs? what docs

Mario Carneiro (Mar 21 2024 at 21:15):

I learned all this the hard way

Yaël Dillies (Mar 21 2024 at 21:15):

I learned all this the "Ask Mario" way

Mario Carneiro (Mar 21 2024 at 21:16):

I tried that, it doesn't seem to work so well for me

Yaël Dillies (Mar 21 2024 at 21:17):

I have literally seen no mention of overloading anywhere whatsoever, and even less mentions of the macro/macro vs macro/elab vs elab/macro overloading differences.

Mario Carneiro (Mar 21 2024 at 21:17):

I had no idea and just tried all of them and observed that it doesn't seem to work

Yaël Dillies (Mar 21 2024 at 21:18):

Let me add a subchapter to the metaprogramming book real quick

Mario Carneiro (Mar 21 2024 at 21:18):

There is some support in the code for "choice nodes" but this is when the syntax itself is overloaded, e.g. if mathlib defines have and one of the cases is also a case that could be handled by the upstream have

Mario Carneiro (Mar 21 2024 at 21:19):

I'm not sure if there is any other way to trigger the type-based overloading behavior (that we usually try to avoid)

Yaël Dillies (Mar 22 2024 at 10:37):

Thanks to all your help, I'm having a lot of fun right now :smile:

Joachim Breitner (Mar 22 2024 at 14:46):

I'm looking forward to that chapter, this is all currently mysterious to me too :-)

Siddhartha Gadgil (Mar 23 2024 at 16:27):

What happens if we set a higher priority to the same syntax but with a new name and have the elaborator we want? I seem to remember trying this (and it worked) but am not sure.

Mario Carneiro (Mar 24 2024 at 06:00):

that would completely override the original syntax, there will be no fallback to the original behavior

Mario Carneiro (Mar 24 2024 at 06:01):

a drawback is that the syntax kind will be different so things that are expecting the original syntax kind won't trigger once you have overridden it (e.g. linters)

Mac Malone (Apr 07 2024 at 20:51):

I have literally seen no mention of overloading anywhere whatsoever, and even less mentions of the macro/macro vs macro/elab vs elab/macro overloading differences.

To chime in a bit here, overloading a macro with an elab (elab_rules) does not work because macros are expanded prior to running the elaborators.

Robert Maxton (Apr 08 2024 at 08:27):

Yaël Dillies said:

I have literally seen no mention of overloading anywhere whatsoever, and even less mentions of the macro/macro vs macro/elab vs elab/macro overloading differences.

There's a bit in the Lean Manual: https://lean-lang.org/lean4/doc/macro_overview.html#syntax-expansions-with-macro_rules-and-how-it-desugars

But yeah, there's nothing at all about elab; I guess you're supposed to infer that macro vs macro_rules work basically the same way as elab and elab_rules? Though I still had to dig into the Lean source to find the elab_rules : _ <= ty? syntax for an expected type...


Last updated: May 02 2025 at 03:31 UTC