Zulip Chat Archive

Stream: lean4

Topic: Overloading for non-number syntax?

David Thrane Christiansen (May 09 2022 at 19:13):

While Lean features pervasive overloading for numbers, there doesn't seem to be anything similar for list literals. Notation.lean contains a macro_rules that expands list literal syntax directly into invocations of List.cons and List.nil.

Similarly, string literals are not overloaded so far as I can see.

What is the thought process behind determining which syntactic literals should admit overloading? I've had good luck with overloaded list in Idris for various kinds of indexed list types (e.g. environments for an interpreter that are indexed by the types of the values found in them). Overloaded strings are useful in contexts where there are embedded external strings, but this seems less compelling, because these contexts are likely to be under the control of a macro anyway.

In any case, it would be nice to know why the system is as it is.

Thanks!

Henrik Böving (May 09 2022 at 19:59):

Technically speaking there is absolutely nothing stopping you from writing your own elaborator that define the semantics of list literals and build that semantic in a way that allows overloading, all syntax is overridable.

Henrik Böving (May 09 2022 at 20:04):

macro_rules
  | `([ $elems,* ]) => `(@List.nil Nat)

#eval [] == [1,2,3,4]

yields true.

Leonardo de Moura (May 09 2022 at 20:06):

Lean 4 supports both ad-hoc overloading and typeclass-based overloading. TC-based polymorphism has many advantages, but some limitations too. For example: it forces the overloads to have a "similar shape". We also use expected type-based overloading where we use the expected type to provide the name of a namespace where we expect to have operations (e.g., the expected type is Foo ..., we could check whether we have Foo.cons and Foo.nil and build the list literal using them).
Here is the [ ... ] notation for Vector

inductive Vector (α : Type u) : Nat → Type u
  | nil : Vector α 0
  | cons : α → Vector α n → Vector α (n+1)

syntax (name := vector) "[" term,* "]"  : term

macro_rules (kind := vector)
  | `([ $elems,* ]) => do elems.getElems.foldrM (init := (← `(Vector.nil))) fun x xs => `(Vector.cons $x $xs)

def v : Vector Nat 3 := [1, 2, 3]

def xs : List Nat := [1, 2, 3]

One can also use scoped parsers and priorities, and make sure, for example, [ ... ] is always a Vector.

Henrik Böving (May 09 2022 at 20:09):

:O how does it figure out when to use the vector vs the regular list macro?

Leonardo de Moura (May 09 2022 at 20:12):

The parser creates a Choice node for the overloaded notation. The elaborator tries to elaborate both of them. If only one succeeds, it uses it. If more than one, then it generates an "ambiguous" error. In the example above, if we add

def ys := [1, 2, 3]

we get the error

 error: ambiguous, possible interpretations
  Vector.cons 1 (Vector.cons 2 (Vector.cons 3 Vector.nil)) : Vector ?m.1531 (0 + 1 + 1 + 1)

  [1, 2, 3] : List ?m.1575

Leonardo de Moura (May 09 2022 at 20:23):

Here is the scoped + priority approach for temporarily overriding notation

inductive Vector (α : Type u) : Nat → Type u
  | nil : Vector α 0
  | cons : α → Vector α n → Vector α (n+1)

namespace VectorNotation

scoped syntax (name := vector) (priority := high) "[" term,* "]"  : term

macro_rules (kind := vector)
  | `([ $elems,* ]) => do elems.getElems.foldrM (init := (← `(Vector.nil))) fun x xs => `(Vector.cons $x $xs)

end VectorNotation

open VectorNotation in
def v : Vector Nat 3 := [1, 2, 3]

def xs : List Nat := [1, 2, 3]

def ys := [1, 2, 3] -- ys is a List

open VectorNotation in
def zs := [1, 2, 3] -- zs is a Vector

#check zs
-- Vector Nat (0+1+1+1)

David Thrane Christiansen (May 11 2022 at 10:28):

Thanks, this is really helpful! This might be a good example for the manual - I think it very clearly illustrates something interesting and useful.

Have you encountered problems with nested elaborators with Choice nodes leading to quadratic blow-ups of elaboration time?

David Thrane Christiansen (May 11 2022 at 10:36):

Here's an example of something that stays orange for a long time on my machine:

syntax (name := a) "S" term : term
syntax (name := b) "S" term : term

macro_rules (kind := a)
 | `(S $e) => `(Nat.succ $e)

macro_rules (kind := b)
 | `(S $e) => `(Nat.succ $e)

def foo := S S S S S S S S S S S S S S S S Nat.zero

Not only is elaboration very slow, but the quadratically-growing error message causes challenges for at least the Emacs mode.

I suppose the answer is akin to the old joke: "Doctor, it hurts when I raise my arm!" "Then don't do it!"

David Thrane Christiansen (May 11 2022 at 10:39):

(I could see something like this coming up if writing down a typing context using ordinary notation, as a snoc list that grows to the right, in a module that defines two variations of contexts (e.g. one for variables, one for the kinds of type variables))

Leonardo de Moura (May 11 2022 at 13:31):

Have you encountered problems with nested elaborators with Choice nodes leading to quadratic blow-ups of elaboration time?

Not really, but it is the least used form of overloading. The most used is typeclass-based overloading by far. In the "second" place, we have type-based overloading. Here is an example of type-based overloading: the a[i] notation. Lean infers the type of a, suppose it is C ..., then it looks for a function named C.getOp, and then expands the term to C.getOp (self := a) (idx := i).
We do use ad-hoc polymorphism in examples such as the typed interpreter example: https://github.com/leanprover/lean4/blob/master/doc/examples/interp.lean
where :: is notation for List.cons, Vector.cons, and Env.con. Everything is fast there because we do not build big "lists" using them.
The Choice node also works well if all but one alternative fail quickly. That is, there is enough contextual information for quickly discarding alternatives. This is not the case in your example since the two different Sexpand to the same thing.
There is also room for improving the performance of Choice node elaboration by implementing more aggressive alternative pruning.
We can also address the big error messages and suppress irrelevant information. However, I see it as a very low priority right now, since users are not currently hitting this problem in real examples.

Last updated: May 02 2025 at 03:31 UTC