Zulip Chat Archive
Stream: lean4
Topic: Overloading for non-number syntax?
David Thrane Christiansen (May 09 2022 at 19:13):
While Lean features pervasive overloading for numbers, there doesn't seem to be anything similar for list literals. Notation.lean
contains a macro_rules
that expands list literal syntax directly into invocations of List.cons
and List.nil
.
Similarly, string literals are not overloaded so far as I can see.
What is the thought process behind determining which syntactic literals should admit overloading? I've had good luck with overloaded list in Idris for various kinds of indexed list types (e.g. environments for an interpreter that are indexed by the types of the values found in them). Overloaded strings are useful in contexts where there are embedded external strings, but this seems less compelling, because these contexts are likely to be under the control of a macro anyway.
In any case, it would be nice to know why the system is as it is.
Thanks!
Henrik Böving (May 09 2022 at 19:59):
Technically speaking there is absolutely nothing stopping you from writing your own elaborator that define the semantics of list literals and build that semantic in a way that allows overloading, all syntax is overridable.
Henrik Böving (May 09 2022 at 20:04):
macro_rules
| `([ $elems,* ]) => `(@List.nil Nat)
#eval [] == [1,2,3,4]
yields true.
Leonardo de Moura (May 09 2022 at 20:06):
Lean 4 supports both ad-hoc overloading and typeclass-based overloading. TC-based polymorphism has many advantages, but some limitations too. For example: it forces the overloads to have a "similar shape". We also use expected type-based overloading where we use the expected type to provide the name of a namespace where we expect to have operations (e.g., the expected type is Foo ...
, we could check whether we have Foo.cons
and Foo.nil
and build the list literal using them).
Here is the [ ... ]
notation for Vector
inductive Vector (α : Type u) : Nat → Type u
| nil : Vector α 0
| cons : α → Vector α n → Vector α (n+1)
syntax (name := vector) "[" term,* "]" : term
macro_rules (kind := vector)
| `([ $elems,* ]) => do elems.getElems.foldrM (init := (← `(Vector.nil))) fun x xs => `(Vector.cons $x $xs)
def v : Vector Nat 3 := [1, 2, 3]
def xs : List Nat := [1, 2, 3]
One can also use scoped parsers and priorities, and make sure, for example, [ ... ]
is always a Vector
.
Henrik Böving (May 09 2022 at 20:09):
:O how does it figure out when to use the vector vs the regular list macro?
Leonardo de Moura (May 09 2022 at 20:12):
The parser creates a Choice
node for the overloaded notation. The elaborator tries to elaborate both of them. If only one succeeds, it uses it. If more than one, then it generates an "ambiguous" error. In the example above, if we add
def ys := [1, 2, 3]
we get the error
error: ambiguous, possible interpretations
Vector.cons 1 (Vector.cons 2 (Vector.cons 3 Vector.nil)) : Vector ?m.1531 (0 + 1 + 1 + 1)
[1, 2, 3] : List ?m.1575
Leonardo de Moura (May 09 2022 at 20:23):
Here is the scoped + priority approach for temporarily overriding notation
inductive Vector (α : Type u) : Nat → Type u
| nil : Vector α 0
| cons : α → Vector α n → Vector α (n+1)
namespace VectorNotation
scoped syntax (name := vector) (priority := high) "[" term,* "]" : term
macro_rules (kind := vector)
| `([ $elems,* ]) => do elems.getElems.foldrM (init := (← `(Vector.nil))) fun x xs => `(Vector.cons $x $xs)
end VectorNotation
open VectorNotation in
def v : Vector Nat 3 := [1, 2, 3]
def xs : List Nat := [1, 2, 3]
def ys := [1, 2, 3] -- ys is a List
open VectorNotation in
def zs := [1, 2, 3] -- zs is a Vector
#check zs
-- Vector Nat (0+1+1+1)
David Thrane Christiansen (May 11 2022 at 10:28):
Thanks, this is really helpful! This might be a good example for the manual - I think it very clearly illustrates something interesting and useful.
Have you encountered problems with nested elaborators with Choice
nodes leading to quadratic blow-ups of elaboration time?
David Thrane Christiansen (May 11 2022 at 10:36):
Here's an example of something that stays orange for a long time on my machine:
syntax (name := a) "S" term : term
syntax (name := b) "S" term : term
macro_rules (kind := a)
| `(S $e) => `(Nat.succ $e)
macro_rules (kind := b)
| `(S $e) => `(Nat.succ $e)
def foo := S S S S S S S S S S S S S S S S Nat.zero
Not only is elaboration very slow, but the quadratically-growing error message causes challenges for at least the Emacs mode.
I suppose the answer is akin to the old joke: "Doctor, it hurts when I raise my arm!" "Then don't do it!"
David Thrane Christiansen (May 11 2022 at 10:39):
(I could see something like this coming up if writing down a typing context using ordinary notation, as a snoc list that grows to the right, in a module that defines two variations of contexts (e.g. one for variables, one for the kinds of type variables))
Leonardo de Moura (May 11 2022 at 13:31):
Have you encountered problems with nested elaborators with Choice nodes leading to quadratic blow-ups of elaboration time?
Not really, but it is the least used form of overloading. The most used is typeclass-based overloading by far. In the "second" place, we have type-based overloading. Here is an example of type-based overloading: the a[i]
notation. Lean infers the type of a
, suppose it is C ...
, then it looks for a function named C.getOp
, and then expands the term to C.getOp (self := a) (idx := i)
.
We do use ad-hoc polymorphism in examples such as the typed interpreter example: https://github.com/leanprover/lean4/blob/master/doc/examples/interp.lean
where ::
is notation for List.cons
, Vector.cons
, and Env.con
. Everything is fast there because we do not build big "lists" using them.
The Choice
node also works well if all but one alternative fail quickly. That is, there is enough contextual information for quickly discarding alternatives. This is not the case in your example since the two different S
expand to the same thing.
There is also room for improving the performance of Choice
node elaboration by implementing more aggressive alternative pruning.
We can also address the big error messages and suppress irrelevant information. However, I see it as a very low priority right now, since users are not currently hitting this problem in real examples.
Last updated: Dec 20 2023 at 11:08 UTC