Zulip Chat Archive

Stream: lean4

Topic: Lean 4 under the hood


view this post on Zulip Mario Carneiro (Apr 12 2021 at 05:03):

This will be a list of questions that came up while trying to write lean 4 tactics, macros and other backend things. The intent is to record that I had the question and what the answer was in hopes of improving documentation in the future.

To parse 1D array literals, you need:

syntax "#[" sepBy(term, ", ") "]" : term
macro_rules  | `(#[ $elems,* ]) => ...

Q: What is the analogue for 2D array literals (nested sepBy)?

syntax "#[" sepBy(sepBy(term, ", "), "; ") "]" : term
macro_rules | `(#[ ??? ]) => ...

The rust syntax would be $($elems,*);* but that doesn't work.

view this post on Zulip Mario Carneiro (Apr 12 2021 at 05:07):

Q: How do you evaluate a syntax quotation?

#eval `(#[1, 2; 3, 4])

doesn't work because it doesn't know what monad it's working in.

import Lean
open Lean Elab Command
#print CommandElabM
#eval show CommandElabM Syntax from do `(#[1, 2; 3, 4])

doesn't work because CommandElabM doesn't implement MetaEval, whatever that means. What monads should I be using in #eval?

view this post on Zulip Mario Carneiro (Apr 12 2021 at 05:11):

A: UseTermElabM:

import Lean
open Lean Elab
#eval show TermElabM Syntax from do `(#[1, 2; 3, 4])
-- (term#[_,;] "#[" [[(numLit "1") "," (numLit "2")] ";" [(numLit "3") "," (numLit "4")]] "]")

The only monads that can be used in #eval are those implementing MetaEval, namely TermElabM and MetaM, and those implementing Eval, namely IO and pure expressions.

view this post on Zulip Lim, Thing-han (Apr 12 2021 at 05:43):

I just found out that using quote also works for evaluate a syntax quotation~

view this post on Zulip Mario Carneiro (Apr 12 2021 at 05:56):

quote works but it does something else, I think. `(1+1) captures the Syntax corresponding to 1+1, i.e. the AST of that expression, while quote (1+1) will return a syntax that represents the numeral 2

view this post on Zulip Mario Carneiro (Apr 12 2021 at 05:57):

(it's the equivalent of lean 3 reflect, while `(..) is the equivalent of lean 3 ``(..))

view this post on Zulip Sebastian Ullrich (Apr 12 2021 at 07:46):

Mario Carneiro said:

The rust syntax would be $($elems,*);* but that doesn't work.

We use $[...] for that purpose because we already use $(e) for antiquotations with non-atomic terms.

macro_rules
  | `(#[ $[$elemss,*];* ]) => _

I've just written a journal chapter on our antiquotations, which I should copy to the docs when I find time...
For many (undocumented) examples, see also https://github.com/leanprover/lean4/blob/master/tests/lean/StxQuot.lean, which shows another way to run syntax quotations: Unhygienic.run, which as the name says can break hygiene because it does not depend on a name generator state.

view this post on Zulip Mario Carneiro (Apr 12 2021 at 10:46):

How do you do ambiguity resolution in a macro?

syntax "#[" sepBy(sepBy(term, ", "), "; ") "]" : term

macro_rules
  | `(#[ $[$elems,*];* ]) => `(())

#eval #[1, 2]
-- ambiguous, possible interpretations
--   Unit.unit
--
--   #[1, 2]

Is there a priority system, or a way to throwError in the macro_rules so that it doesn't get picked?

view this post on Zulip Sebastian Ullrich (Apr 12 2021 at 11:49):

Yes to both: https://github.com/leanprover/lean4/blob/master/tests/lean/macroPrio.lean and Macro.throwUnsupported

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:24):

Nice to see you thinking about this @Mario Carneiro . Julia and Numpy and Matlab have written tons of arguments on how to make A * B polymorphic and mathematically consistent. It turns out that Householder linear algebra notation can be quite tricky to get right because of historical and inconsistent definitions of what a vector means.
In Julia, we found that making #[1, 2, 3] a RowVector and #[ 1 2 3] a column vector a neat solution to preserve the appropriate mathematical mappings when doing linear algebra operations.
This 20 min talk by Dr. Jiahao Chen is a good summary of why it was hard to talk about and how it's been solved in Julia:
https://www.youtube.com/watch?v=C2RO34b_oPM

Concretely: distinguishing between a row vector and a column vector allowed us to have A * B when are different row or column vectors to have the appropriate dimentions or become a scalar when the dot product is applied. Lean should consider using that as well.

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:25):

I think that the #[1 2 3] syntax would be a bit tough in lean since that means application

view this post on Zulip Yakov Pechersky (Apr 12 2021 at 15:25):

You can use type synonyms instead in Lean

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:26):

but lean 4 parsing is pretty much magic so I wouldn't consider anything impossible, if you really want to make it work

view this post on Zulip Johan Commelin (Apr 12 2021 at 15:27):

In Lean 3 mathlib we have vector which is "1-dimensional" and you can apply row or col to make it "2-dimensional" in the sense that you get a matrix of size m x 1 or 1 x n.

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:27):

Yeah, it's your call on syntax, but its the distinction that matters.

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:28):

Oh yeah that wouldn't be too bad, you just write #[1, 2, 3].row and #[1, 2, 3].col

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:28):

Oh that's interesting @Johan Commelin .
Do you get a scalar from multiplying a 1 x n vector by an appropriate n x 1?

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:29):

You would get a 1x1 matrix

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:29):

Like Matlab!

view this post on Zulip Johan Commelin (Apr 12 2021 at 15:29):

I don't think you should want to get around that, if you use strict types

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:30):

If row and col vectors are considered separate types distinct from matrices then you could get that to evaluate to a scalar

view this post on Zulip Johan Commelin (Apr 12 2021 at 15:30):

We have dotproduct : vector -> vector -> scalar

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:31):

This is the neat comparison chart at the proper time-stamp
https://youtu.be/C2RO34b_oPM?t=913

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:31):

That too. Is it necessary to distinguish row and col vectors?

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:32):

I don't know that having a seriously overloaded * operator is worth it

view this post on Zulip Johan Commelin (Apr 12 2021 at 15:32):

I've always found the distinction artificial. But you certainly want functions to turn them into matrices.

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:32):

I'll need to think about how Lean does it.

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:33):

well lean doesn't do it right now so you can do what you like

view this post on Zulip Yakov Pechersky (Apr 12 2021 at 15:34):

I proposed a mathlib coercion of (1 x 1) matrices into scalars, but that was shut down

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:34):

In general, in mathlib we use a lot more projection notation like A.mul B which gives some more flexibility to pick the right operation

view this post on Zulip Yakov Pechersky (Apr 12 2021 at 15:34):

really a x b where unique a and unique b

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:35):

@Mario Carneiro the super overloaded * in Julia is about half the superpower of the language:

julia> methods(*)
# 331 methods for generic function "*":
[1] *(x::T, y::T) where T<:Union{Int128, UInt128} in Base at int.jl:908
[2] *(x::T, y::T) where T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8} in Base at int.jl:88
[3] *(s1::Union{AbstractChar, AbstractString}, ss::Union{AbstractChar, AbstractString}...) in Base at strings/basic.jl:260
[4] *(c::Union{UInt16, UInt32, UInt64, UInt8}, x::BigInt) in Base.GMP at gmp.jl:539
[5] *(c::Union{Int16, Int32, Int64, Int8}, x::BigInt) in Base.GMP at gmp.jl:541
...

[327] *(x::Float16, y::Float16) in Base at float.jl:330
[328] *(x::Number) in Base at operators.jl:516
[329] *(x::T, y::T) where T<:Number in Base at promotion.jl:397
[330] *(x::Number, y::Number) in Base at promotion.jl:322
[331] *(a, b, c, xs...) in Base at operators.jl:560

view this post on Zulip Yakov Pechersky (Apr 12 2021 at 15:35):

What's very powerful about Lean matrices is that indexing types need not be numeric types.

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:36):

How the hell do you know what anything means?

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:36):

that looks like a nightmare

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:36):

This getting a polymorphic * with type class resolution dispatching to the appropriate structured matrix multiplication operations is an algorithmic must for nueric linear algebra.

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:36):

What do you mean know what anything means?

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:36):

which of the 331 functions gets called when you write x * y

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:37):

julia is dynamically typed (or something) so it isn't even that easy to find out the types of x and y

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:37):

Same as in primary school:
Multiply 2 ints, apply the integer multiplication:

x = 3
y = 4
x * y == 12

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:38):

okay, so which function is getting called there

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:38):

even with the type signatures it's not that obvious to me

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:39):

maybe it's 330? 331?

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:39):

julia> @which 3 * 4
*(x::T, y::T) where T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8} in Base at int.jl:88

The multiplication that applies to all concrete Integer types.

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:39):

It's like 5 characters to find out which specific method is being called.

view this post on Zulip Johan Commelin (Apr 12 2021 at 15:39):

How many instance of has_mul does mathlib have? :rofl:

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:40):

well I wouldn't say mathlib is a shining beacon of hope here either

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:41):

And user defined functions are easy to instrospect as well:

julia> foo(x::Int, y::Int) = x * y
foo (generic function with 1 method)

julia> foo(2, 3)
6

julia> @which foo(2,3)
foo(x::Int64, y::Int64) in Main at REPL[7]:1

view this post on Zulip Johan Commelin (Apr 12 2021 at 15:41):

At least we have widgets :thinking:

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:41):

remember when application turned out to be a binary operator on nat?

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:42):

Here, foo(x::Int, y::Int) means: When you call foo and the first argument x is an Int, and the second argument y is an Int, do x * y

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:43):

and it's trivial to add new methods for foo that dispatch on different types:

julia> foo(x::String, y::String) = "$x and $y"
foo (generic function with 2 methods)

julia> foo("a", "b")
"a and b"

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:44):

And now foo has 2 specific methods:

julia> methods(foo)
# 2 methods for generic function "foo":
[1] foo(x::Int64, y::Int64) in Main at REPL[7]:1
[2] foo(x::String, y::String) in Main at REPL[10]:1

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:44):

I'm not a big fan of operator overloading, but lean has the ability to do these things

view this post on Zulip Anne Baanen (Apr 12 2021 at 15:44):

Mario Carneiro said:

julia is dynamically typed (or something) so it isn't even that easy to find out the types of x and y

Not to derail the discussion too much, but I find this a weird argument. If we define a type with a couple of constructors and a function pattern-matching on those:

inductive some_enum : Type
| a : some_enum
| b : some_enum
| c : some_enum
| d : some_enum
| e : some_enum

open some_enum

def foo : some_enum  some_enum  
| a b := 1
| c d := 37
| ...

would you complain it's hard to figure out which alternative of foo gets called?

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:44):

We call it multiple dispatch in Julia-land, I think Lean people have it as type-class based resolution.

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:45):

Oh hey @Anne Baanen , long time no see! :wave:

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:46):

@Anne Baanen My issue is more around "whole program type inference" and things of that nature. I'm talking about the types, not the values. In that example foo isn't getting called so I'm not sure exactly how the parallel goes

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:47):

FWIW, we don't do "whole program type inference" - Julia uses type inference for optimizations, but only at the function level. If it's not inside a function, we bail the analysis and run the program dynamically.

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:48):

Right, in julia it's more like dynamic typing + function overloading

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:48):

which means that by looking at the code you don't know what the types of values are (sometimes you do, but you don't have to) and hence what functions are getting called

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:49):

maybe I'm just jaded from C++

view this post on Zulip Anne Baanen (Apr 12 2021 at 15:49):

Mario Carneiro said:

Anne Baanen My issue is more around "whole program type inference" and things of that nature. I'm talking about the types, not the values. In that example foo isn't getting called so I'm not sure exactly how the parallel goes

Not sure I understand your distinction. Is it a type-level issue now?

def bar (x y : some_enum) : fin (foo x y) := sorry

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:50):

In that example, there is only one function foo getting called, so I can look it up and see the pattern match to find out how the values x and y get treated

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:51):

Yeah PTSD from C++ is understandable :sweat_smile: , no worries.

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:51):

If foo was a generic function parameterized by a typeclass, I would have to look to the caller to find out what function is getting run

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:51):

this can get quite involved

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:52):

I've been tracing through lean 4 functions and the higher order arguments make your head spin, it's hard to find where the work gets done

view this post on Zulip Miguel Raz Guzmán Macedo (Apr 12 2021 at 15:53):

fwiw in Julia we can always do @edit foo(3,4) and it will open up your editor to the specific line that implements that method.

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:54):

You could kinda sorta do that with lean typeclasses if you #check the term and then look up the relevant typeclass

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:54):

I don't think there is a unique place to look though, there could be lots of typeclasses involved in general

view this post on Zulip Anne Baanen (Apr 12 2021 at 15:55):

So this is really an issue with higher-order functions, not generic functions? Because as soon as you have first-class closures, you have arbitrary gotos.

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:55):

well, there are higher order functions but mostly as continuations, they aren't terrible

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:55):

The generics are tough though

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:56):

because go to definition just isn't sufficient anymore

view this post on Zulip Mario Carneiro (Apr 12 2021 at 15:57):

The other thing, which is lean specific, is the connection between syntax -> macro -> elab, these are all scattered about and it's not obvious how the links are even connected to each other, let alone how to follow the link

view this post on Zulip Sebastian Ullrich (May 02 2021 at 10:32):

Sebastian Ullrich said:

I've just written a journal chapter on our antiquotations, which I should copy to the docs when I find time...

Here is the new section (4.1) in the meantime: https://arxiv.org/pdf/2001.10490.pdf


Last updated: May 07 2021 at 13:21 UTC