Zulip Chat Archive

Stream: new members

Topic: References on the (non-kernel!) type theory of Lean?

Eduardo Ochs (Jun 24 2024 at 04:22):

Hi all...

Lean expressions. This data structure is used in the kernel and
elaborator. However, expressions sent to the kernel should not
contain metavariables.

Eduardo Ochs (Jun 24 2024 at 04:24):

and "The Type Theory of Lean" - Mario Carneiro's thesis - only describes the type theory of the kernel...

The "non-kernel" type theory of Lean has lots of other features - like metavariables, optional parameters, typeclasses, etc etc etc.

Where can I read more about how they are formalized?

Eduardo Ochs (Jun 24 2024 at 04:24):

I am trying to read the four main books about Lean - FPiL, TPiL, Metaprogramming in Lean4, and the Lean Manual - and there are many places in which the code only makes sense to me when I do "type inference by hand", by copying an expression on a sheet of paper, drawing underbraces under its subexpressions, and then writing the type of each subexpression...

The problem is that my algorithm and my notation for this "type inference by hand" are improvised, and very messy - and I think that if I try to read papers on, say, optional parameters, they will give me lots of useful ideas.

Thanks in advance!

Damiano Testa (Jun 24 2024 at 06:06):

Maybe I am misunderstanding what you are asking, but don't you get "current expected type information" by moving around the syntax with the cursors and looking at the infoview?

Henrik Böving (Jun 24 2024 at 06:37):

These algorithms are not fully formalized. All of them contain lots of heuristics that were added over time due to people having issues. Note that these algorithms are also continuously evolving, a paper that explained the implementation of definitional equality a year back would be outdated by now.

For the basic ideas you'll definitely find some papers floating around still. However if you really want to know what's going on, reading the Lean compiler itself is pretty much the only resource for the precise behavior of the Lean compiler.

All that being said I don't think there is a reason to understand these algorithms to get a feeling for how Lean does type inference. You just write more and more Lean code, look how it works and eventually get a feeling for how things work. For example if you were to read the tabled type class resolution paper that describes the underlying algorithm for type class synthesis, I don't think you'd get a better feeling for how to work with type classes, that comes from experience.

Eduardo Ochs (Jun 25 2024 at 13:46):

Hi @Damiano Testa,

I do that all the time, and sometimes I write several similar "#check"s one below the other to compare their results, like here:

set_option linter.unusedVariables false

def myfst : {A:Type} → {B:Type} → A×B → A := fun (a,b) => a

#eval myfst                          ("a",2)
#eval myfst (A := String) (B := Nat) ("a",2)
#eval myfst (A := String)            ("a",2)
#eval myfst               (B := Nat) ("a",2)


#check myfst
#check myfst (A := String)
#check myfst               (B := Nat)
#check myfst (A := String) (B := Nat)

#eval myfst                                                   ("a"     , 2    )
--    -------------------------------  -----------             --------  -----
--    : {A:Type} → {B:Type} → A×B → A  (A:=String)             : String  : Nat
--    --------------------------------------------  --------  -----------------
--    : {B:Type} → String×B → String                (B:=Nat)  : String×Nat
--    ------------------------------------------------------
--    : String×Nat → String
--    -------------------------------------------------------------------------
--    : String

Eduardo Ochs (Jun 25 2024 at 13:47):

Let me try to explain my question in a better way.

In the middle of the pandemic I tried to learn Haskell for the nth time - that time using Hutton's book, because a friend was TAing a course that used it. But I got stuck at the chapter with monads - I had the impression that the author expected that the readers would be able to do all the type checkings and type inferences by sight, and I couldn't do that, my mental buffer was too small...

Eduardo Ochs (Jun 25 2024 at 13:50):

I also tried to infer the types by drawing underbrace diagrams by hand. My diagrams were like the one below - they used several conventions that were quite bad, and I was not happy with them.

m1     >>= \ x1   -> m2     >>= \ x2   -> f                 x1   x2
------       ----    ------       ----    ----------------  ---- ----
:: m a       :: a    :: m b       :: b    :: a -> b -> m c  :: a :: b
                                          ----------------------
                                          :: b -> m c
                                          ---------------------------
                                          :: m c
                                -------------------------------------
                                :: b -> m c
                     ------------------------------------------------
                     :: m c
           ----------------------------------------------------------
           :: a -> m c
---------------------------------------------------------------------
:: m c

Henrik Böving (Jun 25 2024 at 13:57):

You can hover over Lean terms to figure out their type or put a sorry anywhere (it works for terms also, not only proofs) and if the type checker already has enough information to figure out the type of the term it expects at that point hovering the sorry will give you all of the type information as well.

Note that in particular with monads you do not necessarily need to understand the desugaring to >>= to work with them. You can purely work in do notation and just keep in mind that <- does some sort of extra thing depending on the moand you are in:

For IO it indicates that an interaction with the outside world might be happening at that point. In particular there might be an error being thrown that exits you early from the computation
For Option and Except you similarly have the none/error check with an early exit
For ReaderM you pass in an additional argument implicitly that the function can use as a context to derive information from, often used for implementing things like global program configurations
StateM both passes in some state and fetches the updated state from the computation to use it when it passes in a state for the next time.

The desugaring into >>= and pure is really more of an interesting thing than one that you need to know to use a language that does monadic programming. For example the Rust language has the ? operator which mimicks the Option and Except monad but the majority of Rustaceans has no clue what a monad is at all.

If you want to learn about the details of monadic programming in Lean I suggest reading https://lean-lang.org/functional_programming_in_lean/monads.html chapter 5-7.

Number Eighteen (Jun 25 2024 at 14:04):

In regards to meta-variables, elaboration etc. the Meta-programming book explains the mechanisms. However, you do need to understand how monads work, so if you don't, Henrik's suggestion is the best place to start.

Eduardo Ochs (Jun 25 2024 at 14:05):

Thanks for the pointer to sorry! I was using only the underbrace, I'll try the tricks with sorry soon...

Eduardo Ochs (Jun 25 2024 at 14:09):

More on the long story: a few days ago I announced - in "Animating type inferences" - that I had found a nice way to draw animated underbrace diagrams for type inferences by starting with a a diagram in ascii art like the ones above. There's some information about that in this section of my page on Lean - but I need to update that to show how to make the tool use \textsf{...} by default.

Henrik Böving (Jun 25 2024 at 14:15):

Number Eighteen said:

In regards to meta-variables, elaboration etc. the Meta-programming book explains the mechanisms. However, you do need to understand how monads work, so if you don't, Henrik's suggestion is the best place to start.

The meta programming book doesn't really explain how the underlying algorihtms work in detail, it gives a slightly verbose API reference with examples and for a few things also some general explanations for the strategy of some algorithms.

Eduardo Ochs (Jun 25 2024 at 14:17):

I found the chapters on monads of FPiL and the material on the metaprogramming book very hard, and that may be because that they have "missing diagrams" in this sense (ArXiV)...

Henrik Böving (Jun 25 2024 at 14:20):

There is also no reason to read or know any category theory to understand monads as they are used in functional programming. I would not expect a beginner explanation of FP monads to ever have a category theory diagram unless it is explicitly targeting people that know category theory already.

Henrik Böving (Jun 25 2024 at 14:22):

And I would expect even less category theory related material in the meta programming book. I would say that many if not the majority of Lean meta programmers have no or very little clue about category theory.

Eduardo Ochs (Jun 25 2024 at 14:23):

...so "drawing the missing diagrams" - even if they look weird at first - looks like an activity that will bring good karma. And I thought that reading the right source code comments on type inference could be useful, because they would mention how some things are done internally in several steps, and would show better notations for things in which I'm still improvising.

Eduardo Ochs (Jun 25 2024 at 14:25):

@Henrik Böving, that paper started from problems in Category Theory, but most of its conventions for drawing "missing diagrams" can be adapted to other contexts...

Eduardo Ochs (Jun 25 2024 at 14:30):

Btw, the Monad map in the wiki looks like a treasure trove of good ideas, but it doesn't have examples... I am trying to write some of the examples that it omits - and leaves as exercises...

Mark Fischer (Jun 25 2024 at 16:01):

Eduardo Ochs said:

In the middle of the pandemic I tried to learn Haskell for the nth time - that time using Hutton's book, because a friend was TAing a course that used it.

The pedagogy around this interesting.

Given the current place I am in my learning, my feeling is somewhat that because they're reliably automated, Type Checking and Type Inference are systems for which one needs to understand what's happening but which you're rarely going to need to do it yourself.

Perhaps it's kind of like sorting a list. There's a lot of algorithms that do this, but it's only the edge cases where the one in use really matters. Most of the time, the important bit is understanding the result. In fact, before learning your first sorting algorithm you probably already need an understanding of which permutations of a list are ordered or not.

Though we'll all forever implement our own sorting algorithms because they're good learning tools for other purposes.

Is Type Inference like this? I'm not really sure. When I write expressions — especially ones where type classes/elaboration/macros are at work, I feel like I'm guided by heuristics and then rely on the compiler to catch the inevitable shortcomings of that approach.

Hard to say though!

Last updated: May 02 2025 at 03:31 UTC