Zulip Chat Archive
Stream: new members
Topic: Terms vs expressions
Malcolm Langfield (May 26 2022 at 10:21):
I'm slightly confused by terminology while reading Theorem Proving in Lean 4. The authors refer to some things as terms and some things as expressions, and I'm not sure if these are synonyms or if there is an informal or formal distinction. Can anyone shed some light on this?
In the lean4
source, there is a namespace Term
, but unless I missed it, there doesn't appear to be a type called Term
. There is, however, an inductive type called Expr
.
There is a related question here: https://cs.stackexchange.com/questions/64699/are-the-words-expression-and-term-interchangeable-in-programming-language-th
But I'm looking for a lean-specific answer if possible.
Thanks!
Arthur Paulino (May 26 2022 at 11:28):
I think of a "term" in the context of a given "type". For instance, 0
is a term of type Nat
Expr
is an inductive type used to implement more user-defined terms and types in Lean 4. Maybe this link can help you :+1:
Malcolm Langfield (May 26 2022 at 11:37):
Thanks for the link! Does it make sense to ask which is more general? In the metaprogramming book, it says
Expressions (terms of type Expr)
which makes it sound like all expressions are terms. But there is also
In Lean, terms and types are represented by expressions.
which makes it sound like all terms are expressions.
Henrik Böving (May 26 2022 at 11:43):
You have to differentiate between the meta level and the language itself.
In Lean we talk about something being a term as soon as it has a type, so 0
is a term of type Nat
as explained above.
In order to perform meta programming we require a representation of all terms as some Lean data type, we call this data type Expr
. So what we can say is that 0
is a term of type Nat
that is in the meta world represented by a term of type Expr
(more specifically by an Expr.lit
). This means that on the meta level all lean terms have a representation as a term of type Expr
that meta programs can operate on.
Arthur Paulino (May 26 2022 at 11:43):
Good questions. I think there's a confusion regarding the abstract ideas from type theory and their implementations in Lean
Arthur Paulino (May 26 2022 at 11:56):
Maybe it's better to actually stay away from metaprogramming concepts in the beginning because it may create more confusion.
This example may clarify a few things:
Nat
is a type0
is a term of typeNat
0 + 0
is an expression that, when reduced, yields the term0
of typeNat
We sometimes take the mental shortcut of talking about an expression as if it were already reduced because some expressions are just too trivial for our brains
Malcolm Langfield (May 26 2022 at 13:11):
Thanks @Henrik Böving @Arthur Paulino , this distinction between the meta level and the "ordinary" level is helpful to know about. Is it correct to say that a term is a single identifier or literal? That they're atomic, in syntax terms?
Henrik Böving (May 26 2022 at 13:13):
No, everything that has a type counts as terms, that can be function applications, even types themselves (since in Lean types have types and are thus just terms of other types as well) so a function type Nat -> Nat -> Nat
is a term as well etc.
Maybe as a general rule of thumb, if you can #check
it it is a term.
Patrick Johnson (May 26 2022 at 13:25):
Henrik Böving said:
This means that on the meta level all lean terms have a representation as a term of type
Expr
that meta programs can operate on.
I don't think that is true. There are countably many meta-level representations of Lean terms, but there are object-level types that have uncountably many terms (such as Nat -> Bool
), so not all of them can be represented as Expr
terms.
Henrik Böving (May 26 2022 at 13:33):
But in the end all terms of typeNat -> Bool
will be represented as an identifier, a lambda abstraction, a function application a let abstraction or a combination of them wont they? And that's what Expr
(mostly) is.
You can also see that the compiler thinks that all terms can be represented by Expr
because docs4#Lean.Elab.Term.elabTerm returns an Expr
in the term elaboration monad.
Patrick Johnson (May 26 2022 at 13:49):
Henrik Böving said:
But in the end all terms of type
Nat -> Bool
will be represented as an identifier, a lambda abstraction, a function application a let abstraction or a combination of them wont they?
Yes, but that's not enough to represent all terms of type Nat -> Bool
, we can only represent a countable subset of them.
Arthur Paulino (May 26 2022 at 13:51):
I'm not a mathematician so this is not obvious to me:
There are countably many meta-level representations of Lean terms
But it's already out of scope of the thread. Can someone split this thread in two, please? It got me curious :thinking:
Malcolm Langfield (May 26 2022 at 14:29):
Is e.g. #print
itself an expression? What about keywords like def
, or theorem
?
Henrik Böving (May 26 2022 at 14:41):
#print
def
and theorem
is what we call a command not an expression, they dont carry any value themselves they are used to communicate to the Lean compiler that it is supposed to perform a certain action (in the case of #print
that is printing something, in the case of def
and theorem
that is registering a new valid identifier to the Environment
)
Martin C. Martin (May 26 2022 at 16:55):
Let me see if I understand. "term" is similar to "instance" in object oriented programing: SS0
is a term (instance) of type Nat
. Also, the type of all real numbers (i.e. the set of all reals) has uncountably infinitely many terms (elements), even though only countably many are expressible in Lean. Is that right? So "term" is closer to a mathematical object (e.g. a real number) than what people in logic or parsers call a term.
Arthur Paulino (May 26 2022 at 17:05):
It can get a bit uncomfortable because Nat
is a type, yes, but it's also a term of type Type
. That's why I said in the beginning that I think of "term" in the context of a given "type"
Arthur Paulino (May 26 2022 at 17:23):
I said earlier that 0
is a term of type Nat
because I got used to Lean's default way to infer it. But 0
can also be inferred as Int
. And 0 : Nat
and 0 : Int
are different things in type theory, although you can coerce a term of a type to a term of another type.
So in short, "0
is a term of type Nat
" is not an absolute fact. There is a 0
of type Nat
Mario Carneiro (May 26 2022 at 17:41):
A better way to put it is that Nat.zero
is a Nat
, and when you write the literal 0
it elaborates to (some wrappers around) Nat.zero
if you say you want 0
to be treated as an element of Nat
. If you say 0 : Int
then you get Int.zero
instead and this is a different term with a different type. The underlying type theory is all talking about expressions post-elaboration.
Mario Carneiro (May 26 2022 at 17:47):
Martin C. Martin said:
Let me see if I understand. "term" is similar to "instance" in object oriented programing:
SS0
is a term (instance) of typeNat
. Also, the type of all real numbers (i.e. the set of all reals) has uncountably infinitely many terms (elements), even though only countably many are expressible in Lean. Is that right? So "term" is closer to a mathematical object (e.g. a real number) than what people in logic or parsers call a term.
When you are proving theorems or constructing functions, you generally want to have the "semantic" view, so if you have def foo (x: Real) := ...
then x
is imagined to range over all real numbers. This is the easiest way to make sense of the theorem that asserts that Real
is uncountable. This is only an interpretation though, there isn't anything in the system itself that demands this perspective, and you can also consistently imagine x
to range over all definable real numbers instead (of which there are only countably many, although the system can't prove this on pain of contradiction).
The term "term" is generally used to connote the more syntactic view of things. There x
is just a variable with name "x"
that the typechecker knows has type Real
in this context. This is the level that tactics generally work at, because they can look at the expression of the goal and reassociate things or do other syntactic manipulations that would not be possible with the semantic view.
Malcolm Langfield (May 27 2022 at 12:54):
From a more pedagogical point of view, how do we want new users (like me) to talk about things? I have seen, in various parts of TPIL, the Lean 4 manual, and people's project READMEs, all of the words term
, expression
, identifier
, name
, object
, variable
. It also doesn't help that in typical usage, a ton of things you're talking about are also types, so that word gets thrown around a lot as well. If you're writing a README or docs for a project? What's best? For example, when you're talking about something that can be #check
ed, we could call that an expression or a term, if I understood the discussion above correctly. But expression
is going to confuse programmers coming from a more run-of-the-mill language a lot less than term
. That's not necessarily a definitive reason to use expression
unilaterally. Just something to keep in mind.
Henrik Böving (May 27 2022 at 13:13):
My proposal to this issue would be:
- a term is anything that has a type, they can be recognized because they can be
#check
ed - a type is in fact a term on its own since types have a type themselves, this is one of the fundamental distinctions between Lean and more popular programming languages, the term and the type level are not separate, all types are also terms that can be operated on, you can even compute a type with a function dynamically. What exactly a type is beyond being a term is hard to pin down precisely, an explanation I've seen working well for beginners is to handwave it and say "its like a set" although this is of course a lie (that tends to work out well for a while), otherwise we could be doing just set theory
- expression should be disconnected from term and be used a Lean meta programming term (pun intended) we probably don't want to tell new comers right away
- an identifier is a name that has a type attached, identifiers are one of the many kinds of terms.
- object can be seen synonymous with term, we could get rid off it but mathematicians like to talk about mathematical objects and for OOP programmers it is nice to have some familiar name
- variables are placeholders, like in regular mathematics
Malcolm Langfield (May 27 2022 at 13:50):
@Henrik Böving Would you make a distinction between a variable and an identifier?
Henrik Böving (May 27 2022 at 14:32):
I'd say that all variables are identifiers, but not all identifiers are variables, e.g.
def natId (x : Nat) : Nat := x
x
is an identifier (and variable) with type Nat
and Nat
is an identifier (and type) with type Type
, you would however agree that Nat
is clearly not a variable here.
Yaël Dillies (May 27 2022 at 14:34):
Can't we say that a variable is a bound identifier?
Henrik Böving (May 27 2022 at 14:36):
Not necessarily, Lean has a concept of free (not bound) variables as well which are visible to the user in the tactic state (and of course also meta variables but those are a different thing newcomers shouldn't be concerned with)
Tomas Skrivan (May 27 2022 at 16:55):
If you work with macros/parser there you have ident
(identifiers). I think for example custom attributes are parsed with ident
and they do not have a type. I think an identifier is just a string of characters made out of specific subset of unicode characters, like excluding white space or '+', '-' ...
Max Glick (May 28 2022 at 01:01):
As a new person who craves a certain amount of "okay but what is really going on here" information, I am enjoying looking through the manual. It has a definition of expression in Section 3.2.
Malcolm Langfield (May 28 2022 at 03:49):
Thanks, this is very helpful!
Last updated: Dec 20 2023 at 11:08 UTC