Zulip Chat Archive

Stream: new members

Topic: Terms vs expressions


Malcolm Langfield (May 26 2022 at 10:21):

I'm slightly confused by terminology while reading Theorem Proving in Lean 4. The authors refer to some things as terms and some things as expressions, and I'm not sure if these are synonyms or if there is an informal or formal distinction. Can anyone shed some light on this?

In the lean4 source, there is a namespace Term, but unless I missed it, there doesn't appear to be a type called Term. There is, however, an inductive type called Expr.

There is a related question here: https://cs.stackexchange.com/questions/64699/are-the-words-expression-and-term-interchangeable-in-programming-language-th

But I'm looking for a lean-specific answer if possible.

Thanks!

Arthur Paulino (May 26 2022 at 11:28):

I think of a "term" in the context of a given "type". For instance, 0 is a term of type Nat

Expr is an inductive type used to implement more user-defined terms and types in Lean 4. Maybe this link can help you :+1:

Malcolm Langfield (May 26 2022 at 11:37):

Thanks for the link! Does it make sense to ask which is more general? In the metaprogramming book, it says

Expressions (terms of type Expr)

which makes it sound like all expressions are terms. But there is also

In Lean, terms and types are represented by expressions.

which makes it sound like all terms are expressions.

Henrik Böving (May 26 2022 at 11:43):

You have to differentiate between the meta level and the language itself.

In Lean we talk about something being a term as soon as it has a type, so 0 is a term of type Nat as explained above.

In order to perform meta programming we require a representation of all terms as some Lean data type, we call this data type Expr. So what we can say is that 0 is a term of type Nat that is in the meta world represented by a term of type Expr (more specifically by an Expr.lit). This means that on the meta level all lean terms have a representation as a term of type Expr that meta programs can operate on.

Arthur Paulino (May 26 2022 at 11:43):

Good questions. I think there's a confusion regarding the abstract ideas from type theory and their implementations in Lean

Arthur Paulino (May 26 2022 at 11:56):

Maybe it's better to actually stay away from metaprogramming concepts in the beginning because it may create more confusion.

This example may clarify a few things:

  • Nat is a type
  • 0 is a term of type Nat
  • 0 + 0 is an expression that, when reduced, yields the term 0 of type Nat

We sometimes take the mental shortcut of talking about an expression as if it were already reduced because some expressions are just too trivial for our brains

Malcolm Langfield (May 26 2022 at 13:11):

Thanks @Henrik Böving @Arthur Paulino , this distinction between the meta level and the "ordinary" level is helpful to know about. Is it correct to say that a term is a single identifier or literal? That they're atomic, in syntax terms?

Henrik Böving (May 26 2022 at 13:13):

No, everything that has a type counts as terms, that can be function applications, even types themselves (since in Lean types have types and are thus just terms of other types as well) so a function type Nat -> Nat -> Nat is a term as well etc.

Maybe as a general rule of thumb, if you can #check it it is a term.

Patrick Johnson (May 26 2022 at 13:25):

Henrik Böving said:

This means that on the meta level all lean terms have a representation as a term of type Expr that meta programs can operate on.

I don't think that is true. There are countably many meta-level representations of Lean terms, but there are object-level types that have uncountably many terms (such as Nat -> Bool), so not all of them can be represented as Expr terms.

Henrik Böving (May 26 2022 at 13:33):

But in the end all terms of typeNat -> Bool will be represented as an identifier, a lambda abstraction, a function application a let abstraction or a combination of them wont they? And that's what Expr (mostly) is.

You can also see that the compiler thinks that all terms can be represented by Expr because docs4#Lean.Elab.Term.elabTerm returns an Expr in the term elaboration monad.

Patrick Johnson (May 26 2022 at 13:49):

Henrik Böving said:

But in the end all terms of type Nat -> Bool will be represented as an identifier, a lambda abstraction, a function application a let abstraction or a combination of them wont they?

Yes, but that's not enough to represent all terms of type Nat -> Bool, we can only represent a countable subset of them.

Arthur Paulino (May 26 2022 at 13:51):

I'm not a mathematician so this is not obvious to me:

There are countably many meta-level representations of Lean terms

But it's already out of scope of the thread. Can someone split this thread in two, please? It got me curious :thinking:

Malcolm Langfield (May 26 2022 at 14:29):

Is e.g. #print itself an expression? What about keywords like def, or theorem?

Henrik Böving (May 26 2022 at 14:41):

#print def and theorem is what we call a command not an expression, they dont carry any value themselves they are used to communicate to the Lean compiler that it is supposed to perform a certain action (in the case of #print that is printing something, in the case of def and theorem that is registering a new valid identifier to the Environment)

Martin C. Martin (May 26 2022 at 16:55):

Let me see if I understand. "term" is similar to "instance" in object oriented programing: SS0 is a term (instance) of type Nat. Also, the type of all real numbers (i.e. the set of all reals) has uncountably infinitely many terms (elements), even though only countably many are expressible in Lean. Is that right? So "term" is closer to a mathematical object (e.g. a real number) than what people in logic or parsers call a term.

Arthur Paulino (May 26 2022 at 17:05):

It can get a bit uncomfortable because Nat is a type, yes, but it's also a term of type Type. That's why I said in the beginning that I think of "term" in the context of a given "type"

Arthur Paulino (May 26 2022 at 17:23):

I said earlier that 0 is a term of type Nat because I got used to Lean's default way to infer it. But 0 can also be inferred as Int. And 0 : Nat and 0 : Int are different things in type theory, although you can coerce a term of a type to a term of another type.

So in short, "0 is a term of type Nat" is not an absolute fact. There is a 0 of type Nat

Mario Carneiro (May 26 2022 at 17:41):

A better way to put it is that Nat.zero is a Nat, and when you write the literal 0 it elaborates to (some wrappers around) Nat.zero if you say you want 0 to be treated as an element of Nat. If you say 0 : Int then you get Int.zero instead and this is a different term with a different type. The underlying type theory is all talking about expressions post-elaboration.

Mario Carneiro (May 26 2022 at 17:47):

Martin C. Martin said:

Let me see if I understand. "term" is similar to "instance" in object oriented programing: SS0 is a term (instance) of type Nat. Also, the type of all real numbers (i.e. the set of all reals) has uncountably infinitely many terms (elements), even though only countably many are expressible in Lean. Is that right? So "term" is closer to a mathematical object (e.g. a real number) than what people in logic or parsers call a term.

When you are proving theorems or constructing functions, you generally want to have the "semantic" view, so if you have def foo (x: Real) := ... then x is imagined to range over all real numbers. This is the easiest way to make sense of the theorem that asserts that Real is uncountable. This is only an interpretation though, there isn't anything in the system itself that demands this perspective, and you can also consistently imagine x to range over all definable real numbers instead (of which there are only countably many, although the system can't prove this on pain of contradiction).

The term "term" is generally used to connote the more syntactic view of things. There x is just a variable with name "x" that the typechecker knows has type Real in this context. This is the level that tactics generally work at, because they can look at the expression of the goal and reassociate things or do other syntactic manipulations that would not be possible with the semantic view.

Malcolm Langfield (May 27 2022 at 12:54):

From a more pedagogical point of view, how do we want new users (like me) to talk about things? I have seen, in various parts of TPIL, the Lean 4 manual, and people's project READMEs, all of the words term, expression, identifier, name, object, variable. It also doesn't help that in typical usage, a ton of things you're talking about are also types, so that word gets thrown around a lot as well. If you're writing a README or docs for a project? What's best? For example, when you're talking about something that can be #checked, we could call that an expression or a term, if I understood the discussion above correctly. But expression is going to confuse programmers coming from a more run-of-the-mill language a lot less than term. That's not necessarily a definitive reason to use expression unilaterally. Just something to keep in mind.

Henrik Böving (May 27 2022 at 13:13):

My proposal to this issue would be:

  • a term is anything that has a type, they can be recognized because they can be #checked
  • a type is in fact a term on its own since types have a type themselves, this is one of the fundamental distinctions between Lean and more popular programming languages, the term and the type level are not separate, all types are also terms that can be operated on, you can even compute a type with a function dynamically. What exactly a type is beyond being a term is hard to pin down precisely, an explanation I've seen working well for beginners is to handwave it and say "its like a set" although this is of course a lie (that tends to work out well for a while), otherwise we could be doing just set theory
  • expression should be disconnected from term and be used a Lean meta programming term (pun intended) we probably don't want to tell new comers right away
  • an identifier is a name that has a type attached, identifiers are one of the many kinds of terms.
  • object can be seen synonymous with term, we could get rid off it but mathematicians like to talk about mathematical objects and for OOP programmers it is nice to have some familiar name
  • variables are placeholders, like in regular mathematics

Malcolm Langfield (May 27 2022 at 13:50):

@Henrik Böving Would you make a distinction between a variable and an identifier?

Henrik Böving (May 27 2022 at 14:32):

I'd say that all variables are identifiers, but not all identifiers are variables, e.g.

def natId (x : Nat) : Nat := x

x is an identifier (and variable) with type Natand Nat is an identifier (and type) with type Type, you would however agree that Nat is clearly not a variable here.

Yaël Dillies (May 27 2022 at 14:34):

Can't we say that a variable is a bound identifier?

Henrik Böving (May 27 2022 at 14:36):

Not necessarily, Lean has a concept of free (not bound) variables as well which are visible to the user in the tactic state (and of course also meta variables but those are a different thing newcomers shouldn't be concerned with)

Tomas Skrivan (May 27 2022 at 16:55):

If you work with macros/parser there you have ident(identifiers). I think for example custom attributes are parsed with ident and they do not have a type. I think an identifier is just a string of characters made out of specific subset of unicode characters, like excluding white space or '+', '-' ...

Max Glick (May 28 2022 at 01:01):

As a new person who craves a certain amount of "okay but what is really going on here" information, I am enjoying looking through the manual. It has a definition of expression in Section 3.2.

Malcolm Langfield (May 28 2022 at 03:49):

Thanks, this is very helpful!


Last updated: Dec 20 2023 at 11:08 UTC