Zulip Chat Archive

Stream: new members

Topic: proofs as rewrites but how do numbers appear?


Mr Proof (Jun 26 2024 at 02:41):

Another newbie question.

I'm trying to understand how terms are constructed to prove something. My understanding is that it is actually working in reverse. e.g. to prove x=x+y-y. You might start with (pseudocode):

rfl: x=x
nthrewrite(rfl, lemma1:(x=x+0), 2) : x=x+0
nthrewrite(nthrewrite(rfl, lemma1:(x=x+0), 2), lemma2:(0=y-y), 2) : x=y-y

Therefor my "term" which I constructed with the type x=y-y is made of a nested set of rewrite rules.

Presumably the nthrewrite is one of the fundamental Type theory rules. But something bothering me is that it involves counting the nth match. I guess that's fine but we are using this theory to define numbers in the first place so the notation is a bit odd.

Am I correct in that the "terms" we construct for our proof consist of a lot of these rewrite rules plus function operations?
https://image.slidesharecdn.com/bekkimineshimaesslli2016-dependenttypes-200410041311/75/ESSLLI2016-DTS-Lecture-Day-1-From-natural-deduction-to-dependent-type-theory-5-2048.jpg

well i suppose nth-rewrite is just a shorthand for:

P:A->Prop ,   C:P(a)  , B:a=b
------------------------------------
          rw{P,b} : P(b)

but in a more compact form

Henrik Böving (Jun 26 2024 at 09:27):

You can do both forward and backward proof with Lean. Nthrewrite does not exist as a primitive. What kind of proof terms you construct depends entirely on the proof and tactics involved, there are vastly different techniques. And the last rule that you wrote down is very similar to the J rule which is indeed part of Leans type theory through the eliminator for the Eq type.

Mark Fischer (Jun 26 2024 at 10:53):

"it is a remarkable fact that it is possible to construct a substantial edifice of mathematics based on nothing more than the type universes, dependent arrow types, and inductive types; everything else follows from those." - TPiL4

Here's how I understand it at the moment:
From the constructive, computational point of view - terms must encode how these three things are constructed and deconstructed. All the more complicated terms are just built on these.

Henrik Böving (Jun 26 2024 at 11:31):

This is not really something that's inherently connected to constructivism, if you assume the existence of a constant that gives you the axiom of choice the above claim still holds but you are not constructive

Mark Fischer (Jun 26 2024 at 14:28):

It feels like considerations that take the form 'my "term" [...] is made of' might break down a bit once you use the axiom of choice, but I suppose no more than it does when making a term within a function abstraction where you've assumed the existence of a term of some Type.

My mental model of "what a term is made of," is a bunch of things representing — actual and assumed — nested inductives and dependent arrows. That's probably not quite right either because it seems like from a type-checking point of view, terms aren't made of anything they just participate in well-formed expressions or something like that.

Mr Proof (Jun 26 2024 at 17:41):

I'm looking at one of the proofs from "The Natural Number Game" (yes you could just use rfl on this)

def ONE:=Nat.succ Nat.zero
def TWO:=Nat.succ ONE

def proofA: Nat.succ TWO = Nat.zero.succ.succ.succ := by
rw [TWO]
rw [ONE]

#print proofA

And the output of the proof is:

def proofA : TWO.succ = Nat.zero.succ.succ.succ :=
Eq.mpr (id (congrArg (fun _a => _a.succ = Nat.zero.succ.succ.succ) TWO.eq_1))
  (Eq.mpr (id (congrArg (fun _a => _a.succ.succ = Nat.zero.succ.succ.succ) ONE.eq_1)) (Eq.refl Nat.zero.succ.succ.succ))

So it looks like the rewrite rules are written like for replacing P(b) with P(c) given b=c:

Eq.mpr (id (congrArg (fun _a => P(_a) ) c))

which unfortunately would be a bit of a long way to write it if you needed to just substitute one variable in a really big expression because first you'd have to write your expression in the form P(a)

I would like to find somewhere a list of the keywords in Lean, which are not made by induction.

Henrik Böving (Jun 26 2024 at 17:51):

I would like to find somewhere a list of the keywords in Lean, which are not made by induction.

What do you mean with made by induction?

Mr Proof (Jun 26 2024 at 18:01):

Henrik Böving said:

I would like to find somewhere a list of the keywords in Lean, which are not made by induction.

What do you mean with made by induction?

I mean made in Lean from other keywords. e.g. Exists is made by induction but the Pi type is a keyword AFAIK. Just looking for the minimal set of Lean keywords. Every programming language has a basic set of keywords. I can't seem to find the list for Lean 4.

Henrik Böving (Jun 26 2024 at 18:04):

Lean 4 is an arbitrarily extensible programming language, there is no defined set of keywords. The built-in set of keywords is also very rich as quite literally every tactic is a keyword

Mr Proof (Jun 26 2024 at 18:07):

If there is no set of keywords then Lean 4 is not a programming language.

Henrik Böving (Jun 26 2024 at 18:08):

The set of keywords is arbitrarily extensible at elaboration time, there is no fix set of keywords. Yes Lean 4 is still a programming language.

Mr Proof (Jun 26 2024 at 18:13):

I don't know what you mean by "elaboration time". I'm just looking for the glossary of keywords that define the Lean 4 language. Otherwise Lean is not a programming language it is an arbitrary set of tools for creating proofs that can be checked by the Lean kernel. Which is fine but doesn't make it a language.

Henrik Böving (Jun 26 2024 at 18:17):

The process of translating the Lean language to its core lambda calculus is referred to as elaboration. Thanks to commands like macro, syntax, etc. a Lean file can define new kinds of keywords, syntax, etc. that can then be used in subsequent Lean files. This is also very much a thing in other programming languages like Racket and in mored limited forms also seen in languages like Agda, Coq, Isabelle or Haskell.

There is a set of built-in syntax that is provided by the compiler which can be extended arbitrarily by a Lean programmer (and the libraries they are using) through this mechanism. The list of syntax and keywords provided by the compiler is already very vast and I would estimate that, given the fact that all tactics are themselves effectively a keyword, there is quite literally hundreds of keywords in the base Lean language already.

Mr Proof (Jun 26 2024 at 18:18):

True, but all tactics create proofs made of keywords that can be checked by the Lean kernel. I am looking for those set of keywords not the tactics keywords.

Henrik Böving (Jun 26 2024 at 18:18):

And yes Lean 4 is still a language, it merely has an extensible parser that allows for this type of stuff, again this is a thing that is also seen both in other proof assistants and also other programming languages to varying degrees of freedom.

Henrik Böving (Jun 26 2024 at 18:18):

If you are looking for Lean's core lambda calculus that is docs#Expr. This is the stuff that tactics create under the hood

Henrik Böving (Jun 26 2024 at 18:19):

This is still far from a description of the basic mechanisms that Lean provdes though docs#Lean.Expr doesn't contain inductive type specifications and a ton of other stuff. It is also not something that anyone in Lean that is not doing meta programming really cares about.

Mark Fischer (Jun 26 2024 at 18:20):

Mr Proof said:

If there is no set of keywords then Lean 4 is not a programming language.

Depends on how you want to define things I suppose!

As I understand it, lean has a set mechanisms and APIs whereby syntax is parsed and processed and these API are open to users. So that the notion of a "built-in" syntax is perhaps hard to pin down. I don't really see this as precluding it from being a programming language though!

Mr Proof (Jun 26 2024 at 18:27):

From my point of view there's three languages. The first is what is output by #print, which is the language of the proofs themselves which I guess is fairly fixed by now(??). (And would that be called a language or a data-structure?) The second is the tactics. Which I guess is a language in it's own right, but one which is constantly changing and updating.
The third is the language to actually build the tactics (but I guess some of that is done in another language like C++?)

Sorry for being vague about what I was looking for. I'm more interested in Lean from a theoretical aspect, so I really am interested in the language that is output by #print. Which I guess must have keywords otherwise the kernel wouldn't be able to check it.

Mark Fischer (Jun 26 2024 at 18:31):

Init.Prelude

This is the first file in the Lean import hierarchy. It is responsible for setting
up basic definitions, most of which Lean already has "built in knowledge" about,
so it is important that they be set up in exactly this way
- link

This isn't syntax, per-say, but there's probably some way to dig into the things Lean has "built in knowledge" about.

Henrik Böving (Jun 26 2024 at 18:34):

Mr Proof said:

From my point of view there's three languages. The first is what is output by #print, which is the language of the proofs themselves which I guess is fairly fixed by now(??). (And would that be called a language or a data-structure?) The second is the tactics. Which I guess is a language in it's own right, but one which is constantly changing and updating.
The third is the language to actually build the tactics (but I guess some of that is done in another language like C++?)

Sorry for being vague about what I was looking for. I'm more interested in Lean from a theoretical aspect, so I really am interested in the language that is output by #print. Which I guess must have keywords otherwise the kernel wouldn't be able to check it.

#print shows a prettified version of docs#Expr, the other syntax that you are talking about is always arbitrarily extensible

Yakov Pechersky (Jun 26 2024 at 19:20):

It's quite important to note that #print prints a term. These are not keywords. rw the tactic helped construct a term that began with Eq.mpr. docs#Eq.mpr is further another term that is defined by something else.

Yakov Pechersky (Jun 26 2024 at 19:22):

So, no, the language of proof terms isn't fixed, because a proof terms could contain any term, and we declare new terms all the time.

Yakov Pechersky (Jun 26 2024 at 19:23):

Are you asking if the structure of terms, or the terms that appear inside terms generated by the rw tactic, whether that is more fixed?

Mr Proof (Jun 26 2024 at 19:24):

Yeah, basically I'm trying to drill down, so that Eq.Mpr is defined by terms A which are defined by terms B etc. and trying to find what lies at the bottom :smile:... if anything

Yakov Pechersky (Jun 26 2024 at 19:25):

Yes, Eq.mpr is defined at src#Eq.mpr.

Mark Fischer (Jun 26 2024 at 19:29):

What does Lean's Kernal see? Eventually all the keywords and such are expanded, elaborated, etc what is the result of all that? I've generally imagined that it's not something that's super fun to look at, but could that be thought of as lean's "at the bottom" representation?

Yakov Pechersky (Jun 26 2024 at 19:29):

The declaration of which takes two arguments, and constructs a new term of some equality statement, using the two arguments and the rewrite triangle. Declarations as defined by parsing and an elaboration rule. The rewrite triangle also has its own elaboration. Elaboration basically means that after syntax is parsed into expressions, the underlying expressions are combined by some rules, and then passed to the kernel as a term and the expected type. Separately, in kernel proof terms, the proof here works because of the recursor on the Eq type. And the expected type expression has been passed along by the elaborator.

Yakov Pechersky (Jun 26 2024 at 19:37):

How much of this is the language that is being asked about, or implementation details of how the language is parsed? For a question that seems similar to me, is Python the language -- the AST or the parser that constructs an AST from some text file, or the CPython implementation that does evaluation?

Mr Proof (Jun 26 2024 at 19:40):

I think I'm getting the idea. That the keywords are just "def", "=", "fun", "->" etc. and everything else can be defined from these? And you could prove the whole of FLT with just these? Or maybe you don't even need "="

Henrik Böving (Jun 26 2024 at 19:43):

There are again quite literally hundreds of keywords, many of them are defined on the fly, for example =is explicitly defined by extending the grammar of Lean at this point: https://github.com/leanprover/lean4/blob/141856d6e6d808a85b9147a530294fee8e48e15f/src/Init/Notation.lean#L325 the underlying type Eq is defined at src#Eq which is done using the inductive keyword that can indeed be considered one of the core primitives. But again, you will not be able to find a notion of "keywords" in Lean. You can find a notion of what the kernel thinks terms and definitions are but those are abstracted over a lot in the surface level syntax and users can add new extensions to those abstractions as they wish arbitrarily.

Mr Proof (Jun 26 2024 at 19:44):

I think we just have a different definition of keyword.

Henrik Böving (Jun 26 2024 at 19:44):

Then what is a keyword for you

Mr Proof (Jun 26 2024 at 19:44):

Something which isn't defined on the fly

Henrik Böving (Jun 26 2024 at 19:47):

I mean the Lean compiler uses quite literally the same mechanisms to register the inductive keyword as a primitive. There is no difference between how inductive is added to the grammar and how = is added to it. You can build your own inductive command if you wish. Lean as a language is a sandbox with infinite possibilities, some of which have been used by the compiler itself already and some of which are waiting to be used by its users.

Henrik Böving (Jun 26 2024 at 19:48):

Again if you are looking for the language that the kernel speaks, you can take a look at the kernel APIs and you will find that it only accepts certain data structures but you will have a hard time convincing the majority here that those are the keywords of Lean, the majority of math users has never and will never see or care about this API.

Mark Fischer (Jun 26 2024 at 19:48):

For Python, there's a Full Grammar specification. The notation is a mixture of EBNF and PEG and you can look through it and find the terminal symbols like 'pass', 'break', 'continue', 'return', 'raise', 'from', 'global'. If I read you correctly, for Python you'd be satisfied with this list.

I don't think it's helpful to think of Lean 4 this way.

Mark Fischer (Jun 26 2024 at 20:39):

I think the intro (link) for the Metaprogramming in Lean 4 book actually contains some insight into why this might feel confusing on first blush.

"One cool thing about Lean, though, is that it allows us to define custom syntax nodes and implement meta-level routines to elaborate them in the very same development environment that we use to perform object-level activities. [...] We can say that, in Lean, the meta-level is reflected to the object-level."

Because there's no separate meta-language, this means you can perform meta-activities like parse code, build an AST and elaborate its syntax nodes into terms that can be processed by the language kernel using syntax you've only moments ago defined in the same file.

I don't know what the most boot-strapped version of this looks like, but it's easy to imagine this doesn't much resemble what Lean 4 looks like.

Henrik Böving (Jun 26 2024 at 20:40):

The most bootstrapped version of Lean 4 is quite literally Lean 3 ^^

Eduardo Ochs (Jun 27 2024 at 02:35):

@Mr Proof, have you tried to read Init/Notation.lean?

Eduardo Ochs (Jun 27 2024 at 02:39):

I think that a good way to get a grasp of what is the core of Lean is to read how the basic commands are defined - and they depend on lots of other things. Grep the sources for "leading_parser"...

Kevin Buzzard (Jun 27 2024 at 07:32):

Mr Proof said:

I think I'm getting the idea. That the keywords are just "def", "=", "fun", "->" etc. and everything else can be defined from these? And you could prove the whole of FLT with just these? Or maybe you don't even need "="

= is notation for Eq which is defined in code in core Lean

inductive Eq : α  α  Prop where
  /-- `Eq.refl a : a = a` is reflexivity, the unique constructor of the
  equality type. See also `rfl`, which is usually used instead. -/
  | refl (a : α) : Eq a a

, so it's no more and no less a keyword than vector spaces, which are also defined in code (this time in mathlib). Note that I can assert this with confidence even though you've supplied no definition of "keyword", because equality and vector spaces are defined using the same technique (as are natural numbers etc etc).

def is a command. Is this the concept you're looking for?


Last updated: May 02 2025 at 03:31 UTC