Zulip Chat Archive

Stream: maths

Topic: Proof by algorithm

Tim Daly (Oct 23 2019 at 06:01):

Rumor has it that an algorithm is a valid proof. Is there an example of an algorithmic proof in Lean?

Kenny Lau (Oct 23 2019 at 06:11):

we have algorithms, and proofs that they work

Kenny Lau (Oct 23 2019 at 06:12):

for example gcd is an algorithm

Mario Carneiro (Oct 23 2019 at 22:13):

Tactics are algorithmic proofs. I would guess that norm_num, ring, linarith and omega are all good examples of proof-by-algorithm

Tim Daly (Oct 24 2019 at 02:29):

I was thinking more along the lines of... Here is a theorem and here, an algorithm (in python? since proof-irrelevance)

Mario Carneiro (Oct 24 2019 at 02:33):

I don't know what that means

Tim Daly (Oct 24 2019 at 07:46):

I don't either, actually. But I have heard many times that "Programs are proofs" and "Constructive Proofs are programs" by Curry Howard. I can "sort-of, maybe, sometimes (SMS) " agree with constructive proofs being programs, under a lot of assumptions. But "programs are proofs" is a direction that I'm "sort-of, maybe, sometimes" wresting with. I have many working GCD algorithms (e.g. NAT, POlYNOMIALS, etc.) and I'm trying to prove them "correct". That is, I'm trying to prove that they fulfill a specification. The specification is "sort-of, maybe, sometimes" a set of logical definitions and axioms, (e.g. gcd(x,y) == gcd(y,x) )

The language of the program (in this case, lisp) is irrelevant since it should be possible to translate a proof into a lambda form expressible in lisp. Given that the lambda form came from a proof with its associated definitions and axioms, it should be possible to "prove" the lambda form, that is, to prove the program.

So the fundamental tasks are to (1) turn a gcd proof into a program and (2) to turn the program into a gcd proof. I am pondering a way to take the lowest level Lean output format into lisp.

Given that this is research (aka creative reuse :-) ) I was hoping for working examples to SMS reasoning.

The other working assumption is that I'm just confused but that's more of an axiom than an assumption.

Kenny Lau (Oct 24 2019 at 07:47):

"programs are proofs" (i.e. Curry--Howard correspondence) has nothing to do with algorithmic proofs

Tim Daly (Oct 24 2019 at 07:47):

I'm listening. Can you give me a reference I can study?

Mario Carneiro (Oct 24 2019 at 07:48):

I'm sorry to tell you that by the Curry Howard isomorphism, the only thing you learn proof-wise from the existence of an algorithm in lisp is that the type of lisp s-expressions is inhabited

Mario Carneiro (Oct 24 2019 at 07:49):

In order to get anything nontrivial out of curry howard, you need a really strong type system

Tim Daly (Oct 24 2019 at 07:50):

So a constructive proof is only evidence that the type is inhabited? Then what does it mean to "generate program" from ITP?

Mario Carneiro (Oct 24 2019 at 07:50):

Terms in lean have a "computational interpretation", aka an operational semantics, so you can run them

Tim Daly (Oct 24 2019 at 07:50):

Axiom has "First Class Dynamic Types", so I have a really strong type system.

Mario Carneiro (Oct 24 2019 at 07:51):

Dynamic types don't cut it

Mario Carneiro (Oct 24 2019 at 07:52):

well, they sort of do, but you have to run the program to obtain a proof, and you only get a proof for that particular input

Tim Daly (Oct 24 2019 at 07:52):

So if I generate a lisp program using the computational interpretation then it has a corresponding proof.

Mario Carneiro (Oct 24 2019 at 07:52):

so you won't be able to prove gcd(x,y) = gcd(y,x), only gcd(3,6) = gcd(6,3) and similar

Tim Daly (Oct 24 2019 at 07:52):

"You only get a proof for that particular input"... But Axiom is symbolic so "that particular input" is actually a class of inputs.

Mario Carneiro (Oct 24 2019 at 07:53):

you presumably can't symbolically execute all code paths though

Mario Carneiro (Oct 24 2019 at 07:54):

you might operate on symbolic expressions, but to prove facts about the program that is doing the manipulation you need to work a meta level above that

Tim Daly (Oct 24 2019 at 07:54):

GCD(x:NAT,y:NAT) == GCD(y:NAT,x:NAT)

Mario Carneiro (Oct 24 2019 at 07:55):

Have you looked at ACL2? That sounds like it bears much more relation to what you are doing than lean

Tim Daly (Oct 24 2019 at 07:55):

NAT has a lot of axioms and definitions (which, in the latest Axiom I'm building) are inherited from the Categories (aka typeclasses)

Tim Daly (Oct 24 2019 at 07:56):

Yes, I've used ACL2. It isn't strong enough.

Mario Carneiro (Oct 24 2019 at 07:57):

Have you looked at the gcd function in lean, and the proof that gcd x y = gcd y x?

Mario Carneiro (Oct 24 2019 at 07:57):

lean's gcd function is computable, it has code and you can run it on given inputs

Mario Carneiro (Oct 24 2019 at 07:58):

I'm not really sure what you are trying to do that is not covered by this approach

Tim Daly (Oct 24 2019 at 07:59):

@Kenny Lau mentioned there is a gcd proof but not which file it lives in. I haven't searched for it yet. Do you know where it lives?

Tim Daly (Oct 24 2019 at 08:01):

Does lean's GCD work for polynomials over a field generated by its coefficients?

Mario Carneiro (Oct 24 2019 at 08:01):

https://github.com/leanprover-community/mathlib/blob/master/src/data/nat/gcd.lean#L31-L34

Tim Daly (Oct 24 2019 at 08:02):

Thanks.

Mario Carneiro (Oct 24 2019 at 08:02):

That gcd is about nat, but this one is generic

Tim Daly (Oct 24 2019 at 08:03):

It seems to assume a commutative ring.

Mario Carneiro (Oct 24 2019 at 08:03):

We could generalize, but it's a matter of whether people care about it

Mario Carneiro (Oct 24 2019 at 08:04):

I'm sure this is Kevin's fault for not believing that noncommutative rings exist

Tim Daly (Oct 24 2019 at 08:04):

Is there a way to generate programs from Lean?

Mario Carneiro (Oct 24 2019 at 08:04):

lean terms are programs

Mario Carneiro (Oct 24 2019 at 08:05):

if it doesn't say noncomputable on the definition, it has code and you can execute it

Tim Daly (Oct 24 2019 at 08:05):

Let me be more specific. Can I generate code in a not-lean, general purpose programming language from lean? Or do I need to write the lean->lisp function?

Mario Carneiro (Oct 24 2019 at 08:06):

actually, wikipedia seems to agree that GCD domains are commutative, so it's maybe not a crazy assumption

Tim Daly (Oct 24 2019 at 08:06):

(actually, lean->Spad, which is Axiom's language)

Mario Carneiro (Oct 24 2019 at 08:07):

what's wrong with writing a lean->lisp function?

Mario Carneiro (Oct 24 2019 at 08:07):

No lean doesn't support transpilers for every language

Mario Carneiro (Oct 24 2019 at 08:07):

lean 4 supports a transpiler to C

Tim Daly (Oct 24 2019 at 08:08):

A lean->lisp is perfect. Spad is a Domain-Specific Language on top of lisp specifically for mathematics.

Mario Carneiro (Oct 24 2019 at 08:08):

I'm saying you would have to do it yourself, but the tools to do it exist

Tim Daly (Oct 24 2019 at 08:09):

Can you give me a pointer to the transpiler? I might hack that into a lean-lisp (which ought to be somewhat easier, it seems to me, but then I'm a lisper)

Mario Carneiro (Oct 24 2019 at 08:09):

to C?

Mario Carneiro (Oct 24 2019 at 08:09):

in lean 4?

Mario Carneiro (Oct 24 2019 at 08:10):

I think it's just called a compiler and it's been a big part of the lean 4 project. See the "counting immutable beans" paper

Mario Carneiro (Oct 24 2019 at 08:10):

I think you would have better success just recursing on the expr type and throwing away type annotations

Tim Daly (Oct 24 2019 at 08:11):

I don't want to throw away the type annotations. I'd turn them into CLOS classes, which are also lisp types, so the program could be type-checked on the lisp side.

Tim Daly (Oct 24 2019 at 08:13):

The counting paper is this? https://arxiv.org/pdf/1908.05647.pdf

Tim Daly (Oct 24 2019 at 08:15):

The reference counting macro in the lean source code (which I read) is a great hack. I am impressed. Since I've written a garbage collector I could decode what was being done.

Unfortunately I was reading the lean sources to get the logical CIC rules actually used in lean and I failed to reverse-engineer them from the code. Can you point me at a paper that contains the implemented rules?

Mario Carneiro (Oct 24 2019 at 08:16):

Right now I think the best reference for the CIC rules used in lean is my masters thesis https://github.com/digama0/lean-type-theory/releases/tag/v1.0

Mario Carneiro (Oct 24 2019 at 08:17):

You won't be able to type check lean on the lisp side. It's too complicated for any simpler type system

Tim Daly (Oct 24 2019 at 08:17):

Excellent. I need to implement Lean's rules in Axiom so I can use them as part of the proof machinery.

Mario Carneiro (Oct 24 2019 at 08:18):

you can dynamically check it though

Mario Carneiro (Oct 24 2019 at 08:18):

as in, lean programs should not ever call a function with an object of the wrong type

Mario Carneiro (Oct 24 2019 at 08:19):

but unless you have a type system that is basically exactly lean's, you won't be able to prove this in advance of getting the object itself

Tim Daly (Oct 24 2019 at 08:20):

Axiom's type system is well beyond anything directly coded in lisp. Axiom knows the types of everything, everywhere, at all times (dependent types) and computes them in the compiler (or it won't compile). The interpreter computes them but has heuristics because the compile-time types are undecidable.

Mario Carneiro (Oct 24 2019 at 08:20):

I'm still skeptical that axiom's type system is strong enough for lean

Tim Daly (Oct 24 2019 at 08:21):

Well, there we differ. I'm certain Axiom can handle Lean. But, of course, that's my research task.

Mario Carneiro (Oct 24 2019 at 08:21):

Do you have a reference for axiom's type system?

Mario Carneiro (Oct 24 2019 at 08:22):

I'm just guessing

Mario Carneiro (Oct 24 2019 at 08:22):

For one thing, if you can handle lean then your axiomatic framework is stronger than ZFC

Tim Daly (Oct 24 2019 at 08:28):

Yeah, lots of papers. See Axiom's wikipedia page. Volume 10.1 Axiom Algebra: Theory has a bunch of papers. Volume 0 has a lot of examples.

https://en.wikipedia.org/wiki/Axiom_(computer_algebra_system)

Tim Daly (Oct 24 2019 at 08:29):

Buchberger implemented Axiom's Groebner basis code (later proved in Coq)

Tim Daly (Oct 24 2019 at 08:35):

I thought of trying to translate Coq's Groebner basis proof to Lean but there are a lot of issues to handle (e.g. term ordering) which have nothing to do with the issue. I have a much more modest goal of proving Axiom's 22 GCD algorithms using Lean. But I would like the proofs to "round trip" (spad->lean->spad).

Tim Daly (Oct 24 2019 at 08:37):

James Davenport is one of the original designers of Axiom and he is active in the proof community these dayss.

Mario Carneiro (Oct 24 2019 at 08:37):

I mean just the type system

Tim Daly (Oct 24 2019 at 08:39):

Axiom was designed in the late 1970s, prior to a lot of the type notation currently in use. There is no "set of rules", although that would make a great PhD thesis for someone who liked true agony.

Tim Daly (Oct 24 2019 at 08:41):

An expected result of my research (hmm, where is my cap and gown?) is that the CIC rules are used on the Axiom side while proving programs. That's why I was trying to reverse engineer Lean's current rule set. It really would help to include a few comments in the source code.

Tim Daly (Oct 24 2019 at 08:43):

Somehow we need to cross the "computational mathematics" bridge between proof tools and computer algebra. Since Axiom has an explicit scaffold of group theory (rather than general rewrite rules like Mathematica) it seems like the best bet to merge these two streams. James Davenport is the only person I've seen who is referenced in both (nearly disjoint) sets of bibliographies.

Mario Carneiro (Oct 24 2019 at 08:44):

If there isn't a clear statement of the axioms, I would be surprised if it was consistent

Tim Daly (Oct 24 2019 at 08:47):

It has monoid, groups, rings, fields, etc., all implemented as a hierarchy of typeclasses. I'm decorating the typeclasses with axioms and defintions and decorating the implementations (we call them domains, aka carriers and algorithms) with axioms, definitions, and lemmas. For example, NAT (which Axiom calls NonNegativeIntegers) has the usual decorations.

Tim Daly (Oct 24 2019 at 08:48):

So, like lean, when you get "down to the code", that is, down to proving or executing, you know a LOT about what you can assume since it is all inherited or explicitly stated.

Tim Daly (Oct 24 2019 at 08:49):

Is group theory consistent?

Mario Carneiro (Oct 24 2019 at 08:50):

From what I can tell from volume 0, the type system is like Java's, it protects the most basic problems but not anything close to functional correctness

Mario Carneiro (Oct 24 2019 at 08:51):

In particular, I see nothing that indicates that something with the Ring domain has to satisfy the axioms of a ring

Tim Daly (Oct 24 2019 at 08:51):

Given that Buchberger's Groebner basis algorithm was proven in Coq and Bruno implemented it in Axiom, I suspect that it is possible to prove things like gcds.

Tim Daly (Oct 24 2019 at 08:54):

Re: "is like Java's"... hasty generalization. It took me several years to understand Axiom's types. It took me several years to understand CIC. So I'd guess that a 5-minute glance isn't sufficient.

Tim Daly (Oct 24 2019 at 08:55):

Then again, I'm rather slow in the uptake :-)

Mario Carneiro (Oct 24 2019 at 08:55):

Does the definition of a semigroup include anything about associativity? Chapter 12.6 suggests no

Tim Daly (Oct 24 2019 at 08:56):

Axiom has both associative and non-associative domains. It depends on the set of typeclasses (Axiom calls the categories) you inherit.

Mario Carneiro (Oct 24 2019 at 08:56):

12.7 suggests that this is "defined by a comment"

Mario Carneiro (Oct 24 2019 at 08:57):

I'm saying that you can implement SemiGroup without being associative, and nothing gives you any trouble

Tim Daly (Oct 24 2019 at 08:59):

Axiom uses "by definition" inheritance. You have to specify what you want to inherit. You can construct things like a Lie algebra (non-commutative).

For some papers that mentions Axiom, see https://swmath.org/software/63

Mario Carneiro (Oct 24 2019 at 09:00):

right, so I can say something wrong, like every ring is commutative, in the way that domains and categories are set up, and Axiom does not complain

Johan Commelin (Oct 24 2019 at 09:01):

In what sense would Lean complain, if I say that? (Of course I can't prove it. But that's a different thing, right?)

Mario Carneiro (Oct 24 2019 at 09:01):

that's how lean would complain

Mario Carneiro (Oct 24 2019 at 09:02):

when you make assertions in lean, you have a proof obligation

Mario Carneiro (Oct 24 2019 at 09:02):

when you make assertions in axiom, it says "okay"

Mario Carneiro (Oct 24 2019 at 09:03):

which means if you want to retrofit proofs, you have a ton of proof obligations to deal with

Tim Daly (Oct 24 2019 at 09:03):

Your point is that a system like Axiom needs a system like Lean, in order to prove the code correct. The inverse point is that a system like Lean needs a system like Axiom in order to make proves "effective".

Tim Daly (Oct 24 2019 at 09:04):

My point is that these two computational mathematics domains (proofs and computer algebra) both have 30+ year histories of ignoring each other. I'm trying to cross that divide in both directions.

Mario Carneiro (Oct 24 2019 at 09:05):

I think you would have a much better time trying to get axiom into lean rather than getting lean into axiom

Mario Carneiro (Oct 24 2019 at 09:06):

all the concepts in axiom seem to have a mirror in lean, but the converse isn't true

Tim Daly (Oct 24 2019 at 09:06):

I'm not trying to "get one into another". I'm trying to make them equivalent.

Mario Carneiro (Oct 24 2019 at 09:07):

They are clearly not equivalent. They have different goals and are different in almost every technical detail

Mario Carneiro (Oct 24 2019 at 09:09):

I think the best place to look for examples that straddle the "divide" are tactics like ring that do CAS algorithms in a theorem prover

Mario Carneiro (Oct 24 2019 at 09:10):

also Rob's mathematica link, which does bidirectional translation of expressions Lean <-> Mathematica

Tim Daly (Oct 24 2019 at 09:10):

Computational Mathematics is a subject that covers both proofs and computer algebra. So far, nobody as far as I know has "crossed the divide". There are examples, such as Coq's Groebner basis work, but no general convergence.

I find this unacceptable and, from all I've seen, there is a merger possible. That, of course, is a research statement and is up to me to show.

Mario Carneiro (Oct 24 2019 at 09:11):

what constitutes "crossing the divide" for you?

Mario Carneiro (Oct 24 2019 at 09:11):

I don't think general convergence is likely, because the aims are different

Tim Daly (Oct 24 2019 at 09:11):

Spad programs proven using Lean's rules. Lean's use of Axiom as a trusted Oracle in proofs.

Mario Carneiro (Oct 24 2019 at 09:12):

The Mathematica link did that

Tim Daly (Oct 24 2019 at 09:12):

I understand you don't think convergence is possible. You're not alone. That's why it is a research question.

Tim Daly (Oct 24 2019 at 09:13):

You trust Mathematica's programs in a proof step? Mathematica is a general rewrite system and is trivial to make inconsistent.

Mario Carneiro (Oct 24 2019 at 09:13):

It's not that I don't think it's possible, I don't think it makes sense. It's like asking if it's possible for more cars to be amphibious

Mario Carneiro (Oct 24 2019 at 09:13):

I don't trust Axiom any more than I trust Mathematica

Mario Carneiro (Oct 24 2019 at 09:14):

I don't think using CAS systems as an oracle is a good idea in general

Mario Carneiro (Oct 24 2019 at 09:15):

If you took those spad programs and just transcribe them in lean, you could prove that they are correct and they would still run just fine

Johan Commelin (Oct 24 2019 at 09:15):

@Tim Daly You should understand that we don't trust Mathematica at all when we use that link

Johan Commelin (Oct 24 2019 at 09:15):

We ask it for an answer, and for hints for the proof. But Lean still builds a formal proof.

Mario Carneiro (Oct 24 2019 at 09:16):

Rob discusses the possibility of using the result as an axiom, but that's only one of several options

Tim Daly (Oct 24 2019 at 09:17):

I understand your skepticism. You're not alone. Few people, as you can tell from the bibliographies, understand both domains. So if you ask proof people about proving CAS they bring out your arguments. If you ask the CAS people about Lean they have no idea how to even read the rules in the literature. At a CCNY seminar, a 40 year CAS person, when shown a proof rule asked "what is that funny fraction?"

Mario Carneiro (Oct 24 2019 at 09:18):

CAS people don't know or use formal proofs. They treat it like any other programming exercise

Mario Carneiro (Oct 24 2019 at 09:18):

I'm all for proving CAS algorithms correct, but you should not try to build that in a world that has no idea what a proof is

Tim Daly (Oct 24 2019 at 09:18):

I'm constructing the system so the compiler won't compile a program unless it proof-checks.

Mario Carneiro (Oct 24 2019 at 09:19):

in that case you need to have a formal system in mind

Tim Daly (Oct 24 2019 at 09:19):

"you should not try to build that in a world that has no idea what a proof is".... umm, research?

Tim Daly (Oct 24 2019 at 09:20):

I have a formal proof system in mind. That's why I'm trying to get a clear statement of Lean's CIC rules. I'd think that would be the most painfully obvious thing to find on page 1 of Lean's website.

Johan Commelin (Oct 24 2019 at 09:20):

@Tim Daly I don't follow. Why is the Mathematica link not a perfect example of what you have in mind. (Sure, it's limited in scope. But it's a start.)

Mario Carneiro (Oct 24 2019 at 09:20):

When you build a proof system in an environment that wasn't expecting it, there will be a thousand ways for the system to be made inconsistent by some peculiar interaction of features

Mario Carneiro (Oct 24 2019 at 09:21):

Even after all these years this is still a problem in HOL Light for example with Obj.magic

Tim Daly (Oct 24 2019 at 09:21):

The MMA link is fine. But you can't trust the result. Axiom's result would be proven and provide the proof as part of the response

Tim Daly (Oct 24 2019 at 09:22):

Thus Axiom's result is essentially a proven lemma you can use in the proof.

Mario Carneiro (Oct 24 2019 at 09:22):

I don't see how reimplementing lean in axiom is easier than writing a translator for axiom programs into lean

Mario Carneiro (Oct 24 2019 at 09:23):

is spad so complicated?

Tim Daly (Oct 24 2019 at 09:23):

I never said I was reimplementing lean in Axiom.

Johan Commelin (Oct 24 2019 at 09:23):

The MMA link is fine. But you can't trust the result. Axiom's result would be proven and provide the proof as part of the response

I definitely can trust the result when Lean says so.

Mario Carneiro (Oct 24 2019 at 09:23):

If you want to implement the CIC rules in axiom, you are reimplementing lean

Tim Daly (Oct 24 2019 at 09:25):

You can trust the result if Lean says so.... so if you need a groebner basis result and MMA provides an answer, you then proceed to re-prove the groebner basis algorithm in lean? Because you can't trust the MMA result as a lemma?

Mario Carneiro (Oct 24 2019 at 09:25):

you just need a witness in that case, so the proof is easy

Mario Carneiro (Oct 24 2019 at 09:26):

generating the witness is hard, proving that it works is easy

Tim Daly (Oct 24 2019 at 09:26):

Can you show that the "witness" is a proper basis?

Mario Carneiro (Oct 24 2019 at 09:26):

those are generally good choices for farming out to a CAS

Mario Carneiro (Oct 24 2019 at 09:27):

If the proof is not easy, then there was no point asking the CAS

Tim Daly (Oct 24 2019 at 09:27):

Axiom has 1100 categories and domains and 10,000+ algorithms. It would be great if Lean could use those as trusted lemmas

Mario Carneiro (Oct 24 2019 at 09:28):

Why? I don't trust them

Tim Daly (Oct 24 2019 at 09:28):

"Why? I don't trust them".... hence, my point.

Mario Carneiro (Oct 24 2019 at 09:28):

is there a reason I should trust them?

Tim Daly (Oct 24 2019 at 09:28):

Because they provide Lean proofs with their answers

Mario Carneiro (Oct 24 2019 at 09:29):

that's like saying I should assume false because it makes my proof easier

Mario Carneiro (Oct 24 2019 at 09:30):

I want a reason that they are trustworthy. The utility of those algorithms is not in question

Tim Daly (Oct 24 2019 at 09:30):

Ok. So you don't agree that this research is worth doing as it provides no value. The CAS people don't want proofs because proofs are too hard and test cases show that the programs are correct. The two computational mathematics camps have dual arguments.

Mario Carneiro (Oct 24 2019 at 09:31):

I think you misunderstand me. I think that having a wide variety of CAS algorithms in Lean would be very valuable. But I want to not have to assume false or dubious things as the price of admission

Tim Daly (Oct 24 2019 at 09:32):

Axiom's motto is "The 30 Year Horizon"... I'm looking out 30 years and thinking that these two streams of computational mathematics must eventually merge.

Mario Carneiro (Oct 24 2019 at 09:33):

As I've said, I think the best way to achieve your goals is to write a program to translate axiom algorithms into type correct computable lean terms

Tim Daly (Oct 24 2019 at 09:33):

I can't rewrite 10k+ algorithms in Lean. Lean doesn't provide a language for programming that I could convince anyone to use. Axiom's Spad language is very close to the mathematics.

Mario Carneiro (Oct 24 2019 at 09:34):

I did something very similar with metamath. I can't rewrite 20k+ theorems from metamath, but I can write a general translator

Tim Daly (Oct 24 2019 at 09:35):

I know that Axiom's GCD algorithm for NATs is correct. I just have to provide Lean with the proof.

Mario Carneiro (Oct 24 2019 at 09:35):

first you have to provide lean with the algorithm

Tim Daly (Oct 24 2019 at 09:36):

Actually, no.

Tim Daly (Oct 24 2019 at 09:38):

The algorithm is "provided" by the trusted Oracle link, along with the proof, as a lemma.

Tim Daly (Oct 24 2019 at 09:39):

There is no way someone will re-implement Bronstein's integration algorithm in lean.

Mario Carneiro (Oct 24 2019 at 09:39):

Again, write a translator

Mario Carneiro (Oct 24 2019 at 09:39):

lisp is trivially easy to turn into lean terms

Tim Daly (Oct 24 2019 at 09:40):

How can I write self-modifying code in lean?

Mario Carneiro (Oct 24 2019 at 09:40):

Lots of ways, but you have to get more specific

Tim Daly (Oct 24 2019 at 09:41):

Heck, I'm just trying to find an algorithm in lean that is executable as a working example.

Mario Carneiro (Oct 24 2019 at 09:42):

def length {α} : list α → nat
| []       := 0
| (a :: l) := length l + 1

#eval length [1, 2, 3]
-- 3

Mario Carneiro (Oct 24 2019 at 09:42):

there are tons of examples in TPIL

Tim Daly (Oct 24 2019 at 09:43):

do you have a gcd example?

Mario Carneiro (Oct 24 2019 at 09:43):

#eval nat.gcd 3 6
-- 3

Mario Carneiro (Oct 24 2019 at 09:43):

don't ask me for gcd's of polynomials

Mario Carneiro (Oct 24 2019 at 09:44):

suffice it to say it can be done

Tim Daly (Oct 24 2019 at 09:44):

"suffice it to say it can be done".... and I plan to actually do it. I still have 42 months until the deadline.

Mario Carneiro (Oct 24 2019 at 09:45):

It can be done in far less than 42 months. If you have the algorithm in hand it should take no more than an hour

Mario Carneiro (Oct 24 2019 at 09:46):

but if you want all of axiom's algorithms in lean, then you should be writing a translator

Tim Daly (Oct 24 2019 at 09:47):

I plan to do it as a side-effect of constructing a system where it naturally fits. Nobody cares about a single algorithm. The GCD is only an instance of a more general convergence of proofs and computer algebra.

Mario Carneiro (Oct 24 2019 at 09:48):

Lean has lots of examples of algorithms. It doesn't have that one because we don't have computable polynomials

Mario Carneiro (Oct 24 2019 at 09:49):

There are lots of other examples of CAS algorithms of similar or higher complexity, like fourier-motzkin elimination in linarith

Mario Carneiro (Oct 24 2019 at 09:49):

and these are validated algorithms because they run in lean, they have no trusted oracle

Tim Daly (Oct 24 2019 at 09:50):

Axiom has hundreds of person-years and an estimated 42 million dollars of funding. I don't think Lean is going to see that level of support in the near future so trying to reproduce all of Axiom in Lean is a non-starter.

Mario Carneiro (Oct 24 2019 at 09:50):

WRITE A TRANSLATOR

Mario Carneiro (Oct 24 2019 at 09:50):

it's waaaay less work

Tim Daly (Oct 24 2019 at 09:51):

Methinks you trivialize a hard problem. But I may be wrong.

Mario Carneiro (Oct 24 2019 at 09:51):

it is literally the best decision I made for metamath zero

Mario Carneiro (Oct 24 2019 at 09:52):

I can't write 30000 proofs! but I can write a translator in a few hundred lines and boom, 30000 proofs served right up

Mario Carneiro (Oct 24 2019 at 09:52):

and now I have a data set to do testing and benchmarking

Tim Daly (Oct 24 2019 at 09:53):

Does metamath have polynomials?

Mario Carneiro (Oct 24 2019 at 09:53):

yes

Mario Carneiro (Oct 24 2019 at 09:54):

it's not a CAS, so don't ask if it has gcd's of polynomials

Tim Daly (Oct 24 2019 at 09:54):

so write a CAS :-)

Mario Carneiro (Oct 24 2019 at 09:56):

My goal was different. My point is, when you have a huge foreign data set you want to work with, you should find a way to translate it or large parts of it rather than dealing with it one bit at a time

Mario Carneiro (Oct 24 2019 at 09:57):

Writing a CAS is a huge amount of work. I can write individual algorithms but never enough. It would be better if I could just import a CAS :)

Tim Daly (Oct 24 2019 at 10:00):

The "translator" from Spad code to lisp code is probably half a million lines

Tim Daly (Oct 24 2019 at 10:01):

I'd rather work much closer to the mathematics.

Tim Daly (Oct 24 2019 at 10:05):

Anyway, by Daly's Axiom: "There is no such thing as a simple job", ...

Johan Commelin (Oct 24 2019 at 10:09):

@Tim Daly It might make sense to search for "metamath zero" on this chat. Mario ported some really serious results to Lean that way.

Tim Daly (Oct 24 2019 at 10:10):

Were the metamath theorems proven before they were imported?

Tim Daly (Oct 24 2019 at 10:12):

Because if they were known to be true before they were translated then you didn't have to worry that they were false. However, a "translator" from untrusted code can't make that assumption.

Mario Carneiro (Oct 24 2019 at 11:00):

A translator from untrusted code can still produce a term that does... something... in lean

Mario Carneiro (Oct 24 2019 at 11:00):

If I write a program in lean the only thing it checks is that it is type correct

Mario Carneiro (Oct 24 2019 at 11:01):

If I define "type correct" so that it is trivial then it is not hard to get lean to get out of the way

Mario Carneiro (Oct 24 2019 at 11:03):

So it's available in lean from the start, and executable too, but you have asserted no properties about it so it's still not much better than you started with. But now it's in lean so you can at least state properties of interest about these algorithms, and in principle prove them

Mario Carneiro (Oct 24 2019 at 11:03):

The better the translation, the more typing you can put into it, so that you have some but not all properties of interest validated at the start

Kevin Buzzard (Oct 24 2019 at 11:57):

I have a formal proof system in mind. That's why I'm trying to get a clear statement of Lean's CIC rules. I'd think that would be the most painfully obvious thing to find on page 1 of Lean's website.

The first few pages of Mario's thesis go through this.

Kevin Buzzard (Oct 24 2019 at 12:00):

I can't write 30000 proofs! but I can write a translator in a few hundred lines and boom, 30000 proofs served right up

...and this is why we can prove the prime number theorem in Lean even though no Lean person ever sat down and tried to figure out how to prove the prime number theorem in Lean :D

Johan Commelin (Oct 24 2019 at 12:57):

@Mario Carneiro Do you have a strategy for providing broader access to these results?

Johan Commelin (Oct 24 2019 at 12:58):

I guess we don't want to make mathlib depend on these results.

Johan Commelin (Oct 24 2019 at 12:59):

Is there room for some sort of mathlib² that depends on mathlib and also provides superpower access to results from metamath (and in the future possible other libraries)

Mario Carneiro (Oct 24 2019 at 17:11):

I guess I could run the translator and push the resulting files as a lean project (separate from mathlib)

Mario Carneiro (Oct 24 2019 at 17:11):

and then users could import both libraries if they want

Bryan Gin-ge Chen (Oct 24 2019 at 17:23):

Might it even be possible to set up a travis script to do it whenever e.g. set.mm changes?

Mario Carneiro (Oct 24 2019 at 17:28):

Is that possible? I don't think CI hooks run on anything other than commits to your own repo. I think we tried something like this for keeping core and mathlib in sync

Bryan Gin-ge Chen (Oct 24 2019 at 17:34):

You can set up travis to run cron jobs. I think mathlib is built with 3.5.0c daily and I set up something similar to keep the community web editor up to date.

Johan Commelin (Oct 24 2019 at 18:31):

@Mario Carneiro But this means that sooner or later all mathematicians will move to a mathlib_2 project built on top of mathlib and set.mm.lean. Because once a bunch of nice results are available there, we will want to use them. And we won't bother with reproving them in a "clean" or "native" setting. I'm not sure about all the implications. Part of me thinks that this is a wonderful world, where we can just import stuff from metamath. But maybe it also leads to fragmentation? How do you feel about this yourself?

Mario Carneiro (Oct 25 2019 at 00:17):

@Johan Commelin I think that's fine; we should do what we can to make sharing knowledge between theorem provers easy

Mario Carneiro (Oct 25 2019 at 00:18):

If and when that becomes a big thing, we can think about better caching etc. for being able to do this scalably

Mario Carneiro (Oct 25 2019 at 00:23):

Originally, mathlib was designed to be the monolith that it is because of maintenance questions. It is separate from lean repo because different teams work on the two projects. There is no need for mathlib to eat all other projects, only projects that are maintained by members of the mathlib team, because these are the ones that benefit the most from the interaction between mathlib and the project. For separately maintained projects (particularly those with their own team and contribution schedule), keeping them out of mathlib is reasonable.

Johan Commelin (Oct 25 2019 at 04:12):

But fragmentation leads to abandonware and bitrot. Somehow all these other projects need to keep up with mathlib and its frequent refactors. By far the easiest way to do this is to make sure they end up as part of mathlib. This is what I'm worried about.

Andrew Ashworth (Oct 25 2019 at 04:30):

A bit of devil's advocate here: a good way to burn out maintainers is with drive-by pull requests :)

Andrew Ashworth (Oct 25 2019 at 04:34):

my personal opinion on this topic is that giant mono-repositories that contain everything work better in corporate settings where people are paid to care about boring maintenance work

Andrew Ashworth (Oct 25 2019 at 04:34):

also: if mathlib gets huge, computer science people will riot. Perhaps they want only a subset of the library, but if they want to typecheck from source, they have to do the entire thing. There will be angry GitHub issues

Andrew Ashworth (Oct 25 2019 at 04:35):

both approaches have their pros and cons. i've seen both ways work in the wild...

Tim Daly (Oct 25 2019 at 06:20):

A "riot" by computer scientists? That's prime XKCD material there....The graphics designs for the protest signs alone would turn into a shed-painting exercise that would never converge.

Tim Daly (Oct 25 2019 at 06:23):

A "pride of lions", a "gaggle of geese", a "herd of cattle".... and now a "riot of computer scientists" :-) I am SO going to squeeze this into a talk somewhere.

Tim Daly (Oct 26 2019 at 04:38):

@Mario Carneiro I'm reading your MetaMath Zero paper. You mentioned "proof to the x86 architecture".

I worked in security and developed the semantics of the x86 architecture. The input is binary (since we were reverse engineering malware and need to know EXACTLY what the malware does). The output is in several formats. One format is called Conditional-Concurrent assignment (CCA). Basically it rolls a bunch of statements under a logical condition which can be combined with other conditions. If you want to ensure you're getting the right code executed you can compile your program and ask for the assembler output from my program, thus "round-tripping" your code.

The "Intel Instruction Semantics Generator" (http://daly.axiom-developer.org/TimothyDaly_files/publications/sei/intel/intel.pdf) is a literate program containing both the source code and the documentation, a fair amount of which is the individual intel instruction semantics.
It is an SEI/CERT research report I did at CMU.

The x86 architecture, while widely used is, I can safely say based on my experience, rather a challenge to understand at the logical / semantics layer. It is especially challenging when you automate semantic generation. My advice to you would be to find the smallest possible subset (regardless of efficiency). Perhaps it might make sense to construct mealy / moore state machines for your logical operations and build semantics on verified versions of that.

Mario Carneiro (Oct 26 2019 at 04:40):

I'm already working down to the byte level, so there is no compiler

Tim Daly (Oct 26 2019 at 04:41):

You have my sympathy.

Mario Carneiro (Oct 26 2019 at 04:41):

Certainly I do not formalize the entire x86 arch, only the part that matters to me, about 30-40 instructions out of 1800 or so (even intel can't get an accurate count)

Mario Carneiro (Oct 26 2019 at 04:42):

I'm not sure what CCA is; I looked it up and I get references to VHDL hardware logic

Mario Carneiro (Oct 26 2019 at 04:43):

which is a bit lower than I am going. I stop at the ISA level, but below that there is the micro architecture and the arrangement of logic gates, which is usually done in languages like VHDL and Verilog, and it's all closed source in Intel's case

Mario Carneiro (Oct 26 2019 at 04:45):

If you are interested in this part of the project you should check out my MM0 + x86 talk at ITP: https://youtu.be/7hAShC6K_vA

Mario Carneiro (Oct 26 2019 at 04:46):

you have a 5000 page document? That's even larger than the intel ISA

Mario Carneiro (Oct 26 2019 at 04:47):

"literate" seems like a stretch

Tim Daly (Oct 26 2019 at 04:47):

If you're working in VHDL I did some work at CMU (for a startup idea I failed to startup). I have that code around somewhere but I'm not sure it will be of interest. I used an Altera FPGA. Unfortunately, once I seriously started VHDL coding Altera was bought by Intel. The Xilinx FPGA is a different architecture. The FPGA tool sets are proprietary and I didn't have the funds or the ambition to start all over.

Tim Daly (Oct 26 2019 at 04:48):

The document is 60,000 lines of code and about 6000 pages formatted. Nobody expects to print it as it contains executable code.

Mario Carneiro (Oct 26 2019 at 04:48):

Latex thinks the TOC is too large, the numbers overflow

Tim Daly (Oct 26 2019 at 04:48):

Beleive me, 5000 pages is small in the Intel world. The stack of x86 books makes that look like a pamphlet

Mario Carneiro (Oct 26 2019 at 04:49):

I assume you are trying to communicate, though...?

Mario Carneiro (Oct 26 2019 at 04:50):

are you talking to a person or a computer? Seems like trying for both at the same time leads to two poor jobs

Tim Daly (Oct 26 2019 at 04:50):

It is reasonably well organized by instruction. There are a LOT of instructions, as you've already noticed. At 1 instruction per page you already have several thousand pages.

Mario Carneiro (Oct 26 2019 at 04:51):

holy crap I'm 250 pages in and the TOC keeps going

Tim Daly (Oct 26 2019 at 04:52):

Literate programming, like Lisp, is an "Ah Ha! epiphany event". You don't "get it" until you "get it" and then you wonder why other people don't "get it". Once you "get it", you never want to go back.

Tim Daly (Oct 26 2019 at 04:52):

The Intel CPU is a complex beast

Tim Daly (Oct 26 2019 at 04:53):

I found mistakes in a lot of Intel assemblers.

Mario Carneiro (Oct 26 2019 at 04:53):

You should look at mine then

Tim Daly (Oct 26 2019 at 04:53):

Pointer?

Tim Daly (Oct 26 2019 at 04:54):

I want to understand things like how you generate lean output.

Tim Daly (Oct 26 2019 at 04:55):

I'm also cruising the Metamath.org website, per your recommendation.

Tim Daly (Oct 26 2019 at 04:59):

We're not that far apart in what we are trying to do, we are just coming at it from different backgrounds (proof vs computer algebra)

Mario Carneiro (Oct 26 2019 at 05:02):

x86 semantics is here -> https://github.com/digama0/mm0/blob/master/examples/x86.mm0

Tim Daly (Oct 26 2019 at 05:02):

By the way, I just got a guffaw from your MM0 paper. Back in the day, an Axiom build from scratch used to take 3 weeks on an RS6000.

Mario Carneiro (Oct 26 2019 at 05:02):

don't worry, we'll get there soon enough

Mario Carneiro (Oct 26 2019 at 05:03):

lean output uses the mm0-hs to-lean subcommand of the haskell MM0 toolchain

Tim Daly (Oct 26 2019 at 05:03):

Fortunately for me, Lisp allows runtime "patching".

Tim Daly (Oct 26 2019 at 05:09):

I've been doing open source development since 1997 (the slashdot era). An Axiom "release" every 2 months for a large code base (>1M lines) costs me about 1 week. So if you build something large and complex, expect that it will come to dominate your time. I can't even read some of the posts here. I'm not sure how one would maintain a large mathematical collection. It will become a major issue.

Mario Carneiro (Oct 26 2019 at 05:11):

I tried to get stats on the AFP build time; here it says that it takes 42h of CPU time to build all of the AFP except the very_slow build group... it doesn't say how long these take

Tim Daly (Oct 26 2019 at 05:11):

Build issues (like travis), server issues, merging new code, and worst of all, dealing with the fact that the operating system and compiler changed so your code no longer compiles. One also has to deal with a community that (a) knows more than you about their code and (b) has strong opinions.

Tim Daly (Oct 26 2019 at 05:13):

Worse, since the axioms and definitions have no "human readable" form they are quite opaque to the great unwashed in some area (e.g. HoTT). That's where literate programming matters (but I'll let you have the epiphany on your own time :0) )

Mario Carneiro (Oct 26 2019 at 05:14):

Why can't axioms and definitions have a "human readable" form? The mathematics is mostly standard

Tim Daly (Oct 26 2019 at 05:19):

You are deep in the religion. It took me 10 courses (some of which you attended) to be able to read pages of greek rules. I've invested about 4 years in the proof areas so far and I still struggle with Coq/Lean/Agda/Idris/Isabelle, not to mention reading ANY paper in the proof literature. As I mentioned before, a computer algebra colleague with 40 years in the computational mathematics business, asked in a seminar "What is that funny fraction?" when confronted with a proof rule.

Believe me, unless you do this every day, a lot of it is completely opaque. I don't know who can possibly maintain it without words. The papers that get published usually only talk about "the interesting bits". And if you try to go back to Frege, like I did, you find his notation opaque.

Axiom, which accepts normal algebra-looking equations, is considered a "steep learning curve". Lean is a vertical smooth wall to most people.

Tim Daly (Oct 26 2019 at 05:24):

I took Harper's course on Cubical Type Theory and it was a real stretch for me.

Tim Daly (Oct 26 2019 at 05:50):

Every definition, lemma, and proof should each have a paragraph of natural language and a literature hyperlink.

Mario Carneiro (Oct 26 2019 at 05:57):

don't read frege, his notation is all over the place

Mario Carneiro (Oct 26 2019 at 05:57):

to be fair, he practically invented the idea of formal notation for logic, but it was a bit like edison's 1000 dead lightbulbs

Tim Daly (Oct 26 2019 at 07:22):

I'm looking deeply into Milawa http://www.cse.chalmers.se/~myreen/2015-jar-milawa.pdf for a system "proven down to the metal". It is particularly interesting because it takes a layered approach to the problem, aka the trusted kernel idea.

Tim Daly (Oct 26 2019 at 07:24):

The fact that it is lisp-based is also a feature as it is easy to translate logic into lisp notation. I've written machine vision code in C++ so I can read it. But I find that the Lean C++ code is opaque. I could not find how a single logic rule was implemented. (I'm looking at your master's thesis now)

Tim Daly (Oct 26 2019 at 07:25):

A Milawa implementation raises the possibility of using it against itself for checking.

Tim Daly (Oct 26 2019 at 07:36):

Another long-term issue with C++ is.... which C++ "standard"? C++ emits a new standard every few years so the C++ you write may not compile in the future. As an example, Axiom compiles to C under the covers. One of the issues is that the C library implementation has changed which caused an amazing amount of maintenance grief. The new "C++ standard" will include "concepts" which will not compile in earlier versions. I also maintained the open source project Magnus (https://github.com/markuspf/magnus) which is a C++ computer algebra program that does Infinite Group Theory. Keeping that alive in C++ is a nightmare. (Did you know that you can construct classes IN THE ARGUMENT LIST of a function definition at compile time?

Mario Carneiro (Oct 26 2019 at 07:39):

as far as I'm aware, C++ has not made anything not compile that used to compile, as this is a very important aspect of the rationale for using C++ in the first place

Tim Daly (Oct 26 2019 at 07:40):

If Lean's goal is to import all of mathematics it needs a "30 year horizon" view, that is, it needs to be very, very maintainable.

Mario Carneiro (Oct 26 2019 at 07:40):

Usually, when a compiler update causes your code to break, that means you used undefined behavior

Tim Daly (Oct 26 2019 at 07:40):

Oh, how I wish that were true.

Giovanni Mascellani (Oct 26 2019 at 07:46):

If a compiler update breaks your code, then you probably relied on UB; but it is very well possible that a standard update breaks your code. Usually they try to not do that too much and have reasonable deprecation windows, but sometime breaking changes happen. For example, C++17 removed std::auto_ptr.

Tim Daly (Oct 26 2019 at 07:50):

On the other hand, the common lisp code from the 1980s runs unchanged and there is only 1 standard.

Tim Daly (Oct 26 2019 at 07:56):

I've been associated with the Axiom code for nearly 40 years so I understand the issue of "long term maintenance". Where does Lean want to be in 40 years? Will it have 100,000 theorems, 50,000 definitions, 1/2 million lemmas, in dozens of areas of mathematics? Who, and how, is that going to be maintained?

Tim Daly (Oct 26 2019 at 07:58):

If @Kevin Buzzard wants all of mathematics to be written down in Lean, how is that going to be curated?

Giovanni Mascellani (Oct 26 2019 at 08:07):

One very nice thing of MM and MM0 is that they decouple the generation and storage of proofs. You can use whatever language you fancy to create and edit proofs. Then you save them in a (rather) simple format, specified independently of any language or tool, and you are perfectly interoperable with any other proof written for the same system. I don't know much about Lean, but I believe this should be the way to go for whoever wants to store information (not just proofs) in a durable fashion.

Together with durability, it also gives everybody the power to use their preferred tools and develop new tools using their preferred languages. This is pretty standard practice on the MM mailing list.

Scott Morrison (Oct 26 2019 at 08:09):

Well... "all of mathematics" is a big reach. I think at some point mathlib is going to need to start picking some more precise boundaries. I think if you interpret "all of mathematics" as "a strong undergradute/masters education at top maths departments", it looks a lot more manageable.

Tim Daly (Oct 26 2019 at 08:09):

People leave projects for various reasons, such as their funding ran out, or they got a new job so they are not available for answering questions. (Several Axiom authors are dead and my Ouija skills aren't up to asking advanced math questions).

Scott Morrison (Oct 26 2019 at 08:10):

We're already a small but appreciable fraction of the way to that threshold, and it's not obvious to me that existing processes (a very public pull request process, with a gradually expanding crew of "permission to merge" reviewers) can't get us there.

Mario Carneiro (Oct 26 2019 at 08:10):

We've already seen a bit of a boundary with the external projects that have been done like flypitch

Mario Carneiro (Oct 26 2019 at 08:11):

ultimately I expect it to turn into something like isabelle library vs AFP

Scott Morrison (Oct 26 2019 at 08:11):

And we're already marking some contributions as belonging in the "archive" directory, rather than mathlib core.

Scott Morrison (Oct 26 2019 at 08:11):

And Simon has been working on setting up a build process for "associated projects".

Scott Morrison (Oct 26 2019 at 08:12):

Certainly anything "research level", if we ever get there, won't belong in mathlib. (Moving it into mathlib might be like "someone writing the textbook".)

Tim Daly (Oct 26 2019 at 08:12):

My undergrad mathematics courses covered material known up to about 1800. That wasn't particularly useful for maintaining code written as part of a PhD thesis (e.g. the Risch Integration algorithm, or Cylindrical Algebraic Decomposition, or Groebner bases)

Mario Carneiro (Oct 26 2019 at 08:13):

most of that is target specific

Mario Carneiro (Oct 26 2019 at 08:13):

if you want risch integration you have a bunch of lemmas to prove that have nothing to do with anything except risch integration

Scott Morrison (Oct 26 2019 at 08:13):

I wasn't attempting to say that a good undergraduate education is all you need to be a useful maintainer of mathlib.

Tim Daly (Oct 26 2019 at 08:13):

HoTT is not target specific.

Mario Carneiro (Oct 26 2019 at 08:14):

HoTT is closer to library level

Scott Morrison (Oct 26 2019 at 08:14):

HoTT is an alternative foundational choice, which mathlib has pretty explicitly disavowed at this point.

Tim Daly (Oct 26 2019 at 08:15):

I'm clearly confused about Lean's and Kevin's goals.

Scott Morrison (Oct 26 2019 at 08:15):

(It is a lovely subject, but if you're not interested in it as a foundational system, I'd say it's "research level"; some undergraduate do learn it --- I've taught some --- but not many.)

Mario Carneiro (Oct 26 2019 at 08:15):

get mathematicians doing formal maths

Scott Morrison (Oct 26 2019 at 08:15):

Make theorem provers actually useful to mathematicians doing research.

Scott Morrison (Oct 26 2019 at 08:16):

The current mathlib library, realistically, is not going to survive a 30 year horizon. But it is a serious stress test of the design decisions of Lean, and an experiment to see if modern theorem provers can provide an environment that a mathematician can use.

Mario Carneiro (Oct 26 2019 at 08:16):

this is part of why kevin emphasizes definitions over theorems... theorems don't open up new areas to formalize, definitions do

Scott Morrison (Oct 26 2019 at 08:17):

Obviously we're not there yet --- there are a bunch of mathematicians using Lean now, but we've all invested a few years in getting up to speed, and none of us are close to being able to use it "in our day jobs".

Tim Daly (Oct 26 2019 at 08:18):

I'm a mathematician (an admittedly poor one) and I'm trying to do research in computer algebra so I'm likely part of your target audience.

Mario Carneiro (Oct 26 2019 at 08:18):

I just have to figure out how to archive the lean proofs in a more timeless format before the ship goes down

Tim Daly (Oct 26 2019 at 08:20):

I don't want the ship to go down. I'm looking to build my next 30 years on it.

Mario Carneiro (Oct 26 2019 at 08:20):

lean 3 will not last 30 years

Mario Carneiro (Oct 26 2019 at 08:20):

the writing is already on the wall, lean 4 exists

Scott Morrison (Oct 26 2019 at 08:21):

Lean 3 is hopefully dead in 2 years.

Tim Daly (Oct 26 2019 at 08:21):

Oh, boy! A new standard! Sigh.

Scott Morrison (Oct 26 2019 at 08:22):

I think the point to keep in mind is that all the current theorem provers _are really bad_. I like Lean, but because I think it's the best of a bad lot.

Mario Carneiro (Oct 26 2019 at 08:22):

I expect MM to survive the 30 year horizon, but lean is structurally not designed for that

Scott Morrison (Oct 26 2019 at 08:22):

Working on mathlib, and getting mathematicians to use it, is for the sake of helping birth theorem provers that are actually usable.

Scott Morrison (Oct 26 2019 at 08:22):

There's going to be some wreckage (libraries that bite the dust...) along the way.

Mario Carneiro (Oct 26 2019 at 08:22):

good UI is always associated with short lifespan

Mario Carneiro (Oct 26 2019 at 08:23):

because design has fads

Tim Daly (Oct 26 2019 at 08:23):

This isn't game programming. This is mathematics.

Mario Carneiro (Oct 26 2019 at 08:23):

you would be surprised how much that matters to people

Scott Morrison (Oct 26 2019 at 08:24):

I do see part of Mario's point here --- the editing experience in VS Code is pretty awesome, but obviously the VS Code plugin only has a lifespan of <5 years, before everyone moves on to the next best thing.

Mario Carneiro (Oct 26 2019 at 08:24):

look at apple products - they are known for their designs, but they only achieve that with a ridiculous release schedule

Scott Morrison (Oct 26 2019 at 08:25):

And so some of the extensions we'd like (a term inspector! a graphical rewriter! typeclass resolution parsers!) all have to provide ROI on 5 year timescales.

Mario Carneiro (Oct 26 2019 at 08:25):

not changing means becoming dated and ugly compared to the cool new thing

Scott Morrison (Oct 26 2019 at 08:26):

But I also wonder if Mario means something deeper by "good UI" than the actual editor experience.

Scott Morrison (Oct 26 2019 at 08:26):

In some sense even "dependent type theory" is "good UI".

Scott Morrison (Oct 26 2019 at 08:27):

It's lovely and expressive and feels like you're talking to the computer the way you actually think about things.

Mario Carneiro (Oct 26 2019 at 08:27):

I think eventually people will come around to my perspective that having a type theory is an editor feature

Tim Daly (Oct 26 2019 at 08:27):

I once collected over 100 "computer algebra systems" on a DVD and gave it out at a conference. All of those projects (and tens of thousands of github/sourceforge/savannah open source projects) died when the original authors left. You're saying that Lean is going to be one of the collection of "100 best theorem provers that died".

Scott Morrison (Oct 26 2019 at 08:27):

.... until you discover the dependent type theory is ... hard.

Scott Morrison (Oct 26 2019 at 08:27):

Well --- I was saying that I'll be happy when Lean 3 dies because Lean 4 has arrived and is better.

Scott Morrison (Oct 26 2019 at 08:28):

I don't think that counts as Lean dying.

Mario Carneiro (Oct 26 2019 at 08:28):

they call it the product lifecycle for a reason

Mario Carneiro (Oct 26 2019 at 08:29):

almost no software from 30 years ago is still worth using. The stuff that remains survived by being minimalistic and making that a selling point, like C

Scott Morrison (Oct 26 2019 at 08:30):

a long list of unix command line tools

Kevin Buzzard (Oct 26 2019 at 08:30):

Well... "all of mathematics" is a big reach. I think at some point mathlib is going to need to start picking some more precise boundaries. I think if you interpret "all of mathematics" as "a strong undergradute/masters education at top maths departments", it looks a lot more manageable.

That's basically the goal of the Xena project, and I am hoping that this is a sufficiently "small" goal that it will be both (a) feasible and (b) not impossible to curate. It seems to me to be a very reasonable boundary for mathlib. If you look at the for_mathlib directory in the perfectoid project you can see that the stuff that ends up in there is either completely foundational (e.g. lemmas on uniform spaces/with_zero/nnreal) or stuff which people meet in the 4 year MSc course at Imperial (normed spaces, polynomials, primes, submodules etc). Stuff like 3200 LOC on the theory of valuations taking values in some funny linearly ordered monoid, which is at the heart of the definition of an adic space, is not pencilled in to go in there because you have to draw a line somewhere and we certainly don't teach that to the MSc students unless they're doing a specialised project. I agree that this is an interesting long-term question.

Scott Morrison (Oct 26 2019 at 08:30):

mathematica is 30 years old, and still going pretty strong

Mario Carneiro (Oct 26 2019 at 08:31):

C++ too... I think the other mode is by growing continuously like a ponzi scheme

Scott Morrison (Oct 26 2019 at 08:32):

so far I would guess that ~90% of mathlib is covered in the union of "good masters degrees".

Scott Morrison (Oct 26 2019 at 08:32):

but we're probably only at most 20% of that union. :-)

Kevin Buzzard (Oct 26 2019 at 08:36):

My thinking currently is "oh look, here are a ton of young mathematicians which I have easy access to, if I go on about Lean all the bloody time then a few of them will get interested and start formalising stuff which they're seeing in lectures, and eventually there will be enough of them to make more of our degree just magically appear in mathlib". By the way, this week I suggested writing a new 3rd year UG course on doing maths in dependent type theory and the suggestion was met with enthusiasm.

Tim Daly (Oct 26 2019 at 08:37):

Axiom was one of "the big 3" (Mathematica, Maple, and Axiom). Axiom was sold commercially for a while and then withdrawn and given to me. Maple was sold to a Japanese company. Mathematica still exists but there are problems. The big problem is that companies last, on average, 15 years. A second problem is that, because MMA is the primary company asset it would need to be sold rather than open sourced, as happened with Macsyma from Symbolics when they went under.

Can you imagine the "hole" that will develop in mathematics when Mathematica is no longer available?

Axiom is trying to build a long term, solid foundation for computational mathematics. While it covers areas of computer algebra it has so far neglected the proof area. The merger of these two streams of computational mathematics, to provide a stable, proven basis is the project goal.

Product cycles are for games. Mathematics is forever.

Kevin Buzzard (Oct 26 2019 at 08:38):

I know life isn't as easy as this really, but that's the vague plan at this point. Occasionally you run into a Kenny or a Chris, and then all of a sudden you have localisation of rings or quadratic reciprocity appearing in mathlib. And one day the same thing will happen for the basic theory of differentiable functions in one real or complex variable.

Marc Huisinga (Oct 26 2019 at 09:47):

mathematics is not entirely forever, either. you can see the effect of "product cycles" in math as well, where old and clumsy ui is deprecated in favor of new and snappy ui and people using the new ui have a hard time working with the old ui. notation changes, the underlying abstractions and intuitions change, and the writing styles of papers change.
as a result, old papers can be pretty hard to read, and old results are adapted to the new ui.
itp is still relatively young and thus evolving through these cycles more quickly, especially with good ui being so important to the area. give it some time, let it evolve and maybe we will eventually be able to meet the 30 year horizon when we are sufficiently happy with the ui we've got!

Tim Daly (Oct 26 2019 at 10:32):

Unfortunately, mathematical software doesn't "evolve". People always start from scratch. As I mentioned above, there are hundreds of computer algebra systems, many started with the insight that C++ allows operator overloading so...Hey! We can do polynomials! These computer algebra systems were academic "hand waving" and ended up contributing nothing but papers to the field.

The big systems like Mathematica, Maple, and Axiom had many people contributing in a similar style over many years. Most important is that they had a continuous stream of funding.

Systems like Sage and Magnus got academic funding for a couple years. When the money ran out they ran into trouble. William Stein left the University of Washington to try to save Sage.

Coq seems to have a funding stream that has continued over many years. If Lean gets enough places using it for teaching and some research mathematicians using it, then it might survive. But developing it with "this week's game" mindset means you can't build on the work.

So my question becomes "Is Lean going to be around in 5 years?" Surely if we want to build a stable base that mathematicians can use both for teaching and research, we want a system that isn't going to devolve into papers.

Of all the software projects I've done over the last 50 years, only the mathematics software has any sign of being "forever". Everything else I've written disappeared.

Marc Huisinga (Oct 26 2019 at 10:54):

i don't think that we always start from scratch. you can write translators like mario did, port stuff manually like in mathlib in the past or add some sort of backwards compatibility like what has been discussed for lean 4 where lean 3 code could be imported into lean 4.

sure, there's a compromise - you most certainly cannot expect users to keep up with evolution that is too frequent, and hence you need to listen to your userbase. from what i've gathered over my time here, most people would welcome a switch to lean 4 in order to fix some of the shortcomings of lean 3. i did not get the impression that lean is developed with a "this week's game" mindset.

whether lean will be around in 5 years does not just depend on whether lean is retaining backwards compatibility, it might also depend on exactly the fact that lean is willing to evolve and cater to the needs of its users.

Kevin Buzzard (Oct 26 2019 at 10:56):

With Lean 4 we are starting from scratch, right?

Kevin Buzzard (Oct 26 2019 at 10:56):

Whatever fancy things there might be, we're still starting from scratch.

Marc Huisinga (Oct 26 2019 at 10:57):

there was some talk here about being able to import lean 3 code into lean 4, not sure whether that is actually feasible. i'll ask sebastian when he's back from vacation ;)

Kevin Buzzard (Oct 26 2019 at 10:58):

I have no real idea about when we will be able to start thinking about these question seriously.

Marc Huisinga (Oct 26 2019 at 11:04):

on an unrelated note, i think it would be better to have discussions like this in #general

Tim Daly (Oct 26 2019 at 11:13):

Actually, I fear I sound too negative and should stop. I really like Lean and am looking at ways to use it from Axiom.

Tim Daly (Oct 26 2019 at 12:40):

This paper might be of interest as a "crossover" between logic and computer algebra:
SC2: Satisfiability Checking Meets Symbolic Computation

Tim Daly (Oct 27 2019 at 09:06):

Does Lean support first-class dependent types? The paper (https://arxiv.org/pdf/1102.1323.pdf) Type Classes for mathematics in type theory talked about essentially this in Coq

Scott Morrison (Oct 27 2019 at 12:08):

Yes.

Last updated: May 02 2025 at 03:31 UTC