Zulip Chat Archive

Stream: general

Topic: Lean vs Coq

Ramkumar Ramachandra (Jan 04 2020 at 12:16):

Is Lean slow because a VM is used for computations?

Mario Carneiro (Jan 04 2020 at 12:27):

This is a vague statement without a reference for comparison, but the short answer is yes. When you evaluate a lean tactic or lean function using #eval, it executes a bytecode in a similar way to Python, leading to the same order of speed (it's certainly not as well optimized as python though).

Mario Carneiro (Jan 04 2020 at 12:29):

One of the biggest changes with Lean 4 is an overhaul of this system. Lean code will now be transpiled to C and then compiled and run, so you are likely to see performance similar to Haskell (although again, optimization is an infinitely deep rabbit-hole)

Ramkumar Ramachandra (Jan 04 2020 at 12:32):

I see, that's good to hear. I was comparing the compile speeds of mathlib and math-comp, which are comparable in size.

Mario Carneiro (Jan 04 2020 at 12:32):

@Ramkumar Ramachandra You made a similar claim on coq-club regarding speed of Coq over lean for everyday elaboration. My personal experience is the opposite, so if you could clarify what kind of benchmark you are thinking of that would be helpful

Mario Carneiro (Jan 04 2020 at 12:33):

How fast does math-comp compile? Is the material comparable? Does this include the coq standard library?

Ramkumar Ramachandra (Jan 04 2020 at 12:33):

Just plain wall-clock time. Run make on mathlib or math-comp. math-comp takes about 10 mins to build, and the loc-count is similar.

Ramkumar Ramachandra (Jan 04 2020 at 12:34):

It doesn't include the standard library, and I think the material is comparable, but I'm a beginner.

Mario Carneiro (Jan 04 2020 at 12:35):

I think the Coq standard library is much bigger than the lean core lib

Mario Carneiro (Jan 04 2020 at 12:36):

Also, I suspect that math-comp is much more low level in its tactic usage, leading to faster run times. I could be wrong but Gonthier has some opinions in this area that are not shared elsewhere in the coq ecosystem

Ramkumar Ramachandra (Jan 04 2020 at 12:37):

lean's stdlib is 18k loc, while coq's is 122k. Coq's stdlib is indeed much larger.

Mario Carneiro (Jan 04 2020 at 12:37):

I think mathlib is closer to coq stdlib + math-comp

Mario Carneiro (Jan 04 2020 at 12:38):

but this is still a very vague equivalence because the details are all different

Ramkumar Ramachandra (Jan 04 2020 at 12:38):

That would be incorrect. math-comp uses ssreflect heavily.

Mario Carneiro (Jan 04 2020 at 12:38):

Right, that's what I mean

Mario Carneiro (Jan 04 2020 at 12:38):

ssreflect is a very direct imperative tactic style

Mario Carneiro (Jan 04 2020 at 12:39):

it's fast to work with that

Ramkumar Ramachandra (Jan 04 2020 at 12:39):

Ah; I wouldn't know.

Patrick Massot (Jan 04 2020 at 12:39):

Yes, Gonthier and his team are extremely proud of using almost no automation.

Mario Carneiro (Jan 04 2020 at 12:39):

A representative of the opposite camp would be isabelle, which uses lots of automated proof searching and "blast", "auto", "sledgehammer" type tactics

Ramkumar Ramachandra (Jan 04 2020 at 12:40):

Yes. I believe they're also starting to play with ML-based proof search.

Mario Carneiro (Jan 04 2020 at 12:40):

I'm pretty sure this split is the number 1 aspect that contributes to overall proof checking time of entire libraries

Ramkumar Ramachandra (Jan 04 2020 at 12:41):

... okay, but lean is similar enough to coq to compare on an loc-basis, no?

Mario Carneiro (Jan 04 2020 at 12:42):

Roughly, yes. I think lean is a bit more terse in general from my impressions

Ramkumar Ramachandra (Jan 04 2020 at 12:42):

Do you have an answer to my question about formalizing ∞-categories, by the way?

Mario Carneiro (Jan 04 2020 at 12:43):

I've been wanting to see how infinity cats are formalized for a long time, because I don't understand the math and will continue not to until someone actually writes a formal definition down instead of vague words

Mario Carneiro (Jan 04 2020 at 12:44):

There is nothing in principle stopping anyone from writing it down assuming they actually know what they want

Mario Carneiro (Jan 04 2020 at 12:45):

I will say that I think it is ludicrous to consider changing your foundations (kernel) in order to accomodate X mathematics. If you need to do so you've already lost

Ramkumar Ramachandra (Jan 04 2020 at 12:46):

I've just started working on formalizing ∞-categories in Coq. The textbook-definition is the most straightforward way to think about it, and you kind of have to bend over backwards in the Coq formalization -- it's quite ugly.

Mario Carneiro (Jan 04 2020 at 12:46):

Do you have a link?

Mario Carneiro (Jan 04 2020 at 12:47):

I think the lean way wouldn't be so different from that, so I'm interested to know what bending you have to do

Ramkumar Ramachandra (Jan 04 2020 at 12:49):

See https://gist.github.com/artagnon/18c5b8ef22704ef5e92a2111da627692 -- I haven't made much progress though.

Ramkumar Ramachandra (Jan 04 2020 at 12:51):

I couldn't write a currified version: it became too complicated.

Ramkumar Ramachandra (Jan 04 2020 at 13:03):

I will say that I think it is ludicrous to consider changing your foundations (kernel) in order to accomodate X mathematics. If you need to do so you've already lost

I would disagree with this, because HoTT is quite a deep change.

Mario Carneiro (Jan 04 2020 at 13:08):

I'm saying that if you want to be a theorem prover for general mathematics, you can't be playing games with the foundations all the time. Yes, this is throwing HoTT and Cubical type theories under the bus

Mario Carneiro (Jan 04 2020 at 13:10):

That definition, at least, would go through in lean with only syntactic changes

Mario Carneiro (Jan 04 2020 at 13:11):

You would have to define it as a pair like you did, though; they are mutually recursive so you need them both

Ramkumar Ramachandra (Jan 04 2020 at 13:12):

It's more complicated than garden-variety mutually-recursive types: the second definition takes the type of the first definition as an argument.

Ramkumar Ramachandra (Jan 04 2020 at 13:12):

I did indeed try with mutually-recusive types, before I hit this limitation.

Mario Carneiro (Jan 04 2020 at 13:12):

right, it's an inductive-recursive definition

Mario Carneiro (Jan 04 2020 at 13:13):

If you didn't have the index n this would be impossible to define in lean

Ramkumar Ramachandra (Jan 04 2020 at 13:13):

Indeed; I thought so too.

Mario Carneiro (Jan 04 2020 at 13:14):

but as it stands this is just a regular recursive function on nat

Ramkumar Ramachandra (Jan 04 2020 at 13:16):

I'm saying that if you want to be a theorem prover for general mathematics, you can't be playing games with the foundations all the time. Yes, this is throwing HoTT and Cubical type theories under the bus

That's a valid viewpoint, but Coq is a full-fledged programming language + proof assistant. They do play games with TT and PLT.

Mario Carneiro (Jan 04 2020 at 13:17):

I know. And I don't trust their metatheory at all

Ramkumar Ramachandra (Jan 04 2020 at 13:17):

Why so? "Coq Coq Correct!" is quite an accessible exposition.

Ramkumar Ramachandra (Jan 04 2020 at 13:18):

I would sympathize with the viewpoint that you don't care about good type theoretic properties because a VM "works in practice", but the trust in the metatheory is quite high.

Ramkumar Ramachandra (Jan 04 2020 at 13:19):

Note that Coq has been used to build CompCert, which was (and still is) an engineering marvel.

Ramkumar Ramachandra (Jan 04 2020 at 13:21):

Lean doesn't try to be a general-purpose programming language for building verified software stacks. It has a narrower focus: a big step-up for pure mathematicians, who are currently on mathematica and sage.

Mario Carneiro (Jan 04 2020 at 13:22):

I have not read "Coq Coq Correct" yet, but skimming it it looks to be a step in the right direction. But It is still a subset of Coq. I won't be satisfied until the entire kernel is covered

Mario Carneiro (Jan 04 2020 at 13:22):

Being able to build things in a proof assistant doesn't mean it doesn't have a bug in it

Mario Carneiro (Jan 04 2020 at 13:22):

That is only half of the equation

Ramkumar Ramachandra (Jan 04 2020 at 13:23):

Indeed, and work is in progress. Coq Coq Correct! already covers a very large subset, by the way.

Mario Carneiro (Jan 04 2020 at 13:24):

I wrote the paper on Lean's Type Theory because I didn't want Lean to make the same mistake. That paper covers everything the kernel is capable of doing (and some things it can't), so that soundness means something

Ramkumar Ramachandra (Jan 04 2020 at 13:24):

In practice, too, very few serious bugs have been reported in Coq, over the last twenty years.

Ramkumar Ramachandra (Jan 04 2020 at 13:25):

Right; although how much of your paper is mechanized in Lean? :)

Mario Carneiro (Jan 04 2020 at 13:25):

Furthermore, my view is that both of these systems are far overcomplicated and you can get by with a much much simpler metatheory (see metamath zero)

Ramkumar Ramachandra (Jan 04 2020 at 13:25):

MM0 is interesting, yes.

Mario Carneiro (Jan 04 2020 at 13:26):

That's where I'm really putting my money where my mouth is - it's a theorem prover that proves its own implementation

Ramkumar Ramachandra (Jan 04 2020 at 13:26):

Perhaps you missed it, but Coq Coq Correct! formalizes a large subset of the Coq engine in Coq.

Mario Carneiro (Jan 04 2020 at 13:27):

I didn't miss it - that's actually the best reference I've seen yet as to how to implement a coq kernel

Mario Carneiro (Jan 04 2020 at 13:27):

There is of course a huge front end too though

Mario Carneiro (Jan 04 2020 at 13:28):

and I don't hold much hope that this be diversified to other front ends

Mario Carneiro (Jan 04 2020 at 13:30):

For reimplementing a kernel, it's not necessary that the metatheory work actually be formalized, as long as it is mathematically precise and complete. Coq papers have historically had difficulty with the "complete" part

Mario Carneiro (Jan 04 2020 at 13:31):

Formalizing it is the next level, but it helps to have an informal version too

Ramkumar Ramachandra (Jan 04 2020 at 13:34):

Agreed. Keep in mind that Coq is twenty years old, and has accumulated quite a lot of features in that time.

Mario Carneiro (Jan 04 2020 at 13:35):

The solution to that problem is to not cultivate a community where problems are solved by extending the kernel

Mario Carneiro (Jan 04 2020 at 13:36):

which is why I said "don't play games with the foundations" above

Mario Carneiro (Jan 04 2020 at 13:36):

a large and complex kernel is the inevitable consequence

Ramkumar Ramachandra (Jan 04 2020 at 13:36):

Well, no so much "extending" as "rethinking foundational design choices". I do agree that extending the kernel makes it hard to trust the metatheory.

Patrick Massot (Jan 04 2020 at 13:38):

@Ramkumar Ramachandra Coq is older than you think it is...

Mario Carneiro (Jan 04 2020 at 13:38):

or time is further along

Patrick Massot (Jan 04 2020 at 13:38):

Or this is nat subtraction hell again.

Patrick Massot (Jan 04 2020 at 13:39):

Who knows what to expect of natural numbers subtraction?

Ramkumar Ramachandra (Jan 04 2020 at 13:40):

Preserving good type theoretic properties are important to the Coq community, so rethinking foundations is inevitable. They'd never be okay with a VM, I think.

Ramkumar Ramachandra (Jan 04 2020 at 13:41):

Indeed, Coq is thirty years old, not twenty.

Kevin Buzzard (Jan 04 2020 at 13:41):

There are several definitions of infinity category in the literature. My instinct would be to formalise the book by Riehl and Verity

Patrick Massot (Jan 04 2020 at 13:42):

Thirty years means you count only from the first official release, right?

Patrick Massot (Jan 04 2020 at 13:42):

I think the first version is from 1984.

Kevin Buzzard (Jan 04 2020 at 13:43):

http://www.math.jhu.edu/~eriehl/elements.pdf

Ramkumar Ramachandra (Jan 04 2020 at 13:47):

@Mario Carneiro Why don't you publish your paper on Lean's type theory at POPL or ICFP? It currently has very low visibility.

Mario Carneiro (Jan 04 2020 at 13:47):

Maybe I should... It was my masters thesis so publication was not a prerequisite

Mario Carneiro (Jan 04 2020 at 13:52):

From Riehl §1.2:

In §1.1, we presented “analytic” proofs of a few of the basic facts about quasi-categories. The category theory of quasi-categories can be developed in a similar style, but we aim instead to develop the “synthetic” theory of infinite-dimensional categories, so that our results will apply to many models at once. To achieve this, our strategy is not to axiomatize what these infinite-dimensional categories are, but rather axiomatize the “universe” in which they live.

I'm getting a bad feeling about this...

Patrick Massot (Jan 04 2020 at 13:53):

Doesn't it sound a bit Mochizuki-like?

Ramkumar Ramachandra (Jan 04 2020 at 13:55):

@Kevin Buzzard Thanks; it's a good book, with emphasis on homotopy coherent theory.

Ramkumar Ramachandra (Jan 04 2020 at 13:57):

I'm working through Lurie's Higher Topos Theory from 2009, but it's a tad bit too advanced for me at this point.

Kevin Buzzard (Jan 04 2020 at 13:58):

Different people mean different things by infinity category. @Emily Riehl and Verity are attempting to axiomatise the things everyone believes about them so their proofs will apply to all definitions. At least this is my understanding of it.

Mario Carneiro (Jan 04 2020 at 14:00):

While synthetic, our work is not schematic or hand-wavy, with the details of how to make things fully precise left to “the experts” and turtles all the way down.[2] Rather, we prove our theorems starting from a short list of clearly-enumerated axioms, and our conclusions are valid in any model of∞-categories satisfying these axioms.
...
[2] A less rigorous “model-independent” presentation of ∞-category theory might confront a problem of infinite regress,since infinite-dimensional categories are themselves the objects of an ambient infinite-dimensional category, and in developing the theory of the former one is tempted to use the theory of the latter. We avoid this problem by using a very concrete model for the ambient (∞,2)-category of ∞-categories that arises frequently in practice and is designed to facilitate relatively simple proofs. While the theory of (∞,2)-categories remains in its infancy, we are content to cut the Gordian knot in this way.

I don't know if I should be reassured by this

Mario Carneiro (Jan 04 2020 at 14:00):

They didn't define what "enriched over quasi-categories" means so I question their claim

Mario Carneiro (Jan 04 2020 at 14:01):

I guess they are still drawing on a large unspecified amount of background material

Mario Carneiro (Jan 04 2020 at 14:03):

oh wait, they have it in an appendix

Kevin Buzzard (Jan 04 2020 at 14:04):

I think their work is formalisable. Just like there is a category of all sets and a 2-category of all categories, there's an infinity-2 category of all infinity categories

Mario Carneiro (Jan 04 2020 at 14:06):

I'm just trying to figure out if I can trace through all the definitions leading to infinity cosmos without leaving the book

Kevin Buzzard (Jan 04 2020 at 14:06):

I think this stuff is going to be important in modern mathematics, for example Scholze's new preprint on ArXiv proves results about cohomology of schemes using infinity categories

Patrick Massot (Jan 04 2020 at 14:06):

Notice the book is unfinished.

Mario Carneiro (Jan 04 2020 at 14:07):

I should hope that this at least is done! (A book on infinity cats should define what is an infinity cat)

Mario Carneiro (Jan 04 2020 at 14:08):

certainly once you've got 700 pages in you should have a good answer to the question

Patrick Massot (Jan 04 2020 at 14:08):

Not necessarily, they could decide to change something after seeing that page 690 is more painful than it should be.

Patrick Massot (Jan 04 2020 at 14:09):

I know nothing about all this though. Maybe it's already completely defined.

Ramkumar Ramachandra (Jan 04 2020 at 14:09):

@Kevin Buzzard What I mean by ∞-categories are the (∞, 1)-categories formalized in HTT; I think that is the generally-accepted definition now. The general approach is to a type theoretic interpretation is to start with simplicial sets, build ∞-groupoids, and work out the coherence constraints that come with interpreting the model category. Indeed, Voevodsky started the HoTT line of work by defining types to be ∞-groupoids, and Shulman's [2013, 2019] work does solve the coherence problem, by which ∞-topoi can be strictified into Quillen Model Categories. I get the impression that he's not too fond of the ongoing cubical work, because he wants even stronger foundations.

[Please take my comments with a pinch of salt, since I'm a beginner]

Mario Carneiro (Jan 04 2020 at 14:10):

HoTT doesn't formalize (∞, 1)-categories at all AFAIK; they are the metatheory of HoTT

Ramkumar Ramachandra (Jan 04 2020 at 14:11):

Yes, that's what I meant.

Ramkumar Ramachandra (Jan 04 2020 at 14:11):

It's a very pleasant and elegant metatheory.

Kevin Buzzard (Jan 04 2020 at 14:11):

But are they Luries infinity categories?

Koundinya Vajjha (Jan 04 2020 at 14:12):

I think this stuff is going to be important in modern mathematics, for example Scholze's new preprint on ArXiv proves results about cohomology of schemes using infinity categories

@Kevin Buzzard could you link to that paper?

Ramkumar Ramachandra (Jan 04 2020 at 14:12):

HoTT doesn't formalize (∞, 1)-categories at all AFAIK; they are the metatheory of HoTT

HTT: Higher Topos Theory.

Ramkumar Ramachandra (Jan 04 2020 at 14:13):

But are they Luries infinity categories?

I can't say for sure, but Shulman's [2013, 2019] work should have the answer. I'm still working through them.

Johan Commelin (Jan 04 2020 at 14:14):

@Ulrik Buchholtz You are our local expert on this stuff... maybe you can enlighten us

Mario Carneiro (Jan 04 2020 at 14:14):

I have always been frustrated reading material in that area that assumes everyone knows what an infinity cat is as if it were a single well understood concept. If you just assume that a thing exists that has the properties you want (i.e. HoTT) then it's easy, but building up the theory formally has apparently never been done

Patrick Massot (Jan 04 2020 at 14:15):

Kevin probably means https://arxiv.org/abs/1912.10932

Kevin Buzzard (Jan 04 2020 at 14:22):

Yes -- sorry -- doing ten things at once. Cesnavicius-Scholze.

Ulrik Buchholtz (Jan 04 2020 at 14:29):

I wouldn't call myself an expert on the model-theory of HoTT, but here's my understanding of where we are: From Shulman, we know that any $(\infty,1)$ -topos (with Lurie's definition) can be presented by a super-nice model category that in turn can be strictified to a model of HoTT. I think it's still not quite what we want: what if we take different presentations, do we get different results (we shouldn't, up to homotopy/identification).

Ulrik Buchholtz (Jan 04 2020 at 14:31):

And it's true we don't know whether HoTT (meaning here MLTT+Univalence+HITs) can define the correct type of $(\infty,1)$ -categories, etc. (A majority conjecture that it's impossible without some extension to the theory, I'm agnostic)

Ulrik Buchholtz (Jan 04 2020 at 14:33):

So until we either figure out how to do it, or find an acceptable extension (one proposal is two-level type theory with Reedy limits), HoTT is not strong enough to do it's own model theory in the correct way – of course, quasicategories can be formalized but that's not adequate from the univalent point of view: it's not the correct type.

Kevin Buzzard (Jan 04 2020 at 14:34):

What I don't really understand here is the following. Maths is about lots of things, including infinity categories and lots of other stuff which is completely unrelated to infinity categories. It seems to be that something like UniMath starts with an infinity category of types and then says "that's it". But I want other infinity categories, and I want a gazillion things that aren't infinity categories. So it seems to me that I want to define infinity categories, like I would do in lean, rather than being stuck with them and nothing else

Ulrik Buchholtz (Jan 04 2020 at 14:35):

Absolutely, HoTT is currently inadequate to properly talk about $(\infty,1)$ -categories! But only HoTT has a chance of even doing so univalently, all other foundations must rely on models.

Ulrik Buchholtz (Jan 04 2020 at 14:35):

I'll say something about all this in my talk on Monday, BTW

Mario Carneiro (Jan 04 2020 at 14:35):

what's wrong with models?

Mario Carneiro (Jan 04 2020 at 14:36):

somehow I'm missing the added value here

Ulrik Buchholtz (Jan 04 2020 at 14:36):

With models you have to always prove that constructions are independent of representations: it's like doing group theory only with group presentations and no actual groups.

Mario Carneiro (Jan 04 2020 at 14:37):

But we have a definition of what a group is

Mario Carneiro (Jan 04 2020 at 14:38):

Apparently we can't decide what the definition of an infinity cat is, but at least Riehl & Verity have a definition of what a definition of an infinity cat is

Ulrik Buchholtz (Jan 04 2020 at 14:38):

Right, but the type of groups in 0-truncated foundations is, well, 0-truncated, so you have to prove that all group-constructions respect isomorphism by hand. The issue with higher structures is similar, just more complicated.

Mario Carneiro (Jan 04 2020 at 14:39):

that is an issue of automation not foundations

Ulrik Buchholtz (Jan 04 2020 at 14:39):

There are many models for $(\infty,1)$ -categories, but they are all equivalent: quasicategories with categorical equivalences, complete Segal spaces with equivalences etc

Mario Carneiro (Jan 04 2020 at 14:40):

it seems to me like there has been a change of terminology in calling those things "models"

Ulrik Buchholtz (Jan 04 2020 at 14:41):

that is an issue of automation not foundations

Depends on your point of view, I think. I think it's conceptually very appealing to be able to work with the types of mathematical objects themselves, and not just with models of them.

Ulrik Buchholtz (Jan 04 2020 at 14:41):

Well, “model” means a billion things in math. Here, it's like “presentation” in group presentations.

Mario Carneiro (Jan 04 2020 at 14:41):

those are not (particular) infinity cats

Ulrik Buchholtz (Jan 04 2020 at 14:42):

I though you wanted a definition of infty cats. Did I misunderstand your question?

Mario Carneiro (Jan 04 2020 at 14:42):

those are candidates for how you might enumerate all infinity cats

Mario Carneiro (Jan 04 2020 at 14:43):

I get why you call them models, but it's not like the situation with group presentations because in that case "models" would be individual groups

Daniel Gratzer (Jan 04 2020 at 14:44):

I don't think that these are models in the sense you're expecting: a single category is a model of (essentially algebraic) theory. This 'model' is more like a presentation of CAT itself. Though @Ulrik Buchholtz might correct me on this

Mario Carneiro (Jan 04 2020 at 14:44):

Right

Ulrik Buchholtz (Jan 04 2020 at 14:45):

In the case of group presentations and groups, they're like the presentations. They are different presentations of the same type: that of $(\infty,1)$ -categories.

Ulrik Buchholtz (Jan 04 2020 at 14:46):

Just like the Lean-type (or any 0-truncated type) of groups, together with the data of group isomorphisms present the 1-type of groups.

Mario Carneiro (Jan 04 2020 at 14:48):

So there shouldn't be any problem with just picking one of the "presentations" and proving all others equivalent?

Ulrik Buchholtz (Jan 04 2020 at 14:48):

The data of (quasicategories, categorical equivalences) is a presentation of the homotopy type of $(\infty,1)$ -categories. (With structure making it a large $(\infty,1)$ -category itself.)

Ulrik Buchholtz (Jan 04 2020 at 14:48):

Absolutely, which is what people have been doing

Mario Carneiro (Jan 04 2020 at 14:49):

sounds like a plan

Ulrik Buchholtz (Jan 04 2020 at 14:49):

Go for it :)

Mario Carneiro (Jan 04 2020 at 14:49):

in particular, nothing stops us from doing this in lean

Ulrik Buchholtz (Jan 04 2020 at 14:49):

Absolutely, because people are doing it in informal set-theoretic math, and Lean captures that quite well

Ulrik Buchholtz (Jan 04 2020 at 14:50):

We could also do it that way in univalent type theories, but there the level of ambition is higher: we'd want the presented type itself.

Mario Carneiro (Jan 04 2020 at 14:50):

and the reservations of your HoTT colleagues are about some kind of ambiguity preserving definition

Mario Carneiro (Jan 04 2020 at 14:52):

where by "the type itself" you mean that in the metatheory you can prove that this is equivalent to the... real... definition?

Mario Carneiro (Jan 04 2020 at 14:54):

it's not clear to me what the set theoretic definition is losing out on here. You still have the language to say what equivalences matter, and show that you have the right definition up to such and such kind of equivalence

Ulrik Buchholtz (Jan 04 2020 at 14:54):

That's one thing: that metatheoretically we could show that in an infinity-topos it externalizes to the stack of infinity-categories. Or just internally that it's a type with the correct identity type (meaning categorical equivalences)

Johan Commelin (Jan 04 2020 at 14:56):

@Mario Carneiro A group is a subset of matrix R n n (for some ring R) such that it is closed under multiplication, and inverses, and contains the unit matrix.

Ulrik Buchholtz (Jan 04 2020 at 14:56):

For the rock-bottom foundation, I don't think you lose out. But we want to use HoTT to talk about other infty-toposes than infty-groupods (animated sets!?), and there it matters:

Johan Commelin (Jan 04 2020 at 14:56):

Is that a satisfactory definition of groups?

Mario Carneiro (Jan 04 2020 at 14:56):

sure

Mario Carneiro (Jan 04 2020 at 14:56):

I mean, it's not what I find on wikipedia, so that's a black mark

Johan Commelin (Jan 04 2020 at 14:57):

Well, some people would argue that it's not a very elegant definition. It doesn't really capture what we want.

Johan Commelin (Jan 04 2020 at 14:57):

Something similar is the problem with inftycats

Mario Carneiro (Jan 04 2020 at 14:57):

It does capture what we want though, assuming we got the definition of "what we want" right

Ulrik Buchholtz (Jan 04 2020 at 14:58):

@Johan Commelin that's a good point: there's some hope that an eventual HoTT (or HoTT+extension) definition of infty-cats would be nicer/easier to work with than via quasicats

Mario Carneiro (Jan 04 2020 at 14:58):

As for finding an elegant definition, that's a process we can work on

Johan Commelin (Jan 04 2020 at 14:58):

@Mario Carneiro I think that working with Lurie's quasi-categories in a prover like Lean will be very painful.

Mario Carneiro (Jan 04 2020 at 14:59):

Usually community consensus works pretty well to find this though (e.g. matching wikipedia)

Johan Commelin (Jan 04 2020 at 14:59):

A synthetic definition would probably remove a lot of proof obligations, and lots "keeping-track of an infinite stack of homotopy-data"

Mario Carneiro (Jan 04 2020 at 15:00):

There is nothing stopping you from modeling the synthetic definition in the form of theorems that are crafted to make the proof obligations trivial

Ulrik Buchholtz (Jan 04 2020 at 15:00):

BTW, we already have a synthetic definition by working in the infty-topos of simplicial spaces: there we can synthetically define complete Segal spaces.

Johan Commelin (Jan 04 2020 at 15:00):

If doing something stupid results in a syntax error, instead of "hey, you've struggled for 37 minutes, maybe you just can't supply all the infinite amount of proof obligations, because you did something silly on the previous line"

Ulrik Buchholtz (Jan 04 2020 at 15:01):

If we could internalize this model, we could use it to prove facts about infty-cats at the ambient level

Mario Carneiro (Jan 04 2020 at 15:01):

there is no infinite amount of proof obligations, there never has been

Johan Commelin (Jan 04 2020 at 15:01):

That depends on how you count

Mario Carneiro (Jan 04 2020 at 15:01):

to the extent that HoTT makes sense at all, it's bookkeeping all this information. So it's not infinite

Johan Commelin (Jan 04 2020 at 15:01):

Of course the infinite amount can be supplied with finitely many keystrokes

Johan Commelin (Jan 04 2020 at 15:02):

Sure, but we don't do HoTT here

Mario Carneiro (Jan 04 2020 at 15:02):

It's infinite in the same sense that there are an infinite number of proof obligations to show + on nat is commutative

Mario Carneiro (Jan 04 2020 at 15:03):

If we can model HoTT, we can do HoTT in lean/set theory with epsilon overhead

Ulrik Buchholtz (Jan 04 2020 at 15:04):

That would actually be great: if you could formalize infty-toposes a la Lurie and prove Shulman's strictification and model result

Johan Commelin (Jan 04 2020 at 15:04):

If we can model HoTT, we can do HoTT in lean/set theory with epsilon overhead

How would that work?

Ulrik Buchholtz (Jan 04 2020 at 15:04):

But how nice is it to do proofs in an object logic/type theory? Is it really $\varepsilon$ overhead?!

Johan Commelin (Jan 04 2020 at 15:05):

I mean... we can define simplicial sets. I've done that. Working with them is another matter

Mario Carneiro (Jan 04 2020 at 15:06):

To make it epsilon overhead, you need a good set of theorems (this is a finite amount of work, to define what HoTT foundations are) and a theorem prover that is prepared to produce theorems in this embedded logic

Mario Carneiro (Jan 04 2020 at 15:06):

that's a UI problem, but it's completely feasible if you think it is possible to build native-HoTT theorem provers

Ulrik Buchholtz (Jan 04 2020 at 15:07):

Ideally, I'd like my foundations to make it super-easy and convenient to do proofs in embedded DSLs (domain-specific logics!)

Ulrik Buchholtz (Jan 04 2020 at 15:07):

And to be able to externalize via model constructions and interpretations!

Mario Carneiro (Jan 04 2020 at 15:08):

And then, once you are committed to this DSL style approach, you realize that the foundations don't have to be type theory at all, and they can be something super simple like string rewriting :)

Johan Commelin (Jan 04 2020 at 15:08):

Yes, yes, I know where this is going (-; :rolling_on_the_floor_laughing:

Ulrik Buchholtz (Jan 04 2020 at 15:08):

How does MM0 handle interpretations between DSLs?

Mario Carneiro (Jan 04 2020 at 15:10):

There are two parts to any such work: the one-time cost of proving relative consistency of one language to another, and the cost of translating individual target theorems to the other language (which requires changing syntax and stuff and so is O(n))

Mario Carneiro (Jan 04 2020 at 15:11):

You can prove the one time cost as a theorem, and the O(n) work is a suite of theorems that are applied in a syntax directed way by tactics

Mario Carneiro (Jan 04 2020 at 15:12):

This lets you model arbitrarily complicated syntactic theories like HoTT or anything else as long as you have a grasp on the metatheory (which tells you how to prove the translation theorems)

Ulrik Buchholtz (Jan 04 2020 at 15:16):

If I want to know that relative to some base theory (say of primitive recursive or polynomial time strength) that theory S interprets theory T, are you thinking of MM0 as the base theory, or should I fix another base theory to talk about the syntaxes of S and T?

Johan Commelin (Jan 04 2020 at 15:16):

@Mario Carneiro I wonder if the following is a down to earth test case: at some point someone (probably @Sebastien Gouezel) defined continuous linear maps and showed that identity and composition of such maps are again such maps. Tada, we have a category. Can some tactics now automatically PR https://github.com/leanprover-community/mathlib/blob/8e820e28bf4e1e1eb091608b1d5a5c2f4d8650f2/src/topology/algebra/module.lean#L368L446

Johan Commelin (Jan 04 2020 at 15:16):

Because that seems to be related to what you are suggesting

Mario Carneiro (Jan 04 2020 at 15:20):

MM0 is the theorem prover, you have to supply your own theory, but it can be something like PA or ZFC for a reasonable math foundation. This is viewed as a fixed background where you are doing "all of math". So you have theory S and T and you want to interpret one in the other, so you define both in ZFC in MM0 and then the mapping function is explicitly definable in the logic, and you can prove properties about it; and it is also computable (in the sense that you can figure out what it does on any given term), so you can "run" it using tactics.

Mario Carneiro (Jan 04 2020 at 15:22):

@Johan Commelin Doesn't lean already have tactics to very nearly automate all of this? It's really not a leap to think that we could do so if we cared enough

Yury G. Kudryashov (Jan 04 2020 at 15:22):

We have a constructor, not a tactic.

Yury G. Kudryashov (Jan 04 2020 at 15:23):

(I mean, a constructor for concrete categories)

Mario Carneiro (Jan 04 2020 at 15:25):

Lean 3 is limited in the kinds of definitions you can write by automation, but it already generates lots of theorems for inductive types. You could have a #eval make_category continuous_linear_map that just produces all these theorems, adds attributes, notations, etc, if it is sufficiently boilerplate

Johan Commelin (Jan 04 2020 at 15:27):

Well, by now we have 25 different kinds of equivs

Johan Commelin (Jan 04 2020 at 15:27):

I would say it is sufficiently boilerplate

Mario Carneiro (Jan 04 2020 at 15:27):

25?

Johan Commelin (Jan 04 2020 at 15:27):

They all follow the same pattern

Johan Commelin (Jan 04 2020 at 15:27):

Ooh, maybe its 13, maybe 37

Johan Commelin (Jan 04 2020 at 15:28):

equiv, add_equiv, mul_equiv, ring_equiv, linear_equiv, continuous_linear_equiv, continuous_equiv (or homeo, maybe)

Johan Commelin (Jan 04 2020 at 15:29):

And there's probably more that I haven't thought of immediately

Mario Carneiro (Jan 04 2020 at 15:29):

Keep in mind also that there is a tradeoff between when you want to switch to an explicit model of the commonalities (i.e. a theory of categories) that becomes more reasonable the more examples you have

Mario Carneiro (Jan 04 2020 at 15:29):

The reason we have all the duplication is because we actually care about the consequences in all of these disparate cases

Mario Carneiro (Jan 04 2020 at 15:30):

i.e. we want to have a way to say "compose these two linear equivs" which doesn't involve the overhead of category theory

Johan Commelin (Jan 04 2020 at 15:31):

Apparently...

Johan Commelin (Jan 04 2020 at 15:32):

Anyway, I guess we could make an attempt at defining quasi-categories à la Lurie

Johan Commelin (Jan 04 2020 at 15:32):

And then maybe some tactic writers can make them work nicely

Mario Carneiro (Jan 04 2020 at 15:32):

In the largest scale, if we have 50 of these categories, but they are all highly used in their own domains, we want the full product of theorems (composition of linear equivs is an equiv, an invertible group hom is an equiv, etc) but these theorems can be proven using a category theory

Ulrik Buchholtz (Jan 04 2020 at 15:33):

I find it funny that you think of using DSLs as having $\varepsilon$ overhead, but common organizational abstractions such as categories as overhead to be avoided? :wink:

Ulrik Buchholtz (Jan 04 2020 at 15:33):

Category theory is itself a kind of DSL

Mario Carneiro (Jan 04 2020 at 15:34):

It's a tradeoff. There is overhead to use the abstraction, vs overhead to state the specific variant (and prove using the abstraction)

Ulrik Buchholtz (Jan 04 2020 at 15:35):

I agree, but I want to know the value of $\varepsilon$ , that's all :big_smile:

Mario Carneiro (Jan 04 2020 at 15:36):

You are absolutely correct that category theory is a DSL. The epsilon overhead only happens after you put in the (possibly large) O(1) overhead of proving all the base theorems for the target category

Ulrik Buchholtz (Jan 04 2020 at 15:37):

But these are typically theorems you want anyway.

Mario Carneiro (Jan 04 2020 at 15:37):

right

Mario Carneiro (Jan 04 2020 at 15:37):

In Johan's example, you want that full product of theorems in any case

Mario Carneiro (Jan 04 2020 at 15:38):

so there isn't much point optimizing anything except a boilerplate generating tactic

Mario Carneiro (Jan 04 2020 at 15:40):

but my impression has been that the benefits of category theory in this area are pretty shallow. Proving that the composition of equivs is an equiv assuming that an equiv is a hom with an inverse is not a very large term

Mario Carneiro (Jan 04 2020 at 15:41):

And I don't see much demand for more complicated categorical facts (are there any interesting theorems about general categories? Yoneda seems like the pinnacle but it's still not that complicated)

Mario Carneiro (Jan 04 2020 at 15:42):

all the real theorems require tons of additional assumptions beyond just being a category

Ulrik Buchholtz (Jan 04 2020 at 15:45):

I don't know what differentiates a “real” theorem from other theorems. You're right that category theory proofs are short (many would consider this a virtue, but from a proof size point-of-view it means they can be inlined). I think for abelian categories and toposes, the “inlinings” would start to get longer.

Johan Commelin (Jan 04 2020 at 15:46):

@Mario Carneiro You goal is golfing pp.all output. My goal is golfing the number of keystrokes that I type.

Ulrik Buchholtz (Jan 04 2020 at 15:48):

Right, there's a difference between the original tactic entered by the user, and the storage of a proof, and the various other ways we'd want to re-display and re-check a proof after it's been completed. I think all systems could improve in these matters.

Mario Carneiro (Jan 04 2020 at 15:49):

As for what the value of epsilon is, in the best case you are looking at a single extra universal quantifier. So that means each generalized theorem application gets an extra argument pointing to the fact "X is a category and C is the objects, H is the homs" or however that needs to be stored. In lean's case this would usually be a typeclass arg so the visible overhead goes to zero, while the pp.all overhead is something like 5% or less (since most steps will not be these generalized theorems they make up a small proportion of the proof anyway)

Mario Carneiro (Jan 04 2020 at 15:50):

Even if we're measuring total keystrokes, there is still a tradeoff between how much work is needed to define the tactic that saves the work

Kevin Buzzard (Jan 04 2020 at 15:50):

I think I want to know the snake lemma in an arbitrary abelian category. That proof is much much longer than Yoneda and is used in the real world

Mario Carneiro (Jan 04 2020 at 15:51):

if you are thinking "well that's not my problem" then that's just an externalized cost model

Mario Carneiro (Jan 04 2020 at 15:51):

has the snake lemma been done formally before?

Mario Carneiro (Jan 04 2020 at 15:52):

I'm not actually sure that it's true that the proof is long, formally

Ulrik Buchholtz (Jan 04 2020 at 15:52):

The fact is that mathematicians like abstractions, both shallow and deep, and like to use them, so we should make it convenient to do so. That means making it easy to write the necessary tactics, too

Mario Carneiro (Jan 04 2020 at 15:53):

Right, there is always the argument that category theory et al is innately interesting, rather than my operational interest metric

Johan Commelin (Jan 04 2020 at 15:54):

has the snake lemma been done formally before?

It probably has been done somewhere. (Note that formally we'll need the snake definition before stating the snake lemma. I.e., the connecting homomorphism is a definition that is usually sn(e)aked into a lemma.)

Johan Commelin (Jan 04 2020 at 15:57):

Fun fact: the Snake Lemma is one of the few lemmas that has "proof by imdb"

Johan Commelin (Jan 04 2020 at 15:57):

lemma snake : bla bla bla :=
by imdb

Kevin Buzzard (Jan 04 2020 at 16:05):

I'm not sure all the details were given there though. I think this is a bad reference and the only reason it's so common is because it's funny.

Kevin Buzzard (Jan 04 2020 at 16:06):

I think there's a proof in kashiwara-schapiro? Of course anyone who cares could work out a proof by themselves...

Daniel Gratzer (Jan 04 2020 at 16:08):

Aluffi has some quote about it being 'bad manners to prove the snake lemma in public' I think...

Johan Commelin (Jan 04 2020 at 16:08):

The opposite of an abelian category is an abelian category. That allows you to cut the proof in half (plus epsilon, of course)

Johan Commelin (Jan 04 2020 at 16:10):

I think that may count as an application of category theory (@Mario Carneiro), since the category of modules over a ring does not have an opposite category equivalent to modules over a ring.

Mario Carneiro (Jan 04 2020 at 16:13):

It's hard to guess what the relative difficulty would be. You certainly give up some things by going to categories, e.g. you can't reason about points and defining functions pointwise anymore

Johan Commelin (Jan 04 2020 at 16:16):

Ooh, and you need to define abelian categories (-;

Johan Commelin (Jan 04 2020 at 16:16):

And there are multiple definitions, so you have to make a choice

Mario Carneiro (Jan 04 2020 at 16:18):

I'm just going to assume that an abelian category is a category where every diagram commutes

Kenny Lau (Jan 04 2020 at 16:47):

The opposite of an abelian category is an abelian category.

just category theory things

Jesse Michael Han (Jan 04 2020 at 16:51):

this might be a good excuse to do the mitchell embedding theorem (the diagram chasing proof of the snake lemma is less annoying than the purely diagrammatic proof)

Ramkumar Ramachandra (Jan 04 2020 at 17:51):

A summary of the discussion: https://artagnon.com/articles/leancoq

Johan Commelin (Jan 04 2020 at 17:57):

@Ramkumar Ramachandra Concerning the speed comparison at the end. You missed Mario's point that Lean uses a lot more automation (notably type class inference).

Johan Commelin (Jan 04 2020 at 17:57):

Besides that: thanks for this write-up!

Bryan Gin-ge Chen (Jan 04 2020 at 17:59):

Lean has much lower startup-cost for pure mathematicians, since its built-in features and math library are great for doing undergraduate-level group theory & topology, masters-level commutative algebra & category theory, but it plateaus quickly thereafter.

I might have missed something but is there really a sense that working with mathlib "plateaus quickly" in some areas?

Johan Commelin (Jan 04 2020 at 18:02):

Well, we still don't have sheaves and/or abelian categories. Even though we've been discussing them for >18 months

Kevin Buzzard (Jan 04 2020 at 18:23):

That's just lack of person-power though. It's hard to do things right.

Sebastian Ullrich (Jan 04 2020 at 18:26):

@Ramkumar Ramachandra That was fast! Minor nitpick: Lean 4 is written mostly in Lean, not C (at least as soon as we delete the old C++ frontend)

Ramkumar Ramachandra (Jan 04 2020 at 18:31):

@Sebastian Ullrich Thanks; fixed.

Ramkumar Ramachandra (Jan 04 2020 at 18:32):

@Johan Commelin Thanks, but I'm not sure I understand how to phrase it.

Ramkumar Ramachandra (Jan 04 2020 at 18:33):

I did say that it uses a VM; doesn't that cover "automation"?

Johan Commelin (Jan 04 2020 at 18:35):

Not really. In Lean, you can do the following: say X and Y are types. We can assume that they are endowed with a topology, by writing [topological_space X] and [topological_space Y]. If you later on write X × Y, and treat it as topological space, then Lean will figure out on its own that there is some piece of maths in the library that explains that it should use the product topology.

Johan Commelin (Jan 04 2020 at 18:36):

In other systems, you explicitly flag "Hey, I'm using the product of X and Y, and remember, it has the product topology"

Johan Commelin (Jan 04 2020 at 18:37):

So, summary: Lean is constantly figuring out a boatload of trivialities under the hood. The result is that statements in Lean can get pretty close to what it looks like in ordinary maths. (But this still varies a lot per subject. The current manifold code is not really recognisable to outsiders.)

Emily Riehl (Jan 04 2020 at 19:05):

I guess they are still drawing on a large unspecified amount of background material

Hi @Mario Carneiro. I'd contend our work isn't Mochizuki-like because the background material we rely upon is known (to a certain community of mathematicians at least). Someone who is familiar with what it's like to work with a simplicial model category could start right away in Chapter 1 (and not read any of the appendices) and understand precisely what we mean right away. On that same point, the material in Appendices A-D is all in the literature somewhere; by contrast Appendices E-F contain technical results about ∞-cosmoi and so are things we proved to support this work. So someone who knows a fair amount of categorical homotopy theory could read the book without ever looking at those appendices and would have a pretty easy go of it, at least for Parts I, III, and IV; II and V are somewhat more technical.

About to board a flight so I'll leave it here for now. Thanks for your interest!

Mario Carneiro (Jan 04 2020 at 19:13):

I wasn't intending to throw shade on the work at all; I'm obviously not the target audience but for formalization being able to find the definitions of everything without a stack of background is important for getting started. In any case, with the appendices I managed to find all the definitions that are needed for that first definition, so I think it passes the test :)

Ramkumar Ramachandra (Jan 05 2020 at 03:59):

@Johan Commelin Thanks; I've updated the post to use your example.

Reid Barton (Jan 07 2020 at 14:24):

Regarding quasicategories, I posted a definition on the maths stream a while ago. On mobile and don't know how to link to it, but searching for QCat should find it.

Reid Barton (Jan 07 2020 at 14:25):

By itself it's not good for much, of course.

Johan Commelin (Jan 07 2020 at 14:31):

Voila: https://leanprover.zulipchat.com/#narrow/stream/116395-maths/topic/quasicategories/near/165386120

Last updated: May 02 2025 at 03:31 UTC