Zulip Chat Archive

Stream: Type theory

Topic: Comparison with Coq


Vincent Siles (Sep 22 2021 at 10:02):

Hi ! Some coworkers are considering using Coq vs Lean for some projects, and they are a bit lost as for the differences between the two tools, especially w.r.t. to the tool guarantees/correctness (lost of strong normalization, subject reduction ...)
is there some existing material that explain the main differences and maybe motivation Lean choices ? or the consequences of these choices ?

Jannis Limperg (Sep 22 2021 at 10:30):

The type theoretic differences are likely irrelevant to whatever you're doing. Exception: if you want to use axiomatised homotopy type theory, you must use Coq, and if you want to be purely constructive, Coq will be easier. The strong normalisation issue means that Lean's type checker can, in theory, loop, but the system is still consistent, so it won't accept anything wrong. (In practice, I believe no inconsistency has ever been reported for Lean 3 while Coq has had a few. But the Coq issues were also very niche and got patched quickly.)

More important differences (as of right now, not considering ongoing work on both sides):

  • Lean 3 has much weaker automation and much less libraries. The editor tooling is nicer than Coq's though.
  • Lean 3 has mathlib for mathematics, Coq has primarily mathcomp. The two have somewhat different foci and use very different methodologies.
  • Lean 3 is more convenient for metaprogramming (i.e. developing your own tactics), but can be slow for heavy-duty stuff.
  • Lean 3 has much weaker support for mutual and nested inductive types.
  • Lean 3 has nicer dependent pattern matching (but Coq's Equations plugin provides similar functionality).
  • Lean 3 has much worse educational resources.

Vincent Siles (Sep 22 2021 at 10:39):

when you say the system is still consistent, do you know how that's proved if it is not SN ?

Jannis Limperg (Sep 22 2021 at 10:43):

@Mario Carneiro did the proof in his MSc thesis. He'll be able to tell you more.

Vincent Siles (Sep 22 2021 at 10:43):

Thanks a lot :)

Chris B (Sep 22 2021 at 10:44):

If you want the 1-line version: https://leanprover.zulipchat.com/#narrow/stream/113489-new-members/topic/not.20p.20versus.20p.20implies.20false/near/209606915

Karl Palmskog (Nov 01 2021 at 11:50):

since the question of how Coq and Lean are different seemingly comes up all the time in both Coq and Lean forums, I'm curious if anyone from the Lean side (with significant Lean experience) is interested in an adversarial collaboration with me to document the differences, pros, cons, etc., of Coq and Lean in a joint GitHub repo.

For example, one could set a "review approval before merge" rule in the repo, and possibly aim to publish some joint document in the end or just make the repo public as a "Coq vs. Lean FAQ" and maintain it. This would not only be about documenting foundational differences with literature references , although that's an important cornerstone - it would also be about ecosystem, tooling, ...

Two points of reference are the proof engineering survey I co-wrote and the Awesome Coq list of resources I maintain.

Karl Palmskog (Nov 01 2021 at 11:56):

to give an alternative view on one issue above - Coq is not "primarily" about mathcomp. The core MathComp projects are something like a few hundreds of thousands lines of code. A simple check on some popular Coq projects yields that the following non-mathcomp projects that formalize mathematics amount to about one million lines of code:
category-theory coqprime coquelicot corn gaia GeoCoq goedel graph-theory HoTT hydra-battles math-classes UniMath

Kevin Buzzard (Nov 01 2021 at 12:02):

I don't know enough about type theory to help here, but I always enjoyed this page which for me really helped; it is a good reference for some of the main differences, and the fact that I simply don't know the meanings of phrases like strong normalisation, subject reduction and canonicity mean that I'm not too bothered that these are apparently broken in Lean. In fact I am wondering whether that website basically tells the whole story in terms of type-theoretic differences.

One thing that I think is worth stressing is that Lean's approach of a huge monolithic mathematics repo has hugely aided the development of what I sometimes rather rudely call "mainstream mathematics" in Lean, because such mathematics (algebra, number theory, geometry, topology, analysis) is quite interconnected and so beyond some point it's really important to have it all in one place. Mathlib developers seem to have no fear of the gigantic refactor, where a fundamental definition like group is changed and then everything breaks and needs to be fixed. I would imagine that the more standard approach, with smaller repos devoted to more specific tasks, would in practice be a nightmare when it came to things like this.

Of course another huge difference between Coq and Lean is that Coq has a huge community of people working on projects completely unrelated to mathematics, whereas Lean is still quite a new prover and although it might be a sociological coincidence that much project development work is mathlib-focussed (or more specialised mathematics repos which have mathlib as a dependency), it is still undoubtedly true.

Karl Palmskog (Nov 01 2021 at 12:03):

monorepo vs. non-monorepo is arguably a big issue in software development in general, but obviously that should be a prominent part of a proper Lean-vs.-Coq FAQ.

Kevin Buzzard (Nov 01 2021 at 12:06):

My impression is that what we're doing with mathlib looks completely horrific to the average software dev, but on the other hand as a mathematician the thought of many smaller repos fills me with just as much horror! I think that superficially one could say that Coq and Lean use basically the same type theory, however the kind of things they're being used to do are so different that this sometimes makes it difficult for the two communities to even understand each other.

Karl Palmskog (Nov 01 2021 at 12:06):

from my interactions with MathComp developers, I think they "eventually" want to integrate most general results into a monorepo (namely, https://github.com/math-comp/math-comp/), but it can take anything from a few months to a few years to many years for code to end up there, and until then, a contribution lives in a standalone repo

Kevin Buzzard (Nov 01 2021 at 12:08):

Am I right in thinking that MathComp doesn't use classical logic though? Projects like Coquelicot do I guess (that's the one which does analysis, right?)

Karl Palmskog (Nov 01 2021 at 12:08):

MathComp the library proper doesn't use classical logic, but some sub-projects do, like analysis: https://github.com/math-comp/analysis

Kevin Buzzard (Nov 01 2021 at 12:09):

Basically I'm saying that I'd be happy to be involved but I have no idea what "subject reduction" means, so would have nothing to say on any discussion of the form "is subject reduction important?" because my answer is "not to me, for sure"

Karl Palmskog (Nov 01 2021 at 12:11):

I think we need someone who can "verify" type theory details for Lean to make this work out (I'm not a proper type theorist, more of a proof engineer and programming languages computer scientist), but I think your input would also be welcome Kevin

Karl Palmskog (Nov 01 2021 at 12:16):

the whole "subject reduction is broken" thing is something where I would like teach the controversy more than anything else

Karl Palmskog (Nov 01 2021 at 12:28):

Don't know if Lean users saw this graph of GitHub and Stack Overflow popularity: https://redmonk.com/sogrady/files/2021/08/lang.rank_.0621-1024x807.png

Lean is in lower left quadrant, Coq is in the quadrant just above. Taken from this post.

In previous graphs, I don't think Lean was included at all, so clearly there is a lot of growth.

Alex J. Best (Nov 01 2021 at 13:32):

Wow! I'm amazed Lean is on there at all, certainly number of stackoverflow posts and number of distinct github projects can be measured in the 100s (maybe thousands now?), I'd expect some esoteric languages would have more of both than Lean does.

Chris B (Nov 01 2021 at 14:59):

There was a good amount of back and forth in this github issue, and there are links at the end pointing to some additional discussions: https://github.com/coq/coq/issues/10871

Karl Palmskog (Nov 01 2021 at 15:03):

indeed, but the most thorough comparison was arguably done by Gaëtan Gilbert here: https://coq.discourse.group/t/alpha-announcement-coq-is-a-lean-typechecker/581

Reid Barton (Nov 01 2021 at 16:04):

My impression is that there is more diversity of goals and values in the Coq user base compared to Lean. For example there is the formalization of the Hahn-Banach in Coq which is perfectly typical of the sort of math in mathlib, but Coq is also being used in serious ways by people interested in constructive logic, computation, program verification, homotopy type theory, the metatheory of type theory, and so on. Probably this diversity accounts for (or is related to, in a feedback loop) some of the other differences in the ecosystems, such as having a monorepo or not. Lean is trying to solve an easier problem in some sense, or at least a more narrowly-focused one.

One lesson for me from mathlib was that, as hard as it is to design a cohesive library for "all" of mathematics, it's far harder to design one that supports both classical and constructive reasoning well at the same time. Lean itself would support this just fine, but the problem is that classical and constructive mathematics are just not that similar--a classical notion (e.g., finiteness) might split into several notions constructively, or the classical description of an object (e.g. the spectrum of a ring) might be the "wrong" one constructively. Accepting these differences is fine (and necessary anyways) if you want to do constructive mathematics, but it's too high a price to impose on the user of a library for classical mathematics.

Karl Palmskog (Nov 01 2021 at 16:04):

oh and to comment on the possible horror of "lots of little repos" approach to formalization, we do have major consolidation projects ongoing in the Coq ecosystem, where code flows upwards from leaf repositories to popular and well maintained projects, and small repos get "archived" on GitHub, one recent example: https://github.com/coq-community/coqeal/pull/54

Karl Palmskog (Nov 01 2021 at 16:06):

ah, I think this formalization of Hahn-Banach is the one mentioned by Reid: https://github.com/math-comp/analysis/blob/hb/hahn_banach.v

Reid Barton (Nov 01 2021 at 16:06):

Regarding the github discussion linked above, people often enter these discussions without understanding how different other parties' goals (or even use of language!) might be. For instance it's equally unhelpful for a Lean user to say "quotient types are a non-issue and work fine both for reasoning and computation" or for a Coq user to say "quotient types break Very Important Properties of the type system, trust us"--even if both statements are "true" in their own communities.

Reid Barton (Nov 01 2021 at 16:08):

Recently someone who works on another proof assistant told me that Lean was not a programming language--that was an interesting conversation.

Reid Barton (Nov 01 2021 at 16:10):

(I saw a talk which was not this exact one, but similar: https://lipn.univ-paris13.fr/~kerjean/slides/slidesTYPES19.pdf)

Karl Palmskog (Nov 01 2021 at 16:10):

we settled on something like this to describe Coq to computer programmers:

Formal language and environment for programming and specification

Karl Palmskog (Nov 01 2021 at 16:13):

we seem to have about 3-4 approaches to handle quotient types in the Coq community, yet they are not even needed by most computer scientists. The closest to Lean's quotients is probably private inductives, which was used by Gaëtan in his translation from Lean 3 to Coq. But private inductives are also considered a hack by some type theorists, even if they to my knowledge are used heavily in the HoTT Coq project.

Kevin Buzzard (Nov 01 2021 at 16:15):

Yes I remember someone else saying that quotients are not used by many computer scientists. They are absolutely everywhere in the kind of pure mathematics we do in mathlib; a lot of "universal constructions" are made via quotient types.

Reid Barton (Nov 01 2021 at 16:19):

Right so ironically they are even more important in constructive mathematics, since you need quotients (AFAIK at least) to define the polynomial ring on a set of variables that doesn't have decidable equality, for instance

Reid Barton (Nov 01 2021 at 16:20):

or if the coefficient ring doesn't have decidable equality

Jeremy Avigad (Nov 01 2021 at 16:24):

One funny difference is that mathematicians say that a function "descends" to a quotient, whereas computer scientists say that it "lifts." I think this paper was the model for Leo's implementation in Lean: http://www.cs.nott.ac.uk/~psztxa/publ/defquotients.pdf. (See definition 2.)

Johan Commelin (Nov 01 2021 at 16:26):

Good point. That difference might be more significant than the silly linguistic difference suggests. It points to a different mental model. And mental models are crucial in communication.

Karl Palmskog (Nov 01 2021 at 16:52):

somewhat amusing that the top answer here is by Thorsten Altenkirch, who says that he "personally prefer[s] Agda [over both Coq and Lean]": https://www.quora.com/Is-LEAN-better-than-Coq

Chris B (Nov 01 2021 at 17:07):

Reid Barton said:

Recently someone who works on another proof assistant told me that Lean was not a programming language--that was an interesting conversation.

What was the thrust of their argument?

Reid Barton (Nov 01 2021 at 17:09):

I think what they meant was that it doesn't have properties like strong normalization or canonicity.

Reid Barton (Nov 01 2021 at 17:09):

I didn't ask them whether, say, Python was a programming language.

Chris B (Nov 01 2021 at 17:14):

Ah, okay. Thanks.

Patrick Stevens (Nov 02 2021 at 21:46):

Kevin Buzzard said:

My impression is that what we're doing with mathlib looks completely horrific to the average software dev, but on the other hand as a mathematician the thought of many smaller repos fills me with just as much horror!

I don't think that's quite true (I'm a software dev who happens to have done the MMath) - I think most non-microservice-dogmatic devs agree that it's easier to work in a monorepo until it scales so large that all the tooling breaks. The place where people tend to start actually having disagreements is where you hit the real world with consumers you can't change when you update your code, and when you start releasing different bits of your code at different times. Mathlib doesn't have the tooling problems yet (although I hear the typeclass system may be creaking a bit), and external consumers have so far been happy to pick one version of mathlib and forever stick to it, or else have been happy to pay the cost of regularly upgrading their dependency on the monolith. It's one to keep an eye on as the situation evolves, but it's not inherently horrifying.

Mac (Nov 02 2021 at 22:29):

Patrick Stevens said:

I think most non-microservice-dogmatic devs agree that it's easier to work in a monorepo until it scales so large that all the tooling breaks. The place where people tend to start actually having disagreements is where you hit the real world with consumers you can't change when you update your code, and when you start releasing different bits of your code at different times.

I think your thrust is right -- most devs would probably agree that working on a monorepo is generally much easier and the problem arises when people start having disagreements -- but I think your estimation of where that occurs is wrong.

When consuming, most developers like small repos because they can mix and match according to their preference. When developing, a monorepo is generally preferable because it is quicker and easier to maintain. What causes a monorepo problems is when one party wants things done one way and another party wants it done another way (and both have sufficient manpower to promote their side). This tends to be inevitable once there is a sufficiently large group of maintainers and/or consumers. Said disagreements on how things should be done lead to forks, splits, new libraries, etc. This also tends to lead to smaller projects as the fork / reinvention is usually more focused on the particularly thing they want to do differently rather than the whole of the original project.

After encountering said trend one too many times, developers then choose to just skip the whole monorepo stage and just build small projects from the get go (where possible). Only when the project in question necessitates a large breadth or heavy standardization (e.g., compiler toolchains, OSes, browsers, etc.) do large projects tend to stick around.

I think mathlib works as a monorepo largely because classical mathematics (as Kevin puts it sometimes, the "mathematics that win Fields Medals") tends to have a relatively agreed upon standard that mathlib can embody. Computer science adjacent mathematics, on the other hand, tends to be much less standardized and much more open to alternative interpretations (like much of CS itself) and thus it is much harder for communities with that kind of focus (which Coq is much more trendy in) to create a single product. Which approach one views as better is usually the result of which camp the user identifies with more (or what they wish to achieve).

Kyle Miller (Nov 02 2021 at 23:17):

I think mathlib is also an exciting experiment in "applied ontology": can all of modern mathematics really be comfortably put into a single framework? with some organizational scheme that reflects how mathematicians tend to think about the whole enterprise?

Mathematicians seem to believe that math is all part of a single cohesive whole (even if different disciplines might lack mutual intelligibility), and an ambitious project like mathlib is a good way to put that belief to the test.(*) Somehow, it's so far been largely successful, with failings seeming to be more like technical challenges that can still be overcome. (With the exception of including math based on non-classical foundations -- so far it's hard to mix these in the same framework. Classical ideas can generalize constructively in many ways, for example, as Reid mentioned.)

Maybe if mathematicians had more experience with needing to write code that interfaces with the messy Real World they'd be less keen on the idea of a universal framework, and instead they'd opt for smaller projects, which as @Mac was suggesting one composes into an appropriate language/ontology for a specific problem. However, I'm looking forward to see how mathlib progresses on its current path: fragmentation would put less pressure on testing whether all of (a certain kind of) math can be unified!

(*) Mathematicians are very aware of the problem of differing terminology and differing notations across papers and textbooks, but I've never heard anyone say they didn't think that with sufficient resources it couldn't all be made consistent.

Junyan Xu (Nov 02 2021 at 23:17):

Found a relevant wikipedia article: https://en.wikipedia.org/wiki/Monorepo
"Google's monorepo, speculated to be the largest in the world, meets the classification of an ultra-large-scale system and must handle tens of thousands of contributions every day in a repository over 80 terabytes large."
Now mathlib's future looks bright to me ...


Last updated: Dec 20 2023 at 11:08 UTC