Zulip Chat Archive

Stream: maths

Topic: challenges in the translation from a set proof to DTT?

Emily Riehl (Nov 12 2024 at 15:18):

@Floris van Doorn gave a very interesting recent talk in the HoTTEST seminar. In the Q&A afterwards, I raised a question that I'd also like to ask here:

Emily Riehl (Nov 12 2024 at 15:19):

What differences/difficulties have folks encountered in the translation between paper proofs (implicitly in set theory) to Lean proofs (in extensional dependent type theory)?

Emily Riehl (Nov 12 2024 at 15:22):

Floris mentioned that this hasn't proven to be a large obstacle in the formalization of Carleson's theorem, but it seems to be a much bigger deal in the projects I've been involved with. What I haven't figured out is:

(a) Whether this is some inherant difficulty involving category theory in dependent type theory, or
(b) Whether I'm just bad at Lean.

If (a), then are other subfields similarly affected by the foundation system?

If (b), then how do I get better at formalizing category theory in Lean (and help others do the same)?

Emily Riehl (Nov 12 2024 at 15:26):

I'll give a little more detail on what might be special about category theory.

On paper, category theory has a notion of "evil," the primary example being categorical statements that involve equality between objects. The reason such statements are "evil" is that they often fail to be invariant under equivalence between categories.

In Lean, it seems relatively easy to prove those "evil" statements that are true. But what can be more difficult is proving that two morphisms are equal because morphisms live in dependent types (over their domain and codomain objects) and if the objects are provably equal rather than definitionally equal, the morphisms belong to different types.

Emily Riehl (Nov 12 2024 at 15:27):

Another difficulty I've encountered relates to something that is also difficult conceptually. (Aside: maybe this is a third mode of proof construction: "on paper", "in a proof assistant", "in your head"?) Namely, the challenge of switching back and forth between category levels.

Sometimes in category theory you are working inside the category of categories. You build some category of mathematical objects of some type and then build another category of mathematical objects of another type. Then you build a functor from the first category to the second. In Lean, we have a special notation ⥤ for functors, which makes sense from this perspective.

But at other times you "go up a category level" and just think of Cat as an example of a category (at a larger universe level). Now functors are just morphisms in this particular category and it makes sense to use the stand morphism notation ⟶ in Lean.

A big annoyance that @Mario Carneiro and I encountered over the summer was translating back and forth between these perspectives. We were trying to build an adjunction involving Cat (treating Cat from the higher category level) but had to manually tell Lean that some result about functors had implications for the morphisms inside Cat.

Yaël Dillies (Nov 12 2024 at 15:28):

Emily Riehl said:

What differences/difficulties have folks encountered in the translation between paper proofs (implicitly in set theory) to Lean proofs (in extensional dependent type theory)?

Do you specifically mean paper proofs in category theory?

Emily Riehl (Nov 12 2024 at 15:28):

No, I specifically don't exclusively mean this. My experience is with category theory but I'm curious what comes up in other areas.

Riccardo Brasca (Nov 12 2024 at 15:32):

I think that a good example, that you probably already know, is what happened with chain complexes (the fact that we need morphisms from C_i to C_j for all i and j). I don't think this is related to category theory, it's the fact that in writing f ∘ g Lean needs to understand that the codomain of g is the domain of f automatically (i.e. they must be definitionally equal)

Kevin Buzzard (Nov 12 2024 at 15:33):

I think this is an interesting question. My experience in formalising mathematics (99.9% of which was nothing to do with category theory) was that I had to do a certain amount of reorganization in my head because I was working on the assumption that subgroups were groups and that $\N\subseteq\Z$ . In Lean groups are types and subgroups are terms, and invisible maps between numeric types are explicitly present. In short I think that this realignment which I had to go through is probably present across mathematics and is really just an artifact of formalization being hard rather than it being anything to do with category theory per se, but perhaps it's even worse in category theory than in number theory? Not sure. It's definitely present though, and I think is all part of the (rather steep!) learning curve that all mathematicians have to go through when formalizing.

I also suspect that this might not just be to do with dependent type theory, I wonder that even if I had learnt a set theory system instead I would have still had to do a bunch of re-learning stuff (because e.g. $\N\subseteq\Z$ is probably not true in set theory either!)

Emily Riehl (Nov 12 2024 at 15:34):

@Riccardo Brasca I don't know about the chain complex example. Can you say more?

Kevin Buzzard (Nov 12 2024 at 15:35):

Oh chain complexes was a great example! I'll let Riccardo explain :-)

Riccardo Brasca (Nov 12 2024 at 15:35):

Give me 1 second to find some code I've written

Kevin Buzzard (Nov 12 2024 at 15:37):

This was a great example of "dependent type hell", so probably specific to DTT (and utterly invisible on paper).

Kevin Buzzard (Nov 12 2024 at 15:37):

(sorry, on a train)

Kevin Buzzard (Nov 12 2024 at 15:38):

(deleted)

Riccardo Brasca (Nov 12 2024 at 15:39):

Here is the problem. Let's define cochain complex of abelian groups. A natural definition is

structure Complex where
  X : ℤ → AddCommGrp
  d : ∀ i, X i ⟶ X (i+1)
  d_comp : ∀ i, d i ≫ d (i+1) = 0

so we have the function that gives the groups, the function that gives the differentials and the assumption on the composition. So far so good. Something like

example (C : Complex) (i : ℤ) : C.d i ≫ C.d (i+1) = 0 := by
  exact C.d_comp i

works well.

Riccardo Brasca (Nov 12 2024 at 15:43):

Now, maybe at some point you need to prove that C.d (i+1) ≫ C.d (i+2) = 0, where i : ℤ is given. You may think that exact C.d_comp (i+1) would work, but if you try to state

example (C : Complex) (i : ℤ) : C.d (i+1) ≫ C.d (i+2) = 0 := by
  sorry

you see that there is an error in the statement. The error is

application type mismatch
  C.d (i + 1) ≫ C.d (i + 2)
argument
  C.d (i + 2)
has type
  C.X (i + 2) ⟶ C.X (i + 2 + 1) : Type ?u.1048
but is expected to have type
  C.X (i + 1 + 1) ⟶ ?m.957 : Type ?u.1048

Riccardo Brasca (Nov 12 2024 at 15:45):

The problem is that for Lean, C.d (i + 2) goes from C.X (i + 2) to C.X (i + 2 + 1), not from C.X (i + 1 + 1) to something else. One can prove that C.X (i + 2) and C.X (i + 1 + 1) are propositionally equal, but Lean doesn't care about propositional equality here.

Riccardo Brasca (Nov 12 2024 at 15:47):

This caused a lot of headache, since to fix it you need to use the marvelous CategoryTheory.eqToHom or maybe its friend CategoryTheory.eqToIso, something it's clearly impossible to do in a complicated proof.

Riccardo Brasca (Nov 12 2024 at 15:50):

One partial solution is the following. Let's define a complex as (I just wrote this, so it's probably not very good, but the idea is there)

structure Complexbetter where
  X : ℤ → AddCommGrp
  d : ∀ i j, X i ⟶ X j
  d_0 : ∀ i j, j ≠ i+1 → d i j = 0
  d_comp : ∀ i, d i (i+1) ≫ d (i+1) (i+2) = 0

Riccardo Brasca (Nov 12 2024 at 15:54):

We now have a differential d i j : X i ⟶ X j for all i and j, and the assumption that the relevant composition is 0. We also assume that d i j = 0 unless j = i+1. Mathematically this is the same as before, but now we can write

example (C : Complexbetter) (i : ℤ) : C.d (i+1) (i+2) ≫ C.d (i+2) (i+3) = 0 := by
  sorry

Eric Wieser (Nov 12 2024 at 15:56):

The paper I wrote with @Jujian Zhang on Graded rings in Lean walks through a similar issue, instead around dealing with a multiplication mul : A i → A j → A (i + j) and its associativity.

Matthew Ballard (Nov 12 2024 at 15:58):

Just to add:

$ rg 'omega' Mathlib/Algebra/Homology | wc -l
$ 271

Riccardo Brasca (Nov 12 2024 at 16:00):

What we actually do in mathlib is

structure Complexevenbetter where
  X : ℤ → AddCommGrp
  d : ∀ i j, X i ⟶ X j
  d_0 : ∀ i j, j ≠ i+1 → d i j = 0
  d_comp : ∀ i j k, j=i+1 → k=j+1 → d i j ≫ d j k = 0

Riccardo Brasca (Nov 12 2024 at 16:01):

the difference between Complexbetter and Complexevenbetter is that in the last field we say that d i j ≫ d j k = 0 if j=i+1 and k=j+1. Again mathematically this the same as before, but now every time we know that these conditions are satisfied (and those are Prop, so we don't care how we proved them) we can use d_comp.

Riccardo Brasca (Nov 12 2024 at 16:03):

example (C : Complexevenbetter) (i : ℤ) : C.d (i+1) (i+2) ≫ C.d (i+2) (i+3) = 0 :=
  C.d_comp (i+1) (i+2) (i+3) (by omega) (by omega)

is a typical example.

Riccardo Brasca (Nov 12 2024 at 16:05):

(if you look at CochainComplex you see that this is an abbreviation for HomologicalComplex, whose definition is even more complicated, but this is only to allow complexes indexed by various things, like ℕ, ℤ or Fin n, and allowing triangles etc)

Riccardo Brasca (Nov 12 2024 at 16:08):

mathlib's solution is great, and everything was done, IIRC, by @Kim Morrison

Emily Riehl (Nov 12 2024 at 16:10):

Very interesting. So, to summarize, the initial obstacles were circumvented by reconfiguring the definition. The DTT notion of chain complex is logically equivalent to the set theoretic notion but structurally more complicated (essentially has more explicitly structure). Have there been propagating difficulties in proofs involving chain complexes related to this?

Riccardo Brasca (Nov 12 2024 at 16:12):

I believe this has eliminated all the DTT issues, though it requires writing more variables (often just an underscore will do)

Riccardo Brasca (Nov 12 2024 at 16:13):

I remember a talk by @Johan Commelin about this, but I don't know how to find it

Riccardo Brasca (Nov 12 2024 at 16:23):

Another (much easier) difference wrt pen and paper proof is all the troubles related to natural number subtraction/division.

Emily Riehl (Nov 12 2024 at 16:33):

Riccardo Brasca said:

Another (much easier) difference wrt pen and paper proof is all the troubles related to natural number subtraction/division.

Could you (or someone) elaborate on this as well? I've been warned that subtraction is bad but haven't yet encountered such issues myself.

Daniel Weber (Nov 12 2024 at 16:48):

In paper proofs when we subtract/divide natural numbers we usually either implicitly move to $\mathbb{Z}$ or $\mathbb{Q}$ , or we know that the result will be a natural number. In general we like homogeneous functions - we don't want someone writing 2 - 1 and getting 1 : ℤ, and in DTT functions which take proofs are usually harder to work with.
To resolve this we make subtraction and division truncate, and take appropriate inequality/divisibility hypotheses to theorems when required. This can be confusing at first, as 1 - 2 = 0 and 3 / 2 = 1

Daniel Weber (Nov 12 2024 at 16:51):

Also see https://xenaproject.wordpress.com/2020/07/05/division-by-zero-in-type-theory-a-faq/

Patrick Massot (Nov 12 2024 at 17:01):

I think the subtraction on natural number is not a good example because it actually also doesn’t work in set theory. The set of natural number is not a subset of integers, unless you have a weird construction. The inclusion map already exist, and there is no good subtraction function on Nat × Nat.

Patrick Massot (Nov 12 2024 at 17:01):

This is completely different from the homological algebra thing which is specific to DTT.

Kevin Buzzard (Nov 12 2024 at 17:04):

Emily Riehl said:

Could you (or someone) elaborate on this as well? I've been warned that subtraction is bad but haven't yet encountered such issues myself.

"Strange" operations like natural subtraction (which satisfies 0 - 1 = 0 and natural division (which satisfies 5 / 2 = 2) are things which some computer scientists might expect, but very few mathematicians do. This has the following quite unfortunate consequence. A mathematician hears about Lean, thinks "this is cool, I might try it", thinks of the easiest possible thing they can prove by induction, which is often $\sum_{i=1}^ni=\frac{n(n+1)}{2}$ , tries to formalize the statement, discovers that computer scientists seem to prefer $\{0,1,2,\ldots,n-1\}$ to $\{1,2,\ldots,n\}$ , rewrites as $\sum_{i=0}^{n-1}i=\frac{n(n-1)}{2}$ , and off they go on their first Lean adventure!

import Mathlib

example (n : ℕ) : ∑ i in Finset.range n, i = n * (n - 1) / 2 := by
  induction n with
  | zero => simp -- base case easy
  | succ n ih =>
    -- use sum to n+1 = last term plus sum to n
    rw [Finset.sum_range_succ]
    -- use inductive hypothesis
    rw [ih]
    -- goal now looks completely obvious
    -- `⊢ n * (n - 1) / 2 + n = (n + 1) * (n + 1 - 1) / 2`
    ring -- fails
    -- now totally stuck
    sorry

Kevin Buzzard (Nov 12 2024 at 17:06):

And then they come and ask here, and are informed that they are in a real mess, and sometimes people come up with arguments which finish the job from here but it's always clunky and horrible, maybe n=0 is a special case because there's an n-1 in there, you will probably have to invoke the fact that n is either even or odd to show that $n(n-1)/2 * 2 = n(n-1)$ and so on. It's a really disastrous introduction to formal proof verification because it turns out that the question did not say what they thought it said, as the basic symbols - and / didn't mean what they thought they meant.

Riccardo Brasca (Nov 12 2024 at 17:07):

Sorry I am with my phone so I will let others write detailed explanations, but two more examples:

PNat is not included in Nat (something true in set theory regardless of N not included in Z).
Our definition of a group includes the (unique) Z-action.

Kevin Buzzard (Nov 12 2024 at 17:09):

This is the fix I advertise:

import Mathlib

-- same question but use rationals, where - and / work as
-- a mathematician expects
example (n : ℕ) : ∑ i in Finset.range n, (i : ℚ) = n * (n - 1) / 2 := by
  induction n with
  | zero => simp -- base case easy
  | succ n ih =>
    -- use sum to n+1 = last term plus sum to n
    rw [Finset.sum_range_succ]
    -- use inductive hypothesis
    rw [ih]
    -- weird up-arrows everywhere because ℕ ⊆ ℚ is false
    push_cast
    -- but at least this now works
    ring

Riccardo Brasca (Nov 12 2024 at 17:12):

I think that the main problem here is that what you write in Lean is not what you have in mind (that is much closer to Kevin's code)

Mauricio Collares (Nov 12 2024 at 17:17):

Riccardo Brasca said:

the difference between Complexbetter and Complexevenbetter is that in the last field we say that d i j ≫ d j k = 0 if j=i+1 and k=j+1. Again mathematically this the same as before, but now every time we know that these conditions are satisfied (and those are Prop, so we don't care how we proved them) we can use d_comp.

Also known as "don't touch the green slime" (unless I misunderstood the issue), as in page 4 of https://personal.cis.strath.ac.uk/conor.mcbride/PolyTest.pdf

Kevin Buzzard (Nov 12 2024 at 17:28):

Here is an example coming from commutative algebra which I think you might like Emily. It is precisely an issue of set theory v type theory, and early on in my Lean journey I was quite concerned about this issue. When I started pushing Lean at Imperial and getting undergraduates involved, they wanted projects, and I wanted commutative algebra because it's the backbone of alg geom and it's nice to do in Lean, so I got them doing commutative algebra. One student was formalising a theorem in a textbook involving three rings $A\subseteq B\subseteq C$ and they realised that there were many ways to say this in Lean, because we have Subring C (the type of subrings of C, which would make A and B terms), and we also have Algebra A B (meaning that B is an A-algebra, i.e. A and B are types and there's a canonical ring map A -> B). Every possibility the student tried led to problems. For example if A and B have type Subring C then we don't have A : Subring B, we have A <= B. If we have Algebra A B then we can't use any theorems about subrings. Most of the solutions are furthermore asymmetric, with A a term and C a type, whereas in the paper proof A and C are treated on the same footing.

I had read the Coq paper on formalization of the odd order theorem, where they have the analogous problem with groups $A\subseteq B\subseteq C$ , and in that paper they made what I thought at the time was an extraordinary decision: they have some underlying "universe group $G$ ", and every group they consider has type Subgroup G. So now at least we have the fact that all groups are being treated the same -- they're all terms, apart from this universe group which is just playing a weird role in the background. This was a design decision all the way through the project; if $A$ and $B$ were two random groups then they would be considered as subgroups of $G:=A\times B$ , and more generally it's easy to see that given any finite collection of finite groups, you can realise them all as a subgroup of a bigger group (the product). Whilst I thought that this was a strange design decision, I figured that if it could be used to prove the odd order theorem then it could be used to do graduate commutative algebra, and for a week or so we experimented with proving theorems in ring theory via this set-up. But then to my horror a student pointed out to me that given two random rings $A$ and $B$ , it was not in general possible to find a "universe ring" $R$ containing $A$ and $B$ as subrings, because Lean's definition of subrings (which I was very happy with as it's the convention used by Grothendieck all through alg geom) demanded that the 1 of the ring coincided with the 1 of the subring, so $R:=A\times B$ doesn't work! Indeed it's very easy to convince yourself that there's no ring which contains both $\Z/2\Z$ and $\Z/3\Z$ as subrings. (1/2)

Daniel Weber (Nov 12 2024 at 17:35):

This is the problem that docs#IsScalarTower resolves, right?

Kevin Buzzard (Nov 12 2024 at 17:38):

By this point I was quite upset, because the whole idea was that doing commutative algebra in a theorem prover was supposed to be easy, and here we were wrestling not with mathematics but with the underlying axiomatic system. I became convinced that we needed a solution where all rings were treated equally, and if they can't all be terms then they were going to have to all be types. I haven't yet mentioned the problem with this set-up, but here it is: we now have A B C : Type, [CommRing A] [CommRing B] [CommRing C] and [Algebra A B] [Algebra A C] [Algebra B C] to indicate that each ring is a subring of the other rings later in the alphabet. What happens now is that typeclass inference is great at tracking the maps between the rings (as they're in the typeclass system with the [Algebra] instances), but sometimes you have an element of A which you treat as an element of B and then as an element of C, and now you're in trouble because you can also treat an element of A directly as an element of C (all the time applying these canonical maps coming from the Algebra instances) and now the typeclass system doesn't know that these two ways of getting from A to C are equal. The brilliant solution to this, discovered by Kenny Lau, was to just make another typeclass [AlgebraTower A B C], which was just the proposition saying that these maps were the same. This worked! After Kenny had written the API, we were able to stop worrying about foundations and just get back to the mathematics, which went back to being painless. Then people like Eric Wieser came along and realised that in fact we could even do [IsScalarTower A B C], which is the axiom (a • b) • c = a • (b • c), which makes sense for algebras but also makes sense in far more generality, is equivalent to the claim that the triangle commutes in the case when A->B->C are rings with A acting on B via applying the canonical ring hom and then multiplying, and which is the set-up that we use today. (2/3)

Kevin Buzzard (Nov 12 2024 at 17:45):

The moral of the story here was that if you go ahead and try and do graduate level commutative algebra in a dependent type theory system in 2020, then you run into issues simply because nobody tried this already. And so it's on you to solve these issues! But because of the amazing community we have here, the issues were eventually solved. You are frustrated because you're also running into issues which are non-issues from a mathematical perspective but which impede the process of formalization, and it might simply be the case that what's happening is that there will be pain points when formalizing cosmoi in DTT because nobody tried it before and so nobody found the solutions. Potential solutions for your problems could perhaps come from a number of places: some trick involving typeclasses, or some bespoke tactic which does some of the work for you, or a reorganization/generalization of the material, or who knows what else. But once the community begins to understand properly what the problems are, it would not surprise me if the community is able to solve them to the point where they stop being painful. This might not happen instantly though. But the reason it's painful now might simply be that nobody ran into these issues before. You might want to try and convey your frustration to Agda experts, who might perhaps have run into this kind of question and come up with ways of solving it. However this might not help, even if they've seen analogous problems before -- for example the Coq solution which worked for groups didn't work for rings, and we needed something else.

Kevin Buzzard (Nov 12 2024 at 17:47):

(Anyone who wants to relive my pain can see the original thread here )

Floris van Doorn (Nov 12 2024 at 18:08):

I think what Kevin is saying is really important. Often there are many ways to state something, and most of them are really annoying to work with, and one if them is (almost) as natural as doing it on paper. Finding this natural way of doing things is really hard work.

In analysis, I've not encountered much of these problems. One minor one is working with $L^p$ -functions. It is inconvenient to work with docs#MeasureTheory.Lp, because it is a quotient of maps up to a.e.-equality, and so properties like $(f + g)(x) = f(x) + g(x)$ don't hold everywhere, only almost everywhere. It is much nicer to just work with functions directly, and require conditions on them almost everywhere.

In differential geometry, working with manifolds (in explicit charts) is still quite painful, which suggest to me that we haven't found the best way to work with them yet. Or maybe we just need better automation to prove properties like "this point lies in this chart". mfld_set_tac is a tactic that tries to do this, but is quite limited.

Bhavik Mehta (Nov 12 2024 at 18:35):

Emily Riehl said:

I'll give a little more detail on what might be special about category theory.

On paper, category theory has a notion of "evil," the primary example being categorical statements that involve equality between objects. The reason such statements are "evil" is that they often fail to be invariant under equivalence between categories.

In Lean, it seems relatively easy to prove those "evil" statements that are true. But what can be more difficult is proving that two morphisms are equal because morphisms live in dependent types (over their domain and codomain objects) and if the objects are provably equal rather than definitionally equal, the morphisms belong to different types.

I spent quite a lot of time doing category theory and topos theory in Lean a couple of years ago, and I agree there's frustrating when you can't prove those two morphisms are equal. But eventually I got into the habit of doing something like this: if A = A' and B = B', and I have morphisms f : A -> B and g : A' -> B', then where I'd informally write f = g, in Lean I'll say f >> eqToHom _ = eqToHom _ >> g, and now I have a nice commutative square. And this works pretty well, you can combine these pretty easily and (in my experience) everything came out quite nicely (when I get two eqToHom next to each other, they can combine, and if I thereby end up with eqToHom (_ : A = A) then by proof irrel it vanishes). But I can also imagine there's a distinction when you're in 1-category land like I was, and when you're in higher categories, so maybe my experience isn't transferrable to your context

Bhavik Mehta (Nov 12 2024 at 18:36):

Emily Riehl said:

maybe this is a third mode of proof construction: "on paper", "in a proof assistant", "in your head"

(On this note, I've always viewed formalisation as not paper -> proof assistant, but two distinct processes paper -> head then head -> proof assistant. And sometimes the second one goes quite a bit smoother if I intentionally leave a time gap between the two)

Dan Velleman (Nov 12 2024 at 20:33):

What I've been doing in Lean is much more elementary than most of what is being discussed here. In How To Prove It with Lean, I was writing for undergraduates who are learning to write proofs. My goal was to use Lean as a tool to teach proof writing, not to teach Lean, so I used only the simplest tactics and the simplest definitions and theorems from Mathlib.

One challenge I ran into was in the last chapter, which is on cardinality. I wanted to be able to talk about cardinalities of sets, not types. I could have converted sets to subtypes and then talked about bijections, but I didn't need subtypes anywhere else in the book, and I wasn't sure it was worth it to introduce them. And using subtypes causes some awkwardness. For example, one theorem I wanted to prove is that if a set $A$ has $n+1$ elements and $a \in A$ , then $A \setminus \{a\}$ has $n$ elements. (I needed this for proofs about finite sets by induction on the number of elements in the set.) Every element of $A \setminus \{a\}$ is an element of $A$ , but if you convert them to subtypes then the correspondence between them is more complicated. I worried that, for my audience, those complications would just be a distraction. In the end, I defined two sets to be equinumerous if there is a relation pairing up their elements one-to-one (no need for subtypes). That made a few things harder--for example, to prove that "equinumerous" is a transitive relation, I couldn't just quote a theorem about the composition of bijections being a bijection. But for the most part things worked out OK.

Ruben Van de Velde (Nov 12 2024 at 20:38):

@Dan Velleman I guess I'm not quite following your point; would you not use Set.ncard or Set.encard in such a situation?

Dan Velleman (Nov 12 2024 at 20:45):

I was trying to develop the theory from scratch as an illustration for students, rather than using theorems in Mathlib. Also, I was doing both finite and infinite cardinalities.

Kim Morrison (Nov 12 2024 at 20:51):

(Thanks, Emily, for asking such a great question, and everyone for their interesting answers! Even having lived through it all, I really appreciated reading this recap.)

Last updated: May 02 2025 at 03:31 UTC