Zulip Chat Archive

Stream: maths

Topic: Feasibility of formalizing Grid Homology for summer pro...

WHY (Jun 30 2025 at 17:30):

Hello everyone!

We are a group of undergraduate students working on a summer research project, and we're hoping to get some advice from the community.

Our Project: Our goal is to formalize the basics of Grid Homology, based on the combinatorial definition given in the recent lecture notes by Wong and Zampa and Chapter 3 and Chapter 4 on the book Grid Homology for Knots and Links. We plan to tackle the simplest case: the "tilde" or "hat" version of the chain complex ( $\tilde{GC}(\mathbb{G})$ or $\hat{GC}(\mathbb{G})$ ) over the field $\mathbb{F}_2$ . Our plan is
1. Define a grid diagram as a combinatorial object in Lean. (as a double-permutation)
2. Define the chain complex generators as bijections of horizontal and vertical circles given by a toroidal grid diagram
3. Define the differential $\partial$ by counting "empty rectangles" between two generators.
4. Prove that $\partial^2 = 0$ , which is the core of the project.
5. Define the resulting homology .
Our Situation: I have some basic experience in Lean but my collaborators are new to Lean and are actively learning the basics now. Our supervisor is a topologist who is very supportive of the project but does not have prior experience with Lean, so we're all learning the practical side of formalization together.
Our Initial Assessment: Based on our research, it seems the algebraic infrastructure we need (chain complexes, homology, vector spaces over $\mathbb{F}_2$ ) is already very well-supported in Mathlib. The main challenge appears to be the combinatorial proof that $\partial^2 = 0$ , which is the Proposition 1.4 in the notes. The proof in the notes is visual, based on a double counting argument for how rectangles can be combined . Our idea is to translate this visual argument into a formal proof by translating into combinatorial language (phrased in terms of permutations and coordinates on $\{0,...,n-1\}^2$ )
Our Questions:
1. Does this project appear feasible and appropriately scoped for a team of beginners to accomplish over the course of a summer (approximately 2 months)? Additionally, is it worthwhile and advisable to formalize this project?
2. Are there any major "gotchas" or tricky parts in formalizing this kind of combinatorial pairing argument that we might be underestimating?
3. Are there any existing projects in Mathlib or elsewhere that formalize similar "counting arguments for a differential" that we could study as a good example?

Thanks so much for any advice or insights you can offer!

Kevin Buzzard (Jun 30 2025 at 22:28):

I would feel more confident about this if you had at least one member of the team who was a Lean expert (whether or not they knew anything about the mathematics) or had easy access to such a person.

Formalization is an art. There will be an art to formalizing the proof of 1.4. I feel almost like 1.4 is cheating and the correct thing to formalize is Remark 1.5, because the correct approch to formalisation is to formalise the right theorem the first time around, as opposed to corollaries of it, because someone somewhere will need the right theorem at some point. For example Heegaard Floer homology is mentioned in recent serious papers in the Annals and if one wants to formalize the statements of those papers one wants to have the foundational theory done in the correct generality. However if this is only a one-off undergraduate project then of course you can formalize what you like. It would be a shame to miss this opportunity to get Heegaard Floer homology into mathlib. However you can regard your experiment as something which teaches the community what does and doesn't work.

Other than finding someone who is really good at Lean and prepared to work with you (which is my number 1 piece of advice and might well be necessary to stop the project fro failing) I think my main suggestion would be to really understand why 1.4 is true before you write any Lean code at all. Write down a super-detailed proof in LaTeX (as part of your blueprint -- every project needs a blueprint) and make sure that there are absolutely no parts of it which are vague in any way. If it comes out to ten pages for something which takes 1 page in the pdf then so be it. Can you simplify your proof? Try and find several proofs in the literature. Think about the right generality. Are those rows and colums really indexed by {1,2,..,n}? Would an arbitrary totally ordered finite set work? Or is no ordering needed -- would Z/nZ work? An arbitrary non-ordered finite set? What is the correct level of generality of all the constructions? What the heck does "a matching is a one-to-one correspondence between the vertical and horizontal circles." (p3) mean? I personally don't have a clue -- what ever is a vertical circle? do you understand this? did you read [31]? Does it help? Did you read the original papers where this was defined, rather than the first thing you found on the internet? The original sources for an idea often contain insights which get forgotten later. What is the correct abstraction? Someone with some Lean experience might really be able to help you here, which is why I am a bit nervous -- you seem to be full of enthusiasm and have many mathematicians available, but where is the person who will explain to you the right way to think about Lean?

WHY (Jul 01 2025 at 03:04):

Kevin Buzzard said:

I would feel more confident about this if you had at least one member of the team who was a Lean expert (whether or not they knew anything about the mathematics) or had easy access to such a person.

Formalization is an art. There will be an art to formalizing the proof of 1.4. I feel almost like 1.4 is cheating and the correct thing to formalize is Remark 1.5, because the correct approch to formalisation is to formalise the right theorem the first time around, as opposed to corollaries of it, because someone somewhere will need the right theorem at some point. For example Heegaard Floer homology is mentioned in recent serious papers in the Annals and if one wants to formalize the statements of those papers one wants to have the foundational theory done in the correct generality. However if this is only a one-off undergraduate project then of course you can formalize what you like. It would be a shame to miss this opportunity to get Heegaard Floer homology into mathlib. However you can regard your experiment as something which teaches the community what does and doesn't work.

Other than finding someone who is really good at Lean and prepared to work with you (which is my number 1 piece of advice and might well be necessary to stop the project fro failing) I think my main suggestion would be to really understand why 1.4 is true before you write any Lean code at all. Write down a super-detailed proof in LaTeX (as part of your blueprint -- every project needs a blueprint) and make sure that there are absolutely no parts of it which are vague in any way. If it comes out to ten pages for something which takes 1 page in the pdf then so be it. Can you simplify your proof? Try and find several proofs in the literature. Think about the right generality. Are those rows and colums really indexed by {1,2,..,n}? Would an arbitrary totally ordered finite set work? Or is no ordering needed -- would Z/nZ work? An arbitrary non-ordered finite set? What is the correct level of generality of all the constructions? What the heck does "a matching is a one-to-one correspondence between the vertical and horizontal circles." (p3) mean? I personally don't have a clue -- what ever is a vertical circle? do you understand this? did you read [31]? Does it help? Did you read the original papers where this was defined, rather than the first thing you found on the internet? The original sources for an idea often contain insights which get forgotten later. What is the correct abstraction? Someone with some Lean experience might really be able to help you here, which is why I am a bit nervous -- you seem to be full of enthusiasm and have many mathematicians available, but where is the person who will explain to you the right way to think about Lean?

Thank you for your detailed suggestions, Prof. Buzzard. Your feedback are very helpful and valuable for us!

When my supervisor proposed this topic, I was uncertain if formalizing this specific result aligned with the philosophy of Mathlib. While the proof is combinatorial, my own exploration led me to believe the formalization would be extremely difficult and perhaps not the most necessary target.

I will follow your advice and discuss finding a Lean expert with my supervisors. I will also raise these concerns to our supervisors and if we agree the project is not the right fit, I hope we can switch topics. Thank you again!

Notification Bot (Jul 01 2025 at 03:04):

WHY has marked this topic as resolved.

Patrick Massot (Jul 01 2025 at 07:58):

Kevin Buzzard said:

It would be a shame to miss this opportunity to get Heegaard Floer homology into mathlib.

At the current rate of progress, Mathlib is hundreds of years away from getting Heegaard Floer homology. Fortunately, this is not what we are discussing here.

Kevin Buzzard (Jul 01 2025 at 08:50):

Even just the definition?

Patrick Massot (Jul 01 2025 at 08:56):

Oh yes.

Michael Rothgang (Jul 01 2025 at 09:40):

I should emphasize that "at the current rate of progress" is load-bearing here: I dearly hope this changes.

Kyle Miller (Jul 01 2025 at 11:27):

Formalizing the combinatorics of grid diagrams and grid link homology would be valuable.

It might come from deeper geometry @Kevin Buzzard, but this stuff can stand on its own. Someone could calculate some invariants of some concrete knots and formally prove it.

It's ambitious for two months, but getting the combinatorics going and sketching out a blueprint seems doable.

You're right Kevin about ordered sets. I think technically "pointed cyclic orders" (made that terminology up) is the right setting, since the diagrams are sort of on a torus with a distinguished fundamental domain.

Kevin Buzzard (Jul 01 2025 at 12:18):

Yeah, reading the article again it seemed that order was important (because of crossings) but conversely they were emphasizing that it should be thought of as a torus!

WHY (Jul 02 2025 at 18:14):

Thank everyone for the suggestions and comments!

Notification Bot (Jul 12 2025 at 00:13):

WHY has marked this topic as unresolved.

WHY (Jul 20 2025 at 20:34):

Hello everyone,

We have an update on our project and some new questions.

First, we are currently working on the blueprint to understand the combinatorics involved and are not writing much Lean code yet, aside from some basic definitions that we may revise later. So far, we have tried three ways to formulate the grid diagrams:

(1) Permutations on $\{0, ..., n-1\}$ .

(2) Permutations on $\Z/ n\Z$ with the natural order defined.

(3) A "double cover" approach, extending the grid to $2n \times 2n$ , but define the states as coordinates.

Our experience has been that approach (1) leads to a large number of cases for the $\partial^2= 0$ proof, which are hard to formalize even in natural language as they are often distinguished visually by drawing graphs. Approach (2) was very similar and did not significantly simplify the case analysis. (We do not delve deeply into (2), as it closely resembles (1). Perhaps this approach with some motification is more suitable.)

We are now focusing on (3), the double cover, which has been more promising. It seems to reduce the number of basic cases to 8 with more combined cases of basic cases. We have developed an algorithmic proof for $\partial^2 = 0$ that constructs the required pairing. However, we are still working on issues with its well-definedness, such as proving that our construction is consistent across the four quadrants of the larger grid.

We have a few questions:

Has anyone tried to formalize the combinatorial arguments for the $\partial^2 = 0$ proof before, even just in a detailed natural language paper?
The original papers describe this proof as "elementary checking" without any detail. Does there exists a better and smarter abstraction that captures the symmetries in the case analysis to reduce the amount of manual work?
If we cannot find a better abstraction, how well does Lean handle a very constructive, algorithmic proof that involves a large number of cases checking (e.g., over 30)?

Thank you for your time and any suggestions!

Kevin Buzzard (Jul 20 2025 at 21:01):

Not a concrete suggestion but I remember once falling into a similar trap with group cohomology. I had a Masters student define group cohomology as ker(d)/im(d) where d was a map between spaces of inhomogeneous cochains, and to prove that this definition made sense we had to prove $d^2=0$ which was a real mess. Later on we realised that had we defined it using homogeneous cochains instead then the proof of d^2=0 was easy and we could have then checked that the square with the homogeneous and inhomogeneous d's in the rows and the isomorphism of homogeneous and inhomogeneous cochains in the columns commuted and deduced that d^2=0 for inhomogeneous cochains for free. So it's definitely worth trying to find the right abstraction because there might be something which makes it all easy.

Last updated: Dec 20 2025 at 21:32 UTC