Zulip Chat Archive

Stream: general

Topic: Ph.D. on formalizing complexity classes


Martin Dvořák (Aug 07 2021 at 06:33):

Hello everyone!

As my Ph.D. project (dissertation), I want to formalize complexity classes and some relationships between them (I would like to reprove stuff such as PSPACE = NPSPACE and I would like to formalize some chapters of fixed-language (V)CSP complexity, too). Do you think it is better to use Lean or Coq for that?

[crossposting to Zulip chat for Coq]

Mario Carneiro (Aug 07 2021 at 06:52):

I would say use Coq if you are interested in reusing the existing complexity theory library. Lean doesn't have much of a library in this area besides the foundations of computability theory, but if you are interested in reproving from scratch then this may eliminate Coq's head-start on the topic. Beyond that, you should use whichever system you are more comfortable with; I would suggest lean but I am of course biased.

Martin Dvořák (Aug 07 2021 at 07:08):

Do you think that this Coq library https://github.com/uds-psl/cook-levin/tree/paper-cook-levin will be of any use if I want to go beyond NP? I have no idea whether there is a reasonable way to model space complexity in the lambda calculus L.

Martin Dvořák (Aug 07 2021 at 07:11):

There is also this library https://github.com/uds-psl/tm-verification-framework for reasoning about Turing Machines in Coq; however, they didn't manage to do much with the library. There are lots of abstractions but not many results.

Patrick Massot (Aug 07 2021 at 10:08):

Don't you have a PhD advisor that could help you take this kind of strategic decision?

Martin Dvořák (Aug 07 2021 at 10:13):

My Ph.D. supervisor is a very good complexity theorist but he has no experience with formal proving. I will basically have to do my dissertation on my own. You guys are the best advisors I can get and I really appreciate you!

Patrick Massot (Aug 07 2021 at 10:15):

This is a pretty weird way to be a PhD advisor...

Martin Dvořák (Aug 07 2021 at 10:18):

I mean, I know I didn't make the best decision. However, I am already in the gradschool, I already have the funding and I probably need to make a compromise. I was working with my supervisor on two other projects but he was dissatisfied with me because I was too slow in proving his conjectures. We ended up agreeing it will be better to do something on my own, in my own tempo; and he will prove his conjectures with someone else, somebody better and faster than I am.

Martin Dvořák (Aug 07 2021 at 10:22):

You probably see that my decision (to change my research topic to something where my supervisor cannot guide me) is a bad strategy. In general. However, my experience so far was like... "This is an open problem, this is my guess what the answer will be, prove it." I realized I couldn't do it anymore.

Patrick Massot (Aug 07 2021 at 10:35):

I wasn't commenting your choices but your advisor choices. I understand the situation better now.

Martin Dvořák (Aug 07 2021 at 10:54):

Mario Carneiro said:

Beyond that, you should use whichever system you are more comfortable with;

I cannot decide which system I am more comfortable with and be responsible about it. I know a little bit of Lean. I never tried Coq. Thus, I am currently "more comfortable" in Lean but that doesn't mean much. I could probably switch to Coq and catch up with my current skill level from Lean to Coq in about one or two weeks.

The only thing I can say about my preference right now is that I find Lean's syntax nicer (I don't claim it to be better for my purposes or better in general; I just consider it more aesthetically pleasing to me eyes) and I find Lean a little bit more elegant in general. However, my main motivation to stay with Lean (as opposed to Coq) is this community. You guys are the best!

Alistair Tucker (Aug 07 2021 at 12:15):

It's always possible that your supervisor's conjectures were false. (I've been there.)

Martin Dvořák (Aug 07 2021 at 16:01):

Alistair Tucker said:

It's always possible that your supervisor's conjectures were false. (I've been there.)

Yes, it is possible, have have been there; however, it seems more likely that I am just stupid right now.

Martin Dvořák (Aug 07 2021 at 16:02):

Let us get back to the topic. I want to formalize Complexity theory in Type theory. Which language should I choose?

Johan Commelin (Aug 07 2021 at 16:26):

I can't really help you with complexity theory. But I do want to warn you for another disappointment. If I were you, please make very sure that your advisor is happy with your new topic/direction. There are many mathematicians out there that don't give a thing for formalisation, and that do not consider it research.
You will end up spending 80% of your time writing libraries that prove stuff that is covered in the first two introductory courses on complexity theory. Your advisor will find it trivial, and you won't learn much maths from this.

You really don't want to spend 3 years working on this, and then learn from your divisor that what you did is not going to be rewarded with a PhD title.

This all sounds very negative. And there are plenty scenarios where it turns out a lot more positive than what I said. But what I sketched above is a serious possibility, and you need to make sure that you don't go down that path.

Mario Carneiro (Aug 07 2021 at 17:10):

That said, you absolutely can parlay work in formalized mathematics into a PhD: Floris van Doorn , Rob Lewis and I all basically did that

Mario Carneiro (Aug 07 2021 at 17:12):

and I'm sure you can get more specific help about lean here, so if that matters to you and you prefer lean's interface then I would say go for it. Lean/Coq foundational differences aren't going to matter for the most part. You will have to talk to the Coq people to see if those libraries are good for you, and if you can collaborate with one of them then that might be a reason to use Coq

Mario Carneiro (Aug 07 2021 at 17:14):

Certainly having someone with a connection to a practicing complexity theorist would be a very good thing from our end. I imagine we could make a lot of progress into areas we are currently lacking

Martin Dvořák (Aug 07 2021 at 17:39):

Johan Commelin said:

If I were you, please make very sure that your advisor is happy with your new topic/direction. There are many mathematicians out there that don't give a thing for formalisation, and that do not consider it research.

My supervisor approved of this new direction. However, I will have to pass a qualifying exam by 2021-11-15; part of which is a defense of my research proposal. There will be several committee members to decide whether my topic is good enough for Ph.D.. After I pass this exam, only my own failure might prevent me from getting a Ph.D., not other people's opinions on whether formalization is a worthy activity / real research.

Martin Dvořák (Aug 07 2021 at 17:41):

Johan Commelin said:

You will end up spending 80% of your time writing libraries that prove stuff that is covered in the first two introductory courses on complexity theory.

This is exactly what I want to do! I want to understand the foundations of my field in the finest detail. I don't have ambitions to study things that are conceptually difficult or very advanced.

Martin Dvořák (Aug 07 2021 at 17:49):

Mario Carneiro said:

Certainly having someone with a connection to a practicing complexity theorist would be a very good thing from our end. I imagine we could make a lot of progress into areas we are currently lacking

Which particular areas would you like to see implemented in Lean? Apart from the complexity classes, probably...

Mario Carneiro (Aug 07 2021 at 18:09):

NP completeness + cook's theorem would be very useful

Martin Dvořák (Aug 07 2021 at 18:29):

Mario Carneiro said:

NP completeness + cook's theorem would be very useful

How difficult do you think it would be to translate the Kunze's and Forster's proof from Coq to Lean?

Mario Carneiro (Aug 07 2021 at 18:48):

I don't know much about it. I worked on turing machines for a while, you might need to set up an appropriate computational model infrastructure, and then write a SAT verifier as a turing machine

Martin Dvořák (Aug 07 2021 at 19:00):

Mario Carneiro said:

I don't know much about it. I worked on turing machines for a while, you might need to set up an appropriate computational model infrastructure, and then write a SAT verifier as a turing machine

What I meant in my last question was more like ...
Would it be valuable (for the Lean community) if I "mindlessly transcribed" the Coq proof into Lean?

Hunter Monroe (Aug 10 2021 at 22:56):

@Martin Dvořák it would be a very bad idea to try to write a PhD thesis without an advisor and department who can support you in writing it and later advocate that you had done something useful. As it stands, there is a risk that your supervisor would say if asked "he could not solve the problems I gave him, and instead of doing original work he mechanically formalized others' work". If you want to do a PhD on formalization, move to somewhere like CMU, Cornell, or TUM with the right support system. It is a real prospect that your topic would not be approved in November unless you shift to something sexier, like using AI to come up with better tactics or automating formalization (presenting a plan B might be a good idea). You might also fail to complete your thesis if you head in the wrong direction--the Coq paper states 'They conclude, after implementing a framework for programming TMs consisting of 19K lines of Coq code, that “TMs as model of computation are inherently infeasible for the formalisation of any computability or complexity theoretic result”, owing to the non-compositionality and low-level structure of TMs.' Your supervisor's job is to point you toward interesting problems and steer you away from dead ends. Your convex grabbing paper sounds interesting--perhaps that is a thesis chapter and you can find others with a similar theme. Caveat: my PhD is in economics, although I have published a paper in computational complexity.

Martin Dvořák (Aug 12 2021 at 20:01):

In fact, I have plans to switch to developing AI for "automatic" theorem proving at some point in my career (maybe for a postdoc -- if I manage to finish my Ph.D.); however, before I do that, I want to have several years of solid experience with "manual" theorem proving and formalization (and I also want to create something meaningful and beautiful during this epoch of my life).

Martin Dvořák (Aug 18 2021 at 07:21):

I was reading and thinking about the formalization of complexity classes. I've come to the conclusion that aiming directly for formalizing complexity classes in computational model would not be a good Ph.D. idea. First of all, it seems to be bloody hard! Second, Lennard Gäher and Fabian Kunze seem to be going to publish more about complexity classes in Coq quite soon, and so, there would be a risk that my work would end up being just a subset of what these two great guys will have done.

I would like to base my research proposal around the following notes. Do you have any comments on that, please? Did I miss an important paper from this area that already formalized something relevant that I am not aware of having been done?

[crossposting to Zulip chat for Coq]

Already in Lean/mathlib:

  • regular grammars (regular languages) and regular expressions
  • general grammars (recursively-enumerable languages)

Already in Coq:

  • regular grammars (regular languages) and regular expressions
  • context-free grammars (context-free languages)
  • general grammars (recursively-enumerable languages)

I would like to add to Lean/mathlib:

  • context-free grammars (context-free languages) and their subclasses
  • conjunctive grammars
  • context-sensitive grammars (context-sensitive languages a.k.a. NSPACE(n))
  • their inclusion in respective complexity classes (if their definitions become available)

Plan B:

  • switch to Coq
  • prove the remaining 19 out of 21 Karp's NP-complete problems (currently published: SAT and Clique)
  • formalize some hardness results from VCSP complexity (without proving that LP is in P — this would be an assumption)

Last updated: Dec 20 2023 at 11:08 UTC