Zulip Chat Archive

Stream: new members

Topic: Computability, P (and NP)

Pim Spelier (Aug 28 2020 at 10:40):

Hi all,
Me and @Daan van Gent have been working on trying to define problems, P and hopefully even NP. We were wondering how exactly this process of development should go, as we feel this subject is something that eventually should be in mathlib.

Right now, our questions are: when is a first version ready to be pushed to mathlib? There are already some definitions and theorems that we believe to be of interest, could they already be PR'ed? Is there stuff that we need to do before it's ready to be PR'ed? Basically, this is something we're having fun with and want to continue, but we don't know how the process works exactly.

For the interested reader, the state of the project right now: we have forked mathlib to this repository, and while this is very much a work in progress (we're doing a partial refactoring now), we currently have

A definition of problem
A definition of reduction
A definition of computable (in polytime) (using Turing machines)
A definition of P
and the lemma's
A proof that the identity is (polytime) computable
A proof that the problem on bool with tt being the only yes-instance is in P
Also, on the TM front we're working on increasing the functional power of a Turing machine, for example by making a call stack and formalising libraries (both still very much a work in progress).

Our goals (in order of short term to long term):

Prove that the composition of (polytime) computable functions is (polytime) computable (this will give that a problem reducing to a problem in P is itself in P)
Prove that all four problems on bool are in P
Prove that is_even is in P
Prove that is_odd is in P by giving a reduction to is_even
Define NP (this should be very doable already)
Show that P is a subset of NP
Define NPC
Enhance the functional power of Turing machines further
Prove that SAT is in NPC

Johan Commelin (Aug 28 2020 at 10:55):

Wow! That's a huge list. You make it seem like the LftCM workshop was ages ago!

Johan Commelin (Aug 28 2020 at 10:56):

@Mario Carneiro Is probably the maintainer who knows best how and what and where this should go in mathlib. He wrote most of the compu/TM stuff that is in mathlib

Johan Commelin (Aug 28 2020 at 10:57):

Concerning PR's, just to get a feeling for the process, I would start with a very small PR (say 30 lines).
In general we encourage PRs to be at most (say, roughly) 300 lines. [Of course there are exceptions.]

Scott Morrison (Aug 28 2020 at 11:31):

"early and often" is the advice for PRs, I think. As soon as something feels a bit stable (e.g. you've already been through one cycle of throwing it out in disgust and starting over, but this time the next few steps seem to work as well), try to split off that chunk of your work into a PR.

Mario Carneiro (Aug 28 2020 at 18:33):

I would have suggested a PR by about the second item on the list

Mario Carneiro (Aug 28 2020 at 18:33):

it sounds like you have 5 or 6 PR's here

Mario Carneiro (Aug 28 2020 at 18:36):

oops, I missed that the second half is only goals and not completed stuff. I would suggest 2 PR's, one for the definition of P and one for work on the TM library

Pim Spelier (Aug 29 2020 at 11:11):

Thanks for all the advice! I didn't mention this, but we work with self-defined encodings instead of encodables (the encodable seems to be better suited for partial recursive functions than Turing machines). This has already been refactored and used, so could that be the first PR? And for the second PR, we can probably do the definitions of (polytime) computable by a Turing machine and the definition of P. Does that make sense?

Pim Spelier (Aug 29 2020 at 11:12):

By the way, how do I PR something? Is there a link somewhere where it's explained?

Bryan Gin-ge Chen (Aug 29 2020 at 11:14):

You can start here. Feel free to ask for clarification or help if anything is unclear!

Pim Spelier (Aug 29 2020 at 12:54):

Thanks, that's really useful! I see we do need write access to non-master branches of the mathlib repository, can we get that? Our usernames are MadPidgeon and pimsp

Bryan Gin-ge Chen (Aug 29 2020 at 13:05):

Invites sent! https://github.com/leanprover-community/mathlib/invitations

Mario Carneiro (Aug 29 2020 at 19:35):

@Pim Spelier While I understand the reason you need a different encodable interface, I would like to only have one, so you should keep in mind how the existing encodable definitions can be retrofitted on a more TM friendly definition.

Pim Spelier (Aug 30 2020 at 09:38):

What exactly do you mean with retrofitting the existing definitions? The difference between encodable and encoding is something we've been thinking about, and our plan was to eventually show that there are two functions f: \N \to list \Gamma\0\1 and g: list \Gamma\0\1 \to \N such that $g\circ f$ is primitive recursive and $f \circ g$ is computable by a Turing machine. That should give a way to go from encodables to encodings and vice versa.

Pim Spelier (Aug 30 2020 at 09:38):

Does that sound like a good solution, or do you have other suggestions/comments?

Martin Dvořák (Dec 23 2021 at 12:04):

Hi there! I am really interested in this project! Has it been abandoned completely?

Alex J. Best (Dec 23 2021 at 12:16):

We should probably tag @Pim Spelier and @Daan van Gent as otherwise it seems unlikely they will read a message posted to a very old thread.

Aporías (Nov 03 2022 at 11:00):

I'm very interested in working on this. Is it a stil on going project?

Mario Carneiro (Nov 03 2022 at 11:05):

The latest work I'm aware of is https://github.com/prakol16/lean_complexity_theory_polytime_trees by @Praneeth Kolichala

Praneeth Kolichala (Nov 14 2022 at 03:34):

As you might have seen, the existing encodings don't work for defining P because the encodings of lists (and more generally, any recursive structure) grows exponentially with the size of the structure. In general, working with nat is hard and ugly overall and requires a complicated pairing function, though it is possible (this was originally #13213 before I gave up on that because the encoding function was already complicated, and it has a log-overhead which is still annoying).

Instead, I've been working on using tree-based encodings. These offer several advantages: they are the most natural way to encode recursive structures, including lists; the pairing functions are non-arbitrary, and in particular, satisfy nice definitional equalities; and finally, they still are countable, and have a reasonable encoding/decoding to nat. The main disadvantage is that the literature usually uses nat, and that the existing infrastructure uses nat.

I've started with the PR's #16715 and #16415, after which #16416 can go through (refactoring primitive recursive to use trees). Obviously this is a pretty big refactor but it is the only way I currently see to be able to define good encodings for objects that make polynomial-time things work.

If anyone is interested in contributing to this project, DM me, because obviously things progress quite slow when it's just me.

Yuyang Zhao (Nov 16 2022 at 06:53):

I would like to know how you are going to define LOGSPACE with this method, I am having great difficulties here.

Praneeth Kolichala (Nov 17 2022 at 23:37):

I haven't thought about it too much, but here is a sketch of an idea:

inductive logspace : (list bool → tree unit → tree unit) → Prop
| nil : logspace (λ _ _, tree.nil)
| left {f} : logspace f → logspace (λ i x, (f i x).left)
| right {f} : logspace f → logspace (λ i x, (f i x).right)
| pair {f g} : logspace f → logspace g → logspace (λ i x, (f i x) △ (g i x)) -- △ is the tree.node function
| comp {f g} : logspace f → logspace g → logspace (λ i, (f i) ∘ (g i))
| cases {c f g} : logspace c → logspace f → logspace g → logspace (λ i x, if c i x = tree.nil then f i x else g i x)
-- Extra functions: get the input length and get the `n`th bit of the input
| input_len : logspace (λ i _, encode i.length)
| get : logspace (λ i n, encode (i.nth (decode ℕ n).iget))
-- Iteration constraint: repeated application of `f i` to `x` uses `O(x)` memory
| iterate {f} : logspace f → (∃ M : ℕ, ∀ (n : ℕ) x i, ((f i)^[n] x).num_nodes ≤ M * (x.num_nodes + 1)) →
  /- condition: repeatedly applying `f` terminates -/ sorry → logspace (λ x, /- repeatedly applying `f i` to `x` until some condition (e.g. the left branch is nil) becomes true -/ sorry)

def logspace_fun (f : list bool → tree unit) := logspace (λ i _, f i)

Here, logspace_fun is the set of LOGSPACE functions whose outputs are logspace bounded in the input. There are some sorry's to fill in, but the essential idea is that the initial argument list bool is the read-only input, while the second argument is modifiable "working space memory." Here, we assuming the encoding of nat's uses the usual (efficient) base 2 encoding.

We can easily modify the input to be a tree instead of list bool as well -- we just have to change get to return the nth node via tree.get. It's written with list bool here to more clearly distinguish the input from the working memory.

The slightly tricky bit is with iterate. The idea is straightforward -- we can iterate a function with linear memory overhead until some condition is met. But this function might not terminate, even though we know the only way it doesn't terminate is if it repeats an input (since its range is finite for any particular i and x). Maybe there is a better way to state it, I'll have to think about it more/look at existing definitions of L in the literature.

I do believe that this is contained in L and contains all decision problems in L. Intuitively, given a TM, we can write the step function using the constructors other than iterate and then call iterate to run the Turing machine (and the condition for iterate is satisfied if the TM runs in log space).

Last updated: May 02 2025 at 03:31 UTC