Zulip Chat Archive
Stream: general
Topic: Computational Complexity Theory
Parker Bjur (Dec 09 2021 at 15:45):
Is there a method for writing proofs about computational complexity in lean? I have not been able to find anything of the sort but I am also not very sure where to look.
Anne Baanen (Dec 09 2021 at 15:49):
There is an apparently convenient way to do so in Coq through the λ calculus: https://drops.dagstuhl.de/opus/frontdoor.php?source_opus=13915
Anne Baanen (Dec 09 2021 at 15:51):
See also these previous threads:
- https://leanprover.zulipchat.com/#narrow/stream/116395-maths/topic/complexity.20theory
- https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/Ph.2ED.2E.20on.20formalizing.20complexity.20classes
- https://leanprover.zulipchat.com/#narrow/stream/217875-Is-there.20code.20for.20X.3F/topic/Complexity.20theory
- https://leanprover.zulipchat.com/#narrow/stream/113489-new-members/topic/Computability.2C.20P.20.28and.20NP.29
Parker Bjur (Dec 09 2021 at 16:01):
Thank You!
Martin Dvořák (Dec 23 2021 at 12:07):
This is a long shot but ... is anyone working on any complexity theory in Lean?
Martin Dvořák (Dec 24 2021 at 14:06):
How difficult would it be to transcribe the Fabian Kunze et al's library from Coq to Lean? Any rough estimate anyone?
https://github.com/uds-psl/cook-levin
Mario Carneiro (Dec 24 2021 at 21:43):
I don't see a good reason to actually transcribe it, those proofs don't look easy to follow. Probably you should just use the theorems as inspiration
Martin Dvořák (Dec 24 2021 at 21:59):
Mario Carneiro said:
I don't see a good reason to actually transcribe it, those proofs don't look easy to follow. Probably you should just use the theorems as inspiration
Thank you for an answer!
On which level would you recommend me to get inspired by it? Should I break down the problem to equivalent lemmata one-to-one?
Bolton Bailey (Dec 24 2021 at 23:58):
I've been working on a PR for the mu-recursive definition of computability. It seems like https://github.com/uds-psl/cook-levin and https://arxiv.org/pdf/1102.5495.pdf both use definitions that have a similar approach to the model of computation. Perhaps after this PR is made, we can build up from there in terms of defining the time complexity of various algorithms.
Bolton Bailey (Dec 25 2021 at 00:08):
Ok, the PR is #11046. @Martin Dvořák , if you or anyone else wants to contribute, feel free to push to this branch. It feels like someone comes along every few months looking for a complexity library and not finding one. Maybe the issue is that we just need to get started, and once people see there's something to build off of, it'll grow from there.
Kevin Buzzard (Dec 25 2021 at 00:14):
That's exactly what happened with graph theory BTW. Consider making your own stream!
Arthur Paulino (Dec 25 2021 at 11:17):
Great initiative!
Martin Dvořák (Dec 25 2021 at 11:53):
It seems to me that I don't have permission to create new streams. Will any admin or moderator help me?
Martin Dvořák (Dec 25 2021 at 12:03):
Bolton Bailey said:
I've been working on a PR for the mu-recursive definition of computability. It seems like https://github.com/uds-psl/cook-levin and https://arxiv.org/pdf/1102.5495.pdf both use definitions that have a similar approach to the model of computation. Perhaps after this PR is made, we can build up from there in terms of defining the time complexity of various algorithms.
How big is the time overhead of partial recursive functions over Turing Machines? And over RAM, is it known?
Arthur Paulino (Dec 25 2021 at 15:48):
Bolton Bailey said:
Ok, the PR is #11046. Martin Dvořák , if you or anyone else wants to contribute, feel free to push to this branch. It feels like someone comes along every few months looking for a complexity library and not finding one. Maybe the issue is that we just need to get started, and once people see there's something to build off of, it'll grow from there.
Idea if more than 1 person wants to contribute: open PRs to that branch instead of committing and pushing directly to it (as if it were a master branch)
Arthur Paulino (Dec 25 2021 at 23:19):
Would someone more knowledgeable mind laying down a roadmap on this subject? I am having a hard time understanding the structure of the Coq repo
Martin Dvořák (Dec 25 2021 at 23:50):
I can sketch some preliminary roadmap but somebody more knowledgeable will have to check and correct after me. Would it be for thee, knowledgeable person, easier this way?
Arthur Paulino (Dec 25 2021 at 23:52):
@Martin Dvořák I'd say go for it :D. It's a brand new branch so np
Martin Dvořák (Dec 26 2021 at 15:29):
My rough sketch of the roadmap towards complexity theory in Lean is the following. We will follow the approach of Kunze et al.
Paper: https://drops.dagstuhl.de/opus/volltexte/2021/13915/pdf/LIPIcs-ITP-2021-20.pdf
Code (Coq): https://github.com/uds-psl/coq-library-complexity
Please note that we won't use the letter L for the name of the complexity class (deterministic logarithmic time).
(1) Define the notion of a decision problem in the weak call-by-value λ-calculus L.
(2) Define the class P in L.
(3) Define the class NP in L.
(4) Define polytime reductions in L.
(5) Prove that, if A polytime reduces to B in P, then A in P.
(6) Define the classes NP-hard and NP-complete in L.
(7) Define a natural NP-complete problem in L.
(8) Prove that, if NP-hard A polytime reduces to B, then B is NP-hard.
(9) Define the Abstract heap machines and Turing machines — just that we can state the auxiliary decision problems about them; we will not program in them; the reductions will be always programmed in L.
(10) State and prove the polytime reductions (very challenging).
(11) State your favourite version of the tiling problem.
(12) Prove the NP-hardness of the tiling problem using the tools above (the vanilla TM will have to be reduced to it).
(13) Reduce your tiling problem to some form of SAT.
(14) We can continue building the complexity theory from here on without getting our hands dirty with the intricacies of various computational models.
Yaël Dillies (Dec 26 2021 at 15:42):
Turing machines already exist in the file computability.turing_machine
.
Martin Dvořák (Dec 26 2021 at 15:49):
I sent an e-mail to Fabian Kunze requesting his help.
Martin Dvořák (Dec 26 2021 at 15:56):
I don't understand why ye are grinning. I did my best I could do with my very superficial understanding of Kunze et al's approach.
Martin Dvořák (Dec 26 2021 at 15:57):
I am aware of not providing much of a hint regarding the implementation.
Stuart Presnell (Dec 26 2021 at 16:09):
It looks like an impressive and ambitious plan of work. I don’t know enough to make suggestions, but I’m very pleased to see that someone’s taking this on.
Arthur Paulino (Dec 26 2021 at 16:27):
Martin Dvořák said:
I don't understand why ye are grinning.
Sorry, I didn't mean it in a mocking way. I was just happy that you actually did it :smiley:
Zulip's mobile app doesn't have the "on hover" functionality that shows the actual emoji "word"
Mario Carneiro (Dec 26 2021 at 21:50):
I disagree with the idea of requiring everything to be done in L. Not only is that not the computational model we have chosen for the existing computability material, it is also overly restrictive; users should be able to use any computational basis they want that is relevantly equivalent to turing machines or RAM model or whatever is the best computational basis for complexity theory. Currently, computability.turing_machine
contains 3 different computational bases that are all equivalent (although the framework for making use of the equivalences for application to complexity classes like P does not exist yet), and users should be able to write e.g. polynomial time TM2 programs. The language L is also polytime equivalent to TM0 so you should be able to write programs in L if you want to, but I would want to allow as much freedom in this area as possible.
Dima Pasechnik (Dec 26 2021 at 23:19):
Perhaps Fagin's equivalence of NP and sentences in existential 2nd order logic is something that should be more prominent. (In this vein, polynomial-time solvability becomes definability in the 1st order logic).
Gabriel Ebner (Dec 27 2021 at 10:44):
I disagree with the idea of requiring everything to be done in L.
A big advantage of L is that is much more ergonomic to use, since Lean quite directly maps to L and most features are directly supported (higher-order functions, partial applications, etc.) with pretty much the runtime and space complexity you'd expect. You can directly compute the runtime of List.filter as defined in Lean. It is also straightforward to show that concrete Lean definitions are computable in L; this is a lot more exhausting in the current computability library (particularly with recursive functions, or n-ary functions with n>=3). I don't even want to imagine a formalization using turing machines.
users should be able to use any computational basis they want that is relevantly equivalent to turing machines
L is one such model, I'm not sure I see your point.
Mario Carneiro (Dec 27 2021 at 10:49):
Well, the work on TMs is not get complete as far as connecting them to the primrec model. I would like to see a library closer to what we have for proving primrec, which does cover higher order stuff like list.filter.
Mario Carneiro (Dec 27 2021 at 10:49):
The underlying computational model is somewhat orthogonal to this, you can build a higher order notion of time complexity on any model
Mario Carneiro (Dec 27 2021 at 10:54):
From what I can tell of the complete implementation of L used to derive time bounds, it's not all that different from what you get from constructions like TM_to_partrec. It still has a pretty concrete operational model with an explicit evaluation context, so it's not as compositional as pure lambda calculus. The differences don't seem big enough to make it worth introducing yet another computational model and connecting it to the other models
Gabriel Ebner (Dec 27 2021 at 10:56):
I realize that you've put a lot of effort into this development, but I don't think it's a good direction. The primrec
/... stuff is really unergonomic to work with, e.g. for list.filter
you need several lemmas showing that list.filter is primrec/computable/partrec if the predicate is primrec/computable/partrec. If we wanted to tackle time complexity, I feel like we'd end up with another computational model and set of theorems anyhow.
Mario Carneiro (Dec 27 2021 at 11:01):
I do think we can borrow some ideas from Kunze et al for the "frontend" part of this, like h.o. time complexity
Mario Carneiro (Dec 27 2021 at 11:03):
primrec
is limited to first order functions on naturals, which is why there is the proliferation of theorems. With nat.partrec.code
it becomes more practical to express that a higher order function is "nice" once and for all by proving that it is h.o. equivalent to a single code
. This option is not available in primrec
itself because it is still bootstrapping
Mario Carneiro (Dec 27 2021 at 11:05):
In other words, I think we have the tools we need to simulate everything you would do with a computational model like L (in particular all the typeclasses and such that are necessary for an ergonomic library) without actually changing the basis
Mario Carneiro (Dec 27 2021 at 11:07):
I think we need to do some design work on that frontend though; I don't think transliterating the coq code precisely will actually work all that well in lean, we will need to make some modifications but it's going in the right direction
Gabriel Ebner (Dec 27 2021 at 11:09):
It still has a pretty concrete operational model with an explicit evaluation context, so it's not as compositional as pure lambda calculus.
Yes, proving that the time complexity of f t
is the runtime of t
+ runtime of f
on the resulting value is the same in either formalism. What you get with L is in my eyes the following:
- You get a single theorem for computability (namely, a definition of a lambda term that compute the function for all values). Properties like complexity, termination, etc. are then theorems about this lambda term. This also works for higher-order function without any friction. (To be fair, we could also do this with partial recursive functions and maybe turing machines---but I'm not sure if there's a canonical turing machine that suffices for complexity analysis).
- It's obvious how to represent data in L.
List.take 10 xs
hasO(10)
runtime, no matter what the type of the elements inxs
is. Depending on how lists / function arguments / return values are represented in a Turing machine, this might entail copying the elements, the whole list, etc. (Though we could also solve that with other computational models, like random-access machines.)
Gabriel Ebner (Dec 27 2021 at 11:10):
I think we need to do some design work on that frontend though; I don't think transliterating the coq code precisely will actually work all that well in lean, we will need to make some modifications but it's going in the right direction
Completely agreed on that. Transliterating is certainly the wrong way. But I think their ideas are good and sound, and deserve to be used as inspiration.
Mario Carneiro (Dec 27 2021 at 11:18):
I think there is value in having a more complex intermediate language than L for the sake of having nice primitive operations with crafted time bounds. For example, throwing in the natural numbers with arithmetic operations on simply typed lambda calculus
Gabriel Ebner (Dec 27 2021 at 11:18):
In other words, I think we have the tools we need to simulate everything you would do with a computational model like L (in particular all the typeclasses and such that are necessary for an ergonomic library) without actually changing the basis
I think it's clear that we could make a nicer frontend for showing computability, that just seems like an engineering problem. The part that I don't see is how such a frontend would scale to showing time or space complexity. Maybe you have already this figured out though.
Mario Carneiro (Dec 27 2021 at 11:19):
Certainly it is nice to be able to say that list.take 10 xs
has time O(10)
. It would be even better if it takes exactly 10 steps (and lots of other basic functions like it also have exactly the right number of steps)
Mario Carneiro (Dec 27 2021 at 11:21):
we probably won't be able to get that for everything, but a big haskell-ish language would make it easier to do the "writing concrete programs" part, which I expect to be much more important / time consuming than any work needed to set up the model and prove it is polynomially simulated by a TM or partial recursive function.
Gabriel Ebner (Dec 27 2021 at 11:23):
For example, throwing in the natural numbers with arithmetic operations on simply typed lambda calculus
Practically speaking, this won't make any difference at all for any common NP problems because all natural numbers will be bounded by a very low bound. Even the most basic implementation will be fast enough there. We just need to replace Nat
by Num
(which is straightforward in the L approach).
Mario Carneiro (Dec 27 2021 at 11:24):
Gabriel Ebner said:
I think it's clear that we could make a nicer frontend for showing computability, that just seems like an engineering problem. The part that I don't see is how such a frontend would scale to showing time or space complexity. Maybe you have already this figured out though.
I've been thinking mostly about time complexity here. I would borrow Kunze's work for this part. Regarding space complexity, I haven't thought much about it, and we might need a RAM model to get all the results that complexity theorists expect here
Gabriel Ebner (Dec 27 2021 at 11:25):
Certainly it is nice to be able to say that
list.take 10 xs
has timeO(10)
. It would be even better if it takes exactly 10 steps (and lots of other basic functions like it also have exactly the right number of steps)
It won't take exactly 10 steps in any model of computation (for a non-specialized take implementation), because decrementing natural numbers has some non-constant runtime.
Mario Carneiro (Dec 27 2021 at 11:27):
Gabriel Ebner said:
For example, throwing in the natural numbers with arithmetic operations on simply typed lambda calculus
Practically speaking, this won't make any difference at all for any common NP problems because all natural numbers will be bounded by a very low bound. Even the most basic implementation will be fast enough there. We just need to replace
Nat
byNum
(which is straightforward in the L approach).
Yes, but that's not the point. The idea behind the haskell-ish intermediate language is to calibrate the time bounds to be exactly some nice numbers. For example, to set the cost of m * n
to m.size * n.size + 1
or whatever we can get away with (and not have to introspect further on how these operations work)
Mario Carneiro (Dec 27 2021 at 11:28):
The actual replacement of Num
for Nat
will happen in the once-and-for-all proof that this IL is polytime simulated by a TM
Gabriel Ebner (Dec 27 2021 at 11:28):
but a big haskell-ish language would make it easier to do the "writing concrete programs" part
The whole idea about the L approach (and I believe what they do / want to do in Coq) is that this can be almost fully automated, since the host language maps so closely to the object language. You write a function in pure idiomatic Lean, and then deriving instance Computable for List.filter
and you have a realizing lambda term + theorems for the runtime of the cons and the nil case.
Mario Carneiro (Dec 27 2021 at 11:31):
Gabriel Ebner said:
Certainly it is nice to be able to say that
list.take 10 xs
has timeO(10)
. It would be even better if it takes exactly 10 steps (and lots of other basic functions like it also have exactly the right number of steps)It won't take exactly 10 steps in any model of computation (for a non-specialized take implementation), because decrementing natural numbers has some non-constant runtime.
This depends on how tight we need the simulation to be. If we can afford a log factor in the simulation then we can make decrementing numbers O(1)
Gabriel Ebner (Dec 27 2021 at 11:33):
Mario Carneiro said:
Yes, but that's not the point. The idea behind the haskell-ish intermediate language is to calibrate the time bounds to be exactly some nice numbers. For example, to set the cost of
m * n
tom.size + n.size + 1
or whatever we can get away with (and not have to introspect further on how these operations work)
That would be a wonderful breakthough. I believe the best known multiplication algorithms are O(n log n).
Gabriel Ebner (Dec 27 2021 at 11:37):
Generally I think the focus on precise numbers is a distraction here. Even in the best case there's going to be lots of annoying +1s and -1s that nobody cares about. The runtime should be stated as a big-O, which is I think very much the standard in complexity theory. Just imagine you'd have to account for all the index computations in a sorting algorithm...
Mario Carneiro (Dec 27 2021 at 11:38):
Sure, we can't build in everything so that's inevitable
Gabriel Ebner (Dec 27 2021 at 11:39):
Right, and if it's inevitable we might as well accept it and use big-O everywhere.
Mario Carneiro (Dec 27 2021 at 11:39):
but unfortunately I think that in practice we won't be able to use big-O nearly as much as the books would have you believe
Mario Carneiro (Dec 27 2021 at 11:40):
lots of proofs require that you fix the big-O constant after the fact in some way that is not easily explained by any concrete definition of big-O
Gabriel Ebner (Dec 27 2021 at 11:41):
Seriously, the biggest issue with L in my eyes is with subpolynomial complexity. I mean sure, you can prove that an algorithm has O(n log n) runtime in L, but I'm not sure how convincing that result is to somebody else. Then there's the question of what LOGSPACE is in L.
Mario Carneiro (Dec 27 2021 at 11:42):
Right. I think that for the question of what computational basis to use for our definitions, we should be looking at the tightest complexity classes, because those can sometimes distinguish the bases
Gabriel Ebner (Dec 27 2021 at 11:42):
Mario Carneiro said:
lots of proofs require that you fix the big-O constant after the fact in some way that is not easily explained by any concrete definition of big-O
Do you have any concrete examples?
Mario Carneiro (Dec 27 2021 at 11:45):
You generally need the existential constant to lie outside all variables, including parameters to the algorithm. This is generally very cumbersome to write, and it is easy to leave out a variable and then get stuck when you try to use the function recursively in a proof much later
Gabriel Ebner (Dec 27 2021 at 11:56):
I see what you mean (in a paper I would always interpret big-O to quantify over everything unless specified otherwise), but I don't think it means big-O is problematic. It just means we always need to quantify over all arguments, i.e. is_O (fun (m,n) => runtime (m + n)) (fun (m,n) => (m+n) * (m+n).log)
, which is indeed a bit verbose (but fixable).
Mario Carneiro (Dec 27 2021 at 12:02):
Here's a more concrete example. Suppose we have a function like:
def foo : Nat -> Nat
| 0 => 0
| n+1 => do f n; g n; foo n
Let's suppose that f n
is O(n)
and g n
is O(n^2)
.If we wanted to keep big-O everywhere, we might want to argue like so:
time(f n) is O(n)
time(g n) is O(n^2)
time(foo n) is O(n^3)
time(f n; g n; foo n) is O(n + n^2 + n^3) = O(n^3)
but of course this doesn't work because of the recursion. So we have to introduce an explicit constant:
time(f n) is O(n)
time(g n) is O(n^2)
time(foo n) <= c n^3
time(f n; g n; foo n) is O(n + n^2 + c n^3) <= c (n+1)^3
but this also doesn't work because that last step isn't valid. The correct way to prove this is:
time(f n) is O(n)
time(g n) is O(n^2)
time(f n) + time(g n) is O(n^2)
Suppose time(f n) + time(g n) <= d n^2
Pick c := d + 1 (or something)
Prove by induction: time(foo n) <= c n^3
time(f n; g n; foo n) =
time(f n) + time(g n) + time(foo n) <=
d n^2 + (d+1) n^3 <= (d+1) (n+1)^3
The big-O is really not helping here; if we just had explicit constants for everything, for example if we knew time(f n) <= 4 n
and time(g n) <= 37 n^2
then this proof would be a lot simpler.
Mario Carneiro (Dec 27 2021 at 12:05):
Plus, you never know when a more precise bound might come in handy
Dima Pasechnik (Dec 27 2021 at 12:07):
The whole business about L seems to be geared towards various questions related to complexity of various λ-calculi. For the classical complexity theory this seems to be more of a sidetrack (which however might have technical benefits).
Mario Carneiro (Dec 27 2021 at 12:08):
I really don't want to prove anything about church rosser here, that seems like entirely a sidetrack if the goal is P and NP
Mario Carneiro (Dec 27 2021 at 12:09):
(To be fair, you don't need to take such a sidetrack if formalizing L; but it does seem likely to come up if you start thinking about compositionality in the lambda calculus)
Gabriel Ebner (Dec 27 2021 at 12:12):
if we just had explicit constants for everything
Luckily, we can eliminate the big-O again if we need to fall back to explicit constants. (Just like we can fall back to ε-δ-proofs if more general methods fail.)
Dima Pasechnik (Dec 27 2021 at 12:12):
Isn't the problem we see with big O here due to the true nature of big O being asymptotic? One needs to argue that O(n^k)<O(n^{k+1}) as n->oo.
(OK in sense of proving something is in P).
Dima Pasechnik (Dec 27 2021 at 12:13):
One sweeps these explicit constants into exponents, that's OK is only done fixed number to times.
Gabriel Ebner (Dec 27 2021 at 12:16):
Isn't the problem we see with big O here due to the true nature of big O being asymptotic?
This doesn't make any real difference. If f n = O(g n)
as n → ∞
, then there exists a c
such that f n ≤ c * (g n + 1)
and vice versa. (assuming f,g≥0)
Mario Carneiro (Dec 27 2021 at 12:17):
I think it would be okay to have theorems of the form \exists N, \all n >= N, time(f n) <= 37 n
, where we fix the c
but not N
Mario Carneiro (Dec 27 2021 at 12:18):
since I don't think that N
interferes with recursive proofs as much as c
does
Gabriel Ebner (Dec 27 2021 at 12:18):
That's logically equivalent to a big-O so I'm not opposed. But it's a bit unfortunate if we can't reuse the is_O
definition.
Mario Carneiro (Dec 27 2021 at 12:19):
is it? big-O requires a multiplicative constant
Mario Carneiro (Dec 27 2021 at 12:19):
or do you mean something like time(f n) - 37 n in O(1)
Mario Carneiro (Dec 27 2021 at 12:21):
We can still have is_O
-stated theorems for "presentation" purposes (and who knows, maybe they will be usable in some proofs) but for the really hard time complexity theorems I will expect us to skip those and go straight for the ones with algebraically more complicated but logically simpler bounds
Gabriel Ebner (Dec 27 2021 at 12:21):
The whole business about L seems to be geared towards various questions related to complexity of various λ-calculi.
No, that's certainly not the intention. The idea behind L is that it is much closer to a functional language (like Lean) in syntax and operational semantics, so you don't have to do as much encoding as with Turing machines. Like Mario is saying, a lot of basic operations (like applying a function or reading the first element of a list, etc.) take exactly one step in L, but can take several or even many steps in a Turing machine because of copying, skipping, etc.
Gabriel Ebner (Dec 27 2021 at 12:24):
Mario Carneiro said:
We can still have
is_O
-stated theorems for "presentation" purposes (and who knows, maybe they will be usable in some proofs) but for the really hard time complexity theorems I will expect us to skip those and go straight for the ones with algebraically more complicated but logically simpler bounds
I'm not sure I understand you. Are you saying that 37
is better than c
, because presumably norm_num
can simplify it? While I recognize the pain, I feel like the solution here should be better automation, and not theorems with hard-coded bounds.
Mario Carneiro (Dec 27 2021 at 12:25):
Not (just) because norm_num can work with it, but also because we don't need to deal with existential elimination and the resulting scoping. It's a lot easier to automate such a proof if everything is in the empty context
Dima Pasechnik (Dec 27 2021 at 12:26):
Mario Carneiro said:
is it? big-O requires a multiplicative constant
Yes, but for the purposes of proving something in P, one can replace a constant by a sufficiently slowly growing function, e.g. n
, or log n
.
Gabriel Ebner (Dec 27 2021 at 12:27):
So Σ' c, time (f n) ≤ c * n^2
would be ok then? (No elimination, no scope.) If you allow classical logic, you can also avoid the existential elimination with an epsilon.
Mario Carneiro (Dec 27 2021 at 12:30):
I think that still suffers from the scoping over parameters problem. The normal way to write that would imply that c
can depend on anything in scope, whereas if it is an explicit 37
then you know it doesn't. I think lean might still be able to work with it if it can unfold the value to 37
, but it can still possibly trigger scoping issues if it is not reduced
Mario Carneiro (Dec 27 2021 at 12:31):
Another way to hide the argument is to have time_f_constant := 37
and then time (f n) ≤ time_f_constant * n^2
, where you can control the parameters needed in time_f_constant
Gabriel Ebner (Dec 27 2021 at 12:34):
Maybe I'm naive, but is this so much easier than rcases time_f_spec with ⟨time_f_constant, N, time_f_le⟩
at the beginning of the proof?
Dima Pasechnik (Dec 27 2021 at 12:35):
I guess I'm talking about completely eliminating big-O, in the sense that O(n)<n^2 as n->oo, etc. This works for showing something in P.
Mario Carneiro (Dec 27 2021 at 12:36):
For showing things are in P, I think you can avoid constants altogether and instead use closure properties for most things
Mario Carneiro (Dec 27 2021 at 12:37):
The big-O stuff only comes up for tighter bounds like sorting in O(n log n)
Mario Carneiro (Dec 27 2021 at 12:38):
@Gabriel Ebner Maybe. I haven't worked it all out, so consider me moderately dubious but willing to try things and see what sticks
Arthur Paulino (Dec 27 2021 at 13:45):
This is a subject I'm very interested in and I will try to follow the discussions as hard as I can.
Can we have a separate stream complexity theory
for it? It's getting more and more laborious to backtrack every discussion made about this topic. Different topics are spread over different streams and mixed/smashed into same threads.
Sorry for my ignorance, but what's the core idea for computing time and space complexity formally? The proofs that I've done/seen all my entire life about these were extremely informal, like counting loops inside loops and making some logical connection with the input size.
Yaël Dillies (Dec 27 2021 at 13:48):
Streams are usually made to divert attention away from a subject which isn't interesting to most people. Simply cut that thread into smaller ones if you feel like you can't follow the conversation.
Martin Dvořák (Dec 27 2021 at 17:58):
Dima Pasechnik said:
Perhaps Fagin's equivalence of NP and sentences in existential 2nd order logic is something that should be more prominent. (In this vein, polynomial-time solvability becomes definability in the 1st order logic).
As a definition? Or an additional theorem about our classes?
Mario Carneiro (Dec 28 2021 at 02:54):
Arthur Paulino said:
Sorry for my ignorance, but what's the core idea for computing time and space complexity formally? The proofs that I've done/seen all my entire life about these were extremely informal, like counting loops inside loops and making some logical connection with the input size.
The basic idea is pretty simple: count all the basic steps evaluated by a program. You can very often get exact formulas for this, like 2 n^2 + 3 x - 7
; formally this is often an easier task than doing big-O analysis, although it can start to help once the formulas get too complicated. Once the tools to state such theorems are in place, the proofs are pretty trivial, it usually amounts to proofs like \sum i < n, 1 + 2 + (3 i - 1) = O(n^2)
.
Mario Carneiro (Dec 28 2021 at 02:57):
For space complexity, I'm not sure what the best setting is. Probably we want a computational model with a finite memory like an FSM (but with parametric memory size k
possibly depending on the input n
, instead of O(1)
), and then a space complexity bound talks about whether such a limited model is capable of computing the desired function.
Dima Pasechnik (Dec 29 2021 at 11:54):
Martin Dvořák said:
Dima Pasechnik said:
Perhaps Fagin's equivalence of NP and sentences in existential 2nd order logic is something that should be more prominent. (In this vein, polynomial-time solvability becomes definability in the 1st order logic).
As a definition? Or an additional theorem about our classes?
either way would be useful - IMHO it's a very convenient way to think about the class NP, as it does not invoke weird stuff such as nondeterministic Turing machines.
Martin Dvořák (Dec 29 2021 at 12:18):
What is the "natural" NP-complete problem in this formalism?
Dima Pasechnik (Dec 29 2021 at 12:36):
Martin Dvořák said:
What is the "natural" NP-complete problem in this formalism?
the usual, e.g. SAT. For instance, see https://hal.archives-ouvertes.fr/hal-00017602/document
Dima Pasechnik (Dec 29 2021 at 12:39):
this brings up the question of a "natural reduction" - in the classical theory, Karp vs Turing reductions. Here we have "1st order reductions", as well.
Martin Dvořák (Dec 30 2021 at 09:48):
I will be free to start working on that after my Qualifying Exam on 2022-01-25. However, I'd need a lot of guidance in order to succeed in this task.
Arthur Paulino (Dec 31 2021 at 14:35):
I am super curious to see how it would be built up in Lean :eyes:
Fabian Kunze (Jan 04 2022 at 11:02):
Martin Dvořák said:
(1) Define the notion of a decision problem in the weak call-by-value λ-calculus L.
(2) Define the class P in L.
(3) Define the class NP in L.
(4) Define polytime reductions in L.
(5) Prove that, if A polytime reduces to B in P, then A in P.
(6) Define the classes NP-hard and NP-complete in L.
(7) Define a natural NP-complete problem in L.
(8) Prove that, if NP-hard A polytime reduces to B, then B is NP-hard.
(9) Define the Abstract heap machines and Turing machines — just that we can state the auxiliary decision problems about them; we will not program in them; the reductions will be always programmed in L.
(10) State and prove the polytime reductions (very challenging).
(11) State your favourite version of the tiling problem.
(12) Prove the NP-hardness of the tiling problem using the tools above (the vanilla TM will have to be reduced to it).
(13) Reduce your tiling problem to some form of SAT.
(14) We can continue building the complexity theory from here on without getting our hands dirty with the intricacies of various computational models.
Hi, I'm one out the authors of the Cook-Levin mechanisation in Coq.
I think you are missing (or at last not explicitly acknowledging) the part that an 'time-invariant' simulation of L in terms of Turing machines is needed (if the model of computations is to be choosen L): As NP-complete talks more or less about simulating all poly-time verifiers with a single problem, one needs to transform the L-algorithms into TMs, i.e. implement the abstract heap machine as TM. (Or is that the part (10)?)
This part is quite hard: We developed an framework to verify Turing machines (See our publication with Wuttke).
Fabian Kunze (Jan 04 2022 at 11:02):
On a more general note:
Our experience there is also why we think that explicitly constructing TMs (or other low-level models of computation, like mu-recursive functions) just does not scale: There is so much overhead, like"how do I encode different types of data? (lists, terms, programms...)", or "How to verify simple composition of functions and other straightforward, non-recursive (or "non-loop-containing") programs without to much effort?" and of course "How can I verify loops/recursion"?
Maybe, with a lot of engineering work, one can come up with a scaling verification framework, but our conclusion is that functional programming is such a sweet-spot in ITPs that one would waste time not using a functional language whenever possible.
It might be very much possible to support sublinear space in a lambda calculus, for example by introducing oracles for the input. No matter the model of computation, it seems that sublinear space will make it a lot more tedious to have composability of programs, as one can not use the same space measure for a program "on top level" and "a program called as a subroutine of another program".
Fabian Kunze (Jan 04 2022 at 11:16):
My judgement is that 1,2,3,4,6,7,9 are trivial.
5 and 8 are very easy once one has a basic understanding on how to compose programs.
Proving that (7) is indeed NP-complete of course needs an universal L-program that runs in poly-time, for instance implementing the heap machine in L. With a good verification framework for the model of computation (in this case, L), this is not very hard, but without a good framework, this is very, very tedious, and maybe even unfeasible.
(I commented on 10 above, if that is the L-to-TM-reduction, i.e. the implementation of an L-interpreter as a TM. It took us a very talented student with an excellent bachelors thesis on this topic to come up with a verification framework for TMs, and after his thesis, it still was the work of over 100 hours of work to implement and verify the TMs needed for this step. )
11 and 12 were also tackled in a bachelor thesis with another extraordinary student, and needed a lot of proof Engineering to solve the many, many cases that can occur when simulating a computation with a tiling problem.
13 again is trivial, assuming one can write and verify basic programs in the model of computation (of course, as always, assuming one also can verify running times).
Mario Carneiro (Jan 04 2022 at 11:17):
@Fabian Kunze I'm curious whether by "TM" you mean in the strict sense defined by turing, or a more type-level programming friendly version. I think that while TMs construed literally are not good for verification, it is possible to get a more abstract model with user-chosen state spaces to do TM proofs without as much pain. I think that it is possible to get these looking something like WHILE programs or finite state machines with O(1) programs in between, which seems like a good balance between the native logic part used to write the programs, and the explicit step representation you need to do time and space bounds (within a constant factor).
Fabian Kunze (Jan 04 2022 at 11:20):
Yes, we abstract away the TM behind while-program like "combinators" that construct TMs in the background:
"The Turing machine verification framework allows giving algorithms in the style of a
register-based while-language, and a corresponding machine is automatically constructed
behind the scenes. Separate correctness and verification proofs are then inclusion proofs
between the automatically derived and the user-given relations for the constructed machine"
But there still is the problem that the "registers" contain asingle tape/strings, and that one might want to change the alphabet when composing machines/
Mario Carneiro (Jan 04 2022 at 11:21):
As an example of what can be done with such a framework, see the program tm_to_partrec
and surrounding section. That one is implemented in a multi-stack machine, but if you have ideas for a more abstract model that still supports some nice operations I'm all ears
Mario Carneiro (Jan 04 2022 at 11:21):
You can change the alphabet when composing machines
Mario Carneiro (Jan 04 2022 at 11:23):
Ah. I think, based on your description, that one difference between our formalisms is that the mathlib one actually changes the model of computation rather than using combinators over the original model (and there are general theorems about reduction from one model to another)
Fabian Kunze (Jan 04 2022 at 11:23):
Yes, we change the alphabet, it is all possible, and we are quite happy with our final, hoare-like verification framework that hides most, or even all of those details.
Fabian Kunze (Jan 04 2022 at 11:25):
Yes, we do only have a shallow representation of the while language, not a deep one. One aspect was that this allows to add new data typ[es (lists of other things, for example) on the fly, as long as one person wrote the "basic" combinators to interact with that new data.
Fabian Kunze (Jan 04 2022 at 11:26):
Partrec seems to be very close to L in my eyes by the way, and I have no doubt that this approach can work.
Mario Carneiro (Jan 04 2022 at 11:27):
That's my impression as well. I think that tm_to_partrec
automaton is a rough equivalent of your L-to-TM reduction
Fabian Kunze (Jan 04 2022 at 11:27):
Is the theoretical problem of coming up with a time measure for partrec-functions that is invariant w.r.t. turing machines solved?
Mario Carneiro (Jan 04 2022 at 11:28):
what do you mean by invariant?
Fabian Kunze (Jan 04 2022 at 11:28):
that the TM and part rec function simulate wach other with just polynomial overhead
Fabian Kunze (Jan 04 2022 at 11:28):
probably, count all mu-rec-loop-executions, is it?
Mario Carneiro (Jan 04 2022 at 11:28):
As long as you can accept a O(n) slowdown (e.g. anything P or larger) then you can do the simulation
Fabian Kunze (Jan 04 2022 at 11:29):
the slowdown of implementing mu-rec programs in TMs is only linear?
Fabian Kunze (Jan 04 2022 at 11:29):
sorry, other way arround]
Mario Carneiro (Jan 04 2022 at 11:30):
We don't have the actual time bounds proved yet, but the current setup should have O(1) from TM0 to TM1, O(n) slowdown for TM2 from TM1, and O(log n) slowdown for partrec from TM2
Mario Carneiro (Jan 04 2022 at 11:30):
TM2 is multi-stack machines
Fabian Kunze (Jan 04 2022 at 11:30):
Ah, nice.
Mario Carneiro (Jan 04 2022 at 11:33):
Well, to be precise the last step is not partrec, it is this model which is proven equivalent to the partrec model but has a more explicit cost model
Mario Carneiro (Jan 04 2022 at 11:34):
partrec itself is mostly only focused on computability in the abstract, some of those algorithms are really bad (like double exponential) if run literally
Mario Carneiro (Jan 04 2022 at 11:35):
but that's why partrec is defined as a Prop with no computational content, you shouldn't be extracting an actual algorithm from it
Fabian Kunze (Jan 04 2022 at 11:35):
The TM2 model looks interesting. Is there documentation/intuition on what kind of types are allowed as elements on the stacks, and what kind of "elementary" operations are permitted?
Mario Carneiro (Jan 04 2022 at 11:35):
The stack element can be any finite type
Mario Carneiro (Jan 04 2022 at 11:36):
The elementary operations are basic operations on stacks here
Fabian Kunze (Jan 04 2022 at 11:37):
and sigma is a finite type, i assume?
Mario Carneiro (Jan 04 2022 at 11:37):
yes, that's the "local state"
Mario Carneiro (Jan 04 2022 at 11:38):
in the simulation using basic TMs the basic states are pairs of a local state value and a label
Fabian Kunze (Jan 04 2022 at 11:39):
Very nice abstractions, in hindsight, more transformations when going from single-tape to multi-tape TMs would have made our life simpler.
Mario Carneiro (Jan 04 2022 at 11:47):
One of the things I like a lot about the TM models is that the type Λ
of "labels" (aka TM states) is not required to be finite, but rather finitely supported (only a finite number of labels are reachable from any given label). That means that label types can be simple infinite inductive types like this one, and we prove separately that for any starting point of interest there is a finite set of reachable states from it. In that example, we have a single label space covering all partrec programs, and finiteness is guaranteed by the fact that you only visit programs corresponding to subterms of the original program
Fabian Kunze (Jan 06 2022 at 09:57):
We are quite happy with the fact that in our hoare-style framework, we only need one lemma for each total TM, following the execution once, to get all properties of the tm we need (and the lemmas for the Hoare rules handle the well-formatted-conditions in the background.)
Mario Carneiro (Jan 06 2022 at 10:21):
what properties are those? Functional correctness and a time bound? Do you have an example?
Fabian Kunze (Jan 11 2022 at 16:07):
Yes, the specification of functional correctness and the running time. Well-formedness is already "baked in" into the type of turing machines, for instance that the alphabet is finite etc.
Martin Dvořák (Jan 16 2022 at 10:17):
Is anybody going to work on the implementation in foreseeable future?
Bolton Bailey (Jan 16 2022 at 20:13):
I am going to continue working on it in #11046. Notwithstanding all the discussion here, I'm just going to start by trying to finish a function time
with the same type as docs#nat.partrec.code.eval, I'm not sure how long this will take me, but it seems like the best combination of achievability and forward progress to me.
Arthur Paulino (Jan 18 2022 at 03:11):
Mario Carneiro said:
The basic idea is pretty simple: count all the basic steps evaluated by a program. You can very often get exact formulas for this, like
2 n^2 + 3 x - 7
; formally this is often an easier task than doing big-O analysis, although it can start to help once the formulas get too complicated. Once the tools to state such theorems are in place, the proofs are pretty trivial, it usually amounts to proofs like\sum i < n, 1 + 2 + (3 i - 1) = O(n^2)
.
What if you have, say, f(a, b) = a + b
, would you say that this function is O(a)
(or O(b)
depending on how addition is defined) or O(1)
?
Mario Carneiro (Jan 18 2022 at 03:38):
It depends on whether that is a primitive operation or not. If it is primitive, then it is . For a lot of purposes it might be reasonable to take it as such; the Word RAM model of computation allows addition to be , although the addends must be bits where w, the "word size", is
Mario Carneiro (Jan 18 2022 at 03:39):
If addition is not primitive, then I would expect
Martin Dvořák (Jan 20 2022 at 19:56):
Mario Carneiro said:
It depends on whether that is a primitive operation or not. If it is primitive, then it is . For a lot of purposes it might be reasonable to take it as such; the Word RAM model of computation allows addition to be , although the addends must be bits where w, the "word size", is
Why Omega?
Martin Dvořák (Jan 20 2022 at 20:27):
Could this Isabelle/HOL project be used as inspiration for us?
https://github.com/wimmers/poly-reductions
I don't know whether I understand what they did. They have the polytime reductions. Unfortunately, they don't have any problem proved to be NP-complete yet. Nevertheless, it is a part of their plan and their framework should be capable of that. Am I right?
Mario Carneiro (Jan 20 2022 at 21:00):
I think it's usually stated as an inequality , the asymptotics of it don't matter so much. It's a hypothesis, but a fairly reasonable one for word RAM: it's saying that the entire problem fits in the address space of the machine
Mario Carneiro (Jan 20 2022 at 21:01):
For a regular computer, this is saying that 64-bit machines can work on problems of size at most 2^64
Mario Carneiro (Jan 20 2022 at 21:02):
For asymptotic complexity results, this is convenient because it means that as grows so does , so you can do larger operations in
Martin Dvořák (Jan 20 2022 at 21:02):
But can we choose word size to be much higher than Omega(log n) to reduce the time complexity in some cases?
Mario Carneiro (Jan 20 2022 at 21:03):
Yes, but the complexity bounds are stated as a function of both and in this case
Martin Dvořák (Jan 20 2022 at 21:03):
Oh!
Mario Carneiro (Jan 20 2022 at 21:04):
For a really intense application of word RAM capabilities check out fusion trees
Bolton Bailey (Jan 20 2022 at 21:12):
After thinking about the partial recursive implementation more, I'm not sure this approach actually works. If we are working over the naturals and the only primitive operation for mutating a natural is the successor function, it seems like there's no way for addition to be O(log n), since to construct a + b
from a
and b
, you have to apply the successor at least (b - a)
times. It also seems like you can't represent a natural in binary, since if you fix the number of registers you need for a natural at k, you can represent at most t ^ k
numbers in O(t)
time. I may need to rethink the approach.
Martin Dvořák (Jan 20 2022 at 21:17):
Basic question, possibly very stupid:
What is the difference between the partially-recursive functions and the call-by-value lambda-calculus, please?
Bolton Bailey (Jan 20 2022 at 21:23):
That's a good question. I was a bit confused when @Fabian Kunze said that mu-recursive functions wouldn't scale, after saying that the call-by-value lambda calculus was what he and his co-authors used, since I would have said that those models of computation are similar, or at least that it wouldn't be too hard to prove slowdown lemmas about one with respect to the other. Perhaps he or someone else can elaborate on the differences, or if this is indeed difficult?
Mario Carneiro (Jan 20 2022 at 21:30):
@Bolton Bailey This is why you probably shouldn't define time
based on the partrec evaluation rules. It's only intended for defining computability, not complexity, and there are several reductions involved that are exponential or double exponential
Mario Carneiro (Jan 20 2022 at 21:34):
Any of the TM-based computation models should all be polynomially equivalent: TM0
, TM1
, TM2
. You can also probably code binary in the docs#turing.to_partrec.code model, but the built in nats in that model are unary
Bolton Bailey (Jan 20 2022 at 21:34):
It occurs to me that if only the naturals were represented in binary and I could do an O(1) doubling or halving operation on them, I would have stacks and one could use multiple stacks to implement a Turing machine, so I imagine time in this model of computation would be within a polynomial factor of the usual definition.
Mario Carneiro (Jan 20 2022 at 21:36):
Take a look at https://leanprover-community.github.io/mathlib_docs/computability/tm_to_partrec.html#simulating-sequentialized-partial-recursive-functions-in-tm2, that's exactly what it does
Mario Carneiro (Jan 20 2022 at 21:38):
There are computational models similar to primitive recursive functions which compute functions in P, you have to switch out the prec
constructor with "recursion on notation" aka docs#nat.binary_rec
Fabian Kunze (Jan 21 2022 at 07:38):
Regarding the observation with "not enough space to madd numbers":
That exactly is the complication, function composition does not behave like before, where one cold just concatenate the resource consumption functions.
For TMs, one approach is to have input (and output) tapes, on which one can only read (or write and move one step to the right). I have not looked into it that deep, but as far as I understood, you can arrange things such that one can compose "online" algorithms, for instance one can think of addition as a function that traverses two binary representations and, on the fly,. returns the stream of bits of the sum. If the function called on that result directly "consumes" the output (or, more explicitly, pauses and resumes the computation of the bits as needed), one can add numbers inside an log(n) algorithm.
Another approach (which I think is more common) is to represent numbers not as numbers, but as algorithms that, given an index i, return the i-the bit of the number. This allows to represent numbers of size up to n in log(n) space (But only a constant amount of them, because the code has to exists in the program itself). As long as one is only interested in space, this very inefficient representation does work.
But this inability of writing down intermediate results explicitly makes formalizations of space really complicated.
Fabian Kunze (Jan 21 2022 at 07:45):
Bolton Bailey said:
That's a good question. I was a bit confused when Fabian Kunze said that mu-recursive functions wouldn't scale, after saying that the call-by-value lambda calculus was what he and his co-authors used,
I'm sorry, I was inconsitent: when I said "[I have no doubts that] partrec [] can work", this was meant as a retraction of my previous statement that I was sceptical using murec for time-complexity. I used those two names, partrec and murec, interchangeably, as they mean the same.
The one complication I see is that without an abstraction layer for encodings (gödel-numbers?), the statements and programs get a bit messy when programming an universal machine. With the lambda calculus, we more or less get any Algebraic data type "for free" using Scott encodings (wich are similar to church encodings, but vor cbv-languages).
Yannick Forster (Jan 21 2022 at 15:56):
Mario Carneiro said:
The idea behind the haskell-ish intermediate language
Do you have any more concrete thoughts how a haskell-ish intermediate language might look Mario?
Yannick Forster (Jan 21 2022 at 15:57):
Mario Carneiro said:
We don't have the actual time bounds proved yet, but the current setup should have O(1) from TM0 to TM1, O(n) slowdown for TM2 from TM1, and O(log n) slowdown for partrec from TM2
Do you see a chance to find a good notion of space complexity for partrec
?
Mario Carneiro (Jan 21 2022 at 20:26):
I think the most natural thing would be to use a computational model which is similar in structure to partrec
but uses binary strings as the basic data type instead of nat
Bolton Bailey (Jan 21 2022 at 21:33):
Ok, thanks Fabian for responding, I guess I'll start thinking about a binary-string partrec
.
Bolton Bailey (Jan 21 2022 at 21:36):
Yannick Forster said:
Do you see a chance to find a good notion of space complexity for
partrec
?
I think that space-complexity for partrec
can look very similar to time-complexity for partrec
in structure. Instead of time (comp f g) = time f + time g
we have something like space (comp f g) = max (space f) (space g)
(with appropriate inputs to f
and g
).
Martin Dvořák (Jan 21 2022 at 22:20):
Fabian Kunze said:
For TMs, one approach is to have input (and output) tapes, on which one can only read (or write and move one step to the right).
Would the TM always have a constant amount of tapes?
Martin Dvořák (Jan 21 2022 at 22:29):
Would the number of tapes be baked into the TM type?
Yannick Forster (Jan 22 2022 at 14:24):
Mario Carneiro said:
I think the most natural thing would be to use a computational model which is similar in structure to
partrec
but uses binary strings as the basic data type instead ofnat
What makes partrec
tedious in my opinion is that you don't have a lambda, so all programs are combinatorial and passing around input is not as easy as in functional programming languages. But, of course, with a lambda you buy into the harder simulation proof. And why restrict to binary strings? Intuitively I'd rather argue for binary trees (like in the computability book by Jones. On top of that I would imagine that one can build an even more expressive simply typed language with natural numbers, lists, options and trees (which is immediately reduced to the simpler case with just one base type)
Yannick Forster (Jan 22 2022 at 14:28):
Bolton Bailey said:
Yannick Forster said:
Do you see a chance to find a good notion of space complexity for
partrec
?I think that space-complexity for
partrec
can look very similar to time-complexity forpartrec
in structure. Instead oftime (comp f g) = time f + time g
we have something likespace (comp f g) = max (space f) (space g)
(with appropriate inputs tof
andg
).
It's not obvious to me that this works. (But of course it would be great if it would!) To implement on a Turing machine, you'd probably need some kind of environment implemented with pointers, and pointers can grow linear-logarithmically instead of just linearly. At least, that's what happens with lambda calculus and why for e.g. the full lambda calculus space complexity is still an open problem. For RAM machines, defining space is also way more subtle than one might think (that was the content of the first paper by Slot and van Emde Boas, where they came up with the conjecture that all reasonable models of computation can simulate each other with polynomial overhead in time and constanct-factor (i.e. linear) overhead in space
Bolton Bailey (Feb 02 2022 at 05:15):
Hey @Mario Carneiro , I'm noticing that the definition of docs#turing.to_partrec.code.eval for code.fix
could be changed to
| (code.fix f) := pfun.fix $ λ v, pure (if v.head = 0 then sum.inl v.tail else sum.inr v.tail)
Which is shorter than what is there currently. I believe this essentially changes it from a "while" to a "do-while" semantics. I think this doesn't change the model too much and might be cleaner in the long run (if I can fix what breaks in this file). Is there a reason I'm missing for it to be defined the way we have it?
Mario Carneiro (Feb 02 2022 at 05:17):
you don't use f
in that snippet?
Bolton Bailey (Feb 02 2022 at 05:17):
Uhh, oops, you're right, just a sec
Bolton Bailey (Feb 02 2022 at 05:22):
I think this is what I meant:
| (code.fix f) := pfun.fix $ λ v, if v.head = 0 then pure (sum.inl v.tail) else ((f.eval v.tail).map sum.inr)
Bolton Bailey (Feb 02 2022 at 05:25):
I guess it doesn't end up that much shorter
Martin Dvořák (Dec 27 2022 at 18:07):
I think we should establish some regular Zoom meetings on formalizing complexity theory in Lean.
Many of us are interested in having such a library, but it will require a lot of collaboration which seems that we haven't started yet.
Martin Dvořák (Dec 28 2022 at 11:21):
I suggest we start in the 3rd week of 2023. Indicate your time preferences please:
https://doodle.com/meeting/participate/id/ep84nEQe
Yaël Dillies (Dec 28 2022 at 11:59):
Also note that graph algorithms are of interest to me, if you happen to need one at some point.
Henrik Böving (Dec 28 2022 at 12:56):
Martin Dvořák said:
I suggest we start in the 3rd week of 2023. Indicate your time preferences please:
https://doodle.com/meeting/participate/id/ep84nEQe
I signed up but mostly because this has been a topic of interest for me for a while, not necessarily because I'm actually competent enough to contribute something from a purley mathematical POV so if my vote ends up being a deal breaker just ignore me.
Shreyas Srinivas (Dec 28 2022 at 15:40):
Henrik Böving said:
Martin Dvořák said:
I suggest we start in the 3rd week of 2023. Indicate your time preferences please:
https://doodle.com/meeting/participate/id/ep84nEQeI signed up but mostly because this has been a topic of interest for me for a while, not necessarily because I'm actually competent enough to contribute something from a purley mathematical POV so if my vote ends up being a deal breaker just ignore me.
Complexity theory is broad enough as a subject that it is unlikely that everyone knows all the relevant math beforehand, beyond the basic stuff.
Henrik Böving (Dec 28 2022 at 16:12):
My knowledge ends at "There is P and NP and there is a couple of fun reductions to SAT you can do to show things are in NP" :P
Hanting Zhang (Dec 29 2022 at 03:19):
:O I'm very interested as well! It seems I can't sign up for the meeting though. May I have an invite?
Martin Dvořák (Dec 29 2022 at 07:51):
Everyone is invited! Just vote in the poll, please.
Hanting Zhang (Dec 29 2022 at 08:21):
Oh apologies I was doing something wrong -- I've signed up now
Martin Dvořák (Dec 29 2022 at 08:23):
No need to apologize. Just clarify, please, did you just indicate that only one time works for you?
Martin Dvořák (Dec 29 2022 at 08:24):
Doodle asks every participant for a subset of time slots.
Patrick Johnson (Dec 29 2022 at 08:52):
Don't forget to record the meetings and post on YouTube.
Martin Dvořák (Dec 29 2022 at 08:53):
Is it necessary?
Martin Dvořák (Dec 29 2022 at 08:55):
Recording meetings makes many people afraid of embarrassing themselves.
Patrick Johnson (Dec 29 2022 at 11:26):
Participants may request to remove any problematic part of the video before posting on youtube, but it will rarely happen. Most Lean meetings are public. See for example mathlib4 porting meetings at leanprover community youtube channel. People who can't participate for whatever reason should have access to video materials.
Arthur Paulino (Dec 29 2022 at 13:16):
2c: I would suggest the first meetings not to be recorded because it's important that participants feel as comfortable to speak as possible. And the group still has to find its dynamics. Recording will be more natural when there's a minimal shape to begin with. (I'm not participating, but I have some experience starting out informal study groups)
Arthur Paulino (Dec 29 2022 at 13:18):
Please don't let anything hinder the spontaneity of the movement. It's an important ingredient!
Praneeth Kolichala (Jan 01 2023 at 20:42):
I'd very much like to attend but this is the same week as the POPL conference (Principles of Programming Languages). I only marked times as available when they didn't conflict with this (i.e. before 9:00 AM).
On the other hand, it seems there are quite a few talks about Lean, which presumably means other Lean people will be there. So while I can't really do a Zoom call, if other people will be at POPL I am willing to do an in-person meeting!
Martin Dvořák (Jan 02 2023 at 08:12):
I am sorry I chose an inconvenient week for the first call. Given how many people have responded to the poll, I will not change it now. I will be happy if you join us for the next meetings!
Shreyas Srinivas (Jan 06 2023 at 16:44):
This discussion on elementary probability is very relevant to us. Most theoretical CS people may not be fluent in measure theory either.
Martin Dvořák (Jan 10 2023 at 09:13):
We shall meet on Monday 2023-01-16 at 15:00 GMT.
Martin Dvořák (Jan 10 2023 at 09:19):
Can someone with Zoom Pro generate a Zoom link for our meeting please? My links are limited to 40 minutes.
Alex J. Best (Jan 10 2023 at 12:54):
If you use the add global time button when composing message you can add a time which will be rendered as everyones local time, eg:
Martin Dvořák (Jan 10 2023 at 12:58):
Yes is correct.
Martin Dvořák (Jan 10 2023 at 20:17):
Hot news! The Cook-Levin theorem has been proved in Isabelle!
https://www.isa-afp.org/entries/Cook_Levin.html
Hanting Zhang (Jan 10 2023 at 20:23):
Wow!! Only two days ago as well :O
Martin Dvořák (Jan 10 2023 at 20:54):
@Fabian Kunze Can you please tell us a brief comparison with your proof in Coq?
Shreyas Srinivas (Jan 10 2023 at 21:27):
The first 360 odd pages in the proof doc are just for establishing the basic machinery of TMs and other basic stuff. I wonder how much we can shave this in size with effective use of mathlib
Shreyas Srinivas (Jan 10 2023 at 21:28):
and how much of this machinery related to TMs is reusable for the rest of the theorems in that book
Shreyas Srinivas (Jan 10 2023 at 21:29):
A significant chunk of the TM related proofs is about writing TMs for specific languages and proving their correctness and complexity.
Henrik Böving (Jan 10 2023 at 21:30):
In general I would usually expect isabelle results to be shorter than ours because they can cut proofs quite short thanks to superior automation.
Shreyas Srinivas (Jan 11 2023 at 09:13):
I don't mean shorter through automation, but shorter in an amortized sense, because we have a library which might come in handy for proving this theorem, and the theorems we develop for this might come in handy elsewhere.
Martin Dvořák (Jan 11 2023 at 20:58):
Can somebody please generate the Zoom link?
Shreyas Srinivas (Jan 11 2023 at 23:28):
Martin Dvořák said:
Can somebody please generate the Zoom link?
I need to check if our institutions still have licenses.
Shreyas Srinivas (Jan 11 2023 at 23:29):
I'll be able to check if I can create a room by Friday evening
Hanting Zhang (Jan 11 2023 at 23:31):
I think I can generate one under my institution
Hanting Zhang (Jan 11 2023 at 23:32):
https://usc.zoom.us/j/96278860441?pwd=YXFjdlJ2aFVlOUovMUVscitNWHZCdz09
Hanting Zhang (Jan 11 2023 at 23:32):
Does this work lol
Praneeth Kolichala (Jan 12 2023 at 05:06):
Over the past couple weeks, I've been working on my reworked definitions of polynomial time computation. I've uploaded the repository here with a reasonably detailed README. I feel like I have a reasonable path towards Cook-Levin, although the machinery for circuits is still quite basic. Essentially, we need to define uniform circuit families and show that composition (which is mostly straightforward with circuits) can actually be computed in polynomial time.
If anyone would like to help out or contribute, that would be very appreciated (especially now that winter break has ended, I will probably be busy again).
Martin Dvořák (Jan 12 2023 at 07:12):
Your README looks great! When I have more time, I will READYOU more carefully.
Martin Dvořák (Jan 12 2023 at 07:16):
Hanting Zhang said:
https://usc.zoom.us/j/96278860441?pwd=YXFjdlJ2aFVlOUovMUVscitNWHZCdz09
Is it all right that I can join the call already now?
Hanting Zhang (Jan 12 2023 at 07:22):
uhhhh yeah idk how to restrict it
Martin Dvořák (Jan 12 2023 at 07:23):
Do other people use the same link?
Martin Dvořák (Jan 12 2023 at 09:46):
Can I distribute the link without a risk that the recipients will land into unrelated meetings?
Martin Dvořák (Jan 12 2023 at 19:22):
Martin Dvořák said:
Fabian Kunze Can you please tell us a brief comparison with your proof in Coq?
The author of the new proof in Isabelle compares his proof to the proof in Coq as follows:
image.png
Martin Dvořák (Jan 12 2023 at 19:22):
Fabian Kunze (Jan 12 2023 at 20:26):
Martin Dvořák said:
Fabian Kunze Can you please tell us a brief comparison with your proof in Coq?
We use a different intermediate problem to get from n-Tape TMs to SAT: Instead of oblivious 2-tape TMs, we use 1-tape TMs, and then look at the configuration history, which is a 2-dimensional table: Line $i$ line corresponds to the tape, head position and state after $i$ computation steps, and each cell in that line is more or less one cell of the tape. Then, we express the transition function of the Turing machine as a set of 2x3 cell large "dominos", and each 2x3 cell in the configuration history must be in this set of allowed states.
This means that validity of the computation is a very local property, and we can just AND that each position in the table is one of the different allowed dominos.
Both the intermediate problem used by us, and by the authors of the Isabelle formalization, are standard proof ideas sketched in text books
on complexity theory.
Fabian Kunze (Jan 12 2023 at 20:29):
On a higher level, our approach of using the lambda calculus as a model of computation enables us to more or less generate the witnesses of computability (i.e., the programs computing reduction functions etc) directly from the (functional) specification of those functions in Coq. We still have to verify many Turing machines (a TM interpreting lambda-calculus, and a translation from multi-tape TMs to single-tape-TMs), but at least not all the reduction functions need to be modeled as TMs.
Fabian Kunze (Jan 12 2023 at 20:33):
Praneeth Kolichala said:
I feel like I have a reasonable path towards Cook-Levin, although the machinery for circuits is still quite basic. Essentially, we need to define uniform circuit families and show that composition (which is mostly straightforward with circuits) can actually be computed in polynomial time.
Interesting, I would not have thought that circuits are a natural path towards formalizing cook-levin, doesn't one still need another model of computation to define "uniformity" of circuit families?
Praneeth Kolichala (Jan 13 2023 at 00:51):
Yes, we still need a model of computation to define uniformity (and this is the main model of computation; it is based on an inductive definition akin to the current docs#nat.primrec). The main thing is that (hopefully) we avoid having to deal with TMs directly.
Shreyas Srinivas (Jan 13 2023 at 00:56):
Praneeth Kolichala said:
Yes, we still need a model of computation to define uniformity (and this is the main model of computation; it is based on an inductive definition akin to the current docs#nat.primrec). The main thing is that (hopefully) we avoid having to deal with TMs directly.
Depends on your goal. If it is to just prove variants of complexity results in your model, then this will certainly do. There will have to be suitable adaptations of statements. For instance, in the time/space hierarchy theorems. If however your goal is to build a larger library of results which the TCS community finds useful, and builds upon further, then TM and RAM are unavoidable. There are relatively few uniform translations of complexity results from one model to another, and usually one can do better for various classes of problems. Most people in this field care about them. There was a similar discussion in another thread.
Shreyas Srinivas (Jan 13 2023 at 00:57):
(deleted)
Shreyas Srinivas (Jan 13 2023 at 01:03):
If you wish to try model agnostic approaches, then descriptive complexity might be a nice approach.
Andrew Carter (Jan 15 2023 at 00:34):
Shreyas Srinivas said:
If you wish to try model agnostic approaches, then descriptive complexity might be a nice approach.
I've actually been playing around with something in this vein (I use vein loosely here, mostly its the idea of quantifying over models) (there is another thread in Is there code for X? with some of my earlier thoughts leading up to here). Basically define a model of computation and then we can write theorems about all models.
https://github.com/calcu16/lean_complexity/blob/67ebab10624a272cf7c6c2a5345d26b2096e1c55/src/complexity/basic.lean
For better or worse I've been trying to abuse instances so that in theory I could autogenerate a complexity bound from the lean definitions.
for example I've defined nat.mul generically on any model to have complexity based on an equivalent formulation using nat.iterate
instance mul_complexity [has_encoding m ℕ] [has_encoding m (ℕ×ℕ)]
[cf: has_complexity m
(curry (compose
prod.fst
((compose
(uncurry (nat.iterate (fork (uncurry nat.add) prod.snd)))
(((fork prod.snd (fork (@const ℕ (ℕ×ℕ) 0) prod.fst))))))))]:
has_complexity m nat.mul := ...
Then assuming you have a model with the complexities of the necessary primitives defined for a model the complexity will "pop" out.
def complexity {α β γ: Type} [has_equiv β] [preorder γ] [has_add γ]
(m: complexity.model α β γ) {δ: Type} [complexity.has_encodable_function m δ]
(f: δ) [c: complexity.has_complexity m f]:
complexity.cost_function' m δ := c.value.cost
For example in my case I'm using a model of counting beta reductions in untyped lambda calculus and I can get a complexity for it without having to write(or cost) nat.mul myself in lambda calculus.
example: (complexity distance_model nat.mul) ≤ (λ (x y: ℕ), ((30*y + 43):ℕ)) :=
begin
intros n m,
simp only [complexity, complexity.cost_function.less_than_or_equal, has_complexity.value, fork, const, uncurry],
ring_nf,
norm_num,
induction m,
{ simp only [iteration_complexity.cost_zero],
ring_nf },
simp only [iteration_complexity.cost_succ', function.iterate_succ', nat.mul_succ, add_mul, nat.add_assoc],
nlinarith,
end
One cute thing with this approach is that you don't even need a model, you could say something basically akin to "for all models with properties X Y Z" my function will have complexity better than C.
One thing that's not clear to me is how to interact with general recursion using the instance strategy (or even if the instance strategy is a good idea). Additionally functions need to be written in a point-free style, so pointers on how I could get generic pattern matching against lambdas would also be of use I think.
Martin Dvořák (Jan 16 2023 at 08:06):
[not relevant anymore]
Martin Dvořák (Jan 16 2023 at 17:46):
We have discussed that we would like to meet probably twice a month.
Let's vote for a regular meeting — it will happen every other week, starting with the week that we vote about!
https://doodle.com/meeting/participate/id/e791Lj8b
Martin Dvořák (Jan 23 2023 at 11:35):
Please vote. I would like to announce the time of our meetings tomorrow.
Martin Dvořák (Jan 26 2023 at 11:28):
Next meeting:
Andrew Carter (Jan 30 2023 at 23:47):
Probably can't make it tomorrow, but I was looking into @Shreyas Srinivas suggestion of the matching algorithm, and one concern I have is that most of the math doesn't seem to be there yet (blossoms but no augmenting paths). Also I have some concerns about how cardinality is handled, I've had to add noncomputable to indicate finiteness, hopefully if I have a fintype I can avoid it, but it does make me worried (I don't think berg's lemma holds on infinite graphs for example).
Anyway, I think ideally for the initial algorithms we explore, the math exists (i.e. provably correctness) but the complexity is the missing piece. Is there any other suggestions (I had been working towards Karatsuba but maybe that's not impressive enough?).
Martin Dvořák (Jan 31 2023 at 14:46):
Hanting Zhang said:
https://usc.zoom.us/j/96278860441?pwd=YXFjdlJ2aFVlOUovMUVscitNWHZCdz09
Can we use the same link today?
Martin Dvořák (Jan 31 2023 at 15:57):
I got there. You can join me.
Shreyas Srinivas (Jan 31 2023 at 17:10):
https://cs-uni-saarland-de.zoom.us/j/82352578108?pwd=REhLWHFQN2VCVUd4cW9CbG0wL0Z5UT09
Shreyas Srinivas (Jan 31 2023 at 22:42):
Do we meet same time two weeks from now?
Shreyas Srinivas (Feb 01 2023 at 10:24):
@Mario Carneiro I recall that you had a question about promise problems yesterday : here's a well-motivated presentation: https://www.wisdom.weizmann.ac.il/~oded/PSX/prpr-r.pdf
Martin Dvořák (Feb 07 2023 at 07:09):
Shreyas Srinivas said:
Do we meet same time two weeks from now?
Yes. We will meet on not today.
Martin Dvořák (Feb 14 2023 at 12:51):
The old Zoom link does not work anymore. Can someone generate a new link, please?
Mario Carneiro (Feb 14 2023 at 13:38):
https://cmu.zoom.us/j/97134152768?pwd=K2NxRW9SMEg0dTdXQ21oUjNUd1dLQT09
Martin Dvořák (Feb 14 2023 at 17:42):
Is this definition of RAM in Coq useful for us?
https://link.springer.com/chapter/10.1007/978-3-540-45208-9_5
Martin Dvořák (Feb 14 2023 at 19:01):
FYI, we didn't schedule another meeting yet. When we have a concrete topic to discuss, we will call a meeting.
Andrew Carter (Feb 19 2023 at 04:29):
I've sketched out a complexity model based on @Shreyas Srinivas memory-bank suggestion here:
https://github.com/calcu16/lean_complexity/blob/96c3d3580dcdd6b457728fb2d9fdec0ebc9ede64/src/membank/basic.lean
I've golfed it a little bit, but the basic premise is that you start with a fixed amount of memory, but behind each memory is even more memory. The golfing is rather than having each memory bank have n data and n more memory banks, each memory bank only has 1 data, but this is just pushing things one level down because the n more banks 1 data is n data.
All data starts zeroed out (and I use has_zero rather than inhabited for this reason), I allow arbitrary binary operations and random-access through any (constant) number of memory banks. I figure we can always put a filter over the instructions later if someone doesn't want that.
As @Shreyas Srinivas suggested, I've added a "call" instruction which takes one of these memory banks and limits the call's visibility inside of it. When you run out of instructions the function returns (and this works at the top level to create the final "result"). This makes proving composability easier, which I've proved here (https://github.com/calcu16/lean_complexity/blob/main/src/membank/encoding/basic.lean) - the proof is messy right now.
One tricky thing is allowing recursion, right now recurse is essentially creates a do-while around the current function and calls continue. I'm not the happiest about this, and am open to suggestions. I've considered a giant map of function names (numeric or otherwise), but I feel like that would impede composability since you'd have to limit the scope, possibly deal with alpha-equivalence etc.
I currently allow any type as the memory bank, but I think bool or finset 2 is all that is required. However, leaving it general does allow infinite data types such as natural numbers or even reals which could be fun to think about (especially by constraining the binary op).
Andrew Carter (Feb 19 2023 at 04:40):
Actually, recurse is more like the y-combinator than do-while since you can call it multiple times and get new stacks etc.
Andrew Carter (Feb 27 2023 at 17:58):
I've worked through the master theorem modulo asymptotic analysis (its just a recursive function at the moment instead, although it could be easily converted into a sum, which I will probably do at some point).
https://github.com/calcu16/lean_complexity/blob/45dbf0afed8bea0cea6fed6b1e43b366a1ea979c/src/membank/master.lean
(I haven't cleaned up the proofs yet)
Basically culminating in
theorem divide_and_conquer_cost_sound {p: program α}
{fr: bank α → ℕ} (hfr: ∀ inp arg, recurse_arg p inp arg → fr arg < fr inp)
{fc: ℕ → ℕ} (hfc: ∀ inp, call_cost_le ⟨p, p, inp⟩ (fc (fr inp))):
∀ inp, p.cost_le inp (divide_and_conquer_cost p fc (fr inp))
Where divide_and_conquer_cost is defined as
def divide_and_conquer_cost (p: program α) (fc: ℕ → ℕ): ℕ → ℕ
| 0 := internal_cost_bound p + fc 0
| (n+1) := internal_cost_bound p + fc (n + 1) + divide_and_conquer_cost n * recurse_count p
I changed some of the definitions to make things work, but they are basically the same as before with extra properties that are essentially transparent to the "programmer".
This should in theory make proving complexity of basic recursive functions straightforward, and I think I've implemented merge_sort Martin Dvořák :
https://github.com/calcu16/lean_complexity/blob/main/src/membank/encoding/list.lean
Although I haven't proved it (but the bugs are presumably minor) and I haven't proved complexity yet either, but those are next on my docket (it should be less heavy lifting and more just building ergonomic machinery for stepping through programs).
Martin Dvořák (Feb 27 2023 at 18:01):
Can you please make a unit test for your mergesort?
Andrew Carter (Feb 27 2023 at 18:16):
I was just going to go straight for the proof, probably a similar amount of work
Martin Dvořák (Feb 27 2023 at 18:18):
Ah. I was eager to see something executable.
Andrew Carter (Feb 27 2023 at 18:22):
Well memory access is done via a function definition (bank.mem.memory has type α → bank) so I'm not sure #eval would work (although maybe it'd be fine, "step" claims to be computable) but even if it was I'd have to define an encoding for the elements of the list as well as a comparator before I could run anything.
Andrew Carter (Feb 27 2023 at 18:59):
On second thought if I define an inverse of encode (dare say decode) I think it'll work, how would you feel about a unit test of sorting a list of "fin 2"s?
Andrew Carter (Feb 28 2023 at 05:16):
Martin Dvořák said:
Can you please make a unit test for your mergesort?
https://github.com/calcu16/lean_complexity/blob/main/src/membank/encoding/example.lean
def test: list (frame ℕ) := ((stack.step^[1000]) ((merge_sort nat_cmp).apply (push_list [9, 3, 6, 5, 2] bank.null))).val
#eval (list.length test) -- 1
#eval list.length (frame.current (list.ilast test)) -- 0
#eval decode_list (frame.register (list.ilast test)) -- [2, 3, 5, 6, 9]
1000 is probably (hopefully?) overkill
Martin Dvořák (Feb 28 2023 at 15:33):
Thanks a lot !!!!!
Martin Dvořák (Feb 28 2023 at 15:35):
FYI, for sorting a list of twenty numbers, nearly 3500 steps are needed.
Martin Dvořák (Feb 28 2023 at 15:38):
BTW, I get error when building src/local/maximal_matching.lean
on line 330.
Does it work on your computer?
Andrew Carter (Feb 28 2023 at 16:17):
Oh yeah it does, I was trying to work on Shreyas's graph matching problem, but then switched over to exploring the new membank/RAM model. I'll comment out that theorem for now, looks like the rest of the file is fine (I had just realized I needed alternating paths in addition to augmenting paths in order to prove berg's lemma so I was copying the proofs from augmenting and patching them).
Split takes: 8n + 2 steps
Merge takes less than 26n + 5 steps
Merge sort internal takes 18 steps, so for a list of length 2^m the recurrence relations should be
def cost (n: ℕ) (f: ℕ → ℕ): ℕ := (∑ i : fin n.succ, f (n - i) * 2 ^ (i:ℕ))
#eval cost 5 (λ n, 24*2^n + 35) -- 6813
Which for m=5 gives 6813, so you got off easy
(I had an off by one error in the original recurrence definition)
Shreyas Srinivas (Mar 08 2023 at 14:51):
I had a chance to read part of the Isabelle Proof of the Cook Levin Theorem. A few observations:
- They spend a lot of time doing the work we have already done in mathlib e.g. Arithmetic with bit representations. Much of this is TM-independent
- They spend a lot of time formalising basic theorems about TMs. These are the kind of things any library on complexity should have.
- They spend a lot of time defining and formalising concepts about complexity classes and TMs for them.
- They follow the Barak-Arora book for definitions.
Conclusion:
- We should try to replicate this work in Lean
- We should make sure we are not doing redundant things. We can probably achieve more modularity and reduce work.
- Much of the things we formalise will be of use in other places. So we have to organise the definitions and proofs like designing a library for future use, not just proving this theorem.
Shreyas Srinivas (Mar 08 2023 at 14:52):
Finally : We can probably make savings in effort if we are willing to use loose inequalities in the number of steps that TMs take to compute their functions
Shreyas Srinivas (Mar 08 2023 at 14:54):
The SAT stuff should probably go in a separate folder. This is because it is likely to be of much wider interest.
Martin Dvořák (Mar 08 2023 at 15:20):
BTW has anybody explored the project by @Andrew Carter ? What do you think about the "infinitely hierarchical" memory model?
Shreyas Srinivas (Mar 08 2023 at 15:26):
I don't understand the hierarchy very well yet. But to be fair I should invest a little more time into it first.
Shreyas Srinivas (Mar 08 2023 at 15:35):
for instance, is it like a call stack, except each call stack has one cell?
Shreyas Srinivas (Mar 08 2023 at 15:36):
But I like the prospect that in principle the cell can be of any type. This allows us a lot of flexibility.
Martin Dvořák (Mar 08 2023 at 15:36):
I think the best description of the memory bank is "tree".
Martin Dvořák (Mar 08 2023 at 15:37):
The nice thing is that you can pass a subtree to a subprogram.
Martin Dvořák (Mar 08 2023 at 15:38):
The subprogram has no way of damaging the rest of the memory.
Martin Dvořák (Mar 08 2023 at 15:40):
Therefore, you can express a clear "contract" for each routine.
Shreyas Srinivas (Mar 08 2023 at 15:42):
Shreyas Srinivas said:
I had a chance to read part of the Isabelle Proof of the Cook Levin Theorem. A few observations:
- They spend a lot of time doing the work we have already done in mathlib e.g. Arithmetic with bit representations. Much of this is TM-independent
- They spend a lot of time formalising basic theorems about TMs. These are the kind of things any library on complexity should have.
- They spend a lot of time defining and formalising concepts about complexity classes and TMs for them.
- They follow the Barak-Arora book for definitions.
Conclusion:
- We should try to replicate this work in Lean
- We should make sure we are not doing redundant things. We can probably achieve more modularity and reduce work.
- Much of the things we formalise will be of use in other places. So we have to organise the definitions and proofs like designing a library for future use, not just proving this theorem.
Another thing in this paper is that a lot of effort is spent on trying to prove that given a function f : A -> B
and a m : TM
, m
computes f
in t x.card
steps for input x
. The instance consists of a suitable proof that is typically like : x : A
has such and such bit string encoding x
and the output of the TM is y
and decoding y
is f x
. This suggests two things:
- A tactic which extracts the proof obligations of this translation and presents it to us.
- A typeclass say
ComputableIn
which has three types : the function, the TM, and the bound on step count.
Point 2 should significantly ease our proof burden, and in particular allow a somewhat top to bottom proof.
Shreyas Srinivas (Mar 08 2023 at 15:44):
Martin Dvořák said:
I think the best description of the memory bank is "tree".
Ah okay. Now I can map this to what I said in the meeting. If we can express the idea that this is equivalent to what is classically understood as RAM (upto asymptotics) and then show some simple algorithms' time and space complexities, this should probably qualify as a single publishable unit. The critical thing is to demonstrate that we can reason about algorithms without the memory bank structure getting in our way.
Martin Dvořák (Mar 08 2023 at 15:57):
RAM can be expressed as a special case of the memory-bank model afaik. So we should eventually write a transpiler from a general memory-bank machine to the RAM-like memory-bank machine.
Martin Dvořák (Mar 08 2023 at 15:59):
Shreyas Srinivas said:
Another thing in this paper is that a lot of effort is spent on trying to prove that given a function
f : A -> B
and am : TM
,m
computesf
int x.card
steps for inputx
.
What is t x.card
please?
Shreyas Srinivas (Mar 08 2023 at 16:07):
x
is the input. x.card
a hypothetical name for the size of the encoding of x
. t : \N -> NNReal
is a time complexity function
Andrew Carter (Mar 10 2023 at 04:09):
Shreyas Srinivas said:
- A typeclass say
ComputableIn
which has three types : the function, the TM, and the bound on step count.Point 2 should significantly ease our proof burden, and in particular allow a somewhat top to bottom proof.
So I'm not sure if the TM is necessary for the typeclass, I chose to just use the function and the bound on step count (and the model which in this case is just TM as a model). What is the point of having the specific TM as part of the typeclass?
class has_complexity (m: model α β γ) {δ: Type} [has_encodable_function m δ] (f: δ) :=
(value: is_complexity m f)
Underneath is_complexity is witness which takes an element of the model (so a specific TM in your case)
def witness: Π (enf : encodable_function m), α → enf.unwrap → (cost_function enf) → Prop
| (encodable_function.result en) := λ prog data, en.model.accepts_with_cost prog (en.encode data)
| (encodable_function.application en b) := λ prog f cost, ∀ arg : en.type,
witness b (encoding.application en prog arg) (f arg) (cost arg)
````
Andrew Carter (Mar 10 2023 at 04:11):
That being said I have doubts about the typeclass model at least as far as introspecting lambdas goes (point-free functions seem straightforward with typeclasses - but generally annoying to work with), and am coming around to @Praneeth Kolichala's idea of a compile tactic (I think he calls it complexity).
Andrew Carter (Mar 10 2023 at 04:18):
Shreyas Srinivas said:
Ah okay. Now I can map this to what I said in the meeting. If we can express the idea that this is equivalent to what is classically understood as RAM (upto asymptotics) and then show some simple algorithms' time and space complexities, this should probably qualify as a single publishable unit. The critical thing is to demonstrate that we can reason about algorithms without the memory bank structure getting in our way.
So I think proving that any RAM program is a membank program should be straightforward (always use bank "0" which is effectively executed in place by bank) the only wrinkle is that I use recurse which is essentially the y-combinator instead of GOTO. But GOTO can be simulated easily with a giant WHILE(true)...IF...IF..IF...
Going the other direction is trickier and probably requires a simulator that flattens the memory banks into a single much larger RAM program. We can also restrict membank operations after the fact to make this more palatable. The biggest wrinkle is that I allow copying a whole memory bank at once, but copy-on-write imperative semantics should smooth that out I think. Otherwise you would have to define a copy function for every datatype which could get tedious.
Shreyas Srinivas (Mar 10 2023 at 09:09):
Andrew Carter said:
So I'm not sure if the TM is necessary for the typeclass, I chose to just use the function and the bound on step count (and the model which in this case is just TM as a model). What is the point of having the specific TM as part of the typeclass?
In the context of Cook Levin theorem as formulated in textbooks and this Isabelle proof, having the TM is useful because, with the relevant TM and suitable TM composition laws, you get new TMs, and that is basically how this version of the proof goes.
Shreyas Srinivas (Mar 10 2023 at 09:10):
The purpose of the typeclass is to abstract the away the equivalence between a function and a TM that computes the function.
Andrew Carter (Mar 10 2023 at 16:37):
Shreyas Srinivas said:
The purpose of the typeclass is to abstract the away the equivalence between a function and a TM that computes the function.
But do you actually care what the TM is, or just that a TM with certain properties exists (in this case step count. but things like memory usage could also be included)? You can still "get" a TM through any existence quantifier, you just don't know what it looks like at proof time.
Shreyas Srinivas (Mar 10 2023 at 16:39):
The TMs are the meat of this particular proof. So I do care about what the TM looks like.
Shreyas Srinivas (Mar 10 2023 at 16:40):
Whether that is necessary for "a" proof of this theorem is a different matter. Being able to construct and reason about TMs is more broadly useful in complexity theory results, so I can see some value in doing things this way
Andrew Carter (Mar 10 2023 at 17:00):
Shreyas Srinivas said:
The TMs are the meat of this particular proof. So I do care about what the TM looks like.
Then I'm not really sure what the typeclass buys you over just the appropriate theorems, if you care what the TM is each time then I would wonder whatthe auto-resolution buys you (i.e. what if it picks the wrong resolution). Then again I'm not an expert in correct uses of typeclasses, and I think I'm already abusing them.
Shreyas Srinivas (Mar 10 2023 at 17:00):
Top down proofs
Andrew Carter (Mar 10 2023 at 17:17):
Shreyas Srinivas said:
Top down proofs
This I don't think I understand, the proof in my mind is exactly the same inside of typeclasses vs. theorems. The only benefit I've noticed is the autoresolution of 'apply_instance'. Do you have a simple example of the a top down proof via typeclasses vs. what the equivalent theorem based approach would look like?
Andrew Carter (Apr 02 2023 at 17:55):
Ok, I did another draft, and now I have merge_sort's complexity of
((37 + c) * l.length * (nat.clog 2 l.length) + 5)
where c is the cost of a comparison.
https://github.com/calcu16/lean_complexity/blob/aae684cc55a9beb196d4acabd06a2c22951b810f/src/hmem/encoding/list_complexity.lean
I renamed the model to hmem (for hierarchical memory) and made it more tree-like - hopefully this would make it easier for people to understand the model (based on @Martin Dvořák and @Shreyas Srinivas's discussion above), I also hid the memory in a quotient group so we can use eq.subst in proofs. Assuming a simulator interns memory.null then it would use memory.usage_le memory, if it interns everything it would use memory.unique_usage_le:
https://github.com/calcu16/lean_complexity/blob/aae684cc55a9beb196d4acabd06a2c22951b810f/src/hmem/memory.lean
I haven't written any proofs about memory usage yet though.
I've also modified stack so that it embeds well-formedness within the data structure because it was annoying to work with,
and I've added a new concept trace which captures the essence of an execution that's required for proofs.
https://github.com/calcu16/lean_complexity/blob/aae684cc55a9beb196d4acabd06a2c22951b810f/src/hmem/trace.lean
Ideally (I think) the future "compile" tactic would take a lean defined function and generate a program with its corresponding trace.
Also the math is quite onerous, it would be nice to deal with asymptotics, the majority of the complexity proof for merge_sort_complexity is just proving a less than or equal condition.
Notification Bot (Apr 02 2023 at 18:15):
Junaid Ahmed Mohammed has marked this topic as resolved.
Notification Bot (Apr 02 2023 at 18:15):
Junaid Ahmed Mohammed has marked this topic as unresolved.
Shreyas Srinivas (Apr 19 2023 at 11:50):
Has anyone tried to prove complexity bounds for distributed graph algorithms? These should be simpler in principle because we don't count the computations that happen in a node. Only the communications happening between them (in lockstep). It would be nice to see the Cole-Vishkin algorithm on a directed path. formalised.
Andrew Carter (Apr 19 2023 at 21:28):
I don't know about that specifically - I did see some coloring work in the graph folder though (although not sure about directedness). But for any of the graph things I would worry much more about proving correctness rather than complexity per se, especially since the inner loop of the algorithm is trivially constant time. I'd imagine you'd get better visibility by asking about this as a graph question rather than a complexity question.
Andrew Carter (Apr 21 2023 at 03:58):
I'm a little bit stuck, although mostly in a not letting it go way, and not in an actually stuck way. I've defined memory of a quotient of hidden.memory as follows:
universe u
namespace hidden
inductive memory (α: Type u)
| leaf: memory
| node (value: α) (children: α → memory): memory
variables {α: Type u} [has_zero α] [decidable_eq α]
def getvp: memory α → list α → α
| (memory.leaf) _ := 0
| (memory.node v _) [] := v
| (memory.node _ vs) (a::as) := getvp (vs a) as
end hidden
def memory (α: Type u) [has_zero α] [decidable_eq α]: Type* := @quotient (hidden.memory α) infer_instance
and it seems to me, I should be able to computably define a constructor for the equivalent of hidden.memory.node for memory using some form of lift, the noncomputable version is
noncomputable def mk (value: α) (children: α → memory α): memory α :=
quotient.mk (hidden.memory.node value (quotient.out ∘ children))
I think this sound in the sense that no matter what representative is chosen mk has equal output. This doesn't strictly matter because (a) we can use the noncomputable version in proofs still and (b) if α is a fintype then I can actually generate a canonical hidden.memory α and so I have a computable quotient.out. Also I could use the hidden form instead, it just makes the proofs more annoying since I can't rw as easily.
That being said the ability to #eval has been cool, and nat is slightly easier to work with when testing then fin, so it would be nice to have it working.
Does anyone have any ideas on how to achieve this?
One thing I've noticed is a lot of things that I would normally mathematically define as quotients (such as integers and rationals) are actually defined as dependent types of the canonical member instead. So maybe there is a better way of defining memory?
Kevin Buzzard (Apr 21 2023 at 07:45):
My understanding is that these mathematically ugly (ie non-quotient) definitions of int and rat have been defended on the grounds of speed and/or efficiency (so in particular it's nothing to do with mathematics, it's an implementation issue). In practice it makes the basic API much harder to develop but if you're prepared to pay this price then you get a quotient-avoiding definition.
Shreyas Srinivas (Apr 25 2023 at 15:26):
This is another matching algorithm paper appearing in ITP: https://arxiv.org/pdf/2302.13747.pdf
Although it is in Isabelle, I think it illustrates what I have in mind when I speak of verifying things at a higher level of abstraction
Shreyas Srinivas (Jun 05 2023 at 23:04):
How far can we replicate this in lean ?
https://lawrencecpaulson.github.io/2023/02/22/Binary_GCD.html
I was able to verify the equivalent for normal Euclidean GCD algo, but proofs about binary GCD quickly became rather long (caveat: I used conv to make the proofs feel natural, and control unfold
). I never got around to finishing the binary version.
Mario Carneiro (Jun 05 2023 at 23:10):
do you have lean code?
Mario Carneiro (Jun 05 2023 at 23:11):
The normal euclidean algorithm for GCD is already implemented in lean core and verified in std/mathlib
Shreyas Srinivas (Jun 05 2023 at 23:14):
First part of it below. Warning : it has import Mathlib
at the beginning. This was in my scratch repo so it isn't online
import Mathlib
import Std
inductive BinaryGCD : ℤ → ℤ → ℤ → Prop where
| dzero (x : ℤ) : BinaryGCD x 0 x
| deven : BinaryGCD x y z → BinaryGCD (2*x) (2*y) (2*z)
| dodd : BinaryGCD x y z → ¬(2 ∣ y) → BinaryGCD (2*x) y z
| dstep : BinaryGCD (x - y) y z → y < x → BinaryGCD x y z
| dswap : BinaryGCD x y z → BinaryGCD y x z
theorem BinaryGCD_Div : (BinaryGCD x y z) → z ∣ x ∧ z ∣ y := by
intro h
induction h
case dzero => simp
case deven =>
rename_i x y z a a_ih
cases' a_ih with left right
cases' left with w wh
cases' right with v vh
constructor
· use w
rw[wh, Int.mul_assoc]
· use v
rw[vh, Int.mul_assoc]
case dstep =>
rename_i x y z step _ a_h
cases' a_h with left right
have div_sum := Int.dvd_add left right
cases' div_sum with w wh
constructor
case intro.left =>
use w
rw [←wh]
ring
case intro.right =>
exact right
case dswap =>
rename_i x y z a a_ih
cases' a_ih with left right
constructor
exact right
exact left
case dodd =>
rename_i x y z a _ a_ih
cases' a_ih with left right
cases' left with w wh
constructor
case intro.intro.right =>
exact right
case intro.intro.left =>
use 2*w
rw [wh]
ring
done
Shreyas Srinivas (Jun 05 2023 at 23:15):
There's more, but it is too messy for me to make sense a month later.
Mario Carneiro (Jun 05 2023 at 23:16):
import Mathlib.Tactic.Ring
is sufficient
Shreyas Srinivas (Jun 05 2023 at 23:19):
My first question is : Does it even make sense to pursue this style?
Mario Carneiro (Jun 05 2023 at 23:20):
here's some minor style cleanup:
theorem BinaryGCD_Div (h : BinaryGCD x y z) : z ∣ x ∧ z ∣ y := by
induction h with
| dzero => simp
| deven _ a_ih =>
let ⟨⟨w, wh⟩, v, vh⟩ := a_ih
constructor
· use w
rw [wh, Int.mul_assoc]
· use v
rw [vh, Int.mul_assoc]
| dstep _ _ a_h =>
let ⟨left, right⟩ := a_h
let ⟨w, wh⟩ := Int.dvd_add left right
constructor
· use w
rw [← wh]
ring
· exact right
| dswap _ a_ih =>
let ⟨left, right⟩ := a_ih
constructor
exact right
exact left
| dodd _ _ a_ih =>
let ⟨⟨w, wh⟩, right⟩ := a_ih
constructor
· use 2*w
rw [wh]
ring
· exact right
Mario Carneiro (Jun 05 2023 at 23:21):
what do you mean by "does it make sense to pursue this style"? What level of this proof / formalization are you questioning
Mario Carneiro (Jun 05 2023 at 23:22):
I think mathlib would generally not have the definition BinaryGCD
unless it is required by something else
Mario Carneiro (Jun 05 2023 at 23:22):
this is also (as pointed out in the blog post) not really an algorithm
Mario Carneiro (Jun 05 2023 at 23:22):
it's a bunch of legal moves you can do but no decision procedure among them
Mario Carneiro (Jun 05 2023 at 23:23):
for example you could just keep applying dswap
and not really approach a solution
Mario Carneiro (Jun 05 2023 at 23:24):
if the goal is just to replicate the formalization in the blog post then this looks correct
Shreyas Srinivas (Jun 05 2023 at 23:28):
Mario Carneiro said:
what do you mean by "does it make sense to pursue this style"? What level of this proof / formalization are you questioning
Exactly. It isn't an algorithm. It is just an inductively defined predicate, and we check if objects satisfying this predicate have the properties we seek.
Shreyas Srinivas (Jun 05 2023 at 23:29):
It doesn't deal with messy implementation details
Shreyas Srinivas (Jun 05 2023 at 23:30):
only with the structural aspects. This is close to the level a lot of algorithms folks operate in.
Mario Carneiro (Jun 05 2023 at 23:32):
we arguably already have the structural aspects, this is just theorems like gcd (x + y) y = gcd x y
Mario Carneiro (Jun 05 2023 at 23:32):
where gcd
is here being treated as just the abstract gcd and not a particular algorithm to compute it
Mario Carneiro (Jun 05 2023 at 23:33):
each of the BinaryGCD
constructors should correspond to an existing theorem about gcd
Shreyas Srinivas (Jun 05 2023 at 23:43):
Mario Carneiro said:
each of the
BinaryGCD
constructors should correspond to an existing theorem about gcd
okay. Question 2: At present what is the best lean equivalent for auto
in isabelle?
Mario Carneiro (Jun 05 2023 at 23:43):
By the way, BinaryGCD
does not compute the same result as Int.gcd
, because the first clause says dzero (x : ℤ) : BinaryGCD x 0 x
but Int.gcd x 0 = x.natAbs
Mario Carneiro (Jun 05 2023 at 23:43):
i.e. this gcd function produces negative results but ours never does
Mario Carneiro (Jun 05 2023 at 23:44):
Shreyas Srinivas said:
Question 2: At present what is the best lean equivalent for
auto
in isabelle?
aesop
Shreyas Srinivas (Jun 05 2023 at 23:44):
I switched to Int
to use ring
. Because of the subtraction in dstep
zify doesn't work very well if I use Nat.
Mario Carneiro (Jun 05 2023 at 23:45):
oh I see, the original was about Nat
Mario Carneiro (Jun 05 2023 at 23:45):
you shouldn't change the theorem statement, that's cheating
Shreyas Srinivas (Jun 05 2023 at 23:46):
Mario Carneiro said:
Shreyas Srinivas said:
Question 2: At present what is the best lean equivalent for
auto
in isabelle?
aesop
aesop
seems to require a lot of fiddling to make it work. I read the paper, and they are quite explicit that it is a whitebox search tactic. Most of the time I just call aesop
I get failed to prove the goal after exhaustive search.
Mario Carneiro (Jun 05 2023 at 23:46):
inductive BinaryGCD : ℕ → ℕ → ℕ → Prop where
| dzero (x : ℕ) : BinaryGCD x 0 x
| deven : BinaryGCD x y z → BinaryGCD (2*x) (2*y) (2*z)
| dodd : BinaryGCD x y z → ¬(2 ∣ y) → BinaryGCD (2*x) y z
| dstep : BinaryGCD (x - y) y z → y < x → BinaryGCD x y z
| dswap : BinaryGCD x y z → BinaryGCD y x z
theorem BinaryGCD_Div (h : BinaryGCD x y z) : z ∣ x ∧ z ∣ y := by
induction h with
| dzero => simp
| deven _ a_ih =>
let ⟨⟨w, wh⟩, v, vh⟩ := a_ih
constructor
· use w
rw [wh, Nat.mul_assoc]
· use v
rw [vh, Nat.mul_assoc]
| dstep _ yx a_h =>
let ⟨left, right⟩ := a_h
let ⟨w, wh⟩ := Nat.dvd_add left right
rw [Nat.sub_add_cancel (Nat.le_of_lt yx)] at wh
constructor
· use w
rw [← wh]
· exact right
| dswap _ a_ih =>
let ⟨left, right⟩ := a_ih
constructor
exact right
exact left
| dodd _ _ a_ih =>
let ⟨⟨w, wh⟩, right⟩ := a_ih
constructor
· use 2*w
rw [wh]
ring
· exact right
Mario Carneiro (Jun 05 2023 at 23:47):
auto
also requires a lot of fiddling, the library is marked up quite a bit to make it work
Shreyas Srinivas (Jun 05 2023 at 23:48):
My nat
version was longer than this. At some point I remember that library_search
was not returning useful results (this is before recent improvements as I recall).
Mario Carneiro (Jun 05 2023 at 23:49):
I'm sure it was, Nat.sub_add_cancel
is a really important lemma to use here and our tactics won't use it automatically for the most part
Shreyas Srinivas (Jun 05 2023 at 23:55):
In any case, the length of the proof compared to Isabelle's was a sufficient deterrent for me to put this off back then.
Mario Carneiro (Jun 05 2023 at 23:56):
well the proof can still be golfed a lot
theorem BinaryGCD_Div (h : BinaryGCD x y z) : z ∣ x ∧ z ∣ y := by
induction h with
| dzero => simp
| deven _ ih => exact ⟨mul_dvd_mul_left _ ih.1, mul_dvd_mul_left _ ih.2⟩
| dstep _ yx h =>
exact ⟨by simpa [Nat.sub_add_cancel (Nat.le_of_lt yx)] using Nat.dvd_add h.1 h.2, h.2⟩
| dswap _ ih => exact ⟨ih.2, ih.1⟩
| dodd _ _ ih => exact ⟨dvd_mul_of_dvd_right ih.1 _, ih.2⟩
Mario Carneiro (Jun 05 2023 at 23:58):
it's not really in the same style as just calling blast
on everything, but that's probably possible
Shreyas Srinivas (Jun 05 2023 at 23:59):
This syntax is quite convenient for induction. In my syntax I was getting a whole different set of variables than the one I started with (BinaryGCD x y z)
and just keeping track of which variable was getting used at what point was painful
Mario Carneiro (Jun 05 2023 at 23:59):
well it is the induction tactic after all :)
Shreyas Srinivas (Jun 06 2023 at 00:03):
blast is a tableau prover right? In principle shouldn't it be possible to implement a lean version (within whatever strange limitations that come with DTT unification)
Mario Carneiro (Jun 06 2023 at 00:03):
lean has tauto
and itauto
for propositional proving
Mario Carneiro (Jun 06 2023 at 00:04):
lean 3 had cc
and eblast
and a smt_tactic
mode but it was not maintained
Mario Carneiro (Jun 06 2023 at 00:15):
Here's a more blasty proof:
theorem BinaryGCD_Div (h : BinaryGCD x y z) : z ∣ x ∧ z ∣ y := by
induction h with
| dstep _ yx h =>
exact ⟨by simpa [Nat.sub_add_cancel (Nat.le_of_lt yx)] using Nat.dvd_add h.1 h.2, h.2⟩
| _ => aesop (add safe mul_dvd_mul_left, unsafe 50% dvd_mul_of_dvd_right)
Mario Carneiro (Jun 06 2023 at 00:17):
to mimic the isabelle proof we need a better Nat.dvd_sub
lemma
Mario Carneiro (Jun 06 2023 at 00:23):
theorem Nat.dvd_sub_iff_left {k m n : ℕ} (mn : n ≤ m) (h : k ∣ n) : k ∣ m ↔ k ∣ m - n := by
simpa [Nat.sub_add_cancel mn] using (Nat.dvd_add_iff_left (m := m - n) h).symm
theorem BinaryGCD_Div (h : BinaryGCD x y z) : z ∣ x ∧ z ∣ y := by
induction h with
| dstep _ yx h => exact ⟨(Nat.dvd_sub_iff_left (Nat.le_of_lt yx) h.2).2 h.1, h.2⟩
| _ => aesop (add safe mul_dvd_mul_left, unsafe 50% dvd_mul_of_dvd_right)
maybe there is a way to get aesop
to find the dstep
proof, not sure
Shreyas Srinivas (Jun 06 2023 at 15:21):
So having read a bit about aesop rule sets and attributes, and having taken a peek at the continuity tactic, would a similar divisibility tactic written on top of aesop make sense? This is of course to avoid having to configure aesop posthoc after figuring out which theorems are needed.
Scott Morrison (Jun 06 2023 at 22:44):
Yes. Although first we should fix the current aesop
based continuity
implementation, which is broken in many places. It's not clear to me that it's possible to implement it properly on top of aesop
, although I hope so. Just look for porting notes mentioning continuity
to find test cases. :-)
Shreyas Srinivas (Jun 07 2023 at 12:31):
Mario Carneiro said:
this is also (as pointed out in the blog post) not really an algorithm
@Mario Carneiro I found one of the papers I was looking for, to explain why I think inductively defining invariants and proving them might be tenable : https://dl.acm.org/doi/pdf/10.1145/3519939.3523707
Mario Carneiro (Jun 07 2023 at 12:32):
there is nothing about this formalization that isn't doable, after all we basically did it above
Mario Carneiro (Jun 07 2023 at 12:32):
it's just, not really an algorithm
Mario Carneiro (Jun 07 2023 at 12:33):
if your goal is to describe an algorithm you probably want something you can #eval
Shreyas Srinivas (Jun 07 2023 at 12:34):
Mario Carneiro said:
there is nothing about this formalization that isn't doable, after all we basically did it above
The other paper is cited by this one and is more relevant : https://link.springer.com/chapter/10.1007/978-3-642-35308-6_9
Shreyas Srinivas (Jun 07 2023 at 12:34):
It shows how to extract a total computable function from an inductive relation
Shreyas Srinivas (Jun 07 2023 at 12:39):
Mario Carneiro said:
it's just, not really an algorithm
Agreed that it isn't an algorithm and that it is doable. I am making the case that this level of abstraction captures what algorithms peopel normally do and as the papers I linked argue, functions can be extracted from some of them.
Mario Carneiro (Jun 07 2023 at 12:40):
I disagree with that
Mario Carneiro (Jun 07 2023 at 12:40):
algorithms people don't write inductive types
Mario Carneiro (Jun 07 2023 at 12:40):
they write pseudocode
Shreyas Srinivas (Jun 07 2023 at 12:42):
Mario Carneiro said:
they write pseudocode
Which would barely qualify as an algorithm these days (which is why implementing most modern algorithms in code is non-trivial).
Shreyas Srinivas (Jun 07 2023 at 13:03):
Would it help to adopt an approach that combines defining terminating functions and inductive properties? One part of it would be to write functions that do the computation, and proving that they satisfy some inductive property (whose arms are invariants). Another part would be proving theorems about the inductive prop. According to the papers I linked, it should be possible to extract functions from (some?) such inductive props.
Martin Dvořák (Jun 09 2023 at 07:13):
Shreyas Srinivas said:
Mario Carneiro said:
they write pseudocode
Which would barely qualify as an algorithm these days (which is why implementing most modern algorithms in code is non-trivial).
What's wrong with publishing algorithms in pseudocode?
Shreyas Srinivas (Jun 09 2023 at 08:29):
Martin Dvořák said:
Shreyas Srinivas said:
Mario Carneiro said:
they write pseudocode
Which would barely qualify as an algorithm these days (which is why implementing most modern algorithms in code is non-trivial).
What's wrong with publishing algorithms in pseudocode?
Nothing if they are written well. Big emphasis on "if"
Shreyas Srinivas (Jun 09 2023 at 20:08):
I think I should elaborate on that point. I have been discussing the idea of building a library of formalized algorithms and at least their correctness with a few leading people in the field. This issue of how to specify algorithms came up quite often. We agreed that if one starts looking at the algorithms literature, extracting something implementable out of the pseudocode provided in most papers is non trivial enough to warrant a masters thesis lasting a year for each of them. This applies to anything beyond basic clrs (or other equivalent textbooks) stuff.
This is a problem entirely separate from actually constructing a model in which to specify such an implementation and prove something about it. This is not true for all papers of course and quite a lot of good ones exist. But the current state of affairs is highly non-ideal to pursue an implementation focused approach. Just implementing them in a normal imperative language and measuring their performance, without proving anything about them would be worth a paper or two.
Shreyas Srinivas (Jun 09 2023 at 20:17):
(deleted)
Shreyas Srinivas (Jun 09 2023 at 20:17):
I guess that this is one of the reasons that there are so few entries in Isabelle's AFP for algorithms and data structures, relative to how much content there is out there. There are 40 entries under algorithms and 65 entries under data structures, stretching back to 2004. Not all of them are directly concerned with topics that an algorithms textbook would consider, while some are about mathematical results that algorithms use.
Martin Dvořák (Sep 05 2023 at 07:36):
Is conversion TM1
to TM2
missing in https://github.com/leanprover-community/mathlib4/blob/master/Mathlib/Computability/TuringMachine.lean or did I just overlook it?
Mario Carneiro (Sep 05 2023 at 09:27):
There is the TM2to1
namespace... if you mean a simulation of TM1 machines on TM2, there isn't much need, the interesting direction is that TM2 machines can be simulated on TM1. For all the reverse implications it is easiest to just have something close the cycle and prove everything turing-equivalent
Martin Dvořák (Sep 05 2023 at 11:14):
Yes, this is exactly what I had on my mind!
So, for example, if I prove that TM0
can be simulated by Grammar
and that Grammar
can be simulated by TM2
then all of TM0
, TM1
, TM2
, and Grammar
will be proven equivalent regarding computability, right?
Last updated: Dec 20 2023 at 11:08 UTC