Zulip Chat Archive

Stream: lean4

Topic: Expressing complexity of algorithms in Lean

Geoffrey Irving (Jul 22 2024 at 20:37):

I was wandering through https://leanprover-community.github.io/mathlib4_docs/Batteries/Data/UnionFind/Basic.html, and it seems like

The implementation does both path compression and union-by-rank, so it should have complexity O(n ɑ(n)).
There's no proof of this.
Of course (2) has to be true, as there's no way of expressing the complexity of an algorithm written in Lean in the same file.

Is (3) right? I suppose the best one could do is write a lifting macro that extracts a representation of the algorithm in some other data structure, then prove the complexity of that lift. But this would require assuming complexities for all primitive operations, so it would be quite a heavyweight lifting algorithm.

I've done a few (way easier) proofs of complexity for the debate formalization, but that was a much simpler setting where (1) the Lean code computed a monad representating the computation and (2) I was only counting oracle query complexity, not some broader class of primitive operations.

Mario Carneiro (Jul 22 2024 at 20:41):

Yes, all three points are true

Mario Carneiro (Jul 22 2024 at 20:42):

My dream system for doing this would be to reflect on the lean Expr itself and have some complexity theory of lambda calculus evaluation over that representation

Henrik Böving (Jul 22 2024 at 20:43):

But this would require assuming complexities for all primitive operations, so it would be quite a heavyweight lifting algorithm.

Note that due to sharing, in particular of Array you'd have to either:

Overestimate things like setting an array element with O(n) because it needs to be copied in the worst case
Write a perfect linearity analysis system, in which case you have solved one of the most annoying issues that we have right now.

Mario Carneiro (Jul 22 2024 at 20:43):

and/or to have a compiler into a simpler language with explicit runtime accounting

Geoffrey Irving (Jul 22 2024 at 20:44):

Ouch, the sharing aspect is rough.

Henrik Böving (Jul 22 2024 at 20:45):

But hey, overestimating with O(n) would still get you a correct O(f(x)) in the end :P

Mario Carneiro (Jul 22 2024 at 20:45):

well no, you would not want to introduce an extra linear function on a function of inverse ackermann bound

Geoffrey Irving (Jul 22 2024 at 20:46):

I assume @Henrik Böving is saying that if you add in O(n) once it's fine.

Geoffrey Irving (Jul 22 2024 at 20:46):

Indeed you can safely add it in 4 times before you exceed ɑ(n).

Mario Carneiro (Jul 22 2024 at 20:46):

but it's not once, it's once per operation, i.e. it would completely dominate the actual contribution of the algorithm

Mario Carneiro (Jul 22 2024 at 20:46):

one O(n) is already enough to beat O(alpha(n))

Henrik Böving (Jul 22 2024 at 20:46):

Right but since O is only an upper bound it's fine to exceed

Henrik Böving (Jul 22 2024 at 20:47):

You just won't get the results that you want, but you'll get technically correct results!

Mario Carneiro (Jul 22 2024 at 20:47):

I'm very confused, that's like saying it's fine if your binary search algorithm is O(n), that's not the theorem we wanted to prove

Henrik Böving (Jul 22 2024 at 20:48):

Yes of course you would not want to practically do that, but it would yield a correct result. I'm just making a silly joke.

Geoffrey Irving (Jul 22 2024 at 20:49):

There is certainly a correct result that if I'm handed a UnionFind of size n, and I do n consecutive operations on it, then the overall complexity is O(n ɑ(n)). Since at most the first operation copies.

Still not following the joke, unfortunately. Unless it's some sort of cojoke.

Mario Carneiro (Jul 22 2024 at 20:49):

Anyway it's very likely true that a solution to automating complexity theory will solve our sharing issues, although that's true in the same sense as "proving the correctness of the program will solve the problem of having bugs in the code"

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Jul 22 2024 at 21:01):

As an aside, there is some cool recent work on internally expressing complexity in type theory, e.g. calf (a logical framework).

Shreyas Srinivas (Jul 22 2024 at 21:31):

I await the day when one of these lambda calculus based complexity models actually doesn't have polynomial blow ups of complexity over normal RAM model computations

Shreyas Srinivas (Jul 22 2024 at 21:33):

Currently all this work happens entirely outside algorithms groups and nobody seems to ask the algorithmists or complexity theorists about what would be useful for them (hint: ALGOL like lightweight DSL with CLRS style annotations)

Henrik Böving (Jul 22 2024 at 21:35):

What's CLRS annotations?

Shreyas Srinivas (Jul 22 2024 at 21:35):

That's not a technical term. See how the famous CLRS book annotates pseudocode with complexity

Shreyas Srinivas (Jul 22 2024 at 21:39):

Another point: I don't see how a solution to complexity theory can be automatic in Lean's current expression language since by funext, all algorithms which compute the same function are equal to each other regardless of complexity.

Shreyas Srinivas (Jul 22 2024 at 21:40):

Arthur pointed this out last year on discord.

Shreyas Srinivas (Jul 22 2024 at 21:41):

It is an entirely symmetrical but different problem that modern algorithms research doesn't explore the complexity of algorithms in functional language models. These lines of research are more or less in separate silos

Mario Carneiro (Jul 22 2024 at 21:43):

Shreyas Srinivas said:

I await the day when one of these lambda calculus based complexity models actually doesn't have polynomial blow ups of complexity over normal RAM model computations

Currently all this work happens entirely outside algorithms groups and nobody seems to ask the algorithmists or complexity theorists about what would be useful for them (hint: ALGOL like lightweight DSL with CLRS style annotations)

Especially for the Expr version I mentioned above, the complexity model I'm most interested in is the one the lean compiler actually uses

Shreyas Srinivas (Jul 22 2024 at 21:47):

apart from that funext issue, that's actually the most useful model for lean (I guess some quantitative type theory is involved)

Shreyas Srinivas (Jul 22 2024 at 21:48):

But that also means that we don't have the same complexity model in lean as algorithmists do and the results from there may not apply

Shreyas Srinivas (Jul 22 2024 at 21:48):

Which is more or less fine for lean users

Shreyas Srinivas (Jul 22 2024 at 21:49):

I say "may not" because there are several niches in algorithms theory which are not concerned with the complexity of evaluation of a lean function, only the number of certain function calls. This includes several query models and communication models.

Geoffrey Irving (Jul 22 2024 at 21:50):

It seems like the question is how long it takes and reasonable approximations thereof, not niche theory.

Shreyas Srinivas (Jul 22 2024 at 21:51):

What I mean is, union find might not be as efficient as you read in your undergrad algorithms textbook

Geoffrey Irving (Jul 22 2024 at 21:52):

If we care about oracle complexity, https://github.com/google-deepmind/debate/blob/d9ca51ac43e06a8bab5937c47e7c0c68387efea5/Comp/Defs.lean#L23 works.

Shreyas Srinivas (Jul 22 2024 at 21:53):

Yes. I am formalizing a different line of work where computational complexity is irrelevant

Geoffrey Irving (Jul 22 2024 at 21:54):

Then you should use custom monads to express algorithms, and not complain that a thread about time complexity of Lean code would involve Lean details.

Shreyas Srinivas (Jul 22 2024 at 21:59):

I am not complaining. On the contrary, I think

Mario's solution makes the most sense for lean code.
Calling it algorithmic complexity might be misleading, in that it won't match traditional algorithms analysis. It isn't esoteric theory. In fact functional programming models are very esoteric as far as mainstream algorithms research goes.

Shreyas Srinivas (Jul 22 2024 at 22:08):

Mario Carneiro said:

I'm very confused, that's like saying it's fine if your binary search algorithm is O(n), that's not the theorem we wanted to prove

You will get results like this (not this exactly, but similar blow ups in complexity) unless you somehow manage to ensure in place mutation everywhere you need it. Textbook algorithmic results may not apply as is.

Shreyas Srinivas (Jul 22 2024 at 22:28):

A difficulty with working on lean's complexity model (let's call it that for now) will be coming up with the correct complexity asymptotics for theorem statements, especially as the lean functions get more complicated, since it is unlikely that lean itself will produce the final simplified asymptotic form.

Mario Carneiro (Jul 23 2024 at 03:02):

I don't think this is a major issue. Textbook algorithms assume mutable data structures, and this maps straightforwardly to linear usage of mutable variables

Mario Carneiro (Jul 23 2024 at 03:03):

Shreyas Srinivas said:

unless you somehow manage to ensure in place mutation everywhere you need it.

Yes that's the point

Mario Carneiro (Jul 23 2024 at 03:03):

I'd like my proof to actually show the absence of bugs thank you very much

Mario Carneiro (Jul 23 2024 at 03:04):

that applies just as much to complexity bounds as correctness

Mario Carneiro (Jul 23 2024 at 03:04):

if my program has a (sharing) bug in it I shouldn't be able to prove the theorem

Mario Carneiro (Jul 23 2024 at 03:04):

otherwise what's the point of the theorem?

Daniel Weber (Jul 23 2024 at 03:41):

I'd love to be able to prove something along the lines of (f : (α : Type) → [LinearOrder α] → (l : List α) → {l₂ : List α // l₂.Perm l ∧ List.Sorted (· ≤ ·) l₂}) → (complexity f = Ω(n log(n))). Is there any approach which could, in theory, allow for something like this?

Shreyas Srinivas (Jul 23 2024 at 08:42):

Mario Carneiro said:

I'd like my proof to actually show the absence of bugs thank you very much

How would you make an asymptotically tight prediction to put into a theorem statement when predicting linearity is hard? Would it be trial and error? Otherwise when a bound doesn't get proved, especially for complex ones, how would you determine whether the proof is stuck because of tactics or a false statement?

Geoffrey Irving (Jul 23 2024 at 08:46):

I don’t think Marco is expecting to have a perfect decision rule for complexity.

Shreyas Srinivas (Jul 23 2024 at 08:48):

Daniel Weber said:

I'd love to be able to prove something along the lines of (f : (α : Type) → [LinearOrder α] → (l : List α) → {l₂ : List α // l₂.Perm l ∧ List.Sorted (· ≤ ·) l₂}) → (complexity f = Ω(n log(n))). Is there any approach which could, in theory, allow for something like this?

This statement is flawed as is. You are parametrizing over f (and not using it, but that's maybe just a typo). For an upper bound you need a specific f. For a lower bound this is only true for functions based on swapping operations

Shreyas Srinivas (Jul 23 2024 at 08:50):

Geoffrey Irving said:

I don’t think Marco is expecting to have a perfect decision rule for complexity.

It isn't about perfection. These complexity recurrences can quickly get ugly and require some combinatorial theorems and clever application of inequalities and approximations to simplify into a reasonable shape

Daniel Weber (Jul 23 2024 at 09:37):

Shreyas Srinivas said:

Daniel Weber said:

I'd love to be able to prove something along the lines of (f : (α : Type) → [LinearOrder α] → (l : List α) → {l₂ : List α // l₂.Perm l ∧ List.Sorted (· ≤ ·) l₂}) → (complexity f = Ω(n log(n))). Is there any approach which could, in theory, allow for something like this?

This statement is flawed as is. You are parametrizing over f (and not using it, but that's maybe just a typo). For an upper bound you need a specific f. For a lower bound this is only true for functions based on swapping operations

I am using it, in complexity f = Ω(n log(n)). Why is this lower bound not true, for arbitrary α whose only interface is the linear order instance? Can you do anything other than comparisons?

Shreyas Srinivas (Jul 23 2024 at 11:01):

Radix sort

Shreyas Srinivas (Jul 23 2024 at 11:05):

Your lower bound is only true for comparison sorts I.e. a model where you can access the array in the following two ways:

Given two indices i and j, you make a compare(i,j) query and get the comparison of the corresponding elements.
Given two indices, you call a swap query to swap(i,j) the elements in those indices.

Luigi Massacci (Jul 23 2024 at 11:10):

Shreyas Srinivas said:

Radix sort

or for an easier example, counting sort, but indeed any other sorting algorithm in that family. As @Shreyas Srinivas is saying you will need to either specify in a lot more detail what f does (in a way that I think is far from trivial in lean) or much more likely (and easily) prove your result for a specific implementation of a specific algorithm

Daniel Weber (Jul 23 2024 at 11:15):

But f takes the type — it can't depend on it being Nat, for instance

Shreyas Srinivas (Jul 23 2024 at 11:25):

That's beside the point. You can generalize it to all hashable types by constructing a suitable hash function with low collision probability. Further, the specific theorem you are going for wouldn't apply as such. You need a way to restrict operations to a three-way comparison and swaps on indices. If any other operation were to be involved, for instance, even being able to access the array element, then you can do sorting in an external bucket array.

Shreyas Srinivas (Jul 23 2024 at 11:32):

Also, currently in Lean, you can do something like the following:

import Mathlib

variable (Complexity : (α → β) → (ℕ → ℕ) → Prop)

-- assume f and g both compute the same function with different complexities
example (f g : α → β) (T : ℕ → ℕ) (h_ext : ∀ x : α, f x  = g x) (ch : Complexity f T) : Complexity g T := by
  observe h_eq : f = g
  rw[h_eq] at ch
  exact ch

Shreyas Srinivas (Jul 23 2024 at 11:34):

so you f could be radix sort, g could be merge sort. You could prove radix sort is linear time and then use the above trick to "prove" that merge sort is linear time.

Shreyas Srinivas (Jul 23 2024 at 11:41):

your only way to prove this lower bound is to build a mini DSL that only lets you perform three way comparisons and swaps on elements at a given pair of indices, interpret them on a given array, prove that a proper sorting algorithm in this DSL indeed sorts and that it needs \Omega(n log n) swaps to do this.

Shreyas Srinivas (Jul 23 2024 at 11:42):

A neat trick to generalise this DSL could be to parametrise a DSL with a type that represents all the basic operations allowed. Then you could do proofs in several kinds of query models and perhaps even some RAM models (at least as far as time complexity goes).

Shreyas Srinivas (Jul 23 2024 at 11:44):

Daniel Weber said:

I am using it, in complexity f = Ω(n log(n)). Why is this lower bound not true, for arbitrary α whose only interface is the linear order instance? Can you do anything other than comparisons?

Yeah but where is the evidence that the sorting was done using f.

Daniel Weber (Jul 23 2024 at 12:28):

What do you mean? f by its type must be a sorting function, and I do not see how you can have a sensible complexity measure where the statement "if you have a function which takes an arbitrarily typed list then its worst case time for a list of size n is Omega(n log(n))". Do you have an example of such a function?

Daniel Weber (Jul 23 2024 at 12:30):

Shreyas Srinivas said:

Also, currently in Lean, you can do something like the following:

import Mathlib

variable (Complexity : (α → β) → (ℕ → ℕ) → Prop)

-- assume f and g both compute the same function with different complexities
example (f g : α → β) (T : ℕ → ℕ) (h_ext : ∀ x : α, f x  = g x) (ch : Complexity f T) : Complexity g T := by
  observe h_eq : f = g
  rw[h_eq] at ch
  exact ch

Yes, this is a problem, but my type was just an illustration — the actual type can be Algorithm `(a : Type) <| Algorithm `([LinearOrder a]) <| Algorithm `(List a) <| ... or something along those lines

Daniel Weber (Jul 23 2024 at 12:33):

Shreyas Srinivas said:

That's beside the point. You can generalize it to all hashable types by constructing a suitable hash function with low collision probability. Further, the specific theorem you are going for wouldn't apply as such. You need a way to restrict operations to a three-way comparison and swaps on indices. If any other operation were to be involved, for instance, even being able to access the array element, then you can do sorting in an external bucket array.

Have you noticed that the type of f is (α : Type) → [LinearOrder α] → (l : List α) → {l₂ : List α // l₂.Perm l ∧ List.Sorted (· ≤ ·) l₂}? How could any other operation on α be involved, when the only data f gets regarding it is the linear order instance?

Henrik Böving (Jul 23 2024 at 12:35):

In this case you have a List a so you can use every trick about List that we have available to us.

Daniel Weber (Jul 23 2024 at 12:38):

Yes, you could call List.mergeSort for instance, but I'm assuming complexity also counts the runtime of functions you call

Shreyas Srinivas (Jul 23 2024 at 13:48):

You are looking for an array lower bound. Not a list lower bound

Daniel Weber (Jul 23 2024 at 14:08):

What is the difference?

Shreyas Srinivas (Jul 23 2024 at 14:09):

element access time given an index

Shreyas Srinivas (Jul 23 2024 at 14:13):

To clarify the lower bound itself is independent of the underlying data structures, but in a lean proof using lists or arrays the differences could matter. You would use different lemmas. With my query model you could bundle both the proofs into one.

Shreyas Srinivas (Jul 23 2024 at 14:16):

Further, to answer your question about model/complexity measure, pointer machines are supposed to be able to sort in linear time.

Shreyas Srinivas (Jul 23 2024 at 14:16):

https://dl.acm.org/doi/abs/10.1007/BF01936140

Shreyas Srinivas (Jul 23 2024 at 14:18):

As for your proposal with an Algorithm typeclass, it will only work if you have a way of distinguishing between different algorithms, otherwise you will still find yourself running into the issue I illustrated. Whence the DSL. Btw, this was discussed last year in a thread in the #general channel.

Shreyas Srinivas (Jul 23 2024 at 14:19):

I need to check out of this discussion now, but I'd be happy to hear about any developments on algorithms complexity formalisation in any reasonable model.

Geoffrey Irving (Aug 16 2024 at 20:42):

It is true that mergeSort is O(n log n) and this is optimal, BTW:

Matthew Ballard (Aug 16 2024 at 20:43):

Very nice

Eric Wieser (Feb 20 2025 at 21:00):

I know this is an old thread and Geoffrey has a nice solution already; but does this approach work?

import Mathlib

/-! Step one: define the operations that you wish to count-/
class MonadicLE (m : Type → Type*) (α : Type*) where
  leM : α → α → m Bool

/-! Step two: implement your abstract computation using only this monadic interface. -/

variable {m} [Monad m] [MonadicLE m α]

def merge (xs ys : List α) : m (List α) := do
  match xs, ys with
  | [], ys => return ys
  | xs, [] => return xs
  | x :: xs, y :: ys =>
    if ← MonadicLE.leM x y then
      return x :: (← merge xs (y :: ys))
    else
      return y :: (← merge (x :: xs) ys)

def mergeSort : List α → m (List α)
  | [] => pure []
  | [a] => pure [a]
  | a :: b :: l => do
    let s := List.MergeSort.Internal.splitInTwo ⟨a :: b :: l, rfl⟩
    merge (← mergeSort s.1) (← mergeSort s.2)
  termination_by s => s.length

/-! Step three: implement the typeclass to compute your cost function. -/
instance MonadicLE.ofDecidable (α) [LE α] [DecidableLE α] :
    MonadicLE (StateM Nat) α where
  leM x y := do
    modify (· + 1)
    return x ≤ y

Shreyas Srinivas (Feb 20 2025 at 21:01):

FWIW, I discussed the query approach with someone who has done algorithms formalisation in Isabelle and he suggested that having a more systematic combinator language around queries would bring uniformity and clarity to the algorithm specifications.

Eric Wieser (Feb 20 2025 at 21:01):

Clearly you can instantiate MonadicLE (DComp ..) using @Geoffrey Irving's construction, so this approach certianly works for writing down the algorithm; though maybe you still need Comp/DComp to state anything nicely

Eric Wieser (Feb 20 2025 at 21:02):

Shreyas Srinivas said:

and he suggested that having a more systematic combinator language around queries

I guess my claim is that you can just write out all your possible queries as fields of a typeclass, and use a regular monad for the combinators.

Eric Wieser (Feb 20 2025 at 21:04):

This also handles things like @Geoffrey Irving's debate protocol, where some functions in your system only have access to subsets of the queries; you build a typeclass hierarchy, and instead of needing allow and allowAll to call a few-query-permitted algorithm in a many-query-permitted algorithm, you just let typeclass inference run.

Shreyas Srinivas (Feb 20 2025 at 21:05):

My idea differs from Geoffrey in that it clearly syntactically separates calls to queries from other operations .

I also discussed the combinator approach with people at ICFP (including Joachim) and one thing that struck us is that perhaps we could automate some of the parts related to deducing complexity recurrences and use external tools like sage to simplify them

Shreyas Srinivas (Feb 20 2025 at 21:05):

The weakness is that it would take some effort to achieve the same level of expressivity that Geoffrey’s approach gets for free from lean

Eric Wieser (Feb 20 2025 at 21:07):

I don't understand either what the claimed idea of yours is, or why you want queries to be syntactically differentiated beyond the way that ← already distinguishes monadic operations in Geoffrey's approach.

Shreyas Srinivas (Feb 20 2025 at 21:09):

What stops me from non monadically calling the operations that the queries perform?

Shreyas Srinivas (Feb 20 2025 at 21:09):

And hide them deep in the algorithm

Shreyas Srinivas (Feb 20 2025 at 21:09):

So that someone has to carefully read everything to spot it

Eric Wieser (Feb 20 2025 at 21:14):

Shreyas Srinivas said:

What stops me from non monadically calling the operations that the queries perform?

The type of your function doesn't provide you access to those operations, as it's general over types and monads, without mentioning the relation or the counting monad.

Eric Wieser (Feb 20 2025 at 21:15):

But maybe that argument doesn't work with choice, since a function can use if m = StateM Nat then ...

Shreyas Srinivas (Feb 20 2025 at 21:33):

One can always lift pure functions into the monadic world.

Last updated: May 02 2025 at 03:31 UTC