Zulip Chat Archive

Stream: CSLib

Topic: Does Query Model allow matrix mul in O(N^2) time ?


TongKe Xue (Dec 06 2025 at 18:24):

Suppose we allow +, -, *, / on arbitrary precision numbers in one op:
Consider a vec of n elemes, each of log n bits. We can encode this as a big number of (n log n) bits in O(n) time.
We can take the dot product of two Vecs in O(1) time: amortize the encoding time, do regular integer mul on two (n log n) bit numbers, use mod to read off answer.
This would allow multiplying two NxN matrices, each with O(log n) bit number, in O(n^2 time):

  1. encode all the rows of matrix A in O(n) time
  2. encode all the rows of matrix B in O(n) time
  3. for each cell of output, do dot product of corresponding row/col in O(1) time

How does Query Model prevent such a formalization?

[RAM model breaks this because we can only multiply , for some fixed constant c, (c log n) * (c log n) mul in O(1) time; a [n log n]-big number would be forced to take atleast [n/c] cells].

Shreyas Srinivas (Dec 06 2025 at 18:25):

What queries are you using?

Shreyas Srinivas (Dec 06 2025 at 18:27):

you keep speaking of "the query model". It's actually many models wrapped into one. If you provide a matrixMult query with input matrices, then you could even accomplish it in one query.

Shreyas Srinivas (Dec 06 2025 at 18:28):

When you want to reason about matrix multiplication, you would fix a parameter w for word size and allow input entries to range from 0 to 2^w - 1

Shreyas Srinivas (Dec 06 2025 at 18:30):

You are not going to get some big gotcha with encoding tricks since you'd have to choose the right kind of queries to get the result you seek

TongKe Xue (Dec 06 2025 at 18:32):

You are not going to get some big gotcha with encoding tricks since you'd have to choose the right kind of queries to get the result you seek

I disagree with this. With the CLRS-RAM model, you only have to get the model right once.

With the "many different queery model", it seems every time you define a new "query model", you have to make sure that model defines "ticks" correctly (limit on wordsize, for example).

Shreyas Srinivas (Dec 06 2025 at 18:36):

I hate to break it to you but that's how algorithms theory works. I admit that we don't teach these levels of abstraction in undergrad since we are not particularly concerned about these translations. The combinatorial stuff is much more central to the subject. You can get even more gotchas with the knapsack NP Hardness result. The knapsack problem is NP Hard but has a pseudopolynomial algorithm. This is fine. We know when we need to care about these representation issues and when we don't

Shreyas Srinivas (Dec 06 2025 at 18:36):

There are a gazillion models and variants out there

Shreyas Srinivas (Dec 06 2025 at 18:50):

Btw, there's something called the pointer machine model. Check out how much time integer multiplication takes there

TongKe Xue (Dec 06 2025 at 19:17):

Philosohically:

  1. I want a tiny "trusted" base (ex: CLRS style RAM model).
  2. I do NOT want to have a new "trusted" base for every class of algorithms where we manually assign costs to constructors and then make an informal argument that it is correct.
  3. I believe the way to avoid (2) is to have everyone agree on a CLRS-style RAM model, and then for new "model creators" to provide lean4 verifiable formal proofs their model "compiles down" to CLRS-style RAM models for time/memory costs.

Shreyas Srinivas (Dec 06 2025 at 19:58):

3 is not happening. Sorry.

Shreyas Srinivas (Dec 06 2025 at 19:58):

Many of those models have useful purposes.

Shreyas Srinivas (Dec 06 2025 at 19:59):

The queries framework is the best way to capture all of them. One can always define translations between queries to do reductions for example.

Shreyas Srinivas (Dec 06 2025 at 20:00):

But the idea of “everyone agreeing to a strict definition of a ram model” and agreeing to write their algorithms fully formally in it is just not happening. Informally everybody agrees in some agnostic notion of a RAM model, but even there many other models exist for good reasons.

TongKe Xue (Dec 06 2025 at 20:10):

I agree with 99% of your idea. The only part I disagree on is this notion that for each Query Model, humans assign a cost to each constructor, and there is some informal, socially accepted "proof" that the cost of each constructor is correct.

I believe this "informal + socially accepted proof" should be replaced by a Lean4 verified "mini compiler" that targets some commonly accepted CLRS-style RAM model, and that we prove in lean4 that each of the constructor indeed only take X computation and Y memory.

I think this is an accurate stance on what we agree/disagree on (and it appears neither of us will convince the other of whether there should be a formal lean4 proof of the cost of each constructor of each query model.)

Shreyas Srinivas (Dec 06 2025 at 20:27):

You are misunderstanding the purpose of this subject. It is not to produce implementable algorithms. That’s a nice added bonus every theorist will like, but not enough to put in the necessary grunt work that offers no fundamentally new insights. The evidence lies in how many procedures of the last 20 years are actually simple enough to be implementable, let alone implemented. Even where implementation is possible, it can be done at a different level of abstraction so as to not obscure the key insights of an algorithm.

Shreyas Srinivas (Dec 06 2025 at 20:30):

What is of interest is “given a set of basic operations, what can I compute with so and so complexity”

Shreyas Srinivas (Dec 06 2025 at 20:32):

For example, given matrix multiplication, can I maintain such and such data structures dynamically ?

Ching-Tsun Chou (Dec 06 2025 at 21:11):

Would it be possible to formalize the Zeno machines as a query model? If yes, let’s do that and collapse all those pesky complexity classes to O(1) and bring the whole field of algorithm design to a successful conclusion. (Needless to say, the preceding paragraph is not meant to be taken seriously.)

Shreyas Srinivas (Dec 06 2025 at 21:12):

It’s also not a great argument.

TongKe Xue (Dec 06 2025 at 21:30):

I want:

  1. assume X
  2. we can show that algorithm i has running time t(i) and memory usage s(i)

You want:

  1. for i = 0, ... ; assume model_i
  2. then for algorithm A_j in query_model_i, it uses time t(j) and memory usage s(j)

In the above, X = CLRS style ram model.

You want lots of "islands" each with their own assumptions. I just want a single assumption.

I think that is the core of our difference, and neither of us will convince the other.

Shreyas Srinivas (Dec 06 2025 at 21:33):

those islands can be related by someone who wants them related

Shreyas Srinivas (Dec 06 2025 at 21:33):

Others can continue building their islands

Shreyas Srinivas (Dec 06 2025 at 21:33):

Typically you will find that there are many ways to link those islands.

Shreyas Srinivas (Dec 06 2025 at 21:33):

That' s for the interested person to solve

Marcelo Fornet (Dec 07 2025 at 20:37):

Both approaches seems compatible to me, we develop the query meta-model, and on top of it we develop the RAM model … and other models.

Then everyone can work on the part they care the most! In particular a lot of the algorithms will be developed for the RAM model due to its practical importance, but connections between models can be made anyway.

Shreyas Srinivas (Dec 07 2025 at 21:04):

Exactly. If anything this might introduce a tower of abstractions connected through layers of interpretation functions.

Eric Wieser (Dec 07 2025 at 21:05):

The query complexity design ought to permit something like "evaluate the queries using this RAM model"

Shreyas Srinivas (Dec 07 2025 at 21:05):

It doesn’t forbid it

Eric Wieser (Dec 07 2025 at 21:06):

I don't remember if FreeM already has a monadic run that lets you implement the queries in another monad; but it would be easy enough to add it if it is missing.

Shreyas Srinivas (Dec 07 2025 at 21:06):

You can always translate from one query model to another. The point is, both conceptually and for formalisation, I think it is the abstraction that works best.

Shreyas Srinivas (Dec 07 2025 at 21:11):

it cleanly separates algorithmic insights for a particular algorithm in a model from insights into how its primitives can be implemented. There are bound to be many ways for the latter. So you can also structure the library to allow multiple complexity theorems for each algorithm corresponding to different translations.

Shreyas Srinivas (Dec 07 2025 at 21:14):

This separation is incidentally only implicit in the manual tick model, but will be automatic in the query model (by the query type)

Eric Wieser (Dec 07 2025 at 21:14):

One thing that is hard with the query model is space complexity, since you can't prevent the implementation using storage in the Lean type theory

Eric Wieser (Dec 07 2025 at 21:16):

This is out of my area of expertise, but probably systems with more exotic things like linear types can protect against this.


Last updated: Dec 20 2025 at 21:32 UTC