Zulip Chat Archive

Stream: CSLib

Topic: Proposal on Time Complexity

Sorrachai Yingchareonthawornchai (Sep 25 2025 at 09:11):

I am thinking about the simplest way to analyze the time complexity of an algorithm in Lean. Later, we can port these algorithms into more elegant machinery such as Boole.

I have written an example of merge sort using the Cost Monad to track the running time. What do you feel about this type of analysis? What I like about this simple approach is that the correctness and runtime proofs remain separate.

Comments/suggestions are welcome.

structure CostM (α : Type) where
  val : α
  cost : ℕ

namespace CostM

def pure (a : α) : CostM α :=
  ⟨a, 0⟩

def bind (m : CostM α) (f : α → CostM β) : CostM β :=
  let r := f m.val
  ⟨r.val, m.cost + r.cost⟩

instance : Monad CostM where
  pure := pure
  bind := bind
-- Increment cost
def tick {α : Type} (a : α) (c : ℕ := 1) : CostM α :=
  ⟨a, c⟩
-- We define `@[simp]` lemmas for the `.cost` field, similar to how we did for `.val`.
@[simp] theorem cost_of_pure (a : α) : (pure a).cost = 0 := rfl
@[simp] theorem cost_of_bind (m : CostM α) (f : α → CostM β) : (CostM.bind m f).cost = m.cost + (f m.val).cost := rfl
@[simp] theorem cost_of_tick {α} (a : α) (c : ℕ) : (tick a c).cost = c := rfl
@[simp] theorem val_bind (m : CostM α) (f : α → CostM β) : (CostM.bind m f).val = (f m.val).val := rfl

def merge (xs ys : List ℕ) : CostM (List ℕ) :=
  let rec go : List ℕ → List ℕ → CostM (List ℕ)
  | [], ys => tick ys ys.length
  | xs, [] => tick xs xs.length
  | x::xs', y::ys' =>
    if x ≤ y then
      do let rest ← go xs' (y::ys')
         tick (x :: rest)
    else
      do let rest ← go (x::xs') ys'
         tick (y :: rest)
  go xs ys


def mergeSort ( xs: List ℕ ) : CostM (List ℕ) :=
  if xs.length < 2 then .pure xs
  else
  let n := xs.length
  let half := n / 2
  let left := xs.take half
  let right := xs.drop half
  do
    let sortedLeft ← mergeSort left
    let sortedRight ← mergeSort right
    merge sortedLeft sortedRight
termination_by xs.length decreasing_by (
  · simp only [List.length_take, inf_lt_right, not_le, gt_iff_lt]
    subst n
    simp_all
    rename_i h
    exact Nat.log2_terminates xs.length h
  · grind
)

Shreyas Srinivas (Sep 25 2025 at 10:59):

This is effectively what @Geoffrey Irving already did in the debate formalisation. They also prove the comparison sort lower bound

Shreyas Srinivas (Sep 25 2025 at 11:01):

This method can be generalised to a combinator language parameterised by a query language that I have mentioned many times on Zulip and discord. That allows us to choose custom cost models.

Shreyas Srinivas (Sep 25 2025 at 11:04):

The upside to the monad approach is that you get to reuse lean machinery for the combinator side. The downside is you might be able to sneak in some pure computations. For example you might be able to call merge sort inside a let statement and then just pass the result to the cost function

Shreyas Srinivas (Sep 25 2025 at 11:05):

The ideal method will allow writing Dafny style require and ensure conditions that allow you to write a functional sorting algorithm and then say “our computation is identical to this” and then move to the functional realm for the rest. Then the computational complexity bit will be handled in the Dafny style code listing.

Bas Spitters (Sep 25 2025 at 12:20):

There's also this line of work:
https://www.cs.cmu.edu/~rwh/papers/calf/popl22.pdf

Shreyas Srinivas (Sep 25 2025 at 13:32):

Bas Spitters said:

There's also this line of work:
https://www.cs.cmu.edu/~rwh/papers/calf/popl22.pdf

how much of this translates to imperative code? Wouldn't this require writing a cost aware DSL?

Sorrachai Yingchareonthawornchai (Oct 02 2025 at 07:23):

here is correctness and running time analysis

Shreyas Srinivas (Oct 02 2025 at 09:19):

What stops me from sneaking in the tick function with argument 0 somewhere?

Shreyas Srinivas (Oct 02 2025 at 09:28):

To improve this approach a bit, maybe the array should be put behind the monad and made accessible through some query operations. I think this was the approach in the debate formalisation. Otherwise, nothing stops me from calling a normal function for some operations on some sub arrays which are not counted by this Cost monad.

Alex Meiburg (Oct 02 2025 at 14:49):

Yeah, to me this is not a very satisfactory notion of time complexity. If you proved various bounds on the "time complexity" a function this way, you could never even in principle translate that to (say) something about how quickly a Turing machine can compute it.

Alex Meiburg (Oct 02 2025 at 14:50):

I think this kind of tick operation could maybe be used for talking about query or communication complexity, where a get or send function is the only place you need a tick, and so you somehow bundle those together.

Shreyas Srinivas (Oct 02 2025 at 14:53):

I am formalising communication complexity (right now) iand it's much simpler to leave the complexity stuff to the operational semantics

Alex Meiburg (Oct 02 2025 at 14:53):

Or, in your example, it's already surprising to me: typically sorting algorithms are timed based on the number of comparisons needed, so while you wrote

def merge (xs ys : List ℕ) : CostM (List ℕ) :=
  let rec go : List ℕ → List ℕ → CostM (List ℕ)
  | [], ys => tick ys ys.length
  | xs, [] => tick xs xs.length
  | x::xs', y::ys' =>
    if x ≤ y then
      do let rest ← go xs' (y::ys')
         tick (x :: rest)
    else
      do let rest ← go (x::xs') ys'
         tick (y :: rest)
  go xs ys

I think it would be more standard to express

def merge (xs ys : List ℕ) : CostM (List ℕ) :=
  let rec go : List ℕ → List ℕ → CostM (List ℕ)
  | [], ys => pure ys
  | xs, [] => pure xs
  | x::xs', y::ys' =>
    let c ← tick (x ≤ y)
    if c then
      do let rest ← go xs' (y::ys')
         pure (x :: rest)
    else
      do let rest ← go (x::xs') ys'
         pure (y :: rest)
  go xs ys

Alex Meiburg (Oct 02 2025 at 14:53):

Shreyas Srinivas said:

I am formalising communication complexity (right now) iand it's much simpler to leave the complexity stuff to the operational semantics

Yeah, I agree that's better. Just trying to explain how I feel about a tick-based approach

Shreyas Srinivas (Oct 02 2025 at 14:54):

In general I think if the user can/has to specify the tick calls, then it leaves room for bad specifications

Shreyas Srinivas (Oct 02 2025 at 14:55):

And with CSLib we need to be extra careful since there has been some mention of AI generated code being a part of the project.

Shreyas Srinivas (Oct 02 2025 at 14:58):

One way to make this tick approach work better (but not ideal) is to define a query monad for the compare and swap operations on the array, and a small step op semantics for this monad, which tracks calls to the queries and interprets their operations on an array as well as counts the number of query calls. That way local modifications of the array have no effect on the semantics side.

Sorrachai Yingchareonthawornchai (Oct 02 2025 at 15:48):

Thank you for the good discussion so far. The only downside is that tick instrumentation requires manual verification to ensure the ticks are counted correctly. Other than that, it is pretty flexible to count in a way you think can be implemented. Ideally, it would be preferable not to have any manual checks at all, and we aim to achieve this in the long run.

Sorrachai Yingchareonthawornchai (Oct 02 2025 at 15:50):

Shreyas Srinivas said:

One way to make this tick approach work better (but not ideal) is to define a query monad for the compare and swap operations on the array, and a small step op semantics for this monad, which tracks calls to the queries and interprets their operations on an array as well as counts the number of query calls. That way local modifications of the array have no effect on the semantics side.

I think you are referring to formalizing a query model where you are not allowed to "cheat" the system by performing list manipulation outside the query. This is a good idea if you want to formalize a comparison-based model.

Shreyas Srinivas (Oct 02 2025 at 16:26):

I think the bigger issue with this approach is that for TCS we need a proper specification system that cannot be easily gamed

Shreyas Srinivas (Oct 02 2025 at 16:27):

Because our specifications are usually much larger than math and correspondingly it is an engineering nightmare to make sure they are correct.

Shreyas Srinivas (Oct 02 2025 at 16:27):

My query idea is more general than comparison based models.

Shreyas Srinivas (Oct 02 2025 at 16:28):

Almost all computational models in TCS can be expressed as query models. These queries of the model are the unit operations we perform. This not only applies to standard models like word RAM and Real RAM but also for those situations where you want to additionally perform blackbox queries

Shreyas Srinivas (Oct 02 2025 at 16:29):

So for example, if you want to make blackbox queries to the FOCS'22 max flow algorithm or SSSP algorithm, you can extend it to include these calls as queries.

Shreyas Srinivas (Oct 02 2025 at 16:30):

Additionally you can include non-traditional models such as the online model or the LCA model as query models.

This generalisation is useful because one can reason about the combinators (loops, conditionals, sequential operations, non-deterministic calls) independently of the underlying query model

Shreyas Srinivas (Oct 02 2025 at 16:32):

Another thing I want to emphasise is that in the current approach you are allowing the manipulation of semantic information about the algorithm (namely how many steps the algorithm counts) with the syntax (the program spec). These are two separate concerns. It will be hard-to-impossible to prove that your semantics are sound (if at all they are). Conceptually, your costing mechanism should be inaccessible inside the syntax. In some program logic settings they call such parameters ghost variables.

Sorrachai Yingchareonthawornchai (Oct 02 2025 at 16:41):

Happy to discuss more. Do you have the implementation of your query models or sample code to start the discussion?

Shreyas Srinivas (Oct 02 2025 at 19:05):

@Sorrachai Yingchareonthawornchai : in addition to the example I mailed in DM see the deepmind groups formalisation of debate as well as sampcert (published at PLDI)

Shreyas Srinivas (Oct 02 2025 at 19:38):

https://github.com/girving/debate/blob/f5a4cf1692c9d501c7a922081dc3cf3965e57802/Comp/Sort.lean

Shreyas Srinivas (Oct 02 2025 at 19:44):

Also you can see that this discussion about designing the monad to minimise the amount of code that needs to be manually checked has happened here

https://leanprover.zulipchat.com/#narrow/channel/144837-PR-reviews/topic/.2318629.20runtime.20complexity.20of.20sorting.20a.20list/near/483290907

Juanjo Madrigal (Oct 06 2025 at 06:05):

I've been reading about the CALF framework mentioned above and related work; it is very interesting and solves some of the issues discussed here. I had a couple of questions

Do you think it would be better to build a very general framework in Lean (which may be quite difficult) and after that start estimating algorithms, or perhaps it is better to start contributing estimations with a simpler approach and adapt them progressively as a more general framework takes shape?
I read somewhere that the goal is not only to verify algorithms from a theoretical point of view, but to make them available as verified algorithms that can be used elsewhere. Perhaps adapting these algorithms to a very general framework may incur in loss of performance in runtime that make them unusable?

Quang Dao (Oct 06 2025 at 08:40):

I mentored an undergrad (Ian Martin) this summer and he worked on translating Calf to Lean. The translation is incomplete, but may be worth taking a look in any case: https://github.com/ianm1129/Polytime-Formalizations/tree/master

Jonathan Sterling (Oct 06 2025 at 12:46):

Hi! I don't check this Zulip regularly, but someone pointed me to this thread regarding Calf. The idea of building something like this in Lean is really cool! Let me comment on one potential pitfall to be careful of.

One of the main ideas of Calf is to use an axiomatic phase distinction to prevent intensional/quantitative information (e.g. instrumentation of programs with time steps) from interfering with the extensional behaviour of functions. The simplest semantic model of this would be in the category Arr(Set), i.e. the category whose objects are functions between sets and whose morphisms are commuting squares. Then the "extensional" stuff lives in the codomains and the "quantitative" stuff lives upstairs. The codomain functor Arr(Set) -> Set lets you purge out the intensional/quantitative stuff and reason extensionally about programs. This is incorporated internally to calf by means of a phase modality. In addition to enforcing noninterference, it also gives you a simple way to state theorems like "mergesort == insertionsort modulo cost".

Anyway, this aspect cannot be incorporated into a Lean implementation because Lean has classical logic built-in. In the presence of LEM, the only "phases" are the total phase and the empty phase, which means that the calf system becomes degenerate when written in Lean.

You can still do cool stuff without the phase distinction. But I just wanted to warn you in advance, that to use a phase distinction like calf, you need to work in a type theory that doesn't assume LEM globally. One thing that is interesting about the Arr(Set) model I mentioned is that although Arr(Set) refutes LEM, we can recover Set inside Arr(Set) in two different ways using two modalities; one of them is the "phase modality" that purges all the quantitative stuff, and the other is the corresponding "sealing modality" that "moves" all the extensional/non-quantitative stuff into the quantitative world. (For topos theorists, this is an open/closed partition of the topos; others can feel free to ignore that remark). Anyway, what this means is that you could have a system where LEM is present but restricted in certain ways so as to apply only in these modal subuniverses.

Shreyas Srinivas (Oct 06 2025 at 14:13):

Not even in a deeply embedded DSL?

Shreyas Srinivas (Oct 06 2025 at 14:13):

I thought that was the proposal, given the juxtaposition with Boole.

Jonathan Sterling (Oct 08 2025 at 13:49):

Hi! A deep embedding could potentially work, depending on the details.

Shreyas Srinivas (Oct 08 2025 at 15:05):

I also wonder whether we couldn’t use a typeclass to capture prop types with the desired phase separation. We do use this idea in a different situation (keeping linear and non linear contexts) for BI logic in iris

Shreyas Srinivas (Oct 10 2025 at 16:45):

This is partially relevant:

https://leanprover.zulipchat.com/#narrow/channel/113486-announce/topic/Loom.3A.20framework.20for.20Multi.20Modal.20program.20verifiers.20in.20Lean/near/544200084

Fabrizio Montesi (Nov 27 2025 at 10:27):

Can somebody exemplify how (some of) the code in cslib#165 would look in the 'non-gameable' design? Is it gonna work by means of annotations?

Shreyas Srinivas (Nov 27 2025 at 10:44):

I have commented a link to an example in the PR.

Shreyas Srinivas (Nov 27 2025 at 10:44):

Sorry mentioned it.

Shreyas Srinivas (Nov 27 2025 at 10:45):

But this topic has come up several times and I actually linked the comparison sort Lower bound

Fabrizio Montesi (Nov 27 2025 at 10:46):

this? https://github.com/girving/debate/blob/f5a4cf1692c9d501c7a922081dc3cf3965e57802/Comp/Sort.lean

Shreyas Srinivas (Nov 27 2025 at 10:51):

Yes

Shreyas Srinivas (Nov 27 2025 at 10:52):

It works because iirc they work entirely with queries and implement them separately (at least that’s my memory from 1.5 years ago)

Shreyas Srinivas (Nov 27 2025 at 10:53):

At least the merge sort function is not annotated
https://github.com/girving/debate/blob/f5a4cf1692c9d501c7a922081dc3cf3965e57802/Comp/Sort.lean#L229

Fabrizio Montesi (Nov 27 2025 at 11:25):

Is the key this line then? https://github.com/girving/debate/blob/f5a4cf1692c9d501c7a922081dc3cf3965e57802/Comp/Sort.lean#L226

Fabrizio Montesi (Nov 27 2025 at 11:37):

I agree that fighting manual annotation errors is worthy, but that code looks much more complex. I wonder if it could be made simpler/more ergonomic.
An alternative would be to take the TimeM approach in cslib#165 and create a 'private' library of carefully-vetted and growing-over-time lifted definitions of existing Lean defs, like List.cons (with appropriate costs). Then the 'userspace' algorithms we wanna prove things about could just invoke the API in that library.

But I'd like to get started from somewhere first, so that we can compare code alternatives. Furthermore, even 'time' is just one of the interesting metrics one can have on programs, and in the future we'll have to figure out a multi-metric approach. So I'd be happy with a polished version of cslib#165 that can then later be used as reference for future discussions.

Shreyas Srinivas (Nov 27 2025 at 11:39):

You’ll find that the review burden scales poorly

Shreyas Srinivas (Nov 27 2025 at 11:40):

In the case of algorithms community, there is very little awareness of these issues. CSLib will be seen as an authority on correct formalisation practices

Shreyas Srinivas (Nov 27 2025 at 11:40):

I’d rather we don’t have false starts.

Shreyas Srinivas (Nov 27 2025 at 11:41):

If you want to get algorithmic content in CSLib I have a different suggestion

Shreyas Srinivas (Nov 27 2025 at 11:43):

When proving correctness of algorithms I anticipate that we will find it easier to prove that a given Boole program has the same input-output relationship as a functional programming implementation

Shreyas Srinivas (Nov 27 2025 at 11:43):

And then we prove the correctness properties for this implementation.

Shreyas Srinivas (Nov 27 2025 at 11:43):

Essentially it’s a proof of Boole code refining a functional spec

Shreyas Srinivas (Nov 27 2025 at 11:44):

We can start by writing purely functional versions of the algorithms and proving only their correctness

Shreyas Srinivas (Nov 27 2025 at 11:44):

When Boole arrives we will only need to prove the time complexity and refinement stuff

Shreyas Srinivas (Nov 27 2025 at 11:46):

Proving correctness of algorithms in the functional style is already feasible in lean. We have some basic results in lean core and mathlib.

Shreyas Srinivas (Nov 27 2025 at 11:46):

But there is a whole lot we could still do.

Shreyas Srinivas (Nov 27 2025 at 11:47):

There is the entire “Functional Algorithms verified” book in Isabelle which can be ported to lean

Shreyas Srinivas (Nov 27 2025 at 11:48):

And in graph algorithms we can write purely functional and set-theoretic versions of these algorithms and show that they satisfy their invariants

Fabrizio Montesi (Nov 27 2025 at 11:49):

Proving the functional correctness of Lean code is something we can start doing now for sure, and PRs about that are welcome.
But what about their complexity properties (of Lean code; I see what you mean with Boole code)?

Shreyas Srinivas (Nov 27 2025 at 11:49):

I’d say that given that we don’t yet have functional correctness we could start working there until Boole arrives

Shreyas Srinivas (Nov 27 2025 at 11:50):

Incidentally I think we can also retrofit loom with a different monad and use that for time complexity. But Ilya’s group is already working on it.

Shreyas Srinivas (Nov 27 2025 at 11:51):

From both a technical and social viewpoint I think it is important that we get it right as far as time complexity goes. A false start will be a huge setback given how little the community understands the difficulties involved and will therefore rely on the example set by authority figures.

Shreyas Srinivas (Nov 27 2025 at 11:53):

Let’s not rush the time complexity side in haste when we might very well have a good solution in a couple of months, especially when we still have a lot of work to do on the functional correctness side (especially since this can be put to use in verifying the imperative side later)

Fabrizio Montesi (Nov 27 2025 at 11:54):

Shreyas Srinivas said:

From both a technical and social viewpoint I think it is important that we get it right as far as time complexity goes. A false start will be a huge setback given how little the community understands the difficulties involved and will therefore rely on the example set by authority figures.

I understand, but that can be dealt with by being informative and very direct about limitations in the docstrings. I'm afraid 'not doing anything until we have the perfect solution' will lead to less progress, not more. We should be working on the perfect solution, and we should have some code to use as benchmark.

Fabrizio Montesi (Nov 27 2025 at 11:54):

Shreyas Srinivas said:

Let’s not rush the time complexity side in haste when we might very well have a good solution in a couple of months, especially when we still have a lot of work to do on the functional correctness side (especially since this can be put to use in verifying the imperative side later)

What solution will appear in a couple of months?

Shreyas Srinivas (Nov 27 2025 at 11:55):

Fabrizio Montesi said:

Shreyas Srinivas said:

From both a technical and social viewpoint I think it is important that we get it right as far as time complexity goes. A false start will be a huge setback given how little the community understands the difficulties involved and will therefore rely on the example set by authority figures.

I understand, but that can be dealt with by being informative and very direct about limitations in the docstrings. I'm afraid of 'not doing anything until we have the perfect solution'. We should be working on the perfect solution, and we should have some code to use as benchmark.

This makes sense if we can’t make any progress at all. But here we can. There is an enormous amount of work to be done on the correctness and discrete math side.

Shreyas Srinivas (Nov 27 2025 at 11:56):

Fabrizio Montesi said:

Shreyas Srinivas said:

Let’s not rush the time complexity side in haste when we might very well have a good solution in a couple of months, especially when we still have a lot of work to do on the functional correctness side (especially since this can be put to use in verifying the imperative side later)

What solution will appear in a couple of months?

I’m told that the creators of Loom have already managed to create a DSL on top of it, similar to Velvet m, which can count complexity.

Fabrizio Montesi (Nov 27 2025 at 11:57):

I'd be very (but pleasantly) surprised if Loom machinery were CSLib-ready and in an accepted PR in a couple of months. :-)

Shreyas Srinivas (Nov 27 2025 at 12:03):

It’s more refined than manually counting ticks inside a monad explicitly in its current state because of its verification condition generator. But my original point remains. We have work to do that will be necessary when Boole arrives and which will simplify our task when it does.

Shreyas Srinivas (Nov 27 2025 at 12:04):

For what it’s worth, counting ticks has been proposed several times on this Zulip since early 2023 and rejected for the same reasons. It’s something every TCS person who is new to lean comes up with.

Fabrizio Montesi (Nov 27 2025 at 12:06):

What's the argument against having the separate library of 'tick annotations' for Lean functions?

Shreyas Srinivas (Nov 27 2025 at 12:07):

Same as above. He who controls the ticks determines the complexity. The larger the spec, the harder it is to check.

Shreyas Srinivas (Nov 27 2025 at 12:07):

In algorithms our specs (defs, algorithms, theorem statements) are much larger than in pure math.

Fabrizio Montesi (Nov 27 2025 at 12:08):

Sounds like the specs themselves will pose interesting ergonomics issues.

Fabrizio Montesi (Nov 27 2025 at 12:09):

Anyway, I'll of course be very happy to see PRs about functional correctness only, without time complexity.

Shreyas Srinivas (Nov 27 2025 at 12:09):

I suggest not going ahead with cslib#165

Fabrizio Montesi (Nov 27 2025 at 12:10):

I just don't see how getting started with something about time complexity, as imperfect as it may be, won't be useful to make progress (w/ appropriate strong warnings about reliability). It's the only proposal I have right now.

Shreyas Srinivas (Nov 27 2025 at 12:10):

It will set the wrong precedent and you will have to redo the work again later.

Shreyas Srinivas (Nov 27 2025 at 12:11):

And you will have to explain to people who have no clue about the difficulties of theorem proving why you are going through all this and then need Boole.

Shreyas Srinivas (Nov 27 2025 at 12:13):

There is a stupendous communication gap here because of how little understanding there is of ITP among algorithms theory folks in general have. The few people on this Zulip and some courses students are the exception. Not the norm. A few docstrings will not address this communication gap.

Shreyas Srinivas (Nov 27 2025 at 12:13):

Sometimes, nothing is better than something, if that something is going to prove misleading.

Shreyas Srinivas (Nov 27 2025 at 12:16):

And it’s not like we don’t have enough to do outside Boole.

Shreyas Srinivas (Nov 27 2025 at 12:17):

We hardly have any spectral graph theory. We hardly have basic probability lemmas like concentration bounds. We don’t even have definitions for so many graph decompositions or reduction gadgets.

Shreyas Srinivas (Nov 27 2025 at 12:21):

All of which is to say, if we want to make progress that will also prove useful when Boole arrives, there are many more basic things we can (and should) do, instead of false starts in time complexity that will have to be repeated later and will cause review burden in the meantime.

Shreyas Srinivas (Nov 27 2025 at 12:36):

Actually I just checked and we don’t even have a definition of strongly connected components of a Digraph because that API is so new. Only a few hours ago I posted a recent definition of Walks in #graph theory, which does not use dependent indices (which lead to massive headaches in definitions on Walks). There is currently no definition of Walks for Digraph in mathlib.

Shreyas Srinivas (Nov 27 2025 at 12:40):

Given all these tasks at hand, I don’t see a reason to pursue a naive and flawed approach to time complexity which is supposed to be a stopgap for a more systematic solution, and also adds substantial review burden, and later, repetition of all this work for Boole.

Shreyas Srinivas (Nov 27 2025 at 12:46):

Btw, if you want concrete reachable goals, I can give some right away. I am half way through developing a lot of Digraph API for flow problems. There are many possibilities here.

Fabrizio Montesi (Nov 27 2025 at 14:16):

Many of those are interesting things, which I'd be happy to see PRs for.

Getting back to this specific topic: why is the TimeM approach doomed? As far as I can see, the approach is simple and modular. So instead of invoking 'naked' operations directly and weaving the ticks in the algorithm, in the very near future we could already invoke lifted 'annotated' operations (like merge sort itself, but also more basic ones like List.cons). @Sorrachai Yingchareonthawornchai is that correct?
In the future we might even be able to build some semi-automated support for this.

Fabrizio Montesi (Nov 27 2025 at 14:28):

From all the code linked and approaches discussed so far, I haven't seen one that'd give a substantially better situation -- unless we switch to a deeply-embedded language, but then we'd still have to trust the translation from such language to Lean, the verification of which would require... a time-complexity analysis for Lean code, as proposed in here. This all looks a bit circular to me, so starting from the simplest and immediately available thing (Lean) currently seems pretty reasonable. Am I missing something?

Shreyas Srinivas (Nov 27 2025 at 14:29):

@Fabrizio Montesi when you need to review a thousand lines of specification before you get to a proof, you don’t want there to be too much leeway to tinker with standard things in the spec.

Shreyas Srinivas (Nov 27 2025 at 14:30):

Fabrizio Montesi said:

From all the code linked and approaches discussed so far, I haven't seen one that'd give a substantially better situation -- unless we switch to a deeply-embedded language, but then we'd still have to trust the translation from such language to Lean, the verification of which would require... a time-complexity analysis for Lean code, as proposed in here. This all looks a bit circular to me, so starting from the simplest and immediately available thing (Lean) currently seems pretty reasonable. Am I missing something?

That’s a much more systematic approach (and most likely what Boole is doing and loom does)?

Shreyas Srinivas (Nov 27 2025 at 14:30):

There is a very good reason PL researchers invest so much effort into proof engineering

Fabrizio Montesi (Nov 27 2025 at 14:30):

Shreyas Srinivas said:

Fabrizio Montesi when you need to review a thousand lines of specification before you get to a proof, you don’t want there to be too much leeway to tinker with standard things in the spec.

Are you talking about time complexity specs? Can you give an example?

Shreyas Srinivas (Nov 27 2025 at 14:31):

Try writing any modern STOC/FOcS paper in this framework. The specs (def/theorem statements alone) will alone exceed a thousand lines easily

Shreyas Srinivas (Nov 27 2025 at 14:32):

Actually just check how big some of these theorem statements are in pen and paper, when fleshed out in detail, including all the custom definitions.

Shreyas Srinivas (Nov 27 2025 at 14:32):

Math paper theorem statements are tiny in comparison

Shreyas Srinivas (Nov 27 2025 at 14:33):

Please keep in mind that the algorithm code is part of the spec. As are the tick annotations

Shreyas Srinivas (Nov 27 2025 at 14:39):

Fabrizio Montesi said:

From all the code linked and approaches discussed so far, I haven't seen one that'd give a substantially better situation ?

That substantially better situation comes from a DSL like Boole and Loom. They have all these proofs for translation. It takes time and effort to engineer them. Haphazardly putting something together looks nice but this means you are going to repeatedly prove concrete theorems again and again instead of abstracting the essentials, and allowing too much flexibility in tinkering with complexity computations which should be purely determined by the composition of operations that are being counted.

Shreyas Srinivas (Nov 27 2025 at 14:49):

And this is not waiting for something perfect. This is waiting for something that’s good enough to make formalisation of algorithms reasonably systematic. If ticks were a solution, we would have long since had algorithms theory in lean. As I mentioned earlier, I have seen this idea on this very Zulip a few times now.

Fabrizio Montesi (Nov 27 2025 at 15:08):

Are you saying we won't be able to modularly use theorems like that about the time complexity of merge sort in proofs about the time complexity of algorithms that invoke it?

Shreyas Srinivas (Nov 27 2025 at 15:17):

No I am saying it would depend on me annotating all calls correctly. This is far too dangerous in a spec which has to be manually checked, especially when the algorithms get bigger. Another way to put it, the composition theorems for complexity would be only valid for correctly annotated functions, and there is no automated check for that or a good formal definition of “correctly annotated”.

In an acceptable approach, it should not be so easy to sneak in mistakes. The complexity of a function should follow from the selection of basic operations and composition theorems without any human annotation on what to count.

Sebastian Ullrich (Nov 27 2025 at 16:22):

Shreyas Srinivas said:

The complexity of a function should follow from the selection of basic operations and composition theorems without any human annotation on what to count.

To me that sounds like a metaprogram that transforms regular Lean functions into TimeM programs. The metaprogram rules would be basically identical to the timing rules a DSL would have to declare and so should result in similar assurances about correct counting, but your specification remains a natural Lean program that can be executed (and proofread) without tick annotation overhead, best of both worlds.

Shreyas Srinivas (Nov 27 2025 at 16:24):

Actually if you get the time complexity before the translation then you wont need to translate the program into TimeM

Shreyas Srinivas (Nov 27 2025 at 16:24):

Just a normal lean function would be fine

Shreyas Srinivas (Nov 27 2025 at 16:25):

My issue with the current PR is, it just lets the algorithm writer choose when and how much to tick an operation.

Sebastian Ullrich (Nov 27 2025 at 16:25):

Not sure what you mean by "get" here. Are you saying the metaprogram could be a verifier instead of annotater?

Shreyas Srinivas (Nov 27 2025 at 16:26):

Oh you mean you go from "lean function" to "TimeM"

Shreyas Srinivas (Nov 27 2025 at 16:26):

That could work if we could tell the metaprogram which function calls to count systematically I guess

Sebastian Ullrich (Nov 27 2025 at 16:27):

Yes. Same a DSL would have to decide on

Shreyas Srinivas (Nov 27 2025 at 16:29):

Yeah that would work. That would remove the manual tick annotation. Would we need to prove that the metaprogram worked correctly?

Shreyas Srinivas (Nov 27 2025 at 16:31):

Is it simpler to write this metaprogram than do what Geoffrey did in the sorting lower bound?

Sebastian Ullrich (Nov 27 2025 at 16:32):

The metaprogram could construct a proof that throwing away the ticks results in the input program. I don't think there's much to prove about the ticks part, it would be as axiomatic as in any DSL approach.

Sebastian Ullrich (Nov 27 2025 at 16:37):

Shreyas Srinivas said:

Is it simpler to write this metaprogram than do what Geoffrey did in the sorting lower bound?

Their approach seems to be very specific to counting comparisons via an oracle, I don't see this scaling to arbitrary Lean functions in a natural way

Shreyas Srinivas (Nov 27 2025 at 16:39):

Yeah this sounds much nicer. I once asked Claude to generate an AST for a combinator language parameterised by a Query type. I thought that would generalise his approach.

Claude Generated this. It might have errors. I have a slightly different version without non-determinism

-- Algorithmic combinator type parametrized by query type
inductive Combinator (Query : Type) (α : Type) : Type where
-- Primitive query operation
| query : Query → Combinator Query α

-- Pure value (no computation)
| pure : α → Combinator Query α

-- Sequential composition / monadic bind
| bind : Combinator Query β → (β → Combinator Query α) → Combinator Query α

-- Conditional branching
| cond : Combinator Query Bool → Combinator Query α → Combinator Query α → Combinator Query α

-- Iteration/loops - while loop with state
| while : (σ → Combinator Query Bool) → (σ → Combinator Query σ) → σ → Combinator Query σ

-- Bounded iteration - for loop
| for : Nat → σ → (Nat → σ → Combinator Query σ) → Combinator Query σ

-- Procedure call/recursion - fixed point combinator
| fix : ((α → Combinator Query β) → (α → Combinator Query β)) → α → Combinator Query β

-- Parallel composition (when supported by model)
| par : Combinator Query α → Combinator Query β → Combinator Query (α × β)

-- Choice/nondeterminism (when supported)
| choice : List (Combinator Query α) → Combinator Query α

-- Syntactic sugar for common patterns
def Combinator.map (f : α → β) (c : Combinator Query α) : Combinator Query β :=
Combinator.bind c (Combinator.pure ∘ f)

def Combinator.seq (c1 : Combinator Query α) (c2 : Combinator Query β) : Combinator Query β :=
Combinator.bind c1 (fun _ => c2)

-- If-then-else sugar
def Combinator.ite (b : Combinator Query Bool) (t e : Combinator Query α) : Combinator Query α :=
Combinator.cond b t e

-- Repeat n times
def Combinator.repeat (n : Nat) (c : Combinator Query Unit) : Combinator Query Unit :=
Combinator.for n () (fun _ _ => Combinator.seq c (Combinator.pure ()))

-- Sequential composition of list of computations
def Combinator.sequence : List (Combinator Query α) → Combinator Query (List α)
| [] => Combinator.pure []
| c :: cs => Combinator.bind c (fun x =>
Combinator.bind (Combinator.sequence cs) (fun xs =>
Combinator.pure (x :: xs)))

I didn't pursue this beyond the basics after CSlib was announced due to time constraints and the fact that they were actually paying someone to do this.

Shreyas Srinivas (Nov 27 2025 at 16:40):

The complexity composition theorems can be defined on this type and the AST could be compiled to a lean function (partial maybe?)

Sebastian Ullrich (Nov 27 2025 at 16:42):

Right, though at that point you're basically writing programs in a DSL again, and indeed you would likely want syntax sugar on top of that

Shreyas Srinivas (Nov 27 2025 at 16:46):

True. The nice thing is, I can define all the models in sequential computation as well as custom models as an inductive type (any number of RAM variants, pointer machines, Real number models). So these are the only operations which are counted and for which I have to provide an interpretation.

Sebastian Ullrich (Nov 27 2025 at 16:49):

Note though that with any such semi-shallow representation, you have zero control over the complexity of pure subterms. With a metaprogram, you have all the control.

Shreyas Srinivas (Nov 27 2025 at 16:49):

So in my version, i didn't actually have a constructor for pure, par and choice. I was afraid that I could stuff everything inconvenient into pure

Sebastian Ullrich (Nov 27 2025 at 16:50):

But surely your bind still takes a Lean function that you have no control over

Shreyas Srinivas (Nov 27 2025 at 16:50):

Yes. I never found a satisfactory fix for that.

Sebastian Ullrich (Nov 27 2025 at 16:51):

In effect, the only way to make this completely robust would be for a metaprogram to create this representation, but at that point I think we are at the overkill stage compared to more targeted metaprogram approaches we were discussing.

Shreyas Srinivas (Nov 27 2025 at 16:54):

One work around might be to put the data structures behind the wall of interpretation.

Shreyas Srinivas (Nov 27 2025 at 16:56):

I didn't implement it yet so I don't have a concrete idea of how well it will work. But basically you also parametrise the interpret function by a global type of all data structures you use, which can only be accessed by queries. It is complicated to return the value of type\alpha though. Concretely, I am not sure what the \alpha ought to represent.

Shreyas Srinivas (Nov 27 2025 at 16:56):

In the context of mergesort, this data type would an array and queries would be get (i : Nat), compare (i j : Nat).. and swap (i j : Nat)

Shreyas Srinivas (Nov 27 2025 at 16:58):

since Combinator isn't carrying this array around, a pure function or bind function can't actually touch the array directly. You only have the explicit data structure parameter in the "interpret" function that interprets the combinator AST into a real lean function

Fabrizio Montesi (Nov 27 2025 at 17:11):

Sebastian Ullrich said:

Shreyas Srinivas said:

The complexity of a function should follow from the selection of basic operations and composition theorems without any human annotation on what to count.

To me that sounds like a metaprogram that transforms regular Lean functions into TimeM programs. The metaprogram rules would be basically identical to the timing rules a DSL would have to declare and so should result in similar assurances about correct counting, but your specification remains a natural Lean program that can be executed (and proofread) without tick annotation overhead, best of both worlds.

This is very near to the 'semi-automated' help I was writing about. Plus, possibly, restricting access to pure and tick to the trusted library of operations (like the annotated version of List.cons).
Metaprogramming could perhaps also help with some automatic annotations.

Shreyas Srinivas (Nov 27 2025 at 17:13):

Fabrizio Montesi said:

Sebastian Ullrich said:

Shreyas Srinivas said:

The complexity of a function should follow from the selection of basic operations and composition theorems without any human annotation on what to count.

To me that sounds like a metaprogram that transforms regular Lean functions into TimeM programs. The metaprogram rules would be basically identical to the timing rules a DSL would have to declare and so should result in similar assurances about correct counting, but your specification remains a natural Lean program that can be executed (and proofread) without tick annotation overhead, best of both worlds.

This is very near to the 'semi-automated' help I was writing about. Plus, possibly, restricting access to pure and tick to the trusted library of operations (like the annotated version of List.cons).
Metaprogramming could perhaps also help with some automatic annotations.

It wouldn’t be semi automated. Imagine you have an attribute on top of a lean function that lists function calls you care about. The metaprogram would fully automatically generate a TimeM version with tick at each call of the function. So it would be fully automated.

Shreyas Srinivas (Nov 27 2025 at 17:14):

Sebastian Ullrich said:

In effect, the only way to make this completely robust would be for a metaprogram to create this representation, but at that point I think we are at the overkill stage compared to more targeted metaprogram approaches we were discussing.

Btw as Fabrizio’s message reminds me, you can also cheat in the TimeM monad as it exists in the PR

Fabrizio Montesi (Nov 27 2025 at 17:15):

That part, yes (I don't think you need the list btw, if we have a database of such functions). What I mean is that we'd still have to manually curate the database of lifted basic operations (like operations on Lean lists, arrays, ..., from the standard library).

Fabrizio Montesi (Nov 27 2025 at 17:16):

Shreyas Srinivas said:

Sebastian Ullrich said:

In effect, the only way to make this completely robust would be for a metaprogram to create this representation, but at that point I think we are at the overkill stage compared to more targeted metaprogram approaches we were discussing.

Btw as Fabrizio’s message reminds me, you can also cheat in the TimeM monad as it exists in the PR

Yes, that's why I'm talking about restricting access to pure and tick.

Shreyas Srinivas (Nov 27 2025 at 17:18):

I am still keen on making the combinator approach work with the fix I suggested above because it represents a good opportunity to define several sequential models that are masquerading as “the” RAM model in the literature currently.

I don’t think you should rely on the lean module system as a way to stop accessing functions. The goal is to formally be able to prove things about them while still not being able to use them in code to cheat.

Fabrizio Montesi (Nov 27 2025 at 17:20):

Wouldn't that give us very non-idiomatic lean code? (I admit I might be a bit lost in all the messages now.)

Btw, with the TimeM approach, I also get to use the current manually-indtrumented implementation as a test for the future metaprogramming facility.

Shreyas Srinivas (Nov 27 2025 at 17:20):

It would give us idiomatic algorithms pseudocode which is then compiled into lean code.

Shreyas Srinivas (Nov 27 2025 at 17:21):

Which is arguably a nice thing to have.

Shreyas Srinivas (Nov 27 2025 at 17:21):

And it would allow us to reason about algorithms whose implementation details have not been fleshed out fully.

Chris Henson (Nov 27 2025 at 17:22):

From the perspective of reviewing/maintenance, I worry about introducing this before we have had a chance to see Boole land in the main branch, which to my understanding will cover some of the same ground? After seeing the subtleties discussed in this thread, I don't believe I personally have the expertise or bandwidth to review.

Shreyas Srinivas (Nov 27 2025 at 17:22):

(Which I think is important in modern algorithms theory)

Shreyas Srinivas (Nov 27 2025 at 17:24):

Chris Henson said:

From the perspective of reviewing/maintenance, I worry about introducing this before we have had a chance to see Boole land in the main branch, which to my understanding will cover some of the same ground? After seeing the subtleties discussed in this thread, I don't believe I personally have the expertise or bandwidth to review.

that would also be my suggestion for the near term. This combinator query is something I want to mess with independently, but any PR to do complexity without waiting for Boole will create a lot of overlapping work. And as I argued before, we have a lot we can do that will link up to Boole code in the future anyway.

Fabrizio Montesi (Nov 27 2025 at 17:34):

Shreyas Srinivas said:

It would give us idiomatic algorithms pseudocode which is then compiled into lean code.

Ah, I see what you mean. But this would indeed overlap with Boole and other WIP, so I'd wait at least a little bit to hear progress on that.

The TimeM approach is appealing precisely because it's 'lightweight', albeit this makes it more complicated to build a trust layer around it.

Shreyas Srinivas (Nov 27 2025 at 17:36):

My take is, we had several variants of TimeM floating around since the end of mathlib3->mathlib4 port. We waited for a proper language because of the difficulties of managing this lightweight approach. I think the Boole people and initial CSLib steering committee members who backed it knew this as well.

Shreyas Srinivas (Nov 27 2025 at 17:44):

Shreyas Srinivas said:

since Combinator isn't carrying this array around, a pure function or bind function can't actually touch the array directly. You only have the explicit data structure parameter in the "interpret" function that interprets the combinator AST into a real lean function

@Sebastian Ullrich : Would love to hear your take on this

Shreyas Srinivas (Nov 27 2025 at 17:54):

Also I just noticed that the Claude definition of fix is wrong (it's coinductive)

Fabrizio Montesi (Nov 27 2025 at 18:02):

@Shreyas Srinivas Again, I get what you mean, but it's a different problem. How will I analyse the time complexity of arbitrarily-structured Lean code with that approach?

Shreyas Srinivas (Nov 27 2025 at 18:04):

You can’t. But lean code is also not exactly great to verify time complexity with. Sebastian’s solution gets rid of the manual annotation of each function call. But this is still verifying our interpretation of lean function’s time complexity.

Shreyas Srinivas (Nov 27 2025 at 18:06):

To be clear we are now talking about two related things.

The TimeM monad approach as it exists in the PR
Sebastian’s suggestion to metaprogram lean functions into TimeM and automatically tag calls to pre-specified functions.

Shreyas Srinivas (Nov 27 2025 at 18:06):

1 is not ideal for all the reasons discussed above. 2 is nice because it is lightweight and requires no manual annotation of ticks inside the def of an algorithm.

Sebastian Ullrich (Nov 27 2025 at 18:08):

1 is also good prototyping towards 2 though, I don't find it objectionable if it comes with a clear plan

Shreyas Srinivas (Nov 27 2025 at 18:09):

For 2, you need a metaprogram that converts lean functions and a theorem that shows that the lean function and its TimeM counterpart are extensionally equal.

Fabrizio Montesi (Nov 27 2025 at 18:11):

Shreyas Srinivas said:

To be clear we are now talking about two related things.

The TimeM monad approach as it exists in the PR

Sebastian’s suggestion to metaprogram lean functions into TimeM and automatically tag calls to pre-specified functions.

Exactly, these are not disjoint, planning wise. As I mentioned earlier on, the idea is to use 1 to work on the interface (TimeM) and then work on 2 as we gain experience through 1. The intermediate step is to build a good library of functions so that we can start asking to use little to no ticks in PRs, or something like that.

So, in a nutshell, 1 still looks very useful to me, with the warnings that I talked about.

Shreyas Srinivas (Nov 27 2025 at 18:13):

Well yeah the warnings are important. We should also have some sort of a policy on merging PRs which come with tick annotated functions until 2 is ready.

Fabrizio Montesi (Nov 27 2025 at 18:19):

My idea was to push people towards making and expanding this 'trusted' library of lifted operations and time-verified algos.

We could even make some static checkers and safe/unsafe blocks to help with checking the code.

Shreyas Srinivas (Nov 27 2025 at 18:20):

I think you are headed towards the complexity sebastian was talking about. Currently even if there is a safe function with tick, you still need to call tick on it when you call the function.

Shreyas Srinivas (Nov 27 2025 at 18:21):

The goal should be to get the metaprogram ready asap. the mergesort code can prove useful as a test for the metaprogram (and so migrate to a file in the test folder). EDIT: Actually any manually annotated TimeM program which you think would help device the metaprogram probably ought to be used in testing anyway.

Shreyas Srinivas (Nov 27 2025 at 18:23):

Fabrizio Montesi said:

We could even make some static checkers and safe/unsafe blocks to help with checking the code.

~~People have discussed the difficulties of adding linear type theory and quantitative type theoretic frameworks on top of lean. I think the conclusion was that this was practically impossible.~~ EDIT : Oh you mean for the test cases? Yeah that makes sense

Shreyas Srinivas (Nov 27 2025 at 18:29):

Ideally the main repository folder should not contain this unsafe code.

Fabrizio Montesi (Nov 27 2025 at 18:29):

Shreyas Srinivas said:

I think you are headed towards the complexity sebastian was talking about. Currently even if there is a safe function with tick, you still need to call tick on it when you call the function.

(This is now more a bit of stream of consciousness..)
Couldn't we give the user the possibility to declare the function as safe (or unsafe, conversely) somehow and then activate a linter/checker that complains every time you don't tick a call to a TimeM term?

Shreyas Srinivas (Nov 27 2025 at 18:30):

I don't think it is a good idea to have the unsafe stuff in the main library

Shreyas Srinivas (Nov 27 2025 at 18:31):

Since they are test cases for the metaprogram, it makes sense to have them as objects for building future tests in the corresponding folder

Fabrizio Montesi (Nov 27 2025 at 19:21):

I think we can have a somewhat controllable directory or small unsafe blocks, and then safe algorithms like merge sort that build on them. We'll need a trusted Lean layer for a long time, even with Boole or similar approaches, so I want to start exploring this soon.

Tanner Duve (Dec 04 2025 at 09:10):

I've also mostly been in the "wait for Boole to work on time complexity" camp but I do have a suggestion for this query complexity stuff which uses a free monad to avoid the issue of gaming the tick function. It resembles @Shreyas Srinivas combinator language. Something like the following:

Define an AST/functor describing the primitive ops of our computation model:

inductive QueryF : Type → Type where
  | read  : Nat → QueryF Nat
  | write : Nat → Nat → QueryF PUnit
  | cmp   : Nat → Nat → QueryF Bool
  -- etc: whatever the RAM / comparison model needs

Define a free monad of programs over the model:

abbrev Prog (α : Type) := FreeM QueryF α

With some basic control flow like this:

namespace Prog

def cond {α} (b : Prog Bool) (t e : Prog α) : Prog α :=
  b.bind (fun b' => if b' then t else e)

def forLoop (n : Nat) (body : Nat → Prog PUnit) : Prog PUnit :=
  let rec go : Nat → Prog PUnit
    | 0       => pure ()
    | i + 1   =>
      body i >>= fun _ => go I
  go n

end Prog

Then all programs are written in Prog and their complexity is measured by interpreting/folding into CostM instead of manually annotating with ticks:

def costOfQuery : {ι : Type} → QueryF ι → Nat
  | _, .read _       => 1
  | _, .write _ _    => 1
  | _, .cmp _ _      => 1

def costInterp : {ι : Type} → QueryF ι → CostM ι
  | _, .read i      => CostM.tick 0 (costOfQuery (.read i))
  | _, .write i v   => CostM.tick PUnit.unit (costOfQuery (.write i v))
  | _, .cmp i j     => CostM.tick false (costOfQuery (.cmp i j))

def costProg {α : Type} (p : Prog α) : Nat :=
  (p.liftM costInterp).cost

This doesn't give full time complexity but seems to address some of the issues wrt query complexity

Tanner Duve (Dec 04 2025 at 09:24):

We can write another interpreter that actually executes the program:

def evalQuery : {ι : Type} → QueryF ι → ι
  | _, .read _      => 0
  | _, .write _ _   => PUnit.unit
  | _, .cmp x y     => x ≤ y

def evalProg {α : Type} (p : Prog α) : α :=
  FreeM.foldFreeM id
    (fun {ι} (op : QueryF ι) (k : ι → α) =>
      k (evalQuery op))
    p

And then write a mergeSort as follows:

def cmpVal (x y : Nat) : Prog Bool :=
  FreeM.lift (QueryF.cmp x y)

def merge : List Nat → List Nat → Prog (List Nat)
  | [], ys => pure ys
  | xs, [] => pure XS
  | x :: xs', y :: ys' => do
      let b ← cmpVal x y
      if b then
        let rest ← merge xs' (y :: ys')
        pure (x :: rest)
      else
        let rest ← merge (x :: xs') ys'
        pure (y :: rest)

def split (xs : List Nat) : List Nat × List Nat :=
  let rec go : List Nat → List Nat → List Nat → List Nat × List Nat
    | [], accL, accR => (accL.reverse, accR.reverse)
    | [x], accL, accR => ( (x :: accL).reverse, accR.reverse )
    | x :: y :: xs, accL, accR => go xs (x :: accL) (y :: accR)
  go xs [] []

-- partial for now didn't feel like writing termination proof
partial def mergeSort : List Nat → Prog (List Nat)
  | []      => pure []
  | [x]     => pure [x]
  | XS      =>
    let (left, right) := split XS
    do
      let sortedLeft  ← mergeSort left
      let sortedRight ← mergeSort right
      merge sortedLeft sortedRight

#eval evalProg (mergeSort [5,3,8,6,2,7,4,1])
#eval costProg (mergeSort [5,3,8,6,2,7,4,1])

And our #eval outputs are [1, 2, 3, 4, 5, 6, 7, 8] and 12 respectively

Shreyas Srinivas (Dec 04 2025 at 12:25):

I love this idea.

Shreyas Srinivas (Dec 04 2025 at 12:25):

Just one extra step : abstract the query type as a parameter and declare a type class that requires the query type to have an interpretation function and counting function

Shreyas Srinivas (Dec 04 2025 at 12:27):

(deleted)

Shreyas Srinivas (Dec 04 2025 at 19:16):

@Fabrizio Montesi thoughts?

Shreyas Srinivas (Dec 04 2025 at 19:19):

@Tanner Duve what’s your definition of free monads btw?

Shreyas Srinivas (Dec 04 2025 at 19:19):

Is this using freer monads

Tanner Duve (Dec 04 2025 at 19:43):

Yeah its a freer monad, the traditional free monad definition (eg Haskell) fails termination checking due to strict positivity

Tanner Duve (Dec 04 2025 at 19:43):

All free monad stuff is in here: https://github.com/leanprover/cslib/tree/main/Cslib/Foundations/Control/Monad

Tanner Duve (Dec 04 2025 at 19:44):

The definition is here

Ching-Tsun Chou (Dec 05 2025 at 19:07):

I’m no expert in this topic. But I wonder if there is any plan to ground these rather abstract complexity models on something more “concrete “ like RAM model?

Shreyas Srinivas (Dec 05 2025 at 19:08):

The proposal for Boole is such a concrete model in some sense. But realistically that would be too restrictive for algorithms research and theory.

Ching-Tsun Chou (Dec 05 2025 at 19:20):

But surely there need to be some sort of grounding so that one knows that the abstract model is within (say) a constant factor of a more concrete model like the RAM model? Otherwise one would not know reliably whether a complexity result in terms of the more abstract model actually says anything about real-world performance, right?

Shreyas Srinivas (Dec 05 2025 at 20:19):

Not really. No theoretical complexity model has any meaningful connection with what you run and measure on a recent computer.

Shreyas Srinivas (Dec 05 2025 at 20:21):

Not even the RAM model. It’s a good line of inquiry to build more realistic models and relate them to query models (outside lean), but that’s also orthogonal to formulating the theory in terms of queries in lean in the first place.

Shreyas Srinivas (Dec 05 2025 at 20:24):

Secondly a lot of theoretical models are designed purely for theoretical insight. You want to measure the theoretically best achievable result in a very hypothetical model that is weaker than real models or prove lower bounds in a stronger model that then translate to a weaker model.

Ching-Tsun Chou (Dec 05 2025 at 20:48):

Shreyas Srinivas said:

Not really. No theoretical complexity model has any meaningful connection with what you run and measure on a recent computer.

Then I have to say I don’t understand why you would care about whether the “tick” can be gamed in the first place.

Shreyas Srinivas (Dec 05 2025 at 20:55):

Scale: If you pre-select queries (like Tanner illustrated above) and the complexity is compositionally determined, then you only need to check the query type to know what operations are counted and the statement of correctness and complexity of the algorithm. In the manual tick approach you also have to check that each occurrence of a specific operation is marked with a tick inside the algorithm definition. Additionally, you don’t actually get to make a more fine grained count of each query separately.

Sorrachai is perfectly correct that algorithms theory is more directly concerned with the lightweight measurement of queries rather than exact complexity. If you search this Zulip and the lean discord, I am sure you will find that I have made this point several times in the last two years . My objection to the time monad is how much manual checking this will entail of the spec as opposed to the combinator/freer monad approach. The time monad approach is not nearly as compositional as the latter. Scale problems will generally get worse as we have more algorithms of increasing complexity.

Shreyas Srinivas (Dec 05 2025 at 20:59):

Another benefit of a parametrised query approach is how many different standard models and custom oracles it could subsume in one set of composition theorems for complexity. This benefit is lost in the manual tick approach because of loss of compositionality.

Ching-Tsun Chou (Dec 05 2025 at 23:06):

Note that my comments were not about the particulars of time monad or any other models. I was merely asking whether there is any plan to ground the more abstract models on the more concrete models, perhaps in the form of some simulation theorem where (for example) the run time on the former can be bounded by the run time on the latter with a constant factor. I have no skin in which model should be chosen, but I do care about whether the results proved using a model has some connection to real-world performance. The connection doesn’t have to be direct or even very close, but it should be there and we should have theorems about it.

Shreyas Srinivas (Dec 05 2025 at 23:09):

You are asking for something that is far too complicated for modern commodity hardware + compilers + tool stack. The best we can mathematically do is what theorists already do. Asymptotic measurement of the number of calls to specific primitives (queries).

Ching-Tsun Chou (Dec 06 2025 at 00:07):

Remember that I brought up the RAM model?

Shreyas Srinivas (Dec 06 2025 at 00:07):

Yes. But the RAM model itself is not one model.

Ching-Tsun Chou (Dec 06 2025 at 00:10):

Pick the one you like. Is there a relationship between that RAM model and the query based model you can prove?

Shreyas Srinivas (Dec 06 2025 at 00:11):

Probably. But it is uninteresting and tedious

Ching-Tsun Chou (Dec 06 2025 at 00:11):

Why?

Shreyas Srinivas (Dec 06 2025 at 00:12):

Because it is full of implementation details that are trivially understood but tedious to pin down precisely. You are welcome to experiment with it.

Shreyas Srinivas (Dec 06 2025 at 00:12):

the RAM "model" would just be another query type with a rather long list of queries

Shreyas Srinivas (Dec 06 2025 at 00:13):

Further as I mentioned before, algorithms theory only cares about this connection "morally". Our primary work is combinatorial procedures that explain how we construct something from access to some other procedure (query)

TongKe Xue (Dec 06 2025 at 04:10):

Is there a paper/survey I can read up on to understand how this "Query model" measures time ?

I understand what a "step" of a TM is.
For the "RAM" model, I approximate it as: imagine MIPs, except instead of 32bit registers/words, we have registers/words that are k*logn size for some k, where n = size of input.

Then we can compile a language like a mini-OCaml/mini-Haskell to RAM model.

Then we can write NP reductions succintly in mini-OCaml/Haskell, if a few log n factors here & there don't matter; or write it in "mips assembly w/ k*log n size registers) if getting the right running time really matters [like Dijkstras].

How does the "query model" compare to this ?

Eric Wieser (Dec 06 2025 at 05:08):

Tanner Duve said:

With some basic control flow like this:

I thought the point of the monad is that you can just use do notation with native for loops and ifs?

Shreyas Srinivas (Dec 06 2025 at 10:07):

TongKe Xue said:

Is there a paper/survey I can read up on to understand how this "Query model" measures time ?

I understand what a "step" of a TM is.
For the "RAM" model, I approximate it as: imagine MIPs, except instead of 32bit registers/words, we have registers/words that are k*logn size for some k, where n = size of input.

Then we can compile a language like a mini-OCaml/mini-Haskell to RAM model.

Then we can write NP reductions succintly in mini-OCaml/Haskell, if a few log n factors here & there don't matter; or write it in "mips assembly w/ k*log n size registers) if getting the right running time really matters [like Dijkstras].

How does the "query model" compare to this ?

The query model is a general idea that subsumes multiple models. There query here is an inductive type of basic operations (possibly parametrised by more inputs).

The RAM model is one such query model. You write a query type whose constructors are the unit operations of the RAM model and assign unit cost to all of them.
The operations of a Turing machine can be encoded as another query type. Mathlib already has such a type of operations for one of its Turing machine types.

As for literature, this idea seems to be evolving folklore. I can give you my perspective of the history, but that’s a long and off-topic answer.

Shreyas Srinivas (Dec 06 2025 at 10:09):

But part of it is my own story of convincing both the lean community and my corner of the algorithms to understand this intermediate lightweight abstraction of complexity which is closer to what algorithms theorists write in their papers than focus on implementations. I got a concrete version of this idea from Wouter Swierstra’s PiWare in roughly Spring 2023 . But it’s also basically one of the oldest ideas in the PL world : write a DSL that captures all the compositions you care about and compile it down to other code. Once you abstract away the basic operations type, you can obtain multiple models and complexities of algorithms in them from the same composition theorems for complexity.

TongKe Xue (Dec 06 2025 at 11:24):

It is not clear to me the advantage of this over

define a RAM model once; each instr = 1 tick
let people build whatever High Level Language they want in the form of a compiler to the RAM model
there's no way to "cheat" time/space in the HLL, as we measure both from the RAM model

Shreyas Srinivas (Dec 06 2025 at 11:25):

There is no cheating happening. It is a perfectly valid question to ask how may DFS or cut queries or matrix multiplication queries it took to get an algorithm working.

Shreyas Srinivas (Dec 06 2025 at 11:26):

I am sorry but if you think algorithms theory at the research is about producing implementable code you write in undergrad classes and LeetCode, you are simply mistaken.

Shreyas Srinivas (Dec 06 2025 at 11:28):

Also there is no “the RAM model”. I encourage you to try writing a Real RAM to word RAM compiler including handling all the floating point issues. You will see how many many trivial issues lead to formalisation headaches. It’s also far removed from algorithmic content

Shreyas Srinivas (Dec 06 2025 at 11:30):

Secondly even in complexity theory, oracles are a thing independent of the standard models. These are by definition implementation agnostic

TongKe Xue (Dec 06 2025 at 11:32):

I think you and I have different motives. I want to formalize (1) CLRS style proofs and (2) https://theory.cs.princeton.edu/complexity/ style complexity theory.

Shreyas Srinivas (Dec 06 2025 at 11:32):

On the contrary we both have the same motives. I just recognise that CLRS style formalisation won’t scale for research level CS theory.

Shreyas Srinivas (Dec 06 2025 at 11:33):

And the query + combinator model will give you a better way of working with Turing machines as well than raw transition function applications.

Shreyas Srinivas (Dec 06 2025 at 11:33):

But we are beginning to veer off-topic here

Shreyas Srinivas (Dec 06 2025 at 11:34):

Boole is supposed to be designed to implement CLRS style algorithms per the announcement of CSLib. Maybe that’s something you want to wait for.

Shreyas Srinivas (Dec 06 2025 at 11:38):

Do note that even CLRS plays a bit fast and loose with their RAM model as long as the word size is irrelevant (they become relevant when you have pseudo polynomial time algorithms)

Shreyas Srinivas (Dec 06 2025 at 11:39):

If you want to go as far as implementing word ram, you might end up implementing your own int and float types and that would be overkill.

TongKe Xue (Dec 06 2025 at 11:40):

I'm curious why you insist in assigning costs at constructors of your "Query Model" vs just writing a compiler from your "Query Model" to the CLRS-RAM Model, and just counting instrs at the CLRS-RAM model level.

Shreyas Srinivas (Dec 06 2025 at 11:43):

Because I understand the abstraction at which algorithms theory works and the pointless tedium of trying to implement and/or compile something to some real world model and its complete irrelevance to algorithmic complexity

Shreyas Srinivas (Dec 06 2025 at 11:43):

We discussed all this way back in 2023 in a thread about computational complexity. I think I was talking about the query model more vaguely back then.

Shreyas Srinivas (Dec 06 2025 at 11:45):

I also see the unifying power of a query model to abstract dozens of computational models as well as hundreds of bespoke query models. Just yesterday I was chatting with someone about property testing substring matches and naturally that also fits into a query model.

Shreyas Srinivas (Dec 06 2025 at 11:46):

Also I object to “the CLRS RAM model” being a single well defined term. There is almost certainly more than one model. They have a chapter on geometric algorithms iirc

Shreyas Srinivas (Dec 06 2025 at 11:50):

This discussion has dragged on and I have answered all your questions. To get back on track

I have a PR to @Tanner Duve’s PR where I abstract the query type as a parameter

TongKe Xue (Dec 06 2025 at 17:41):

Does your query model allow people to do * and + ?

Consider a problem of size n, where we will be using (log n) sized registers.

Multiplying two numbers of log-n bits is O(1) on RAM model.
Multiplying two numbers of size (n log n) bits is atleast big-Omega(n) <-- just to read the input and output.

In your query model, does both count as "1 multiplication" ?

The way this is resolved in the RAM model is that registers are of size k log n for some fixed constant k; thus, we can't stuff a (n log n)-bit number in a single register. So we need to spread it out in big-Theta(n/k) different registers. Even in your "Query Model", I don't see how you get around the issue of tracking that all your registers/numbers are O(log n) bits.

Shreyas Srinivas (Dec 06 2025 at 17:43):

I think you are missing the point. You pick the unit operations. To be up front if we did care about bit complexity of multiplication we can use the corresponding cost function. You should know that even in these models, you assume something like bigint and count the bit size of numbers to deduce cost.

Shreyas Srinivas (Dec 06 2025 at 17:45):

The cost is simply abstracted away by taking log2 of those numbers and assigning to operations their bit complexity. How they are distributed among registers is an irrelevant implementation detail.

Shreyas Srinivas (Dec 06 2025 at 17:51):

(deleted)

Shreyas Srinivas (Dec 06 2025 at 17:54):

I am answering your questions, but we are veering substantially off-topic here. Both the time monad and query-combinator approach are being pushed by people who actually work in this area of research (Sorrachai and me, respectively). If you wish to continue this discussion, I believe a new topic should be started. Our discussion above is specifically about how to implement this lightweight query approach.

Tanner Duve (Dec 06 2025 at 18:35):

Eric Wieser said:

Tanner Duve said:

With some basic control flow like this:

I thought the point of the monad is that you can just use do notation with native for loops and ifs?

Yeah you could just use do notation, the control flow stuff I wrote wasn't necessary

Tanner Duve (Dec 06 2025 at 18:36):

Shreyas Srinivas said:

This discussion has dragged on and I have answered all your questions. To get back on track

I have a PR to Tanner Duve’s PR where I abstract the query type as a parameter

just merged it into my branch

Fabrizio Montesi (Dec 06 2025 at 19:39):

Catching up. Question: this is merge sort in cslib#201,

def merge : List Nat → List Nat → SortProg (List Nat)
  | [], ys => pure ys
  | xs, [] => pure xs
  | x :: xs', y :: ys' => do
      let b ← cmpVal x y
      if b then
        let rest ← merge xs' (y :: ys')
        pure (x :: rest)
      else
        let rest ← merge (x :: xs') ys'
        pure (y :: rest)

What prevents me from 'cheating' with a pure normal comparison rather than invoking cmpVal? (I haven't read all the code yet, but would appreciate a pointer.)

Eric Wieser (Dec 07 2025 at 08:55):

Writing the algorithm over an abstract type without any comparison operators helps a bit

Eric Wieser (Dec 07 2025 at 08:58):

When you come to proving correctness of the algorithm, you'll get stuck because I could pass in a comparator that reverses the order, and you'd be ignoring it and using the regular one

Fabrizio Montesi (Dec 07 2025 at 09:20):

What do you mean? Changing the signature of that merge sort example so not to use Nat?

Fabrizio Montesi (Dec 07 2025 at 09:21):

Won't we want to specify that algo and query model over any type with a comparison op?

Shreyas Srinivas (Dec 07 2025 at 09:56):

My PR to Tanners PR was incomplete at the point of merging. I might try to change it so that the query monad returns a program ast (a sequence or tree of queries), such that one has to evaluate them to produce a program. Tanner goes further and lifts from this monad to the time monad.

Shreyas Srinivas (Dec 07 2025 at 09:56):

The monad should eventually just return an ast that is interpreted.

Eric Wieser (Dec 07 2025 at 21:11):

Fabrizio Montesi said:

Won't we want to specify that algo and query model over any type with a comparison op?

You indeed want to generalize to any type, but you don't want to include the comparison operation in the signature of the algorithm; the whole point is to force the algorithm to make a query to obtain it

Eric Wieser (Dec 07 2025 at 21:13):

You can then prove something like ((mySortAlg l).run rel).Sorted rel to show the algorithm is correct, and look at the "ast" of (mySortAlg l) to evaluate complexity. This is what @Geoffrey Irving's proofs of sorting complexity do, roughly.

Shreyas Srinivas (Dec 08 2025 at 00:15):

The idea is that the underlying data structure (in this case the array) is never directly accessible to the program in the Prog monad.

Shreyas Srinivas (Dec 08 2025 at 00:15):

It can only create query tokens in an AST.

Shreyas Srinivas (Dec 08 2025 at 00:16):

It is the interpretation/evaluation function that interprets those queries over the array.

Frederick Pu (Dec 08 2025 at 04:13):

i dont know if someone mentioned this already, but there should probably be a shallow embedding of simply typed lambda caclulus into lean

Frederick Pu (Dec 08 2025 at 04:15):

so some sort of a notation cdef that allows you to simultaneously define a quoted functino in simply typed lambda calulcus along with it's corresponding monadic function and a proof of equivalence between the two. That way you can just prove properties about the unquoted functions for correctness and analyze the time complexity of the quoted function (the quoted function wont be part of a function type so you can just induct over the type of simply typed functions)

Frederick Pu (Dec 08 2025 at 04:16):

and then if you want to embed lean structures (preferrably in teh Type univserse) then you can have a QuotedEncodable typeclass that you get an instance of using `driving QuotedEncodable

Fabrizio Montesi (Dec 08 2025 at 19:42):

Eric Wieser said:

Fabrizio Montesi said:

Won't we want to specify that algo and query model over any type with a comparison op?

You indeed want to generalize to any type, but you don't want to include the comparison operation in the signature of the algorithm; the whole point is to force the algorithm to make a query to obtain it

So this targets something different than 'let's verify the time complexity in something that looks like annotated Lean code', instead aiming at 'offering a hard-to-get-wrong-by-AST-trickery approach to algorithm specification' through the 'usual' Free Monad interpretation trick. Correct?

Shreyas Srinivas (Dec 08 2025 at 19:43):

It’s a more systematic approach to :check: monad

Shreyas Srinivas (Dec 08 2025 at 19:43):

It makes the basic operations and their meaning conceptually separated from the complexity counting.

Fabrizio Montesi (Dec 08 2025 at 19:44):

that's pretty clear (and gives also the parametric complexity model advantage). I'm trying to pinpoint the cons.

Shreyas Srinivas (Dec 08 2025 at 19:44):

It allows us to explicitly relate different models in different ways. By mapping basic operations in one query to another, we can get a free translation of algorithms into different models

Shreyas Srinivas (Dec 08 2025 at 19:46):

The main difference between my approach and tanners is that he uses the free monad (essentially the first two constructors in the code sample I pasted above)

Shreyas Srinivas (Dec 08 2025 at 19:46):

But he goes further and compiles to the time monad

Shreyas Srinivas (Dec 08 2025 at 19:46):

And provides the interpretation function for evaluation

Shreyas Srinivas (Dec 08 2025 at 19:47):

The rest of the combinators come from Lean’s monadic constructions

Shreyas Srinivas (Dec 08 2025 at 19:49):

Now I am trying to parametrise the query type.

Sorrachai Yingchareonthawornchai (Dec 08 2025 at 21:11):

The advantage of a simple TimeMonad is the flexibility. In TCS papers, it is rare for mistracking time to be the correctness concern, since most algorithms in theory papers are easy to track. The main advantage is the flexibility in expressing algorithms and running time across many models.

Let's say we want to analyze graph algorithms. In TCS papers, they usually write graph algorithms in set-theoretic notation rather than raw code (e.g., as in a programming contest), where the underlying implementation is typically straightforward. More concretely, let's consider the BFS algorithm. I can express it easily using Mathlib's graph library. But hold on, Mathlib objects do not support time complexity. TimeMonad allows us to tick it and analyze based on the time.

Simplicity and flexibility are assets here. On the other hand, if it requires too much effort just to state the algorithms, I would not want to use it. Of course, making time-complexity models rigorous and easy to use is a massive project by itself. This is my personal view. Of course, some people may view it differently depending on their preferences. Still, I believe that time-ticking misalignment can be easily corrected, whereas inconvenience may be too burdensome for users.

Let's not worry about one another's approach; I would say let's do it any way you like. At some point, time will tell which approach people want to use.

Shreyas Srinivas (Dec 08 2025 at 21:13):

@Sorrachai Yingchareonthawornchai I agree with you that we want simplicity

Shreyas Srinivas (Dec 08 2025 at 21:14):

I have also lobbied hard for two years to do complexity formalisation at a high level of abstraction

Shreyas Srinivas (Dec 08 2025 at 21:14):

But flexibility comes at a huge cost

Shreyas Srinivas (Dec 08 2025 at 21:15):

If I annotate every algorithm individually as I please, then not only is there room for mistakes, there is also complete loss of information about whether we are even talking about the same (custom/standard) model or not

Shreyas Srinivas (Dec 08 2025 at 21:15):

For example, if you want to have two subtly distinct cut queries or matrix multiplication queries, you can't even distinguish them

Shreyas Srinivas (Dec 08 2025 at 21:15):

Let alone formalise their relationship

Shreyas Srinivas (Dec 08 2025 at 21:16):

The reason you go for modularity and composability (topics of decades of PL research) is precisely to avoid mistakes of scale, separate reduction orthogonal concerns, avoid mixing up abstractions, and avoiding mistakes

Shreyas Srinivas (Dec 08 2025 at 21:16):

I should know. I wrote the bits about scalability issues for the equational theories paper where I was maintainer (the preprint should ideally be up tomorrow night)

Sorrachai Yingchareonthawornchai (Dec 08 2025 at 21:18):

Let's not worry about one another's approach; I would say let's do it any way you like. Less talking, more doing.

Shreyas Srinivas (Dec 08 2025 at 21:18):

I respectfully disagree. Pushing a flawed approach has costs. Coding just for the sake of just getting "something" done is something we teach undergrads not to do.

Shreyas Srinivas (Dec 08 2025 at 21:20):

The free monad approach of Tanner actually builds on top of your time monad approach.
The query monad compiles to the time monad, but essentially annotates ticks compositionally. It would be nice to combine them and merge the combined PR after it is ready.

Also this still means as @Fabrizio Montesi agreed (see : #CSLib > Proposal on Time Complexity @ 💬 about docstrings and a further reference here in the second paragraph that's a single line : #CSLib > Proposal on Time Complexity @ 💬 ), the module docstring of the mergesort example should note the risks of manual annotation. I have also pointed this out in review comments (for e.g. here, here, here ,and here) which have gone unaddressed completely so far.

Shreyas Srinivas (Dec 08 2025 at 21:26):

I am currently neatly parametrizing his approach with a generic query type and filling in the monad instances. This will take a few days (given availability of time outside work hours). But wit this, we will have a clean definition of models and relationships between them, as well as no need to manually annotate ticks. I think the wait is worth it.

Sorrachai Yingchareonthawornchai (Dec 08 2025 at 21:48):

There is no real harm in accepting multiple proposals. It is healthy to have many options to choose from, depending on the context. While there may be many proposals at the start, it is too premature to rule out good ones before they have a chance to grow. If someone believes in their design, we should let them shine and do their thing. In the end, good design will stand the test of time.

Shreyas Srinivas (Dec 08 2025 at 21:49):

I don't think mathlib followed that approach. That's actually more akin to Isabelle's approach and the result is a heavily fragmented library, where you have to carefully choose which definitions you pick and avoid for the same concept.

Shreyas Srinivas (Dec 08 2025 at 21:49):

It is critical to get good definitions into a large library from the beginning in formalisation and keep it coherent.

Shreyas Srinivas (Dec 08 2025 at 21:49):

and stick to one (iteratively improved) definition where possible to avoid fragmentation.

Shreyas Srinivas (Dec 08 2025 at 21:52):

Also if we are pursuing an each-to-their-own approach, then what even is the point of a cslib

Shreyas Srinivas (Dec 08 2025 at 21:53):

I urge some patience in merging this PR (cslib#165) , especially since there is a possibility (cslib#201 still WIP) of systematising its approach.

Shreyas Srinivas (Dec 08 2025 at 21:57):

Another issue is, the approach that will “win” will likely be the one that is easy in the short term and very suboptimal in the long run, but then it will be a much bigger headache to fix. So there is very real harm.

Eric Wieser (Dec 08 2025 at 23:14):

Can you edit "This PR" to reference the PR in question?

Shreyas Srinivas (Dec 08 2025 at 23:17):

sure. Also linked all referenced messages and github comments.

Snir Broshi (Dec 09 2025 at 00:53):

Can you list the proposals in a bulleted list? I'm still confused, especially since cslib#165 doesn't mention "tick monad" anywhere but the code definitely uses ticks. Which PR is which proposal? Is one of them awaiting a refactor?

Shreyas Srinivas (Dec 09 2025 at 00:54):

It’s called TimeM.

Shreyas Srinivas (Dec 09 2025 at 00:55):

To write code in it you explicity (and arbitrarily) pick operations you want to count in the middle of the algorithms definition.

Shreyas Srinivas (Dec 09 2025 at 00:55):

(deleted)

Shreyas Srinivas (Dec 09 2025 at 00:56):

That’s cslib#165

Shreyas Srinivas (Dec 09 2025 at 00:58):

@Tanner Duve proposed a FreeM based abstraction for a specific query model. It has some flaws that Eric pointed out but in principle it can represent algorithms as calls to queries from a query type.

Shreyas Srinivas (Dec 09 2025 at 00:59):

Then a translation function translates into the TimeM monad with automated annotation of ticks where queries are counted.

Shreyas Srinivas (Dec 09 2025 at 01:00):

And an evaluation function computes the function that the algorithm represents. At least that’s the idea. It is still a big WIP in cslib#201

Shreyas Srinivas (Dec 09 2025 at 01:02):

I think you can find these PRs mentioned in the last big message I sent. cslib#201 is decidedly not ready for review. There is still some work to be done. But once it is complete we can get a clean definition of a sorting model and a merge sort algorithm in it. In principle (and unlike cslib#165) we can also get the comparison sort lower bound.

Shreyas Srinivas (Dec 09 2025 at 01:02):

And much of this thread is me arguing that fixing a unified and good definition is important for a useful and coherent and correct library that is relatively easy to add content to.

Fabrizio Montesi (Dec 09 2025 at 14:01):

I still need answers on whether my observations are correct. To help the discussion as far as supporting different approaches in CSLib goes, here are questions I need clear answers for:

In cslib#165, when writing a TimeM def: say I call another TimeM def. Is it easy enough to remember that I should include its time complexity in that of the invoking def? It seems so. What requires care seems to be ticking calls to untimed stuff. Correct?
Can I still mess up in cslib#201 by calling something and wrapping it into a pure or something like that?
Is it fair to say that cslib#165 supports writing pretty idiomatic and arbitrary Lean code, whereas cslib#201 requires changing the way one works a bit? (This wouldn't make me be against cslib#201, I'm aware of its advantages; it's the cons that I wanna understand well before considering merging it once it's cooked enough.)

Fabrizio Montesi (Dec 09 2025 at 14:03):

Satisfactory answers that would motivate supporting both approaches could be in the space 'yes, easy enough', 'yes, I can still mess up'/'no, I can't', 'fair enough'. ;-)

Fabrizio Montesi (Dec 09 2025 at 14:03):

I think it's very important to understand this clearly because we need to equip the two approaches with appropriate docstrings/docs that document their pros and cons (things to be careful with).

Shreyas Srinivas (Dec 09 2025 at 14:05):

cslib#201 also supports writing idiomatic code. It is just a different monad.

Shreyas Srinivas (Dec 09 2025 at 14:07):

In cslib#201 I am still modifying the approach a bit, but the goal is to not be able to sneak in pure code. I can confirm this when I have nailed down the design completely. But even now, I would say it helps us clearly identify dangerous queries (as Eric did when discussing alternative ways to write the sorting example in DM)

Shreyas Srinivas (Dec 09 2025 at 14:08):

“Easy enough” is subjective. Both authors and reviewers must be careful to vet everything for cslib#165. cslib#201 forces you to be more explicit about your basic operations and cost functions. I also dislike that I can’t clearly distinguish subtle model differences easily in cslib#165.

Fabrizio Montesi (Dec 09 2025 at 14:23):

We gotta define/use a query model and invoke operations therein instead of using whatever def/typeclass/etc. in Lean. So it looks different in that sense to me. Correct?

Fabrizio Montesi (Dec 09 2025 at 14:24):

Fair enough. And the approach would still be interesting anyway.

Fabrizio Montesi (Dec 09 2025 at 14:25):

If I invoke mergeSort from another timed def (with TimeM), can I just 'forget' to tick the invocation? Or is the 'default' that it counts as ticked? I think here we could actually devise quite a few mechanisms to identify this kind of occurrences.

Shreyas Srinivas (Dec 09 2025 at 14:42):

It’s one thing to identify mistakes in the cslib#165 using some heuristic metaprogram. It’s another thing to formally define the property of “correctly annotated” programs which is actually impossible. So how does one get compositionally correct complexity results?

Shreyas Srinivas (Dec 09 2025 at 15:24):

For what it is worth, I don't see these approaches as entirely separate. cslib#201 builds a way of using cslib#165's TimeM monad in a systematic manner.

Shreyas Srinivas (Dec 09 2025 at 16:16):

Fabrizio Montesi said:

We gotta define/use a query model and invoke operations therein instead of using whatever def/typeclass/etc. in Lean. So it looks different in that sense to me. Correct?

The model is already conceptually out there. We assume it implicitly when we decide to use certain procedures without caring about their implementations and count the complexity of calls to them (with their inputs). With cslib#201 I want formalisers to make it explicit and abstract the implementation away.

Eric Wieser (Dec 10 2025 at 01:42):

Fabrizio Montesi said:

If I invoke mergeSort from another timed def (with TimeM), can I just 'forget' to tick the invocation?

You could do something like pure (TimeM.run mergeSort).result to cheat and discard the tick count

Fabrizio Montesi (Dec 10 2025 at 11:32):

Thanks, that matches my understanding! I think I know enough to know what to ask for in the docstrings of cslib#165 (will work on that), and it looks like an approach worth keeping.
I look forward to further developments of cslib#201 as well.

Shreyas Srinivas (Dec 10 2025 at 22:49):

I do think the that cslib#165 must explicitly record in docstrings that it is an unsafe and risky approach and that contributors are to be encouraged to build on more safe abstractions. That is why I recommend waiting a bit for cslib#201 (I am only asking for a week roughly).

Fabrizio Montesi (Dec 11 2025 at 10:03):

(deleted)

Fabrizio Montesi (Dec 12 2025 at 11:14):

cslib#165 got quite a few changes recently and looks ready to me now. I'll be interested in future developments of both that and cslib#201 (as well as new approaches).
I might have a few ideas for improvements to both cslib#165 (safety) and cslib#201 (ergonomics), but these can happen at a later stage.

Jakub Nowak (Dec 26 2025 at 08:53):

This topic was mentioned here: #Is there code for X? > Converting Lean.Expr to untyped lambda calculus
@Shreyas Srinivas You mention that's it's possible to relate query model to computational model like TM. Can you elaborator on how this can be done? What I would like to have, is to at the end build one Turing Machine for all possible inputs, and be able to prove stuff about it's time complexity (i.e. a function input -> Nat). Tbh, the first thing I don't understand is how do we ensure with TimeM monad that it's not possible to e.g. arbitrarily access tape/stacks? And the second, how can I have a time that depends on the input, while still ensuring that the turing machine doesn't depend on the input?

Shreyas Srinivas (Dec 26 2025 at 12:00):

Good question. But it’s WIP and I’ll respond when I am finished which is after the holidays.

You can interpret queries over a suitable data structure
TimeM in itself won’t stop you from accessing the TM’s internals

Last updated: Feb 28 2026 at 14:05 UTC