Zulip Chat Archive

Stream: general

Topic: Formalization of Runtime Complexity of Sorting Algorithms

Tomaz Gomes Mascarenhas (May 27 2022 at 21:03):

Hi there!
About this message,
I've completed the formalization last year, and now I'm adjusting everything according to mathlib's guide for contribution. It's implemented in this repository. I would really appreciate if someone could take a look and give me feedback before I make a PR. I know that specially the proof about merge sort is quite large and may be hard to get, so feel free to ask any questions about it. This is my first contribution to mathlib, and I don't have an extensive background in Lean, so probably there is some code smells that I'm not aware of.

The proofs are all based on a new version of Insertion Sort and Merge Sort that I've defined, that counts the number of operations done besides sorting the input list. In the formalization, I've included proofs that they produce the same output as their original counterparts, defined in data/list/sort.lean. Everything is documented on my bachelor thesis (Report.pdf in the repository). Unfortunately, there is only the version in brazilian portuguese :/.

Bhavik Mehta (May 27 2022 at 23:07):

I'm taking a quick look at your insertion sort, and I don't completely understand this part:

#eval ordered_insert (λ m n : ℕ , m ≤ n) 2 [5, 3, 1, 4]
-- ([2, 5, 3, 1, 4], 0)

I would expect this to have one comparison: checking that 2 <= 5, and looking at the definition of ordered_insert it seems that it should be 1 as well - is the comment just outdated?

Bhavik Mehta (May 27 2022 at 23:07):

As an aside, you can also write this as #eval ordered_insert (≤) 2 [5, 3, 1, 4]

Eric Wieser (May 27 2022 at 23:14):

I just spun up a gitpod instance to try it out (https://github.com/tomaz1502/RunTimeFormalization/pull/1): yes, the comment is just outdated, it gives what you expect

Eric Wieser (May 27 2022 at 23:16):

One other comment: your log_monotone lemma is exactly @nat.log_monotone 2

Eric Wieser (May 27 2022 at 23:17):

Generally though it looks very well written!

Tomaz Gomes Mascarenhas (May 27 2022 at 23:20):

Bhavik Mehta said:

I'm taking a quick look at your insertion sort, and I don't completely understand this part:
#eval ordered_insert (λ m n : ℕ , m ≤ n) 2 [5, 3, 1, 4]
-- ([2, 5, 3, 1, 4], 0)
I would expect this to have one comparison: checking that 2 <= 5, and looking at the definition of ordered_insert it seems that it should be 1 as well - is the comment just outdated?

yes, the commentary is outdated, thanks for pointing that out

Eric Wieser (May 27 2022 at 23:22):

Here's a shorter proof of your div_two as well:

lemma div_two (b a : ℕ) : 2 * a ≤ b → a ≤ b / 2 :=
by simp_rw [nat.le_div_iff_mul_le _ _ zero_lt_two, mul_comm, imp_self]

Eric Wieser (May 27 2022 at 23:22):

(I found the main lemma by typing nat.le_div and hitting ctrl+space to get suggestions)

Tomaz Gomes Mascarenhas (May 27 2022 at 23:28):

Eric Wieser said:

One other comment: your log_monotone lemma is exactly @nat.log_monotone 2

oh! Maybe it was added recently? I remember not being able to find it about a year ago, when I was proving this.. anyway, thanks a lot! I will remove it

Eric Wieser (May 27 2022 at 23:33):

I'm surprised that I couldn't find the nat.log b b = 1 lemma (which you have for b=2) in mathlib

Bhavik Mehta (May 28 2022 at 04:19):

Eric Wieser said:

I'm surprised that I couldn't find the nat.log b b = 1 lemma (which you have for b=2) in mathlib

It should be a very special case of docs#nat.log_eq_one_iff

Bhavik Mehta (May 28 2022 at 04:21):

Alternatively, of docs#nat.log_pow

Eric Wieser (May 28 2022 at 07:09):

I think it's still worth having as a lemma

Tomaz Gomes Mascarenhas (May 28 2022 at 21:47):

can someone give me write access to non-master branches of mathlib? So I can push this to the branch and create a PR

Eric Wieser (May 28 2022 at 22:05):

Invite sent

Eric Wieser (May 28 2022 at 22:07):

It's not immediately clear to me whether this type of formalization is in scope for mathlib, but certainly some of the lemmas you needed along the way will be; and the easiest place to have that discussion will probably be in a PR. Perhaps @Mario Carneiro has some opinion on whether and/or where this would best fit into mathlib

Mario Carneiro (May 28 2022 at 22:09):

as long as it is written in a way where people can clearly build on it, I don't see why not to put it in mathlib, perhaps somewhere in computability or near the other complexity theory work

Mario Carneiro (May 28 2022 at 22:11):

I think there are some aspects of the "framework" that need work, for example, is there a principled explanation for where the "ticks" are supposed to go?

Tomaz Gomes Mascarenhas (May 31 2022 at 00:56):

Mario Carneiro said:

I think there are some aspects of the "framework" that need work, for example, is there a principled explanation for where the "ticks" are supposed to go?

Can you give me a more detailed explanation on what do you mean by "where the ticks are supposed to go"? By "tick" you mean each operation, right? You mean that I should define precisely what is an operation in my framework?

Last updated: May 02 2025 at 03:31 UTC