Zulip Chat Archive

Stream: Is there code for X?

Topic: Proofs on toChunks list function

Colin ⚛️ (Jun 13 2025 at 02:09):

Are there proofs for these statements in Mathlib?

import Mathlib

variable (s : List α) (n : ℕ)

theorem One1 : ([]:List α).toChunks n = [] := by
  induction n <;> rfl

theorem Two2 : (([]:List α).toChunks n).length = 0 := by
  rw [List.length_eq_zero, One1]

example (hn : 0 < n) : (s.toChunks n).length = Rat.ceil (s.length / n) := by
  induction s
  · simp only [List.length_nil, CharP.cast_eq_zero, zero_div]
    have h : Rat.ceil 0 = 0 := rfl
    rw?
    sorry
  · sorry

I have to still finish the third proof.

Chris Bailey (Jun 13 2025 at 02:48):

There don't seem to be any theorems referencing List.toChunks in batteries or mathlib, so my guess would be 'no'.

Colin ⚛️ (Jun 14 2025 at 16:10):

Would it be good to add them to Mathlib? Or are they too simple?

Chris Bailey (Jun 14 2025 at 17:06):

I presume lemmas are always welcome, this might be a batteries thing. That being said I'm not sure this is how they would want the lemma stated, they would probably would want it to be about natural number division somehow.

Kim Morrison (Jun 15 2025 at 06:00):

Yes, this should be in Batteries (as that's where the function is defined), and it should not use Rat.

Please make sure to include all the characterising theorems:

join is the original list
the length is as you say (but without Rat)
the length of every list except the last is n
the length of the last list is length % n

Colin ⚛️ (Jun 16 2025 at 14:40):

What should I use instead of Rat?

Colin ⚛️ (Jun 16 2025 at 14:40):

Nat?

Chris Bailey (Jun 16 2025 at 16:04):

Yes

Colin ⚛️ (Jun 17 2025 at 00:32):

Okay working on it

Colin ⚛️ (Jun 17 2025 at 00:53):

Here's the pull request link

Colin ⚛️ (Jun 17 2025 at 01:00):

There is a problem with the imports. The function Nat.ceil requires the file containing List.toChunks, so to import Nat.ceil we would define List.toChunks multiple times if I add the proof to Batteries/Data/List/Basic.lean.

Should I create a new file outside of Basic.lean to avoid this issue?

Kim Morrison (Jun 17 2025 at 01:22):

Don't use Nat.ceil. Just use nat division (adding 1 somewhere, presumably)

Colin ⚛️ (Jun 17 2025 at 01:23):

Oh okay

Aaron Liu (Jun 17 2025 at 01:46):

I don't see a simple way to get ceiling division with only Nat operations

Kim Morrison (Jun 17 2025 at 01:49):

if is fine?

Aaron Liu (Jun 17 2025 at 01:50):

if d ∣ n then n / d else n / d + 1 is kind of long

Kim Morrison (Jun 17 2025 at 01:55):

Colin ⚛️ said:

if I add the proof to Batteries/Data/List/Basic.lean.

There shouldn't be any proofs in this file. Proofs go in List/Lemmas.lean

Colin ⚛️ (Jun 17 2025 at 01:57):

Thanks for letting me know

Colin ⚛️ (Jun 17 2025 at 01:59):

I'm also quite stuck on this proof, if anyone wants to take a shot at it

theorem Four4 (xs : List α) (n : Nat) : (xs.toChunks n).flatten = xs := by
  simp only [toChunks]
  induction xs generalizing n with
  | nil =>
    simp [toChunks.go]
  | cons x xs ih =>
    cases n with
    | zero =>
      simp [toChunks.go, flatten]
    | succ n' =>
      simp only [toChunks.go]
      have : (0 + 1 < succ n') ∨ ¬(0 + 1 < succ n') := Decidable.em (0 + 1 < n'.succ)
      cases this with
      | inl lt =>
        sorry
        rw [ih]
        simp [take_append_drop]
      | inr ge =>
        sorry

Colin ⚛️ (Jun 17 2025 at 01:59):

I'm not sure if I need a hypothesis (h : xs ≠ [])

Colin ⚛️ (Jun 17 2025 at 01:59):

I think it can proven without it

Kim Morrison (Jun 17 2025 at 02:00):

Please read #mwe

Kim Morrison (Jun 17 2025 at 02:00):

toChunks is defined in terms of toChunks.go. Work out what theorem you need to state about toChunks.go first, and prove that.

Colin ⚛️ (Jun 19 2025 at 09:55):

#eval ([]:List Nat).toChunks 1
#eval [1].toChunks 1

The first evaluates to [] and the second evaluates to [[1]]. This is a problem for proofs such as:

theorem Three3 (xs : List α) (h : xs ≠ []) : xs.toChunks 0 = [xs] := by
induction xs
  · contradiction
  · simp only [toChunks]

As they require a stipulation that xs \= [] as toChunks will turn it into an empty list ([]) instead of a list with the empty list ([[]]). The reason for that is in the first match case for toChunks.

def toChunks {α} : Nat → List α → List (List α)
  | _, [] => [] --  HERE IS WHERE THE ISSUE ORIGINATES
  | 0, xs => [xs]
  | n, x :: xs =>
    let rec
    go : List α → Array α → Array (List α) → List (List α)
    | [], acc₁, acc₂ => acc₂.push acc₁.toList |>.toList
    | x :: xs, acc₁, acc₂ =>
      if acc₁.size == n then
        go xs ((Array.mkEmpty n).push x) (acc₂.push acc₁.toList)
      else
        go xs (acc₁.push x) acc₂
    go xs #[x] #[]

We could change the first line to | _, [] => [] to | _, [] => [[]] to make it more consistent and remove the need for stipulation. Thoughts?

Chris Bailey (Jun 19 2025 at 10:29):

I would expect it to be [] and not [[]]; is there a reason why the behavior described by Three3 is expected to be true of the empty list?

Wrenna Robson (Jun 19 2025 at 11:21):

Hey Colin. I am interested in helping with your PR - let me take a look.

Wrenna Robson (Jun 19 2025 at 11:25):

https://hackage.haskell.org/package/split-0.2.5/docs/Data-List-Split.html#v:chunksOf

toChunks is the same as the Haskell function chunksOf, I believe: they are clear there that that definition for [] is deliberate.

Wrenna Robson (Jun 19 2025 at 11:26):

Because the following property should be true: xs.toChunks n ++ ys.toChunks n == (xs ++ ys).toChunks n.

Wrenna Robson (Jun 19 2025 at 11:35):

I have to say I am not sure why toChunks is defined in terms of arrays... seems odd to me!

Wrenna Robson (Jun 19 2025 at 11:35):

Especially when we lack a similar function for arrays.

Wrenna Robson (Jun 19 2025 at 12:00):

... and toChunksAux appears to be useless now. When did this get changed? I feel like toChunksAux ought to have been deprecated at the same time.

Wrenna Robson (Jun 19 2025 at 12:03):

In fact it has always been like that...

Colin ⚛️ (Jun 19 2025 at 13:44):

Chris Bailey said:

I would expect it to be [] and not [[]]; is there a reason why the behavior described by Three3 is expected to be true of the empty list?

I am wondering if it would just make it easier. It's fine if we keep it how it is; we just have to add the stipulation to many theorems that xs \= []. Here are the theorems I want to prove so far. Pay no mind to the names lol.

theorem One1 : ([]:List α).toChunks n = [] := by
  induction n <;> rfl

theorem Two2 : (([]:List α).toChunks n).length = 0 := by
  rw [List.length_eq_zero, One1]

theorem Three3 (xs : List α) (h : xs ≠ []) : xs.toChunks 0 = [xs] := by
  induction xs
  · contradiction
  · simp only [toChunks]

theorem length_element (x xs : List α) (n : Nat) (h : xs ∈ x.toChunks n) : xs.length ≤ n := by
  simp only [toChunks] at h
  unfold toChunks.go at h
  split at h
  · rw [mem_nil_iff] at h
    contradiction
  · sorry

theorem Three' (xs : List α) (n : Nat) (h : xs.length ≤ n) : xs.toChunks n = [xs] := by
  sorry

theorem Four4 (xs : List α) (n : Nat) : (xs.toChunks n).flatten = xs := by
  by_cases h : xs = []
  · rw [h, One1, flatten_nil]
  · by_cases h' : n = 0
    · rw [h', Three3, flatten_singleton]
      exact h
    · simp only [toChunks]
      split
      · contradiction
      · contradiction
      · rename_i n' x' xe xs' ht
        rw [List.toChunks.go.eq_def]
        sorry

example (s : List α) (hn : 0 < n) : (s.toChunks n).length = Nat.ceil (s.length / n) := by
  sorry

Colin ⚛️ (Jun 19 2025 at 13:45):

Wrenna Robson said:

... and toChunksAux appears to be useless now. When did this get changed? I feel like toChunksAux ought to have been deprecated at the same time.

I noticed this too. The function toChunksAux seems to have no use.

Colin ⚛️ (Jun 19 2025 at 13:46):

If you want to help, here is the PR link: https://github.com/leanprover-community/batteries/pull/1279

Aaron Liu (Jun 19 2025 at 14:32):

import Batteries.Data.List.Basic

theorem List.toChunks.length_go {α : Type u} {n : Nat} {xs : List α}
    {acc₁ : Array α} (acc₂ : Array (List α)) (hacc₁ : acc₁.size ≤ n)
    (hpos₁ : 0 < acc₁.size) :
    (List.toChunks.go n xs acc₁ acc₂).length =
      (acc₁.size + xs.length) / n + acc₂.size + if n ∣ acc₁.size + xs.length then 0 else 1 := by
  induction xs generalizing acc₁ acc₂ with
  | nil =>
    rw [List.toChunks.go]
    by_cases hn : n ∣ acc₁.size
    · have h1 : acc₁.size = n := by
        apply Nat.le_antisymm hacc₁
        exact Nat.le_of_dvd hpos₁ hn
      simp [← h1, Nat.div_self hpos₁, Nat.add_comm]
    · have h1 : acc₁.size < n := by
        apply Nat.lt_of_le_of_ne hacc₁
        rintro rfl
        simp at hn
      simp [Nat.div_eq_of_lt h1, hn]
  | cons x xs ih =>
    rw [List.toChunks.go]
    rw [Nat.le_iff_lt_or_eq] at hacc₁
    obtain hacc₁ | rfl := hacc₁
    · have hr : acc₁.size + 1 + xs.length = acc₁.size + (xs.length + 1) := by
        rw [Nat.add_right_comm, Nat.add_assoc]
      rw [if_neg (mt eq_of_beq (Nat.ne_of_lt hacc₁)),
        ih acc₂ (by simpa [Nat.add_one_le_iff] using hacc₁) (by simp)]
      simp [hr]
    · rw [if_pos BEq.rfl, ih _ (by simpa using hpos₁) (by simp)]
      simp [Nat.add_div_left _ hpos₁, Nat.one_add, Nat.add_assoc,
        ← Nat.dvd_add_iff_right (Nat.dvd_refl acc₁.size)]

theorem List.length_toChunks {α : Type u} {n : Nat} {xs : List α} :
    (xs.toChunks n).length =
      xs.length / n + if n ∣ xs.length then 0 else 1 := by
  cases xs with
  | nil => simp [List.toChunks]
  | cons x xs =>
    cases n with
    | zero => simp [List.toChunks]
    | succ n =>
      rw [List.toChunks, List.toChunks.length_go _ (by simp) (by simp)]
      all_goals simp [Nat.one_add]

Colin ⚛️ (Jun 19 2025 at 14:46):

Thanks for the help @Aaron Liu !

Wrenna Robson (Jun 19 2025 at 15:18):

@Colin ⚛️ Just a tip - it is good to get into the habit of well-naming your theorems and/or lemmas. It can help with your own thinking about a problem.

Wrenna Robson (Jun 19 2025 at 15:20):

Colin ⚛️ said:

Wrenna Robson said:

... and toChunksAux appears to be useless now. When did this get changed? I feel like toChunksAux ought to have been deprecated at the same time.

I noticed this too. The function toChunksAux seems to have no use.

Personally I think we should have:

a "toChunks" function on List that does not use Arrays (but also does not aim for performance)
a "toChunks" function on Arrays (and for a bonus, one on Vector too!) that uses the approach of toChunks.go
A theorem linking the two in the appropriate way.

Colin ⚛️ (Jun 19 2025 at 15:31):

Aaron Liu said:

import Batteries.Data.List.Basic

theorem List.toChunks.length_go {α : Type u} {n : Nat} {xs : List α}
    {acc₁ : Array α} (acc₂ : Array (List α)) (hacc₁ : acc₁.size ≤ n)
    (hpos₁ : 0 < acc₁.size) :
    (List.toChunks.go n xs acc₁ acc₂).length =
      (acc₁.size + xs.length) / n + acc₂.size + if n ∣ acc₁.size + xs.length then 0 else 1 := by
  induction xs generalizing acc₁ acc₂ with
  | nil =>
    rw [List.toChunks.go]
    by_cases hn : n ∣ acc₁.size
    · have h1 : acc₁.size = n := by
        apply Nat.le_antisymm hacc₁
        exact Nat.le_of_dvd hpos₁ hn
      simp [← h1, Nat.div_self hpos₁, Nat.add_comm]
    · have h1 : acc₁.size < n := by
        apply Nat.lt_of_le_of_ne hacc₁
        rintro rfl
        simp at hn
      simp [Nat.div_eq_of_lt h1, hn]
  | cons x xs ih =>
    rw [List.toChunks.go]
    rw [Nat.le_iff_lt_or_eq] at hacc₁
    obtain hacc₁ | rfl := hacc₁
    · have hr : acc₁.size + 1 + xs.length = acc₁.size + (xs.length + 1) := by
        rw [Nat.add_right_comm, Nat.add_assoc]
      rw [if_neg (mt eq_of_beq (Nat.ne_of_lt hacc₁)),
        ih acc₂ (by simpa [Nat.add_one_le_iff] using hacc₁) (by simp)]
      simp [hr]
    · rw [if_pos BEq.rfl, ih _ (by simpa using hpos₁) (by simp)]
      simp [Nat.add_div_left _ hpos₁, Nat.one_add, Nat.add_assoc,
        ← Nat.dvd_add_iff_right (Nat.dvd_refl acc₁.size)]

theorem List.length_toChunks {α : Type u} {n : Nat} {xs : List α} :
    (xs.toChunks n).length =
      xs.length / n + if n ∣ xs.length then 0 else 1 := by
  cases xs with
  | nil => simp [List.toChunks]
  | cons x xs =>
    cases n with
    | zero => simp [List.toChunks]
    | succ n =>
      rw [List.toChunks, List.toChunks.length_go _ (by simp) (by simp)]
      all_goals simp [Nat.one_add]

I will simplify and modify these proofs and put them into the pull request

Colin ⚛️ (Jun 19 2025 at 15:31):

Wrenna Robson said:

Colin ⚛️ said:

Wrenna Robson said:

... and toChunksAux appears to be useless now. When did this get changed? I feel like toChunksAux ought to have been deprecated at the same time.

I noticed this too. The function toChunksAux seems to have no use.

Personally I think we should have:

a "toChunks" function on List that does not use Arrays (but also does not aim for performance)

a "toChunks" function on Arrays (and for a bonus, one on Vector too!) that uses the approach of toChunks.go

A theorem linking the two in the appropriate way.

Should we depreciate the toChunksAux function, or just get rid of it?

Kim Morrison (Jun 19 2025 at 23:05):

Wrenna Robson said:

A theorem linking the two in the appropriate way.

Even better, this should be a csimp theorem, so that at runtime the List version is fast, but you only have to reason about the simple implementation.

Wrenna Robson (Jun 20 2025 at 07:18):

Right! Is that as simple as tagging it @[csimp]?

Franklin Harding (Nov 28 2025 at 15:40):

It seems like the more important theorem states that if the original list is of length N then the chunked list consists of floor(N/n) lists of length n followed by one list of length N % n, if N % n is non-zero. I can work on this (no promises because I'm new to lean, but hopefully Aaron's code will serve as a good template). It's too bad that the PR died though.

Last updated: Feb 28 2026 at 14:05 UTC