Zulip Chat Archive

Stream: Is there code for X?

Topic: Induction on repeated elements in list

Violeta Hernández (Sep 15 2024 at 00:31):

Do we have anything like this?

theorem replicateRecOn {α : Type*} {p : List α → Prop} (m : List α)
    (h0 : p []) (hr : ∀ a n, p (replicate n a))
    (hi : ∀ a b n l, a ≠ b → p (b :: l) → p (replicate n a ++ b :: l)) : p m := sorry

Basically, I want to split my list into parts repeating a single element and induct on those parts.

Daniel Weber (Sep 15 2024 at 01:39):

You can use docs#List.groupBy, although there doesn't seem to be any API for it

Kim Morrison (Sep 15 2024 at 04:49):

You just need to prove that every list is either [], a replicate, or of the form replicate n a ++ b :: l for some a \ne b.

Violeta Hernández (Sep 15 2024 at 04:49):

Yeah, I was able to prove this myself. Just asking if something like this already existed, or whether it'd be worth adding in Mathlib.

Kim Morrison (Sep 15 2024 at 05:03):

I'll add

theorem eq_replicate_or_eq_replicate_append_cons {α : Type _} {l : List α} :
    (l = []) ∨ (∃ n a, l = replicate n a ∧ 0 < n) ∨
      (∃ n a b l', l = replicate n a ++ b :: l' ∧ 0 < n ∧ a ≠ b) := ...

Violeta Hernández (Sep 15 2024 at 05:05):

Nice! I think a way to generate data from this might also be valuable.

Kim Morrison (Sep 15 2024 at 05:05):

Do you mean the Sort valued version of what you wrote above?

Violeta Hernández (Sep 15 2024 at 05:05):

Yep

Kim Morrison (Sep 15 2024 at 05:09):

Hmm, what theorem would you want characterising the data producing one?

Kim Morrison (Sep 15 2024 at 05:10):

I don't really see a good statement except in terms of groupBy, in which case one should just use groupBy to begin with ...

Violeta Hernández (Sep 15 2024 at 05:11):

Kim Morrison said:

Hmm, what theorem would you want characterising the data producing one?

The expected stuff, such as replicateRecOn [] = h0, replicateRecOn (replicate n a) = hr a n, etc.

Violeta Hernández (Sep 15 2024 at 05:12):

How would this work in terms of groupBy? I'm a bit confused on how to apply it to my use case.

Kim Morrison (Sep 15 2024 at 05:12):

what's the etc :-)

Violeta Hernández (Sep 15 2024 at 05:13):

(h : a ≠ b) → replicateRecOn (replicate n a ++ b :: l) = hi _ _ _ _ h (replicateRecOn (b :: l))

Kim Morrison (Sep 15 2024 at 05:13):

Slightly tedious to use groupBy, because it is not obvious that the runs have positive length, and different elements.

Kim Morrison (Sep 15 2024 at 05:14):

One might want to define runLengthEncoding which packages that.

Violeta Hernández (Sep 15 2024 at 05:14):

That would be great! I think it's a much better way to work with this.

Kim Morrison (Sep 15 2024 at 05:15):

induction principles are fun, but not always the answer :-)

Kim Morrison (Sep 15 2024 at 05:16):

Apologies the groupBy API is non-existent still.

Violeta Hernández (Sep 15 2024 at 05:17):

Would Data/List/RunLength be fine for this? I'll gladly add a definition and some basic API.

Violeta Hernández (Sep 15 2024 at 05:30):

I'm having trouble understanding the definition of groupBy. What's this loop function doing?
image.png

Kim Morrison (Sep 15 2024 at 05:32):

It's written like that to be tail-recursive.

Kim Morrison (Sep 15 2024 at 05:32):

g is the group we are building up at the moment

Kim Morrison (Sep 15 2024 at 05:32):

gs are the groups we're finished with

Kim Morrison (Sep 15 2024 at 05:33):

and ag is the "previous" element we've considered

Kim Morrison (Sep 15 2024 at 05:46):

So in particular the missing theorems about groupBy are:

(l.groupBy r).join = l
each list is not []
each list satisfies Chain r
the last element of each list is not related to the first element of the next

Kim Morrison (Sep 15 2024 at 05:46):

and then might be fun to show that those properties uniquely characterize l.groupBy

Violeta Hernández (Sep 15 2024 at 05:58):

It's quite confusing how loop assembles both each individual group and the list of groups backwards. But I think I can figure this out.

Violeta Hernández (Sep 15 2024 at 05:59):

Should the API for groupBy go in this RunLength file? Or do I first make a GroupBy file where I prove these characterizations?

Kim Morrison (Sep 15 2024 at 05:59):

I'd first make a GroupBy. I would love to steal it to move to the core repository, to be honest. GroupBy has been low on my list, but I'll get there eventually. :-)

Violeta Hernández (Sep 15 2024 at 06:21):

There's an issue with stating the Chain' condition - List.Chain' expects α → α → Prop, while List.groupBy expects α → α → Bool.

Violeta Hernández (Sep 15 2024 at 06:23):

Ah nevermind, apparently Bool can be casted to Prop.

Violeta Hernández (Sep 15 2024 at 07:12):

I'm getting the hang of this! nil_not_mem_groupBy and chain'_of_mem_groupBy have been proven!

Daniel Weber (Sep 15 2024 at 07:31):

docs#List.Pairwise might be a better conclusion, and it should be easy from docs#List.chain'_iff_pairwise

Kim Morrison (Sep 15 2024 at 07:31):

Only if the relation is transitive.

Kim Morrison (Sep 15 2024 at 07:32):

groupBy still works and is sometimes even useful for non-transitive relations, so best to use Chain

Daniel Weber (Sep 15 2024 at 07:32):

Oh, right, the relation isn't always equality

Kim Morrison (Sep 15 2024 at 07:32):

Rather than nil_not_mem_groupBy I'd prove that if you're in groupBy, you're not nil. More useful.

Violeta Hernández (Sep 15 2024 at 07:34):

I think both are potentially useful statements

Violeta Hernández (Sep 15 2024 at 07:35):

join_groupBy has been proven too!

Violeta Hernández (Sep 15 2024 at 07:41):

I'm not quite sure what the best way to state this last theorem is. Maybe like this?

theorem chain'_last_ne_head_groupBy : (l.groupBy r).Chain' (fun a b ↦ a.getLast? ≠ b.head?) := sorry

Daniel Weber (Sep 15 2024 at 07:44):

Perhaps (l.groupBy r).Chain' (fun a b ↦ ∃ ha hb, ¬r (a.getLast ha) (b.head hb))?

Violeta Hernández (Sep 15 2024 at 07:46):

That seems more useful, since we get to assert both that the lists are non-empty and that the last and first element are distinct

Violeta Hernández (Sep 15 2024 at 07:46):

Also yeah, should have used r instead of ≠

Violeta Hernández (Sep 15 2024 at 09:05):

Proven! All I'm missing is the characterization theorem, and then I'll submit a PR.

Violeta Hernández (Sep 15 2024 at 09:46):

Actually, I'm finding the characterization a bit hard to prove. It's also getting quite late over here, so I'll just PR the rest and I'll try and prove the characterization later down the line, if it's needed.

Violeta Hernández (Sep 15 2024 at 09:48):

#16818

Kim Morrison (Sep 15 2024 at 23:24):

Left some comments.

Kim Morrison (Sep 15 2024 at 23:25):

If you want to prove the characterisation, then the first step is theorem groupBy_join, which is that if you have a List (List X) satisfying some hypotheses, and you join it and then groupBy it, you get back the original List (List X).

After that, one can prove theorem groupBy_eq_iff : l.groupBy = L \iff ....

Violeta Hernández (Sep 16 2024 at 00:25):

The result on attach you mention can actually be generalized:

theorem chain'_attachWith {l : List α} {p : α → Prop} (h : ∀ x ∈ l, p x)
    {r : {a // p a} → {a // p a} → Prop} :
    (l.attachWith p h).Chain' r ↔ l.Chain' fun a b ↦ ∃ ha hb, r ⟨a, ha⟩ ⟨b, hb⟩ := sorry

I'm having some trouble proving this due to some missing API though. For instance, I wanted to use this theorem

theorem head?_attachWith {α : Type*} {l : List α} {p : α → Prop} (h : ∀ x ∈ l, p x) :
    (l.attachWith p h).head? =
    l.head?.attach.map (fun ⟨x, hx⟩ ↦ ⟨x, h x <| mem_of_mem_head? hx⟩) := by
  cases l <;> rfl

Here attach is Option.attach : Π a : Option α, Option {x // x ∈ a}, which judging by how the analogous docs#getLast?_attach is written, doesn't seem to exist either...

Violeta Hernández (Sep 16 2024 at 00:27):

The definition would just be this

def Option.attach {α : Type*} (a : Option α) : Option {x // x ∈ a} :=
  match a with
  | none => none
  | some x => some ⟨x, rfl⟩

Violeta Hernández (Sep 16 2024 at 00:36):

Another basic theorem on attachWith seemingly missing is

@[simp]
theorem attachWith_cons {α : Type*} {l : List α} (a : α) {p : α → Prop} (h : ∀ x ∈ a :: l, p x) :
    (a :: l).attachWith p h = ⟨a, h a (mem_cons_self a l)⟩ ::
      l.attachWith p (fun x hx ↦ h x (mem_cons_of_mem a hx)) :=
  rfl

Kim Morrison (Sep 16 2024 at 00:48):

Oh, good catch, the API for attach is developed much further than the API for attachWith.

Kim Morrison (Sep 16 2024 at 01:39):

@Violeta Hernández, would you mind taking a look at lean#5352 and seeing if that fills the gaps you noticed?

I did not (yet?) introduce Option.attach, but I'll consider that for later.

Violeta Hernández (Sep 16 2024 at 01:50):

Yeah, I think that fills in the gaps for at least the lemmas I need.

Violeta Hernández (Sep 16 2024 at 01:51):

I also noticed something similar happening with head! and head. In particular, we don't have these two lemmas:

theorem head_mem_self : ∀ {l : List α} (h : l ≠ []), head l h ∈ l
  | [], h => by contradiction
  | a :: l, _ => mem_cons_self a l

theorem head_mem_head? : ∀ {l : List α} (h : l ≠ []), head l h ∈ head? l
  | [], h => by contradiction
  | a :: l, _ => rfl

Kim Morrison (Sep 16 2024 at 02:39):

The first is just List.head_mem. I agree the second is missing, and I'll add that now.

Kim Morrison (Sep 16 2024 at 02:42):

lean#5353

Violeta Hernández (Sep 16 2024 at 03:45):

By the way, I've been able to simplify the proofs about groupBy a bit by proving this first

private theorem groupBy_loop_eq_append {r : α → α → Bool} {l : List α} {a : α} {g : List α}
    (gs : List (List α)) : groupBy.loop r l a g gs = gs.reverse ++ groupBy.loop r l a g [] := sorry

Violeta Hernández (Sep 16 2024 at 03:45):

I'll put that in the PR shortly

Violeta Hernández (Sep 16 2024 at 04:24):

I proved the characterization!

Violeta Hernández (Sep 16 2024 at 04:26):

It's about 100 extra lines of code, so I think it'll have to go in a subsequent PR.

Violeta Hernández (Sep 16 2024 at 04:27):

I ended up using this auxiliary lemma:

theorem join_head {l : List (List α)} (hl : l ≠ []) (hl' : l.head hl ≠ []) :
    l.join.head ((join_ne_nil _).2 ⟨_, head_mem hl, hl'⟩) = (l.head hl).head hl' := by
  cases l with
  | nil => contradiction
  | cons a l =>
    simp_rw [join_cons, head_cons]
    exact head_append_of_ne_nil _

Is this something we want in Mathlib, or is it too specific? In the former case, where should I put it?

Violeta Hernández (Sep 16 2024 at 04:33):

For now I've put the characterization in #16837. I imagine it will have to stay as a WIP until a new core update drops.

Kim Morrison (Sep 16 2024 at 05:38):

Looks good to me. Lemma should be called head_join, however, and probably should be head_join_of_head_ne_nil, as one could imagine a more complicated result that went looking for the first thing that wasn't [], and took its head!

Kim Morrison (Sep 16 2024 at 05:39):

Once bump/v4.13.0 is on a sufficiently recent nightly (usually within 24 hours of release), I don't see why you couldn't make a PR direct to that, and have that reviewed in the usual manner. It's not how we usually do things, but it seems better than waiting.

Kim Morrison (Sep 16 2024 at 05:41):

e.g. head_join itself would have RHS ((l.filter (\. \ne [])).head sorry).head sorry

Violeta Hernández (Sep 23 2024 at 06:20):

@Kim Morrison Is it ok for #16818 to land as is (modulo further review), and for me to add the version of chain'_last_ne_head_groupBy without the existential once we have chain'_attachWith?

Edward van de Meent (Oct 16 2024 at 11:45):

i'd like to note that batteries#948 would benefit from the docs#List.ne_nil_of_mem_groupBy lemma being upstreamed, as it would provide a proof enableing prefix search for library notes, rather than exact search.

Kim Morrison (Oct 16 2024 at 23:58):

This is meta code: using head! is completely acceptable. :-)

Antoine Chambert-Loir (Oct 17 2024 at 06:36):

Violeta Hernández said:

Do we have anything like this?

theorem replicateRecOn {α : Type*} {p : List α → Prop} (m : List α)
    (h0 : p []) (hr : ∀ a n, p (replicate n a))
    (hi : ∀ a b n l, a ≠ b → p (b :: l) → p (replicate n a ++ b :: l)) : p m := sorry

Basically, I want to split my list into parts repeating a single element and induct on those parts.

Just by curiosity, what application do you have in mind?

Violeta Hernández (Oct 17 2024 at 07:32):

I use this in my Rubik's cube project to define the "deduplication" of a sequence of moves, i.e. the list where any sequence of four consecutive moves gets deleted

Violeta Hernández (Oct 30 2024 at 08:34):

#16837 is now review-ready!

Last updated: May 02 2025 at 03:31 UTC