Zulip Chat Archive

Stream: Is there code for X?

Topic: a deduped list is longer than a subset deduped

Jared green (Jan 12 2025 at 01:03):

i was trying to prove this, but i couldnt figure out how to do it

import Mathlib.Data.List.Dedup
variable (a : Type) [h : DecidableEq a]
theorem dedup_length (l1 l2 : List a) : l1.Subset l2 -> l1.dedup.length  ≤ l2.dedup.length :=
  by
  sorry

Kim Morrison (Jan 12 2025 at 01:07):

Please try to make #mwe's work: you are missing a typeclass. You can click the button in the top right of the code block to open your code in the sandbox to verify that it is what you intended.

Jared green (Jan 12 2025 at 01:07):

try again

Kim Morrison (Jan 12 2025 at 01:08):

Do you know how Subset is defined? Perhaps try inducting on l1 and unfolding the definition of Subset.

Jared green (Jan 12 2025 at 01:08):

i tried that

Kim Morrison (Jan 12 2025 at 01:08):

Can you show what you got to, then?

Kim Morrison (Jan 12 2025 at 01:09):

(It's important to avoid "low effort" sorries; if you've already tried something, show that in your question! If you have an idea for how the proof should working informally, explain that in text.)

Jared green (Jan 12 2025 at 01:11):

but i didnt know what to do next

Kim Morrison (Jan 12 2025 at 01:12):

Yes, that's why I'm asking you to show what you have done, so someone can help you work out what to do next.

Kim Morrison (Jan 12 2025 at 01:13):

To be blunt, asking "low effort" sorries is just asking for someone to do your work for you, rather than asking to learn.

Jared green (Jan 12 2025 at 01:19):

import Mathlib.Data.List.Dedup

variable (a : Type) [h : DecidableEq a]

theorem dedup_length (l1 l2 : List a) : l1.Subset l2 -> l1.dedup.length  ≤  l2.dedup.length :=
  by
  induction' l1 with hd tl ih
  simp
  unfold List.Subset
  simp
  intro hh hhh
  have hhhh : (hd :: tl).dedup.length ≤ tl.dedup.length := by {
    unfold List.dedup
    simp
    unfold List.pwFilter
    --here
  }
  sorry

Kim Morrison (Jan 12 2025 at 01:31):

Hm, yeah, List.dedup as defined in Mathlib has pretty terrible properties, because it keeps the last occurrence of each element, making it very hard to do inductive arguments about it.

Kim Morrison (Jan 12 2025 at 01:31):

List.eraseDups in Lean keeps the first occurrence, so you should be able to prove a useful theorem eraseDups_cons which isn't really possible for dedup_cons.

Kim Morrison (Jan 12 2025 at 01:32):

I guess you can do cases splits and then use dedup_cons_of_mem and `dedup_cons_of_not_mem.

Kim Morrison (Jan 12 2025 at 01:34):

Maybe this is out of scope for what you're trying to do, but changing the definition of List.dedup might be sane. :-)

Jared green (Jan 12 2025 at 01:39):

case split on what?

Kim Morrison (Jan 12 2025 at 01:39):

btw, things will be much happier if you write l1 ⊆ l2 rather than l1.Subset l2, because simp lemmas will actually fire. Why did you start using .Subset to begin with?

Kim Morrison (Jan 12 2025 at 01:39):

case splits on hd ∈ tl.

Kim Morrison (Jan 12 2025 at 01:43):

e.g.

import Mathlib.Data.List.Dedup

variable (a : Type) [h : DecidableEq a]

open List

theorem dedup_length (l1 l2 : List a) : l1 ⊆ l2 → l1.dedup.length ≤ l2.dedup.length := by
  induction l1 with
  | nil => simp
  | cons hd tl ih =>
    simp
    intro w₁ w₂
    by_cases h : hd ∈ tl
    · simp_all
    · simp_all
      sorry

gets you to the right goal. It's still tricky from there!

Kim Morrison (Jan 12 2025 at 01:44):

Maybe there's a short-cut, but the brute force way is to deduce from w₁ that l2 = xs ++ hd :: ys (there's a lemma saying that), and then deduce from w₂ and h that tl ⊆ xs ++ ys, and then use lemmas about dedup/append.

Jared green (Jan 12 2025 at 02:08):

of course i had a similar problem, i had to also induct on the second list, but in doing so the proof seemed to repeat itself

Jared green (Jan 12 2025 at 02:35):

how about this?(still not done)

import Mathlib.Data.List.Dedup

variable (a : Type) [h : DecidableEq a]

theorem dedup_length (l1 : List a) : ∀ l2 : List a, l1 ⊆ l2 -> l1.dedup.length  ≤  l2.dedup.length :=
  by
  induction' l1 with hd tl ih
  simp
  simp
  intro l3 hh hhh
  cases' Classical.em (hd ∈ tl) with hl hr
  rw [List.dedup_cons_of_mem]
  apply ih
  exact hhh
  exact hl
  rw [List.dedup_cons_of_not_mem]
  simp
  have htl2 : tl ⊆ l3.filter (fun x => x ≠ hd) := by {
    intro x hx
    rw [List.mem_filter]
    constructor
    exact hhh hx
    simp
    by_contra hhd
    apply hr
    rw [← hhd]
    exact hx
  }
  have h3 := ih (l3.filter (fun x => x ≠ hd)) htl2
  have h4 : l3.dedup.length = (l3.filter (fun x => x ≠ hd)).dedup.length + 1 := by {
    sorry
  }
  rw [h4]
  simp only [ne_eq, Nat.add_le_add_iff_right, ge_iff_le]
  exact h3
  exact hr

Kim Morrison (Jan 12 2025 at 02:49):

You should write structured proofs! Whenever there are multiple goals, using focusing dots, etc. It's pretty unreadable otherwise. :-)

Kim Morrison (Jan 12 2025 at 02:49):

Also, I'm still curious, where did the .Subset come from in your original formulation?

Jared green (Jan 12 2025 at 02:50):

i hadnt bothered typing in the subset symbol before

Jared green (Jan 12 2025 at 04:02):

here it is:

variable (a : Type) [h : DecidableEq a]

theorem nin_filter (b : a) (l : List a): b ∉ l -> l = l.filter (fun x => b ≠ x) :=

  by

  intro hb

  induction' l with hd tl ih

  simp

  rw [List.filter_cons]

  rw [if_pos]

  simp at hb

  have i2 := ih hb.2

  rw [i2]

  simp

  simp

  simp at hb

  exact hb.1

theorem dedup_filter_length : ∀ l : List a,∀ b ∈ l, l.dedup.length = (l.filter (fun x => b ≠ x) ).dedup.length + 1 :=

  by

  intro l b hb

  induction' l with hd tl ih

  simp

  contradiction

  cases' Classical.em (hd ∈ tl) with hl hr

  rw [List.dedup_cons_of_mem]

  cases' Classical.em (b = hd) with hlb hbr

  rw [← hlb]

  simp

  simp only [decide_not] at ih

  apply ih

  rw [hlb]

  exact hl

  cases hb

  contradiction

  rw [List.filter_cons]

  simp

  rw [if_neg]

  rw [List.dedup_cons_of_mem']

  simp only [decide_not] at ih

  apply ih

  assumption

  rw [List.mem_dedup]

  rw [List.mem_filter]

  constructor

  exact hl

  simp

  exact hbr

  exact hbr

  exact hl

  rw [List.dedup_cons_of_not_mem hr]

  cases' Classical.em (b = hd) with hbl hbr

  simp

  rw [List.filter_cons]

  rw [if_neg]

  rw [← hbl] at hr

  clear ih

  have ht : tl = tl.filter (fun x => b ≠ x) := by {

    apply nin_filter

    exact hr

  }

  simp at ht

  nth_rewrite 1 [ht]

  rfl

  simp

  exact hbl

  cases' hb

  contradiction

  rw [List.filter_cons]

  rw [if_pos]

  rw [List.dedup_cons_of_not_mem]

  simp

  simp only [decide_not] at ih

  apply ih

  assumption

  rw [List.mem_filter]

  simp

  contrapose

  intro _

  exact hr

  simp

  exact hbr

theorem dedup_length (l1 : List a) : ∀ l2 : List a, l1 ⊆ l2 -> l1.dedup.length  ≤  l2.dedup.length :=

  by

  induction' l1 with hd tl ih

  simp

  simp

  intro l3 hh hhh

  cases' Classical.em (hd ∈ tl) with hl hr

  rw [List.dedup_cons_of_mem]

  apply ih

  exact hhh

  exact hl

  rw [List.dedup_cons_of_not_mem]

  simp

  have htl2 : tl ⊆ l3.filter (fun x => hd ≠ x) := by {

    intro x hx

    rw [List.mem_filter]

    constructor

    exact hhh hx

    simp

    by_contra hhd

    apply hr

    rw [ hhd]

    exact hx

  }

  have h3 := ih (l3.filter (fun x => hd ≠ x)) htl2

  have h4 : l3.dedup.length = (l3.filter (fun x => hd ≠ x)).dedup.length + 1 := by {

    apply dedup_filter_length

    exact hh

  }

  rw [h4]

  simp only [ne_eq, Nat.add_le_add_iff_right, ge_iff_le]

  exact h3

  exact hr

Kim Morrison (Jan 12 2025 at 05:56):

Nice, well done! Clean it up a bit (e.g. structured proofs) and PR to Mathlib! :-)

Jared green (Jan 12 2025 at 13:40):

surely there must be some other way to control the inductive hypotheses when doing induction multiple times, right?

Eric Wieser (Jan 12 2025 at 15:40):

Maybe the theorem we actually want is that l₁ ⊆ l₂ ↔ l₁.dedup.Subperm l₂.dedup?

Eric Wieser (Jan 12 2025 at 15:43):

A good midway point would be the Iff version of docs#List.Nodup.subperm

Eric Wieser (Jan 12 2025 at 15:59):

And similar for docs#List.subset_dedup

Jared green (Jan 12 2025 at 19:06):

i am using this to prove termination of a recursive function that makes some lists longer but is limited by a list that may get shorter

Kim Morrison (Jan 12 2025 at 21:55):

Eric's suggestions seem good to me. They will prove your result out of simpler components.

Last updated: May 02 2025 at 03:31 UTC