Zulip Chat Archive
Stream: Is there code for X?
Topic: a deduped list is longer than a subset deduped
Jared green (Jan 12 2025 at 01:03):
i was trying to prove this, but i couldnt figure out how to do it
import Mathlib.Data.List.Dedup
variable (a : Type) [h : DecidableEq a]
theorem dedup_length (l1 l2 : List a) : l1.Subset l2 -> l1.dedup.length ≤ l2.dedup.length :=
by
sorry
Kim Morrison (Jan 12 2025 at 01:07):
Please try to make #mwe's work: you are missing a typeclass. You can click the button in the top right of the code block to open your code in the sandbox to verify that it is what you intended.
Jared green (Jan 12 2025 at 01:07):
try again
Kim Morrison (Jan 12 2025 at 01:08):
Do you know how Subset
is defined? Perhaps try inducting on l1
and unfolding the definition of Subset.
Jared green (Jan 12 2025 at 01:08):
i tried that
Kim Morrison (Jan 12 2025 at 01:08):
Can you show what you got to, then?
Kim Morrison (Jan 12 2025 at 01:09):
(It's important to avoid "low effort" sorries; if you've already tried something, show that in your question! If you have an idea for how the proof should working informally, explain that in text.)
Jared green (Jan 12 2025 at 01:11):
but i didnt know what to do next
Kim Morrison (Jan 12 2025 at 01:12):
Yes, that's why I'm asking you to show what you have done, so someone can help you work out what to do next.
Kim Morrison (Jan 12 2025 at 01:13):
To be blunt, asking "low effort" sorries is just asking for someone to do your work for you, rather than asking to learn.
Jared green (Jan 12 2025 at 01:19):
import Mathlib.Data.List.Dedup
variable (a : Type) [h : DecidableEq a]
theorem dedup_length (l1 l2 : List a) : l1.Subset l2 -> l1.dedup.length ≤ l2.dedup.length :=
by
induction' l1 with hd tl ih
simp
unfold List.Subset
simp
intro hh hhh
have hhhh : (hd :: tl).dedup.length ≤ tl.dedup.length := by {
unfold List.dedup
simp
unfold List.pwFilter
--here
}
sorry
Kim Morrison (Jan 12 2025 at 01:31):
Hm, yeah, List.dedup
as defined in Mathlib has pretty terrible properties, because it keeps the last occurrence of each element, making it very hard to do inductive arguments about it.
Kim Morrison (Jan 12 2025 at 01:31):
List.eraseDups
in Lean keeps the first occurrence, so you should be able to prove a useful theorem eraseDups_cons
which isn't really possible for dedup_cons
.
Kim Morrison (Jan 12 2025 at 01:32):
I guess you can do cases splits and then use dedup_cons_of_mem
and `dedup_cons_of_not_mem
.
Kim Morrison (Jan 12 2025 at 01:34):
Maybe this is out of scope for what you're trying to do, but changing the definition of List.dedup
might be sane. :-)
Jared green (Jan 12 2025 at 01:39):
case split on what?
Kim Morrison (Jan 12 2025 at 01:39):
btw, things will be much happier if you write l1 ⊆ l2
rather than l1.Subset l2
, because simp lemmas will actually fire. Why did you start using .Subset
to begin with?
Kim Morrison (Jan 12 2025 at 01:39):
case splits on hd ∈ tl
.
Kim Morrison (Jan 12 2025 at 01:43):
e.g.
import Mathlib.Data.List.Dedup
variable (a : Type) [h : DecidableEq a]
open List
theorem dedup_length (l1 l2 : List a) : l1 ⊆ l2 → l1.dedup.length ≤ l2.dedup.length := by
induction l1 with
| nil => simp
| cons hd tl ih =>
simp
intro w₁ w₂
by_cases h : hd ∈ tl
· simp_all
· simp_all
sorry
gets you to the right goal. It's still tricky from there!
Kim Morrison (Jan 12 2025 at 01:44):
Maybe there's a short-cut, but the brute force way is to deduce from w₁
that l2 = xs ++ hd :: ys
(there's a lemma saying that), and then deduce from w₂
and h
that tl ⊆ xs ++ ys
, and then use lemmas about dedup/append.
Jared green (Jan 12 2025 at 02:08):
of course i had a similar problem, i had to also induct on the second list, but in doing so the proof seemed to repeat itself
Jared green (Jan 12 2025 at 02:35):
how about this?(still not done)
import Mathlib.Data.List.Dedup
variable (a : Type) [h : DecidableEq a]
theorem dedup_length (l1 : List a) : ∀ l2 : List a, l1 ⊆ l2 -> l1.dedup.length ≤ l2.dedup.length :=
by
induction' l1 with hd tl ih
simp
simp
intro l3 hh hhh
cases' Classical.em (hd ∈ tl) with hl hr
rw [List.dedup_cons_of_mem]
apply ih
exact hhh
exact hl
rw [List.dedup_cons_of_not_mem]
simp
have htl2 : tl ⊆ l3.filter (fun x => x ≠ hd) := by {
intro x hx
rw [List.mem_filter]
constructor
exact hhh hx
simp
by_contra hhd
apply hr
rw [← hhd]
exact hx
}
have h3 := ih (l3.filter (fun x => x ≠ hd)) htl2
have h4 : l3.dedup.length = (l3.filter (fun x => x ≠ hd)).dedup.length + 1 := by {
sorry
}
rw [h4]
simp only [ne_eq, Nat.add_le_add_iff_right, ge_iff_le]
exact h3
exact hr
Kim Morrison (Jan 12 2025 at 02:49):
You should write structured proofs! Whenever there are multiple goals, using focusing dots, etc. It's pretty unreadable otherwise. :-)
Kim Morrison (Jan 12 2025 at 02:49):
Also, I'm still curious, where did the .Subset
come from in your original formulation?
Jared green (Jan 12 2025 at 02:50):
i hadnt bothered typing in the subset symbol before
Jared green (Jan 12 2025 at 04:02):
here it is:
variable (a : Type) [h : DecidableEq a]
theorem nin_filter (b : a) (l : List a): b ∉ l -> l = l.filter (fun x => b ≠ x) :=
by
intro hb
induction' l with hd tl ih
simp
rw [List.filter_cons]
rw [if_pos]
simp at hb
have i2 := ih hb.2
rw [i2]
simp
simp
simp at hb
exact hb.1
theorem dedup_filter_length : ∀ l : List a,∀ b ∈ l, l.dedup.length = (l.filter (fun x => b ≠ x) ).dedup.length + 1 :=
by
intro l b hb
induction' l with hd tl ih
simp
contradiction
cases' Classical.em (hd ∈ tl) with hl hr
rw [List.dedup_cons_of_mem]
cases' Classical.em (b = hd) with hlb hbr
rw [← hlb]
simp
simp only [decide_not] at ih
apply ih
rw [hlb]
exact hl
cases hb
contradiction
rw [List.filter_cons]
simp
rw [if_neg]
rw [List.dedup_cons_of_mem']
simp only [decide_not] at ih
apply ih
assumption
rw [List.mem_dedup]
rw [List.mem_filter]
constructor
exact hl
simp
exact hbr
exact hbr
exact hl
rw [List.dedup_cons_of_not_mem hr]
cases' Classical.em (b = hd) with hbl hbr
simp
rw [List.filter_cons]
rw [if_neg]
rw [← hbl] at hr
clear ih
have ht : tl = tl.filter (fun x => b ≠ x) := by {
apply nin_filter
exact hr
}
simp at ht
nth_rewrite 1 [ht]
rfl
simp
exact hbl
cases' hb
contradiction
rw [List.filter_cons]
rw [if_pos]
rw [List.dedup_cons_of_not_mem]
simp
simp only [decide_not] at ih
apply ih
assumption
rw [List.mem_filter]
simp
contrapose
intro _
exact hr
simp
exact hbr
theorem dedup_length (l1 : List a) : ∀ l2 : List a, l1 ⊆ l2 -> l1.dedup.length ≤ l2.dedup.length :=
by
induction' l1 with hd tl ih
simp
simp
intro l3 hh hhh
cases' Classical.em (hd ∈ tl) with hl hr
rw [List.dedup_cons_of_mem]
apply ih
exact hhh
exact hl
rw [List.dedup_cons_of_not_mem]
simp
have htl2 : tl ⊆ l3.filter (fun x => hd ≠ x) := by {
intro x hx
rw [List.mem_filter]
constructor
exact hhh hx
simp
by_contra hhd
apply hr
rw [ hhd]
exact hx
}
have h3 := ih (l3.filter (fun x => hd ≠ x)) htl2
have h4 : l3.dedup.length = (l3.filter (fun x => hd ≠ x)).dedup.length + 1 := by {
apply dedup_filter_length
exact hh
}
rw [h4]
simp only [ne_eq, Nat.add_le_add_iff_right, ge_iff_le]
exact h3
exact hr
Kim Morrison (Jan 12 2025 at 05:56):
Nice, well done! Clean it up a bit (e.g. structured proofs) and PR to Mathlib! :-)
Jared green (Jan 12 2025 at 13:40):
surely there must be some other way to control the inductive hypotheses when doing induction multiple times, right?
Eric Wieser (Jan 12 2025 at 15:40):
Maybe the theorem we actually want is that l₁ ⊆ l₂ ↔ l₁.dedup.Subperm l₂.dedup
?
Eric Wieser (Jan 12 2025 at 15:43):
A good midway point would be the Iff
version of docs#List.Nodup.subperm
Eric Wieser (Jan 12 2025 at 15:59):
And similar for docs#List.subset_dedup
Jared green (Jan 12 2025 at 19:06):
i am using this to prove termination of a recursive function that makes some lists longer but is limited by a list that may get shorter
Kim Morrison (Jan 12 2025 at 21:55):
Eric's suggestions seem good to me. They will prove your result out of simpler components.
Last updated: May 02 2025 at 03:31 UTC