Zulip Chat Archive

Stream: Polynomial Freiman-Ruzsa conjecture

Topic: Proof of Kaimonovich-Vershik - how to improve?

Terence Tao (Nov 20 2023 at 19:13):

Hi everyone. I just managed to complete a proof of the Kaimonovich-Vershik inequality https://teorth.github.io/pfr/blueprint/sect0003.html#kv at https://github.com/teorth/pfr/blob/master/PFR/ruzsa_distance.lean :

/-- Obvious extension of Fin.forall_fin_two -/
lemma Fin.forall_fin_three {p : Fin 3 → Prop} : (∀ (i : Fin 3), p i) ↔ p 0 ∧ p 1 ∧ p 2 :=
  Fin.forall_fin_succ.trans <| and_congr_right fun _ => Fin.forall_fin_two

/-- H[X + Y + Z] - H[X + Y] \leq H[Y+Z] - H[Y]. -/
lemma Kaimonovich_Vershik {X Y Z : Ω → G} (h: iIndepFun (fun i ↦ hG) ![X,Y,Z] μ) (hX: Measurable X) (hY: Measurable Y) (hZ: Measurable Z) [IsProbabilityMeasure μ]: H[ X + Y + Z ; μ] - H[ X + Y ; μ] ≤ H[ Y + Z ; μ] - H[ Y; μ ] := by
  suffices : (H[X; μ] + H[Y;μ] + H[Z;μ]) + H[ X + Y + Z ; μ] ≤ (H[X;μ] + H[ Y + Z ; μ]) + (H[Z;μ] + H[ X + Y ; μ])
  . linarith
  have : ∀ (i : Fin 3), Measurable (![X,Y,Z] i) := by
    rw [Fin.forall_fin_three]
    constructor
    . dsimp
      exact hX
    constructor
    . dsimp
      exact hY
    dsimp
    exact hZ
  convert entropy_triple_add_entropy_le _ hX hZ (Measurable.add' hX (Measurable.add' hY hZ)) using 2
  . calc
      H[X; μ] + H[Y;μ] + H[Z;μ] = H[⟨ X, Y ⟩; μ] + H[Z;μ] := by
        congr 1
        symm; apply entropy_pair_eq_add' hX hY
        have := iIndepFun.indepFun h (show 0 ≠ 1 by decide)
        simp at this
        assumption
      _ = H[⟨ ⟨ X, Y ⟩, Z ⟩; μ] := by
        symm; apply entropy_pair_eq_add' (Measurable.prod_mk hX hY) hZ
        exact iIndepFun.indepFun_prod h this 0 1 2 (by decide) (by decide)
      _ = H[⟨ X, ⟨ Z , X + (Y+Z) ⟩ ⟩; μ] := by
        apply entropy_of_comp_eq_of_comp μ _ _ (fun x ↦ (x.1.1, (x.2, x.1.1+x.1.2+x.2))) (fun y ↦ ((y.1, y.2.2-y.1-y.2.1), y.2.1))
        . unfold prod
          funext ω
          dsimp
          congr 2
          abel
        . unfold prod
          funext ω
          dsimp
          congr 2
          abel
        . exact Measurable.prod_mk (Measurable.prod_mk hX hY) hZ
        apply Measurable.prod_mk hX (Measurable.prod_mk hZ _)
        exact Measurable.add' hX (Measurable.add' hY hZ)
  . rw [add_assoc]
  . refine entropy_pair_eq_add' hX (hY.add hZ) ?_ |>.symm.trans ?_
    . apply IndepFun.symm
      exact iIndepFun.add h this 1 2 0 (by decide) (by decide)
    symm
    exact entropy_of_shear_eq hX (hY.add hZ)
  refine entropy_pair_eq_add' hZ (hX.add hY) ?_ |>.symm.trans ?_
  . apply IndepFun.symm
    exact iIndepFun.add h this 0 1 2 (by decide) (by decide)
  rw [show X + (Y + Z) = Z + (X + Y) by abel]
  symm
  exact entropy_of_shear_eq hZ (hX.add hY)

As you can see, the proof is... not elegant. I found myself surprisingly bogged down by minutae, firstly to show (see the first have statement) that every element of the vector ![X,Y,Z] was measurable, and secondly to show that the random variables ⟨⟨X, Y⟩, Z⟩ and ⟨X, ⟨Z, X + (Y + Z)⟩⟩ were functions of each other, and hence had the same entropy. I did introduce a new lemma entropy_of_shear_eq https://teorth.github.io/pfr/docs/PFR/ruzsa_distance.html#entropy_of_shear_eq which did help with some of the calculations, but it is still rather lengthy, and many of the other lemmas still to be proven in the project will face similar challenges. Any suggestions on how to make life easier and the proofs shorter and more readable would be welcome. For instance, could there be a measurability tactic to automate the verification of the measurability of variables such as ⟨X, ⟨Z, X + (Y + Z)⟩⟩? As one can see, this wasn't the most painful part of the proof (it's the part involving Measurable.prod_mk and Measurable.add, which is only three lines long), but every little bit would help.

Heather Macbeth (Nov 20 2023 at 19:15):

This seems like an interesting test case. I should have some time to look at it later today, if others don't get there first.

Adam Topaz (Nov 20 2023 at 19:23):

There already is a measurability tactic in mathlib with docs#Measurable.prod tagged as a lemma for it to use. Does that make things any easier?

Terence Tao (Nov 20 2023 at 19:25):

Adam Topaz said:

There already is a measurability tactic in mathlib with docs#Measurable.prod tagged as a lemma for it to use. Does that make things any easier?

Ooh, it works! Okay, that's one less issue then.

Adam Topaz (Nov 20 2023 at 19:29):

And in case it helps, here's a trick to deal with the three cases:

import Mathlib

open MeasureTheory

example {X Y Z : Ω → G} [MeasurableSpace Ω] [MeasurableSpace G]
    (hX: Measurable X) (hY: Measurable Y) (hZ: Measurable Z) :
    ∀ (i : Fin 3), Measurable (![X,Y,Z] i) := by
  intro i ; fin_cases i <;> assumption

Adam Topaz (Nov 20 2023 at 19:30):

The <;> is a tactic combinator where t <;> s means "apply the tactic s to all goals produced by t".

Terence Tao (Nov 20 2023 at 19:37):

These help, thanks! Here is a slightly cleaned up version (now also using all_goals to reduce the clutter a little).

/-- $$H[X + Y + Z] - H[X + Y] \leq H[Y+Z] - H[Y].$$ -/
lemma Kaimonovich_Vershik {X Y Z : Ω → G} (h: iIndepFun (fun i ↦ hG) ![X,Y,Z] μ) (hX: Measurable X) (hY: Measurable Y) (hZ: Measurable Z) [IsProbabilityMeasure μ]: H[ X + Y + Z ; μ] - H[ X + Y ; μ] ≤ H[ Y + Z ; μ] - H[ Y; μ ] := by
  suffices : (H[X; μ] + H[Y;μ] + H[Z;μ]) + H[ X + Y + Z ; μ] ≤ (H[X;μ] + H[ Y + Z ; μ]) + (H[Z;μ] + H[ X + Y ; μ])
  . linarith
  have : ∀ (i : Fin 3), Measurable (![X,Y,Z] i) := by
    intro i ; fin_cases i <;> assumption
  convert entropy_triple_add_entropy_le _ hX hZ (show Measurable (X + (Y+Z)) by measurability) using 2
  . calc
      H[X; μ] + H[Y;μ] + H[Z;μ] = H[⟨ X, Y ⟩; μ] + H[Z;μ] := by
        congr 1
        symm; apply entropy_pair_eq_add' hX hY
        have := iIndepFun.indepFun h (show 0 ≠ 1 by decide)
        simp at this
        assumption
      _ = H[⟨ ⟨ X, Y ⟩, Z ⟩; μ] := by
        symm; apply entropy_pair_eq_add' (Measurable.prod_mk hX hY) hZ
        exact iIndepFun.indepFun_prod h this 0 1 2 (by decide) (by decide)
      _ = H[⟨ X, ⟨ Z , X + (Y+Z) ⟩ ⟩; μ] := by
        apply entropy_of_comp_eq_of_comp μ (by measurability) (by measurability) (fun x ↦ (x.1.1, (x.2, x.1.1+x.1.2+x.2))) (fun y ↦ ((y.1, y.2.2-y.1-y.2.1), y.2.1))
        all_goals {
          unfold prod
          funext ω
          dsimp
          congr 2
          abel
        }
  . rw [add_assoc]
  . refine entropy_pair_eq_add' hX (hY.add hZ) ?_ |>.symm.trans ?_
    . apply IndepFun.symm
      exact iIndepFun.add h this 1 2 0 (by decide) (by decide)
    symm
    exact entropy_of_shear_eq hX (hY.add hZ)
  refine entropy_pair_eq_add' hZ (hX.add hY) ?_ |>.symm.trans ?_
  . apply IndepFun.symm
    exact iIndepFun.add h this 0 1 2 (by decide) (by decide)
  rw [show X + (Y + Z) = Z + (X + Y) by abel]
  symm
  exact entropy_of_shear_eq hZ (hX.add hY)

Terence Tao (Nov 20 2023 at 19:45):

Having an independence tactic to deduce, for instance, independence of X and Y+Z from the joint independence of ![X Y Z] wouuld be nice.

Adam Topaz (Nov 20 2023 at 19:47):

For defining functions out of products, there is an option to write things like fun ((x,y),z) ↦ (x, (z, x+y+z))) and avoid all the .1.2 overload.

Adam Topaz (Nov 20 2023 at 19:51):

So for example the few lines in the middle of your proof could be written as follows:

        apply entropy_of_comp_eq_of_comp μ _ _
          (fun ((x,y),z) ↦ (x, (z, x+y+z))) (fun (x,y,z) ↦ ((x, z-x-y), y))
        . funext ω ; dsimp [prod] ; ext <;> dsimp ; abel
        . funext ω ; dsimp [prod] ; ext <;> dsimp ; abel
        . measurability
        · measurability

Adam Topaz (Nov 20 2023 at 19:52):

(or you could inline measurability as in the most recent edit)

Terence Tao (Nov 20 2023 at 19:53):

Nice! That definitely enhances readability. New version:

/-- $$H[X + Y + Z] - H[X + Y] \leq H[Y+Z] - H[Y].$$ -/
lemma Kaimonovich_Vershik {X Y Z : Ω → G} (h: iIndepFun (fun _ ↦ hG) ![X,Y,Z] μ) (hX: Measurable X) (hY: Measurable Y) (hZ: Measurable Z) [IsProbabilityMeasure μ]: H[ X + Y + Z ; μ] - H[ X + Y ; μ] ≤ H[ Y + Z ; μ] - H[ Y; μ ] := by
  suffices : (H[X; μ] + H[Y;μ] + H[Z;μ]) + H[ X + Y + Z ; μ] ≤ (H[X;μ] + H[ Y + Z ; μ]) + (H[Z;μ] + H[ X + Y ; μ])
  . linarith
  have : ∀ (i : Fin 3), Measurable (![X,Y,Z] i) := by
    intro i ; fin_cases i <;> assumption
  convert entropy_triple_add_entropy_le _ hX hZ (show Measurable (X + (Y+Z)) by measurability) using 2
  . calc
      H[X; μ] + H[Y;μ] + H[Z;μ] = H[⟨ X, Y ⟩; μ] + H[Z;μ] := by
        congr 1
        symm; apply entropy_pair_eq_add' hX hY
        convert iIndepFun.indepFun h (show 0 ≠ 1 by decide)
      _ = H[⟨ ⟨ X, Y ⟩, Z ⟩; μ] := by
        symm; apply entropy_pair_eq_add' (Measurable.prod_mk hX hY) hZ
        exact iIndepFun.indepFun_prod h this 0 1 2 (by decide) (by decide)
      _ = H[⟨ X, ⟨ Z , X + (Y+Z) ⟩ ⟩; μ] := by
        apply entropy_of_comp_eq_of_comp μ (by measurability) (by measurability) (fun ((x,y),z) ↦ (x, (z, x+y+z))) (fun (a,(b,c)) ↦ ((a, c-a-b), b))
        all_goals { funext ω; dsimp [prod]; ext <;> dsimp; abel }
  . rw [add_assoc]
  . refine entropy_pair_eq_add' hX (hY.add hZ) ?_ |>.symm.trans ?_
    . apply IndepFun.symm
      exact iIndepFun.add h this 1 2 0 (by decide) (by decide)
    symm
    exact entropy_of_shear_eq hX (hY.add hZ)
  refine entropy_pair_eq_add' hZ (hX.add hY) ?_ |>.symm.trans ?_
  . apply IndepFun.symm
    exact iIndepFun.add h this 0 1 2 (by decide) (by decide)
  rw [show X + (Y + Z) = Z + (X + Y) by abel]
  symm
  exact entropy_of_shear_eq hZ (hX.add hY)

Terence Tao (Nov 20 2023 at 19:56):

If I make changes such as replacing Measurable.prod_mk hX hY by by measurability would that significantly reduce performance? It's a minor improvement in readability so not sure if it is worth it.

Adam Topaz (Nov 20 2023 at 19:59):

I think it's fine. In fact, I think it might make sense to just use by measurability as the default value for the various measurability assumptions you have around.

Terence Tao (Nov 20 2023 at 20:00):

OK. I'm taking a break for now but can implement these later (or Heather can, if she wants to look at it). It's all pushed to Github.

Terence Tao (Nov 20 2023 at 20:01):

If we had a similar independence tactic to resolve all the independence hypotheses then actually this begins to track the human proof reasonably well.

Last updated: May 02 2025 at 03:31 UTC