Zulip Chat Archive

Stream: IMO-grand-challenge

Topic: discrete probability

Daniel Selsam (Apr 08 2021 at 15:35):

What is the mathlib way of encoding the following problem statement involving discrete probabilities:
aime-2021-day1-p1.png (the probability part, not the find part)

Floris van Doorn (Apr 08 2021 at 17:28):

The closest existing definition in mathlib is probably docs#pmf for finite probability distributions. However, that takes values in the nonnegative reals, and here you really want to use the fact that the probabilities are rational, so maybe the best encoding is just to define the probability function as a function into ℚ (without the additional info that it is a probability distribution)?

Floris van Doorn (Apr 08 2021 at 18:00):

Here is a way to state this problem by hand:

import data.rat.basic
import algebra.big_operators.basic
import tactic.derive_fintype

open_locale big_operators

abbreviation n := 6

@[derive [decidable_eq, fintype]] inductive racers
| Zou
| Chou

export racers

abbreviation races : Type := fin n → racers

def number_of_times_Zou_won (r : races) : ℕ :=
(finset.univ.filter (λ i : fin n, r i = Zou)).card

def probability_distribution (r : races) : ℚ :=
if r 0 = Chou then 0 else
  ∏ i in finset.univ.erase (0 : fin n), if r i = r (i - 1) then 2 / 3 else 1 / 3

def probability : ℚ :=
∑ r in finset.univ.filter (λ r : races, number_of_times_Zou_won r = 5), probability_distribution r

lemma AIME2021_1 : probability.num + probability.denom = sorry := sorry

I don't think current mathlib will help you all that much simplifying the problem statement. We have finitary products of measures (not yet the fact that the product of probability measures is a probability measure, though that is trivial to add), but we don't have that for discrete probabilities.
Future mathlib could have better support for pmf, including iterated events like here. However, when writing a formal proof for a problem like this, it is probably convenient to stay in ℚ and never mention real numbers.

Floris van Doorn (Apr 08 2021 at 19:24):

Thinking about it more, I think we can generalize pmf to a more general setting if we wanted, something like

def pmf (α β : Type*) [linear_ordered_field β] [topological_space β] : Type* :=
{ f : α → β // has_sum f 1 ∧ ∀ x, 0 ≤ f x }

If we develop the theory pmf a bit more, we could also compute the probability using pmf:

import tactic.derive_fintype
import measure_theory.probability_mass_function

open_locale big_operators nnreal

noncomputable theory

namespace pmf
/-- probability of a set `s` under a pmf `p`. -/
def prob {α : Type*} (p : pmf α) (s : set α) : ℝ≥0 :=
∑' i , s.indicator p i

/-- The pmf when iterating events. The pmf at event `i` can depend on the results of `j < i`. -/
def iterate {ι : Type*} [fintype ι] [partial_order ι] {β : ι → Type*}
  (p : Π i, (Π j < i, β j) → pmf (β i)) : pmf (Π i, β i) :=
⟨λ f, ∏ i, p i (λ j _, f j) (f i), sorry⟩

end pmf
open pmf

abbreviation n := 6

abbreviation racers := bool -- we now define racers to be `bool`, so that we can use `bernoulli` below
abbreviation Zou := tt
abbreviation Chou := ff

abbreviation races : Type := fin n → racers

def number_of_times_Zou_won (r : races) : ℕ :=
(finset.univ.filter (λ i : fin n, r i = Zou)).card

lemma fin.sub_one_lt_self {n} {i : fin (n+1)} (hi : 0 < i) : i - 1 < i := sorry

def probability_distribution : pmf races :=
iterate $ λ i r,
  bernoulli (if h : 0 < i
      then if r (i - 1) (fin.sub_one_lt_self h) = Zou then 2 / 3 else 1 / 3
      else 1)
    (by split_ifs; rw [←nnreal.coe_le_coe]; norm_num)

def probability : ℝ≥0 :=
probability_distribution.prob { r : races | number_of_times_Zou_won r = 5}

If anyone wants to make a PR to mathlib generalizing pmf or proving basic properties of pmf.prob and pmf.iterate, that would definitely be welcome.

Mario Carneiro (Apr 09 2021 at 02:13):

Is this probability computable by #eval? It seems like it would be

Mario Carneiro (Apr 09 2021 at 02:16):

At least with the first one, it seems so:

#eval probability.num + probability.denom -- 97 :)
lemma AIME2021_1 : probability.num + probability.denom = 97 := rfl -- fails :(

Daniel Selsam (Apr 09 2021 at 03:29):

@Floris van Doorn Thank you! Some of the pmf abstractions are indeed nice (iterate, bernoulli, prob). I am not sure if they are worth the cost of proofs-inside-the-statements though. At least in this simple example, the first-principles statement you gave first seems like a fine compromise.

Last updated: May 02 2025 at 03:31 UTC