# Zulip Chat Archive

## Stream: IMO-grand-challenge

### Topic: discrete probability

#### Daniel Selsam (Apr 08 2021 at 15:35):

What is the mathlib way of encoding the following problem statement involving discrete probabilities:

aime-2021-day1-p1.png (the probability part, not the *find* part)

#### Floris van Doorn (Apr 08 2021 at 17:28):

The closest existing definition in mathlib is probably docs#pmf for finite probability distributions. However, that takes values in the nonnegative reals, and here you really want to use the fact that the probabilities are rational, so maybe the best encoding is just to define the probability function as a function into `ℚ`

(without the additional info that it is a probability distribution)?

#### Floris van Doorn (Apr 08 2021 at 18:00):

Here is a way to state this problem by hand:

```
import data.rat.basic
import algebra.big_operators.basic
import tactic.derive_fintype
open_locale big_operators
abbreviation n := 6
@[derive [decidable_eq, fintype]] inductive racers
| Zou
| Chou
export racers
abbreviation races : Type := fin n → racers
def number_of_times_Zou_won (r : races) : ℕ :=
(finset.univ.filter (λ i : fin n, r i = Zou)).card
def probability_distribution (r : races) : ℚ :=
if r 0 = Chou then 0 else
∏ i in finset.univ.erase (0 : fin n), if r i = r (i - 1) then 2 / 3 else 1 / 3
def probability : ℚ :=
∑ r in finset.univ.filter (λ r : races, number_of_times_Zou_won r = 5), probability_distribution r
lemma AIME2021_1 : probability.num + probability.denom = sorry := sorry
```

I don't think current mathlib will help you all that much simplifying the problem statement. We have finitary products of measures (not yet the fact that the product of probability measures is a probability measure, though that is trivial to add), but we don't have that for discrete probabilities.

Future mathlib could have better support for `pmf`

, including iterated events like here. However, when writing a formal proof for a problem like this, it is probably convenient to stay in `ℚ`

and never mention real numbers.

#### Floris van Doorn (Apr 08 2021 at 19:24):

Thinking about it more, I think we can generalize `pmf`

to a more general setting if we wanted, something like

```
def pmf (α β : Type*) [linear_ordered_field β] [topological_space β] : Type* :=
{ f : α → β // has_sum f 1 ∧ ∀ x, 0 ≤ f x }
```

If we develop the theory `pmf`

a bit more, we could also compute the probability using `pmf`

:

```
import tactic.derive_fintype
import measure_theory.probability_mass_function
open_locale big_operators nnreal
noncomputable theory
namespace pmf
/-- probability of a set `s` under a pmf `p`. -/
def prob {α : Type*} (p : pmf α) (s : set α) : ℝ≥0 :=
∑' i , s.indicator p i
/-- The pmf when iterating events. The pmf at event `i` can depend on the results of `j < i`. -/
def iterate {ι : Type*} [fintype ι] [partial_order ι] {β : ι → Type*}
(p : Π i, (Π j < i, β j) → pmf (β i)) : pmf (Π i, β i) :=
⟨λ f, ∏ i, p i (λ j _, f j) (f i), sorry⟩
end pmf
open pmf
abbreviation n := 6
abbreviation racers := bool -- we now define racers to be `bool`, so that we can use `bernoulli` below
abbreviation Zou := tt
abbreviation Chou := ff
abbreviation races : Type := fin n → racers
def number_of_times_Zou_won (r : races) : ℕ :=
(finset.univ.filter (λ i : fin n, r i = Zou)).card
lemma fin.sub_one_lt_self {n} {i : fin (n+1)} (hi : 0 < i) : i - 1 < i := sorry
def probability_distribution : pmf races :=
iterate $ λ i r,
bernoulli (if h : 0 < i
then if r (i - 1) (fin.sub_one_lt_self h) = Zou then 2 / 3 else 1 / 3
else 1)
(by split_ifs; rw [←nnreal.coe_le_coe]; norm_num)
def probability : ℝ≥0 :=
probability_distribution.prob { r : races | number_of_times_Zou_won r = 5}
```

If anyone wants to make a PR to mathlib generalizing `pmf`

or proving basic properties of `pmf.prob`

and `pmf.iterate`

, that would definitely be welcome.

#### Mario Carneiro (Apr 09 2021 at 02:13):

Is this probability computable by `#eval`

? It seems like it would be

#### Mario Carneiro (Apr 09 2021 at 02:16):

At least with the first one, it seems so:

```
#eval probability.num + probability.denom -- 97 :)
lemma AIME2021_1 : probability.num + probability.denom = 97 := rfl -- fails :(
```

#### Daniel Selsam (Apr 09 2021 at 03:29):

@Floris van Doorn Thank you! Some of the `pmf`

abstractions are indeed nice (`iterate`

, `bernoulli`

, `prob`

). I am not sure if they are worth the cost of proofs-inside-the-statements though. At least in this simple example, the first-principles statement you gave first seems like a fine compromise.

Last updated: Aug 16 2022 at 18:20 UTC