Simprocs for the Working Mathematician
Lean v4.6.0 (back in February 2024!) added support for custom simplification procedures, aka simprocs. This blog post is the first in a series of three aimed at explaining what a simproc is, what kind of problems can be solved with simprocs, and what tools we have to write them.
This post describes purely informally what simprocs are and do. The second post will be a walkthrough to writing a simple simproc in three different ways.
To understand what a simproc is and how it works, we will first explain how simp
works.
Then we will give some examples and non-examples of simprocs.
How simp
works
simp
is made of two components.
The first component is rewriting rules.
A rewriting rule has a left hand side (either an expression or an expression pattern), which simp
will use to decide when to use the rewriting rule, and a right hand side, which will be computed from the left hand side and must be equal to it as a mathematical object.
Most rewriting rules are lemmas tagged with @[simp]
that are of the form LHS = RHS
or LHS ↔ RHS
(lemmas that prove P
which is not of the form _ = _
or _ ↔ _
are turned into P = True
), but we will soon see less straightforward examples.
The second one is the simp
tactic, which we will refer to simply as simp
.
When run on a goal ⊢ e
, simp
iteratively looks for a subexpression of e
that matches the left hand side of some rewriting rule, and replaces that subexpression with the right hand side of that rule.
For example, here's the proof steps simp
follows to close the goal 37 * (Nat.fib 0 + 0) = 0
⊢ 37 * (Nat.fib 0 + 0) = 0 ⊢ 37 * (0 + 0) = 0 ⊢ 37 * 0 = 0 ⊢ 0 = 0 ⊢ True
The order in which
simp
traverses an expression is relatively intricate. As a simple approximation, simplification is performed from the inside-out. See the next blog post for more details on the algorithm.If you write
set_option trace.Meta.Tactic.simp true in example : MyGoal := by simp
, you will see the list of simplification stepssimp
performs onMyGoal
.
What is a simproc?
In the previous section, we explained how simp
uses rewriting rules of the form LHS = RHS
or LHS ↔ RHS
to simplify expressions.
Let's now talk about the rewriting rules that are not of this form, aka simprocs.
A simproc is a rewriting rule which, given an expression LHS
matching its left hand side, computes a simpler expression RHS
and constructs a proof of LHS = RHS
on the fly.
The concept of a simproc is genuinely more powerful than that of a simp lemma. Indeed, we will soon see an example of a simproc taking the place of infinitely many simp lemmas.
Examples of simprocs
In this section, we exemplify four simprocs that cover the following use cases:
- Avoiding combinatorial explosion of lemmas
- Computation
- Performance optimisation
Avoiding combinatorial explosion of lemmas: The existsAndEq
simproc
The existsAndEq
simproc is designed to simplify expressions of the form ∃ x, ... ∧ x = a ∧ ...
where a
is some quantity independent of x
by removing the existential quantifier and replacing all occurences of x
by a
.
example (p : Nat → Prop) : ∃ x : Nat, p x ∧ x = 5 := by simp only [existsAndEq] -- Remaining goal: `⊢ p 5` example (p q : Nat → Prop) : ∃ x : Nat, p x ∧ x = 5 ∧ q x := by simp only [existsAndEq] -- Remaining goal: `⊢ p 5 ∧ q 5`
To give simp
this functionality without a simproc, one would have to write infinitely many simp lemmas.
Indeed, the equality x = a
could be hidden arbitrarily deep inside the ∧
.
Technically, one could implement this functionality using finitely many lemmas:
and_assoc
to left-associate all the∧
,and_left_comm (b := _ = _)
to move the=
left-ward,exists_eq_left
to eliminate the=
when it reaches the∃
. This is not useful in practice since it could possibly loop (e.g. if there are two=
, they could be commuted forever) and modifies the expression in unwanted ways, such as reassociating all the∧
, even those outside an∃
.
When presented with a left hand side of the form ∃ x, P x
, where P x
is of the form _ ∧ ... ∧ _
, existsAndEq
does the following:
- Recursively traverse
P x
inside the existential quantifier looking for an equalityx = a
for somea
. - If an equality is found, construct a proof that
∀ x, P x → x = a
. - Return the right hand side
P a
together with the proof obtained from the following lemma:lean lemma exists_of_imp_eq {α : Sort u} {p : α → Prop} (a : α) (h : ∀ x, p x → x = a) : (∃ x, p x) = p a
Computation
Computations are an integral part of theorem proving, and as such there are many ways to perform them.
For example, you will find that the decide
tactic closes most of the examples in this subsection.
There are a few reasons why simprocs are interesting for computation regardless:
-
decide
relies on decidability instances. Not everything one may want to compute is decidable, and not every decidability instance is efficient. In fact, mostDecidable
instances in Lean and Mathlib are very generic, and therefore unspecific and inefficient. Using a simproc gives the opportunity to use a domain-specific algorithm, which is more likely to be efficient and does not rely on the decidability machinery. -
decide
cannot compute the right hand side, given the left hand side only.decide
only works on goals that do not contain any metavariables or free variables. This rules out usingdecide
to find out what a left hand side is equal to. One would need to write down the right hand side we are looking for in order fordecide
to show that it's equal to the left hand side. In particular, using a simproc means we can perform a computation and then continue to simplify the resulting expression within a singlesimp
call.
The Nat.reduceDvd
simproc
The Nat.reduceDvd
simproc is designed to take expressions of the form a | b
where a, b
are numerals (i.e. actual numbers, like 0, 1, 37
, as opposed to variables or expressions thereof), and simplify them to True
or False
.
example : 3 ∣ 9 := by simp only [Nat.reduceDvd] -- `decide` cannot close this goal. example (f : Prop → Prop) : f (2 ∣ 4) = f True := by simp only [Nat.reduceDvd] example : ¬ 2 ∣ 49 := by simp only [Nat.reduceDvd] -- Remaining goal: `¬ False` simp example (a b : Nat) : a ∣ a * b := by simp only [Nat.reduceDvd] --fails
To reiterate one of the points made earlier, this simproc is useful despite doing something that decide
can already do, as it allows the simp
tactic to get rid of certain expressions of the form a ∣ b
, and then simplify the resulting goal (or hypothesis) further.
When presented with a left hand side of the form a ∣ b
where a
and b
are natural numbers, Nat.reduceDvd
does the following:
- Check that
a
andb
are numerals. - Compute
b % a
. - If
b % a
is zero, then return the right hand sideTrue
together with the proofNat.dvd_eq_true_of_mod_eq_zero a b rfl
that(b % a = 0) = True
. - If
b % a
isn't zero, then return the right hand sideFalse
together with the proofNat.dvd_eq_false_of_mod_ne_zero a b rfl
that(b % a = 0) = False
.
The Finset.Icc_ofNat_ofNat
simproc
Note: this
simproc
is not in Mathlib yet (see #22039).
If a
and b
are in a (locally finite) partial order (if you don't know what this means, you can safely ignore these terms and think of the natural numbers instead), then Finset.Icc a b
for a
and b
is the finite set of elements lying between a
and b
.
The Finset.Icc_ofNat_ofNat
simproc is designed to take expressions of the form Finset.Icc a b
where a
and b
are numerals, and simplify them to an explicit set.
example : Finset.Icc 1 0 = ∅ := by simp only [Icc_ofNat_ofNat] example : Finset.Icc 1 1 = {1} := by simp only [Icc_ofNat_ofNat] example : Finset.Icc 1 4 = {1, 2, 3, 4} := by simp only [Icc_ofNat_ofNat] example (a : Nat) : Finset.Icc a (a + 2) = {a, a + 1, a + 2} := by -- fails: the bounds of the interval aren't numerals! simp only [Icc_ofNat_ofNat]
When presented with a left hand side of the form Finset.Icc a b
where a
and b
are natural numbers, Finset.Icc_ofNat_ofNat
does the following:
- Check that
a
andb
are numerals. - compute the expression
{a, ..., b}
recursively onb
, along with the proof that it equalsFinset.Icc a b
.
Performance optimisation: The reduceIte
simproc
The reduceIte
simproc is designed to take expressions of the form if P then a else b
(aka ite P a b
) and replace them with a
or b
, depending on whether P
simplifies to True
or False
.
This can be achieved with simp lemmas too, but it would be less efficient:
When encountering ite P a b
, simp lemmas would first simplify b
, then a
, then P
, and finally ite P a b
.
Assuming P
was simplified to True
, ite P a b
would be simplified to a
and all the work spent on simplifying b
would be lost.
If P
was simplified to False
, then the work spent on simplifying a
would be lost instead.
The point of reduceIte
is that it can be made to simplify P
, then ite P a b
, without simplifying a
and b
first.
For this to happen, one needs to call reduceIte
as a preprocedure, which is done by adding ↓
in front of its name, i.e. simp only [↓reduceIte]
.
Recall that, as a simple approximation, simplification is performed from the inside-out. What we just explained is a concrete example of simplification happening from the outside-in.
Let's see a few examples:
example : (if 37 * 0 = 0 then 1 else 2) = 1 := by -- Works since `simp` can simplify `37 * 0 = 0` to `True` -- because it knows the lemma `mul_zero a : a * 0 = 0`. simp only [↓reduceIte] example (P : Prop) (a b : Nat) : (if P ∨ ¬ P then a else b) = a := by -- Works since `simp` can simplify `P ∨ ¬ P` to `True` using `Decidable.em`. simp only [↓reduceIte, Decidable.em] open scoped Classical in -- Can be removed once Kevin Buzzard is done with the FLT project ;) example : (if FermatLastTheorem then 1 else 2) = 1 := by --This fails because `simp` can't simplify `FermatLastTheorem` to `True` or `False` simp only [↓reduceIte] -- Remaining goal: `⊢ (if FermatLastTheorem then 1 else 2) = 1` sorry -- See https://imperialcollegelondon.github.io/FLT for how to solve this `sorry` ;)
When presented with a left hand side of the form ite P a b
, reduceIte
does the following:
- Call
simp
onP
to get a simplified expressionP'
and a proofh
thatP = P'
. - If
P'
isTrue
then return the right hand sidea
together with the proofite_cond_eq_true r
thatite P a b = a
. - If
P'
isFalse
then return the right hand sideb
together with the proofite_cond_eq_false r
thatite P a b = b
.
Many more applications!
In the second blog post, we will see how to build step by step a simproc for computing a variant of List.range
when the parameter is a numeral.
A few caveats
The current design of simprocs comes with a few restrictions that are worth keeping in mind:
- By definition, a simproc can only be used in
simp
(and tactics that callsimp
under the hood, such assimp_rw
,simpa
,aesop
,norm_num
, etc...), even though the notion of a "modular lemma" could be useful in other rewriting tactics likerw
. -
One cannot provide arguments to a simproc to restrict the occurrences it rewrites.
In contrast, this is possible for lemmas in all rewriting tactics: e.g.
rw [add_comm c]
turns⊢ a + b = c + d
into⊢ a + b = d + c
whererw [add_comm]
would instead have turned it into⊢ b + a = c + d
.