Simprocs for the Working Mathematician
Lean v4.6.0 (back in February 2024!) added support for custom simplification procedures, aka simprocs. This blog post is the first in a series of three aimed at explaining what a simproc is, what kind of problems can be solved with simprocs, and what tools we have to write them.
This post describes purely informally what simprocs are and do. The second post will be a walkthrough to writing a simple simproc in three different ways.
To understand what a simproc is and how it works, we will first explain how simp works.
Then we will give some examples and non-examples of simprocs.
How simp works
simp is made of two components.
The first component is rewriting rules.
A rewriting rule has a left hand side (either an expression or an expression pattern), which simp will use to decide when to use the rewriting rule, and a right hand side, which will be computed from the left hand side and must be equal to it as a mathematical object.
Most rewriting rules are lemmas tagged with @[simp] that are of the form LHS = RHS or LHS ↔ RHS (lemmas that prove P which is not of the form _ = _ or _ ↔ _ are turned into P = True), but we will soon see less straightforward examples.
The second one is the simp tactic, which we will refer to simply as simp.
When run on a goal ⊢ e, simp iteratively looks for a subexpression of e that matches the left hand side of some rewriting rule, and replaces that subexpression with the right hand side of that rule.
For example, here's the proof steps simp follows to close the goal 37 * (Nat.fib 0 + 0) = 0
⊢ 37 * (Nat.fib 0 + 0) = 0 ⊢ 37 * (0 + 0) = 0 ⊢ 37 * 0 = 0 ⊢ 0 = 0 ⊢ True
The order in which
simptraverses an expression is relatively intricate. As a simple approximation, simplification is performed from the inside-out. See the next blog post for more details on the algorithm.If you write
set_option trace.Meta.Tactic.simp true in example : MyGoal := by simp, you will see the list of simplification stepssimpperforms onMyGoal.
What is a simproc?
In the previous section, we explained how simp uses rewriting rules of the form LHS = RHS or LHS ↔ RHS to simplify expressions.
Let's now talk about the rewriting rules that are not of this form, aka simprocs.
A simproc is a rewriting rule which, given an expression LHS matching its left hand side, computes a simpler expression RHS and constructs a proof of LHS = RHS on the fly.
The concept of a simproc is genuinely more powerful than that of a simp lemma. Indeed, we will soon see an example of a simproc taking the place of infinitely many simp lemmas.
Examples of simprocs
In this section, we exemplify four simprocs that cover the following use cases:
- Avoiding combinatorial explosion of lemmas
- Computation
- Performance optimisation
Avoiding combinatorial explosion of lemmas: The existsAndEq simproc
The existsAndEq simproc is designed to simplify expressions of the form ∃ x, ... ∧ x = a ∧ ... where a is some quantity independent of x by removing the existential quantifier and replacing all occurences of x by a.
example (p : Nat → Prop) : ∃ x : Nat, p x ∧ x = 5 := by simp only [existsAndEq] -- Remaining goal: `⊢ p 5` example (p q : Nat → Prop) : ∃ x : Nat, p x ∧ x = 5 ∧ q x := by simp only [existsAndEq] -- Remaining goal: `⊢ p 5 ∧ q 5`
To give simp this functionality without a simproc, one would have to write infinitely many simp lemmas.
Indeed, the equality x = a could be hidden arbitrarily deep inside the ∧.
Technically, one could implement this functionality using finitely many lemmas:
and_assocto left-associate all the∧,and_left_comm (b := _ = _)to move the=left-ward,exists_eq_leftto eliminate the=when it reaches the∃. This is not useful in practice since it could possibly loop (e.g. if there are two=, they could be commuted forever) and modifies the expression in unwanted ways, such as reassociating all the∧, even those outside an∃.
When presented with a left hand side of the form ∃ x, P x, where P x is of the form _ ∧ ... ∧ _, existsAndEq does the following:
- Recursively traverse
P xinside the existential quantifier looking for an equalityx = afor somea. - If an equality is found, construct a proof that
∀ x, P x → x = a. - Return the right hand side
P atogether with the proof obtained from the following lemma:lean lemma exists_of_imp_eq {α : Sort u} {p : α → Prop} (a : α) (h : ∀ x, p x → x = a) : (∃ x, p x) = p a
Computation
Computations are an integral part of theorem proving, and as such there are many ways to perform them.
For example, you will find that the decide tactic closes most of the examples in this subsection.
There are a few reasons why simprocs are interesting for computation regardless:
-
deciderelies on decidability instances. Not everything one may want to compute is decidable, and not every decidability instance is efficient. In fact, mostDecidableinstances in Lean and Mathlib are very generic, and therefore unspecific and inefficient. Using a simproc gives the opportunity to use a domain-specific algorithm, which is more likely to be efficient and does not rely on the decidability machinery. -
decidecannot compute the right hand side, given the left hand side only.decideonly works on goals that do not contain any metavariables or free variables. This rules out usingdecideto find out what a left hand side is equal to. One would need to write down the right hand side we are looking for in order fordecideto show that it's equal to the left hand side. In particular, using a simproc means we can perform a computation and then continue to simplify the resulting expression within a singlesimpcall.
The Nat.reduceDvd simproc
The Nat.reduceDvd simproc is designed to take expressions of the form a | b where a, b are numerals (i.e. actual numbers, like 0, 1, 37, as opposed to variables or expressions thereof), and simplify them to True or False.
example : 3 ∣ 9 := by simp only [Nat.reduceDvd] -- `decide` cannot close this goal. example (f : Prop → Prop) : f (2 ∣ 4) = f True := by simp only [Nat.reduceDvd] example : ¬ 2 ∣ 49 := by simp only [Nat.reduceDvd] -- Remaining goal: `¬ False` simp example (a b : Nat) : a ∣ a * b := by simp only [Nat.reduceDvd] --fails
To reiterate one of the points made earlier, this simproc is useful despite doing something that decide can already do, as it allows the simp tactic to get rid of certain expressions of the form a ∣ b, and then simplify the resulting goal (or hypothesis) further.
When presented with a left hand side of the form a ∣ b where a and b are natural numbers, Nat.reduceDvd does the following:
- Check that
aandbare numerals. - Compute
b % a. - If
b % ais zero, then return the right hand sideTruetogether with the proofNat.dvd_eq_true_of_mod_eq_zero a b rflthat(b % a = 0) = True. - If
b % aisn't zero, then return the right hand sideFalsetogether with the proofNat.dvd_eq_false_of_mod_ne_zero a b rflthat(b % a = 0) = False.
The Finset.Icc_ofNat_ofNat simproc
Note: this
simprocis not in Mathlib yet (see #22039).
If a and b are in a (locally finite) partial order (if you don't know what this means, you can safely ignore these terms and think of the natural numbers instead), then Finset.Icc a b for a and b is the finite set of elements lying between a and b.
The Finset.Icc_ofNat_ofNat simproc is designed to take expressions of the form Finset.Icc a b where a and b are numerals, and simplify them to an explicit set.
example : Finset.Icc 1 0 = ∅ := by simp only [Icc_ofNat_ofNat] example : Finset.Icc 1 1 = {1} := by simp only [Icc_ofNat_ofNat] example : Finset.Icc 1 4 = {1, 2, 3, 4} := by simp only [Icc_ofNat_ofNat] example (a : Nat) : Finset.Icc a (a + 2) = {a, a + 1, a + 2} := by -- fails: the bounds of the interval aren't numerals! simp only [Icc_ofNat_ofNat]
When presented with a left hand side of the form Finset.Icc a b where a and b are natural numbers, Finset.Icc_ofNat_ofNat does the following:
- Check that
aandbare numerals. - compute the expression
{a, ..., b}recursively onb, along with the proof that it equalsFinset.Icc a b.
Performance optimisation: The reduceIte simproc
The reduceIte simproc is designed to take expressions of the form if P then a else b (aka ite P a b) and replace them with a or b, depending on whether P simplifies to True or False.
This can be achieved with simp lemmas too, but it would be less efficient:
When encountering ite P a b, simp lemmas would first simplify b, then a, then P, and finally ite P a b.
Assuming P was simplified to True, ite P a b would be simplified to a and all the work spent on simplifying b would be lost.
If P was simplified to False, then the work spent on simplifying a would be lost instead.
The point of reduceIte is that it can be made to simplify P, then ite P a b, without simplifying a and b first.
For this to happen, one needs to call reduceIte as a preprocedure, which is done by adding ↓ in front of its name, i.e. simp only [↓reduceIte].
Recall that, as a simple approximation, simplification is performed from the inside-out. What we just explained is a concrete example of simplification happening from the outside-in.
Let's see a few examples:
example : (if 37 * 0 = 0 then 1 else 2) = 1 := by -- Works since `simp` can simplify `37 * 0 = 0` to `True` -- because it knows the lemma `mul_zero a : a * 0 = 0`. simp only [↓reduceIte] example (P : Prop) (a b : Nat) : (if P ∨ ¬ P then a else b) = a := by -- Works since `simp` can simplify `P ∨ ¬ P` to `True` using `Decidable.em`. simp only [↓reduceIte, Decidable.em] open scoped Classical in -- Can be removed once Kevin Buzzard is done with the FLT project ;) example : (if FermatLastTheorem then 1 else 2) = 1 := by --This fails because `simp` can't simplify `FermatLastTheorem` to `True` or `False` simp only [↓reduceIte] -- Remaining goal: `⊢ (if FermatLastTheorem then 1 else 2) = 1` sorry -- See https://imperialcollegelondon.github.io/FLT for how to solve this `sorry` ;)
When presented with a left hand side of the form ite P a b, reduceIte does the following:
- Call
simponPto get a simplified expressionP'and a proofhthatP = P'. - If
P'isTruethen return the right hand sideatogether with the proofite_cond_eq_true rthatite P a b = a. - If
P'isFalsethen return the right hand sidebtogether with the proofite_cond_eq_false rthatite P a b = b.
Many more applications!
In the second blog post, we will see how to build step by step a simproc for computing a variant of List.range when the parameter is a numeral.
A few caveats
The current design of simprocs comes with a few restrictions that are worth keeping in mind:
- By definition, a simproc can only be used in
simp(and tactics that callsimpunder the hood, such assimp_rw,simpa,aesop,norm_num, etc...), even though the notion of a "modular lemma" could be useful in other rewriting tactics likerw. -
One cannot provide arguments to a simproc to restrict the occurrences it rewrites.
In contrast, this is possible for lemmas in all rewriting tactics: e.g.
rw [add_comm c]turns⊢ a + b = c + dinto⊢ a + b = d + cwhererw [add_comm]would instead have turned it into⊢ b + a = c + d.