Zulip Chat Archive

Stream: new members

Topic: Any articles or references about the implement of `simp`?

White Chen (Oct 15 2024 at 03:06):

I want to understand the specific implementation principles of the simp tactic. I am currently reading the source code for simp, but I feel that I lack some background knowledge (or prerequisite knowledge), which makes it somewhat difficult. I would like to ask if there are any articles that directly and thoroughly explain the specific principles behind simp?

I have already read Simp, but this article does not delve into the specific implementation details.

White Chen (Oct 15 2024 at 03:09):

I have already read most of Type Theory and Formal Proof: An Introduction. To understand the implementation of the simp tactic, are there any additional books or articles I should read?

Johan Commelin (Oct 15 2024 at 07:10):

Are you familiar with how rw works?

White Chen (Oct 15 2024 at 14:25):

I apologize for the late reply.

Before reading the simp code, I hadn't really read the implementation of rw. However, I've just studied it, and I believe I can understand the implementation of rw. To ensure I truly understand it, I'd like to briefly explain my understanding of rewrite:

I think the key to rewrite lies in the _root_.Lean.MVarId.rewrite function. In this function, I believe the most important lines are:

let eAbst ← withConfig (fun oldConfig => { config, oldConfig with }) <| kabstract e lhs config.occs

and

let eNew := eAbst.instantiate1 rhs

These two lines accomplish the rewriting behavior we want.

The first line abstracts all occurrences of p in our target expression e as a loose bound var. To match all occurrences of p in e, we define a recursive visit function in kabstract. This recursive function first checks if the entire expression e is definitionally equal to p (actually, it first checks e.toHeadIndex and e.headNumArgs for performance). If not equal, it uses visitChildren to recursively check if e's subexpressions are definitionally equal to p.

The second line then instantiates this newly created loose bound var with rhs.

Of course, this isn't the entirety of rewrite. Subsequently, rewriteTarget or rewriteLocalDecl is needed to replace the old goal or local declaration, as well as replaceMainGoal. My understanding of this part isn't very deep, but I don't think it affects my understanding of the rewrite implementation mechanism, so I'll skip it.

This is my understanding of rewrite. If there are any errors or omissions, please feel free to point them out and provide additional information.

Next, I'd like to add to my original question.

Regarding the motivation for studying simp: Initially, I planned to study (or even participate in) the aesop project, but due to unfamiliarity with meta-programming and functional programming, I was forced to temporarily abandon this plan. Instead, I decided to study some commonly used tactics first. (Of course, I should have started with rw; simp seems more complex than I imagined.)

While reading the simp-related code, I did encounter some specific difficulties, such as DiscrTree, Simproc, Method, etc., but I guess it would be better to create dedicated topic threads to inquire about these?

Kyle Miller (Oct 15 2024 at 15:31):

simp is a fairly complicated tactic, since it has a lot of features yet needs to be efficient, but the core idea is that given a term e, simp will produce a proof of an equality h : e = e'. If ?g is the goal and ?g : e, then you can use h to form a new goal ?g' : e' and assign ?g := cast (Eq.symm h) ?g'.

The algorithm is recursive and iterative. Suppose's looking at e. In its basic form:

It looks for a @[simp↓] lemma (a "pre simp lemma") whose LHS matches e exactly. Then e' is the resulting RHS and the simp lemma is the proof of e = e'. If this succeeds, start step 1 again with e' to get e'' and use Eq.trans to chain the equalities.
It constructs a congruence lemma using e. For example, if e is f x y, it will construct a lemma a = a' -> b = b' -> f a b = f a' b'. Simp recurses, simplifying x to x' and y to y'. The resulting equalities are plugged into the congruence lemma, and the result is f x' y'. Or, if e is a fun, it will use funext to "enter" the function.
It tries step one again but with the @[simp] lemmas (the "post simp lemmas", the default). If it succeeds, start again from step 1.
Otherwise, return that nothing happened. The result is e with proof rfl : e = e.

Here are a couple explanations I found that I'd written before on simp and its relationship to rw (it's a very different algorithm):
https://leanprover.zulipchat.com/#narrow/stream/287929-mathlib4/topic/simp.20vs.20rw.20again/near/391348828
https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/SimpTheorems/near/408624455
https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/algorithmic.20documentation.20of.20simp/near/278613746

Kyle Miller (Oct 15 2024 at 15:32):

The conv tactic uses simp-like machinery (congruence lemmas and funext) to navigate into subexpressions.

Kyle Miller (Oct 15 2024 at 15:52):

DiscrTree is a data structure to accelerate looking for applicable simp lemmas.

A simproc is a simplification procedure that can do arbitrary transformations to an expression. Each simp lemma can be thought of as a simple simproc that matches on some specific LHS pattern and replaces it with a specific RHS. You can make simprocs to add together number literals for example, something you can't do with simp lemmas.

A Method corresponds to actions that occur at different steps of the simp algorithm, like whether to apply @[simp] lemmas at all. It's highly configurable.

Last updated: May 02 2025 at 03:31 UTC