# The `rw_search`

tactic #

`rw_search`

attempts to solve an equality goal
by repeatedly rewriting using lemmas from the library.

If no solution is found,
the best sequence of rewrites found before `maxHeartbeats`

elapses is returned.

The search is a best-first search, minimising the Levenshtein edit distance between
the pretty-printed expressions on either side of the equality.
(The strings are tokenized at spaces,
separating delimiters `(`

, `)`

, `[`

, `]`

, and `,`

into their own tokens.)

The implementation avoids completely computing edit distances where possible, only computing lower bounds sufficient to decide which path to take in the search.

# Future improvements #

We could call `simp`

as an atomic step of rewriting.

The edit distance heuristic could be replaced by something else.

No effort has been made to choose the best tokenization scheme, and this should be investigated. Moreover, the Levenshtein distance function is customizable with different weights for each token, and it would be interesting to try optimizing these (or dynamically updating them, adding weight to tokens that persistently appear on one side of the equation but not the other.)

The `rw_search`

tactic will rewrite by local hypotheses,
but will not use local hypotheses to discharge side conditions.
This limitation would need to be resolved in the `rw?`

tactic first.

Separate a string into a list of strings by pulling off initial `(`

or `]`

characters,
and pulling off terminal `)`

, `]`

, or `,`

characters.

## Equations

- One or more equations did not get rendered due to their size.

## Instances For

Pull off leading delimiters.

Pull off trailing delimiters.

Tokenize a string at whitespace, and then pull off delimiters.

## Equations

- One or more equations did not get rendered due to their size.

## Instances For

Data structure containing the history of a rewrite search.

- mk' :: (
The lemmas used so far.

- mctx : Lean.MetavarContext
The metavariable context after rewriting. We carry this around so the search can safely backtrack.

- goal : Lean.MVarId
The current goal.

- type : Lean.Expr
The type of the current goal.

- ppGoal : String
The pretty printed current goal.

The tokenization of the left-hand-side of the current goal.

The tokenization of the right-hand-side of the current goal.

Whether the current goal can be closed by

`rfl`

(or`none`

if this hasn't been test yet).The edit distance between the tokenizations of the two sides (or

`none`

if this hasn't been computed yet).- )

## Instances For

What is the cost for changing a token?
`Levenshtein.defaultCost`

just uses constant cost `1`

for any token.

It may be interesting to try others.
the only one I've experimented with so far is `Levenshtein.stringLogLengthCost`

,
which performs quite poorly!

## Equations

- Mathlib.Tactic.RewriteSearch.SearchNode.editCost = Levenshtein.defaultCost

## Instances For

Check whether a goal can be solved by `rfl`

, and fill in the `SearchNode.rfl?`

field.

## Equations

- One or more equations did not get rendered due to their size.

## Instances For

Fill in the `SearchNode.dist?`

field with the edit distance between the two sides.

## Equations

- One or more equations did not get rendered due to their size.

## Instances For

Represent a search node as string, solely for debugging.

## Equations

- One or more equations did not get rendered due to their size.

## Instances For

Construct a `SearchNode`

.

## Equations

- One or more equations did not get rendered due to their size.

## Instances For

Construct an initial `SearchNode`

from a goal.

## Equations

## Instances For

Add an additional step to the `SearchNode`

history.

## Equations

- Mathlib.Tactic.RewriteSearch.SearchNode.push n expr symm k g ctx = Mathlib.Tactic.RewriteSearch.SearchNode.mk (Array.push n.history (k, expr, symm)) g ctx

## Instances For

Report the index of the most recently applied lemma, in the ordering returned by `rw?`

.

## Equations

- Mathlib.Tactic.RewriteSearch.SearchNode.lastIdx n = match Array.back? n.history with | some (k, snd) => k | none => 0

## Instances For

## Equations

- One or more equations did not get rendered due to their size.

A somewhat arbitrary penalty function.
Note that `n.lastIdx`

penalizes using later lemmas from a particular call to `rw?`

at a node,
but once we have moved on to the next node these penalties are "forgiven".

(You might in interpret this as encouraging
the algorithm to "trust" the ordering provided by `rw?`

.)

I tried out a various (positive) linear combinations of
`.history.size`

, `.lastIdx`

, and `.ppGoal.length`

(and also the `.log2`

s of these).

`.lastIdx.log2`

is quite good, and the best coefficient is around 1.`.lastIdx / 10`

is almost as good.`.history.size`

makes things worse (similarly with`.log2`

).`.ppGoal.length`

makes little difference (similarly with`.log2`

). Here testing consisting of running the current`rw_search`

test suite, rejecting values for which any failed, and trying to minimize the run time reported by

```
lake build && \
time (lake env lean test/RewriteSearch/Basic.lean; \
lake env lean test/RewriteSearch/Polynomial.lean)
```

With a larger test suite it might be worth running this minimization again, and considering other penalty functions.

(If you do this, please choose a penalty function which is in the interior of the region where the test suite works. I think it would be a bad idea to optimize the run time at the expense of fragility.)

## Equations

## Instances For

The priority function for search is Levenshtein distance plus a penalty.

## Equations

- One or more equations did not get rendered due to their size.

## Instances For

We can obtain lower bounds, and improve them, for the Levenshtein distance.

## Equations

- One or more equations did not get rendered due to their size.

## Instances For

Given a `RewriteResult`

from the `rw?`

tactic, create a new `SearchNode`

with the new goal.

## Equations

- One or more equations did not get rendered due to their size.

## Instances For

Given a pair of `DiscrTree`

trees
indexing all rewrite lemmas in the imported files and the current file,
try rewriting the current goal in the `SearchNode`

by one of them,
returning a `MLList MetaM SearchNode`

, i.e. a lazy list of next possible goals.

## Equations

- One or more equations did not get rendered due to their size.

## Instances For

Perform best first search on the graph of rewrites from the specified `SearchNode`

.

## Equations

- One or more equations did not get rendered due to their size.

## Instances For

`rw_search`

attempts to solve an equality goal
by repeatedly rewriting using lemmas from the library.

If no solution is found,
the best sequence of rewrites found before `maxHeartbeats`

elapses is returned.

The search is a best-first search, minimising the Levenshtein edit distance between
the pretty-printed expressions on either side of the equality.
(The strings are tokenized at spaces,
separating delimiters `(`

, `)`

, `[`

, `]`

, and `,`

into their own tokens.)

You can use `rw_search [-my_lemma, -my_theorem]`

to prevent `rw_search`

from using the names theorems.

## Equations

- One or more equations did not get rendered due to their size.