Documentation

Mathlib.Tactic.RewriteSearch

The rw_search tactic #

rw_search attempts to solve an equality goal by repeatedly rewriting using lemmas from the library.

If no solution is found, the best sequence of rewrites found before maxHeartbeats elapses is returned.

The search is a best-first search, minimising the Levenshtein edit distance between the pretty-printed expressions on either side of the equality. (The strings are tokenized at spaces, separating delimiters (, ), [, ], and , into their own tokens.)

The implementation avoids completely computing edit distances where possible, only computing lower bounds sufficient to decide which path to take in the search.

Future improvements #

We could call simp as an atomic step of rewriting.

The edit distance heuristic could be replaced by something else.

No effort has been made to choose the best tokenization scheme, and this should be investigated. Moreover, the Levenshtein distance function is customizable with different weights for each token, and it would be interesting to try optimizing these (or dynamically updating them, adding weight to tokens that persistently appear on one side of the equation but not the other.)

The rw_search tactic will rewrite by local hypotheses, but will not use local hypotheses to discharge side conditions. This limitation would need to be resolved in the rw? tactic first.

Separate a string into a list of strings by pulling off initial ( or ] characters, and pulling off terminal ), ], or , characters.

Equations
  • One or more equations did not get rendered due to their size.
Instances For

    Pull off leading delimiters.

    Pull off trailing delimiters.

    Tokenize a string at whitespace, and then pull off delimiters.

    Equations
    • One or more equations did not get rendered due to their size.
    Instances For

      Data structure containing the history of a rewrite search.

      • mk' :: (
        • history : Array ( × Lean.Expr × Bool)

          The lemmas used so far.

        • The metavariable context after rewriting. We carry this around so the search can safely backtrack.

        • The current goal.

        • type : Lean.Expr

          The type of the current goal.

        • ppGoal : String

          The pretty printed current goal.

        • lhs : List String

          The tokenization of the left-hand-side of the current goal.

        • rhs : List String

          The tokenization of the right-hand-side of the current goal.

        • rfl? : Option Bool

          Whether the current goal can be closed by rfl (or none if this hasn't been test yet).

        • dist? : Option

          The edit distance between the tokenizations of the two sides (or none if this hasn't been computed yet).

      • )
      Instances For

        What is the cost for changing a token? Levenshtein.defaultCost just uses constant cost 1 for any token.

        It may be interesting to try others. the only one I've experimented with so far is Levenshtein.stringLogLengthCost, which performs quite poorly!

        Equations
        Instances For

          Check whether a goal can be solved by rfl, and fill in the SearchNode.rfl? field.

          Equations
          • One or more equations did not get rendered due to their size.
          Instances For

            Fill in the SearchNode.dist? field with the edit distance between the two sides.

            Equations
            • One or more equations did not get rendered due to their size.
            Instances For

              Represent a search node as string, solely for debugging.

              Equations
              • One or more equations did not get rendered due to their size.
              Instances For

                Construct a SearchNode.

                Equations
                • One or more equations did not get rendered due to their size.
                Instances For

                  Add an additional step to the SearchNode history.

                  Equations
                  Instances For

                    Report the index of the most recently applied lemma, in the ordering returned by rw?.

                    Equations
                    • n.lastIdx = match n.history.back? with | some (k, snd) => k | none => 0
                    Instances For
                      Equations
                      • One or more equations did not get rendered due to their size.

                      A somewhat arbitrary penalty function. Note that n.lastIdx penalizes using later lemmas from a particular call to rw? at a node, but once we have moved on to the next node these penalties are "forgiven".

                      (You might in interpret this as encouraging the algorithm to "trust" the ordering provided by rw?.)

                      I tried out a various (positive) linear combinations of .history.size, .lastIdx, and .ppGoal.length (and also the .log2s of these).

                      • .lastIdx.log2 is quite good, and the best coefficient is around 1.
                      • .lastIdx / 10 is almost as good.
                      • .history.size makes things worse (similarly with .log2).
                      • .ppGoal.length makes little difference (similarly with .log2). Here testing consisting of running the current rw_search test suite, rejecting values for which any failed, and trying to minimize the run time reported by
                      lake build &&  \
                      time (lake env lean test/RewriteSearch/Basic.lean; \
                        lake env lean test/RewriteSearch/Polynomial.lean)
                      

                      With a larger test suite it might be worth running this minimization again, and considering other penalty functions.

                      (If you do this, please choose a penalty function which is in the interior of the region where the test suite works. I think it would be a bad idea to optimize the run time at the expense of fragility.)

                      Equations
                      • n.penalty = n.lastIdx.log2 + n.ppGoal.length.log2
                      Instances For
                        @[reducible, inline]

                        The priority function for search is Levenshtein distance plus a penalty.

                        Equations
                        Instances For
                          @[reducible, inline]

                          We can obtain lower bounds, and improve them, for the Levenshtein distance.

                          Equations
                          Instances For

                            Given a RewriteResult from the rw? tactic, create a new SearchNode with the new goal.

                            Equations
                            • n.rewrite r k = Lean.Meta.withMCtx r.mctx do let goal'n.goal.replaceTargetEq r.result.eNew r.result.eqProof let __do_liftLean.getMCtx n.push r.expr r.symm k goal' (some __do_lift)
                            Instances For

                              Given a pair of DiscrTree trees indexing all rewrite lemmas in the imported files and the current file, try rewriting the current goal in the SearchNode by one of them, returning a MLList MetaM SearchNode, i.e. a lazy list of next possible goals.

                              Equations
                              • One or more equations did not get rendered due to their size.
                              Instances For

                                Perform best first search on the graph of rewrites from the specified SearchNode.

                                Equations
                                • One or more equations did not get rendered due to their size.
                                Instances For

                                  rw_search attempts to solve an equality goal by repeatedly rewriting using lemmas from the library.

                                  If no solution is found, the best sequence of rewrites found before maxHeartbeats elapses is returned.

                                  The search is a best-first search, minimising the Levenshtein edit distance between the pretty-printed expressions on either side of the equality. (The strings are tokenized at spaces, separating delimiters (, ), [, ], and , into their own tokens.)

                                  You can use rw_search [-my_lemma, -my_theorem] to prevent rw_search from using the names theorems.

                                  Equations
                                  • One or more equations did not get rendered due to their size.
                                  Instances For