Zulip Chat Archive

Stream: mathlib4

Topic: rw? does not find lemma

Michael Stoll (Nov 09 2023 at 20:53):

In the following tactic state

xy: ℂ
⊢ -Real.pi < (Complex.log y).im + (Complex.log x).im

(with import Mathlib.Analysis.SpecialFunctions.Complex.Log)
rw? does not suggest using Complex.log_im, which I think it should do (and high up in the list).

Looking at

import Mathlib.Analysis.SpecialFunctions.Complex.Log

example (x : ℂ) : x.log.im = 0 := by rw?

example (x : ℂ) : -Real.pi < x.log.im := by rw?

example (x : ℂ) : -Real.pi < x.log.im + x.re := by rw?

suggests that this is a problem with sorting the results (Complex.log_im shows up fairly early (no. 2) in the first example, quite a bit further down (no. 11) in the second, and not at all among the 21 suggestions in the third).
@Scott Morrison ?

Scott Morrison (Nov 10 2023 at 00:21):

Sorting results is hard. Before trying to make further progress on that, we would need to work out how to test/compare proposed changes.

Scott Morrison (Nov 10 2023 at 00:22):

Ideas on that very welcome.

Kevin Buzzard (Nov 10 2023 at 07:18):

I pointed out in another thread once that a problem would be solved "if the results were sorted better" and someone responded that if the results were sorted better then I had just made a powerful new automation tactic which we don't have, and thus the problem of sorting results better was probably hard. It was this observation which made me understand the heart of the problem.

Michael Stoll (Nov 10 2023 at 07:20):

Would it be possible to give rw? some hints à la loogle? E.g., rw? "log" or similar, to get it to propose (only) lemmas that have "log" in the name.

Joachim Breitner (Nov 10 2023 at 10:11):

Once the loogle backend (#find) reaches mathlib or std or so, composing the two should certainly be possible!

Thomas Murrills (Nov 10 2023 at 13:36):

Scott Morrison said:

Sorting results is hard. Before trying to make further progress on that, we would need to work out how to test/compare proposed changes.

One metric could arise from looking at existing rw's in mathlib: replace every rw with rw?, and see how far up the actually-used rules are in the generated list.

There are (at least) two tricky parts: (1) getting a number (2) handling chains of rewrites.

Let's assume for now that we just want rw? to produce the first lemma in a rw list (assuming it's a non-backwards use of a constant). The most basic data we get from a test like this is then a set of Option Nats—for each rw usage, either rw? includes the lemma at some index in its list, or not at all. It might be worth just looking at a histogram—to get a single number out of this, you need some way of weighting the different outcomes or something, and you'd lose important information. (But maybe you could eventually do regression/train a neural net on the basis of this...)
Each rw in mathlib can be a chain rw [lem1, lem2, lem3...]. This actually gives us a lot of data, as we can split it into rw [lem1]; rw [lem2]; rw [lem3]; .... We want to ignore non-constant, non-forwards rewrite rules. But there's another problem—what about when rewrites commute? The order shouldn't be special—we can't ask rw? to know which should come first on the basis of an example like this. Further, now we have more data: if rw [lem1, lem2] ~ rw [lem2, lem1], then rw? can "get points" for suggesting lem2 after rw [lem1], and for suggesting lem1 after rw [lem2]. Figuring out how to deal with this would probably involve a lot of gruesome details. Depending on how often rewrites actually can commute in practice, though, it might be worth it.

Jireh Loreaux (Nov 10 2023 at 13:52):

Thomas, I suspect the approach you suggested won't do at all what you want. For instance, it's going to overweight generally applicable rewrite lemmas (e.g., things like add_comm may show up high on the list, especially because they are not @[simp] lemmas), and disregard the rarely used (but super important and applicable) ones.

Jireh Loreaux (Nov 10 2023 at 13:54):

In fact, I would almost guess that reversing the order you suggested would be the more useful one.

Thomas Murrills (Nov 10 2023 at 14:16):

I'm responding specifically to the question of how to gauge changes to rw?, not proposing how to weight lemmas in the first place! I agree that finding highly specific lemmas is of paramount importance to the weighting question, though. But you can phrase that need within the framework of the method I'm proposing, by weighting different (goal, lemma, ranking) frequencies in a manner sensitive to e.g. the frequency of the lemmas' usage in rw, the "volatility" of the goal (how many different it can be usefully rewritten), maybe some measure of "rewrite complexity", etc. I'm making no statements on how things ought to be weighted to determine rw?'s "score", besides the most general parts of the procedure! :)

Last updated: May 02 2025 at 03:31 UTC