Zulip Chat Archive

Stream: lean4

Topic: simp tries more general lemmas first

Jovan Gerbscheid (Feb 21 2025 at 23:10):

Discrimination tree lookup gives the result (generally) in order from most general to most specific match. Type class search is smart and traverses this array backwards. simp is not so smart and traverses the array forwards, meaning that more general lemmas are tried before more specific lemmas.

This caused a performance problem here. I believe @Joachim Breitner looked into this at some point, but I'm not sure what the status of that is.

Or am I missing a reason why more generall lemmas would better be applied first?

Jovan Gerbscheid (Feb 21 2025 at 23:13):

The file Mathlib.Data.NNRat.Defs contains this library note, which would become unnecessary if this is fixed in simp: (written by @Yaël Dillies)

library_note "specialised high priority simp lemma" /--
It sometimes happens that a @[simp] lemma declared early in the library can be proved by simp
using later, more general simp lemmas. In that case, the following reasons might be arguments for
the early lemma to be tagged @[simp high] (rather than @[simp, nolint simpNF] or
un@[simp]ed):

There is a significant portion of the library which needs the early lemma to be available via
simp and which doesn't have access to the more general lemmas.
The more general lemmas have more complicated typeclass assumptions, causing rewrites with them
to be slower.
-/

Joachim Breitner (Feb 22 2025 at 07:29):

Not sure if I remember all the details. I think the issue is the relative ordering of the entries for star and for a concrete key. But trying to be clever about the the meaning of the array that comes out of that isn't great software design, it depends too much on the internals of that trie data structure.

So if the relative order of lemmas matter, maybe better to use explicit priorities (and comments!) for a robust approach. Or else, sort the matching lemma using some clearly defined valuation function (could be size of the lhs or number of side conditions or so).

Joachim Breitner (Feb 22 2025 at 07:31):

Ah, here is a related issue:
https://github.com/leanprover/lean4/issues/4173

Jovan Gerbscheid (Feb 22 2025 at 08:17):

I would disagree that being clever about the meaning of the array is a bad idea. In fact, when thinking about type classes, this ordering is very important and convenient and is relied on in many places. In the vast majority of cases, a more specific instance should be aplied first.

I do agree though that it would be better if this order was documented more explicitly, or even was explicitly mainained with some matching score (such as in my RefinedDiscrTree). For example currently, the pattern (_ + (2 + 3)) is counted as more general than (1 + _), due to the greedy nature of the order (it first looks in the left argument). This could then be turned around.

Joachim Breitner (Feb 22 2025 at 08:28):

Jovan Gerbscheid said:

a more specific instance should be aplied first.

Absolutely! But what it means to be “more specific” shouldn’t depend on whether the DiscrTree happens to use a * or a something else. A change like yours making the tree more efficient with forall and lambda shouldn't affect the semantics. And if someone rips out the discrtree and does a linear search through all instances they should still get the same behavior. This, to me, clearly points towards a solution that sorts the theorems post-hoc, and where “more specific ” is a property of the theorem, not their discr tree or storage point in the tree.

Jovan Gerbscheid (Feb 22 2025 at 08:40):

I think there is a reason to have 'specificness' be decided in the discrimination tree. For example the pattern for instances foo bar * and foo * bar are the same global specificness. But when applied to the target foo bar *, you would want the first lemma to be considdered more specific. (This argument only applies to type class search, where * patterns in the target are allowed)

Jovan Gerbscheid (Feb 22 2025 at 08:41):

And I think it's OK that if the discrimination tree does switch to a better indexing, that then the sorting of specificness improves as well.

Jovan Gerbscheid (Feb 22 2025 at 18:56):

By the way, when I was working on the extensible push_neg tactic, I encountered the case where I wanted x<y to be negated to y≤x, but for e.g. natural numbers, 0<n should be negated to n=0. I then got confused why the lemma didn't apply.

My point is that I would like to be able to rely on simp picking the closer match (discrimination tree wise) and if it's not clear which match is closer, then I (as the simp set creator) can step in and add priorities.

Jovan Gerbscheid (Feb 24 2025 at 23:52):

I started with a PR to change the order in which simp lemmas of equal priority are tried. A side effect of this change is that more recently added simp lemmas are tried before older lemmas (when they have the same priority). Is this a desirable change?
lean4#7195

Jovan Gerbscheid (Feb 25 2025 at 00:10):

That is, for lemmas that have the same priority and the same pattern, the most recent one is tried.

Joachim Breitner (Feb 25 2025 at 08:15):

I think this observation supports my argument that ideally the lemma order should be independent of the implementation details of some index data structure.

Note that I am not saying that your vision for meassuring “specificness” is wrong, quite the contrary! But I’d like that to be an explicit notion; say a function that takes the goal and the lhs and calculates a score, or compares the applicable lemmas, independent of how the set of applicable lemmas were found.

This would also help with proof stability under import reordering.

Or put differently: simp -index should (ideally) behave the same, just slower.

And the DiscrTree should be only concerned with efficiently retrieving those lemmas/instances that could be defeq to the key, not more, not less. This gives a very clear specification to this somewhat complex beast, and nicely decouples it from the rest of the system.

Jovan Gerbscheid (Feb 25 2025 at 11:36):

I agree that this shows that the ordering should be independent from the implementation details of the DiscrTree. I suppose that decoupling this would also mean that simp and type class search could both specify their own heuristics for this ordering, if that turns out to be desirable (e.g. simp trying older lemmas first and type class search trying newer instances first)

Kevin Buzzard (Feb 25 2025 at 12:51):

I've never been particularly convinced by the "try newer instances first" heuristic for typeclass inference. Mathlib defines groups and rings very early on, and instantly proves that a group is a monoid and a ring is a monoid, and then much later on you create some far more exotic reasons why something might be a monoid, and then lean tries all of these first before trying to see if the thing is a group or a ring, which is often the right answer. Fortunately it's often quick to fail, but there are times when it wastes a lot of time trying to see if something is an ordered monoid or a topological field or whatever when the answer is right in front of it

Jovan Gerbscheid (Feb 25 2025 at 13:39):

What sort of heuristic do you think would be good for simp and type class search when choosing what to apply first?

It would be nice if it isn't too complicated, so that it is predictable. I think the specificness of the discrimination tree match is quite a good heuristic. Another option is to look at how complicated the (type class) hypotheses of the lemma are, but that seems quite complicated to measure.

Jovan Gerbscheid (Feb 25 2025 at 13:39):

For simp I found some cases where you do want the more general lemma to apply: for example in False ↔ False, you might prefer the lemma about p ↔ p instead of False ↔ p. And for if p then False else False the lemma about if p then x else x instead of if p then False else x. But I think it would be natural to tag these with simp high to explicitly enforce this. Althoug in my RefinedDiscrTree these get the same match score, because the repeated p or x makes the match be more specific, since RefinedDiscrTree keeps track of which pattern variables are the same.

Kevin Buzzard (Feb 25 2025 at 19:05):

My understanding is that the problems we sometimes see with lean having a local context full of algebra and then suddenly typeclass inference taking a voyage into topology or order theory are going to be solved (and perhaps in some cases have already been solved) by refactoring the typeclass hierarchies so that these non-algebra things are mixins.

Last updated: May 02 2025 at 03:31 UTC