Zulip Chat Archive

Stream: general

Topic: Making library_search smarter

Stuart Presnell (May 19 2022 at 19:59):

I'm sure this has been asked before, but how difficult would it be to make library_search a little smarter and a little less frustrating?

For example, the following fails:

import data.nat.basic
open nat

example {m : } (h : m  0) (n : ) : 0 < m + n :=

because library_search can't fill in h.bot_lt : 0 < m that would then let it find the solution exact add_pos_left h.bot_lt n.

A similar obstacle arises when we have something of the form h : x < y in context but a lemma takes x ≤ y, and library_search can't produce h.le to use the lemma.

So how difficult would it be to get library_search to (virtually) add things like this to the current context before searching?

Adam Topaz (May 19 2022 at 20:02):

Library search has a fairly new feature where you can write library_search with h (maybe it's using instead of with? I can't remember). Does this work in your example?

Yaël Dillies (May 19 2022 at 20:02):

iIRC this only filters the results. It doesn't make library_search stronger per se.

Adam Topaz (May 19 2022 at 20:02):

Ah okay.

Adam Topaz (May 19 2022 at 20:04):

In such situations I tend to write have := h.bot_lt and try library search after that, and modify the code after the fact

Stuart Presnell (May 19 2022 at 20:22):

Yeah, I've resorted to doing the same hand-holding to help library_search do its job, but if you have several such inequalities this can be a lot of effort.

Damiano Testa (May 19 2022 at 22:42):

Lines 60-63 of tactic.suggest, in docs#tactic.suggest.allowed_head_symbols, seem to give special support for ≤ and < on ℕ. Maybe, simply adding some support for ≠ 0 could be a simple modification with good results. The main heuristic would be that the missed conversion n ≠ 0 ↔ 0 < n may be one of the most frustrating failures of library_search.

I have not looked deeper at the tactic, so I'm not really sure if this is a good idea or not.

Kevin Buzzard (May 20 2022 at 03:17):

Every tweak of the algorithm to make one random case work better of course comes with the risk that it makes everything else worse. library_search is already slow when there are heavy imports. But I think there's a more fundamental reason why one shouldn't be doing this.

It's pretty clear what the job of library_search is, and that's to find exact matches against results in the library rather than to do any thinking for itself. That's the formal spec of the tactic. If that spec isn't right for you then this doesn't mean the tactic needs "fixing", it means you're using the wrong tactic. Like Adam I use the workaround he describes: if I'm not sure what the input will be to the function I'm hoping is there, I hedge by adding more stuff to the local context, like h.le or h.ne' or whatever.

I was once frustrated by linarith for similar reasons: I couldn't see why it couldn't use h : a < b \and P to deduce a < b until it was spelt out to me that this sort of idea goes under "mission creep" which is not the sort of thing one wants in general from a tactic.

However my memory is that actually some minor tweaks were added to linarith to deal with a very small number of edge cases of this nature, and there is a chance that someone will come up with a minor tweak that solves the problem above. As I say there's then a risk that ten other people want other bells and whistles of this form. And if you want a tactic which applies two results from the library to solve a goal rather than one then this isn't library_search, it's a hammer, which typically takes much longer to run.

The alternative fix is to come up with a convention for how these functions should look in the library. For example we could try and stick to the convention that for hypotheses we always demand ne and for conclusions lt, or that hypotheses and conclusions should always have the same choice, and then the user will know when to add stuff to the local context to maximise the chances that things work. Another possibility is library_search! which starts by adding all these extra bells and whistles to the hypotheses and conclusions before starting, and of course comes with a health warning that because it's now trying to solve 2^n times as many problems it will be 2^n times slower, which might not be what everyone wants but at least they can now avoid it.

Violeta Hernández (May 20 2022 at 03:31):

I agree. library_search isn't a magic wand that finds your theorem no matter how you write it down, and it isn't meant to be either.

Violeta Hernández (May 20 2022 at 03:31):

(funnily enough I find that squeeze_simp often does a better job than library_search does, haha)

Alex J. Best (May 20 2022 at 07:17):

I'd definitely vote for a library_search! if not just adding these tweaks to library search, for me the goal of library search is to be maximally useful rather than appear in final proofs, so I don't mind it being a bit more hacky than other things spec-wise personally. The slowness argument is of course something to worry about, but I think changing nat ne 0 to 0 < would save a lot of people frustration in the long run.

Eric Wieser (May 20 2022 at 07:34):

Don't we already have library_search!?

Kevin Buzzard (May 20 2022 at 07:36):

OK then maybe library_search++

Ruben Van de Velde (May 20 2022 at 07:45):

I was going to suggest it would be useful to have library_search [h.le] instead of have := h.le, library_search, but the docs say that already works

Kevin Buzzard (May 20 2022 at 08:02):

@Stuart Presnell is the current tactic documentation (e.g. Rob's six talks starting here) good enough for you to try making the modification yourself? I know that @Damiano Testa recently decided to learn tactic writing and seemed to get up to speed very quickly.

Damiano Testa (May 20 2022 at 08:10):

Besides the excellent videos that Rob has and that Kevin mentioned, I also learned a lot of meta-programming thanks to the help of Arthur Paulino. Arthur wrote a preliminary version of the move_add tactic that I wanted to have and this has been incredibly beneficial for me. And of course, all the follow up discussions that we had about it!

I am not really in a position to prepare a canvas for the tactic, but I try to answer questions in the meta-programming stream, if I think that I can add anything helpful!

Damiano Testa (May 20 2022 at 08:15):

Also, I was initially intimidated by expr. It is a very sophisticated concept, but it is also something with which you have a lot of unconscious experience if you have played with formalization for a bit!

Stuart Presnell (May 20 2022 at 13:18):

Kevin Buzzard said:

Stuart Presnell is the current tactic documentation (e.g. Rob's six talks starting here) good enough for you to try making the modification yourself?

Thanks for the link! Looks like I've got myself a project for the weekend

Stuart Presnell (May 20 2022 at 13:24):

As I understand it, the purpose of library_search is "find the lemma that I expect mathlib already knows but whose name I can't remember/work out". At least, that's very much how I use it and how I imagine a lot of people use it. I'm not proposing that we change that, just that we make it more capable of doing that job.

Damiano Testa (May 20 2022 at 13:32):

While I agree with you in spirit, observe that to convert h : m ≠ 0 to h : 0 < m you are using a second lemma from the library. Thus, either you add a lemma add_ne_zero_left to the library or what you want is a 2-step library search...

I am still in favour of having an option of tweaking library_search to allow looking a little deeper into the library, but I agree with the comment that this should be an optional behaviour, not the standard one.

Stuart Presnell (May 20 2022 at 13:33):

I appreciate the concern about this change making library_search slower, and if that turns out to be a problem then the modified version should be a variant rather than a replacement.

For reference: when you manually add a bunch of have := h.le premises, do you find that library_search becomes noticeably slower?

Ruben Van de Velde (May 20 2022 at 13:35):

In my personal experience, it's already so slow that I couldn't say :)

Damiano Testa (May 20 2022 at 13:36):

I personally found that the speed of library search is mostly affected by the number of imports that I have: depending on how much I am importing, I can either expect it to not timeout, or to timeout. If it is ever successful, I normally do not care how long it took, though! :wink:

Stuart Presnell (May 20 2022 at 13:39):

Damiano Testa said:

Thus, either you add a lemma add_ne_zero_left to the library or [...]

In fact it was just such a situation that prompted this suggestion! I'd proposed (in #14245) adding mul_lt_one to algebra/order/ring as a counterpart to docs#mul_le_one when I found that mathlib apparently didn't know that a < 1 → b < 1 → a*b < 1. But as @Yaël Dillies pointed out, it doesn't make sense "to write 2^n version of each lemma where n is the number of times appears in it".

Damiano Testa (May 20 2022 at 14:00):

Maybe I should point out that I agree with not expanding the library: my suggestion was a hypothetical one with which I disagree!

Stuart Presnell (May 20 2022 at 14:07):

Oh, I agree! It's impractical to explicitly list all the things we consider 'obvious'. That's why I want to broaden the beam of the searchlight instead, so that lemmas like add_pos_left and mul_le_one gain an implicit neighbourhood of equally obvious statements that can also be found.

Eric Wieser (May 20 2022 at 14:59):

I don't understand the argument for not having docs#mul_lt_one

Yaël Dillies (May 20 2022 at 16:15):

The argument is that we already have it. It's called docs#mul_lt_one_of_nonneg_of_lt_one_left

Michael Stoll (May 20 2022 at 19:05):

Can I add a complaint/suggestion regarding suggest to this discussion?
When I use suggest, I frequently observe that when it does not come up with some exact ... suggestion, the list of refines it presents me with rarely contains anything useful. Say, if the goal is some equality, it seems that I mostly get the same list of stuff that can be used to deduce any kind of equality, but nothing that is specific to the terms occurring on the left and right. I assume that this comes down to the way suggest orders its results. Would it be possible to change that in such a way that more specific proposals appear higher up in the list? I guess the question here is how one can measure the "specificity", which may not be so easy to do...

Mario Carneiro (May 20 2022 at 19:21):

For the people suggesting to make library_search two-step, consider that assuming no other guidance, finding a one-step proof is O(n) in the size of the library while finding a two-step proof is O(n^2). Now consider how slow library_search is already and how long you would be waiting for the two step version... Brute force search is slow.

Yaël Dillies (May 20 2022 at 19:21):

Michael, we have suggest ... using ...

Michael Stoll (May 20 2022 at 19:23):

Yaël Dillies said:

Michael, we have suggest ... using ...

In my experience, that doesn't necessarily help.

Mario Carneiro (May 20 2022 at 19:24):

It's not quite as bad as that as long as you can give a small enumerated list of glue lemmas like has_lt.lt.ne but I think you will have difficulty avoiding extending the list, so it won't stay small for long

Michael Stoll (May 20 2022 at 19:24):

Also, in some cases, there are no local hypotheses that would be useful.

Last updated: Aug 03 2023 at 10:10 UTC