Zulip Chat Archive

Stream: new members

Topic: mechanisms to search through mathlilb


Kevin Lacker (Oct 13 2020 at 17:39):

How frequently do more experienced Leaners use library_search or suggest? Personally, I find myself using library_search a lot, and not so much suggest. Just in the general situation of, hmmm this seems like it could be a lemma in mathlib but I don't know if it is exactly. I'm wondering if it were possible to speed up or improve the depth of library_search if many people would use that. So I'm curious if:

  1. Most people just remember all the relevant lemmas
  2. People use library_search a lot
  3. People usually just look through the documentation for relevant lemmas
  4. People use some other similar tactic that I'm not familiar with

Bryan Gin-ge Chen (Oct 13 2020 at 17:44):

Being familiar with the naming conventions helps a lot with being able to guess enough of a name to get the Lean server's auto-complete to do the rest.

Adam Topaz (Oct 13 2020 at 17:53):

I use library_search mostly in the middle of tactic proofs for little things which I know should (or at least ought to) be in mathlib.

Adam Topaz (Oct 13 2020 at 17:56):

I don't think I've ever actually used suggest, but I know it's there.

Shing Tak Lam (Oct 13 2020 at 18:17):

I rarely use library_search or suggest, since finding something by name or by guessing which file it should be in is often faster for me. The Ctrl-T search gets me what i want most of the time, and I use VSCode search otherwise. Sometimes I use the docs when I want to get an overview of a file.

Using library_search to find instance definitions can be a bit slow, so I just use by show_term {apply_instance} when I want to find the definition of an instance.

Yakov Pechersky (Oct 13 2020 at 18:25):

Apart from regular library_search usage, I sometimes use suggest to see what related constructors might be present in the imported namespace

Kevin Lacker (Oct 13 2020 at 18:27):

ok, this is very enlightening for me as I was not considering vscode autocomplete as a logical tool for comparison, but that makes total sense now that you bring it up

Kevin Lacker (Oct 13 2020 at 18:29):

my takeaway is that library_search has some utility, but the latency is not competitive enough with vscode language features to supplant them

Yakov Pechersky (Oct 13 2020 at 18:36):

An additional search mechanism I use (in VSCode) is use suggest or library_search to prove something related or simpler, then F12 to go to definition in the file of that newly found lemma, and read the lemmas and proofs to get a sense of what the API is. Which is more involved than just a doc search, but a lot more informative for me.

Alex J. Best (Oct 13 2020 at 19:16):

I use library search a lot, even in situations where I'm reasonably sure I can work out the name / order of arguments, or even in place of assumption quite often. If you dont have too many imports it can be quicker in my experience just to do that / less mental energy. suggest I use when I want to find a possible rw as well as things to apply, as the rewrite-able lemmas come up as blah_iff_blah.mp.

Kevin Lacker (Oct 13 2020 at 22:11):

Does anybody use the library_search "hole command"? I have not been able to get it to work, but I only found out about its existence from looking at the code, so I'm not sure if it's in general use

Bryan Gin-ge Chen (Oct 13 2020 at 22:13):

Hole commands don't work in tactic mode, unfortunately.

Bryan Gin-ge Chen (Oct 13 2020 at 22:14):

So you'd have to do something like example : true := {! !} and then try to trigger the whole command there.

Kevin Lacker (Oct 13 2020 at 22:16):

I got that far, but I haven't been able to get it to work outside tactic mode either. the library_search hole command at least seems to spit out some extra characters

Kevin Lacker (Oct 13 2020 at 22:16):

but, it seems like it isn't generally what people are using anyway

Bryan Gin-ge Chen (Oct 13 2020 at 22:20):

Oh, you're right. There's a bug with the library_search hole command right now.

import tactic
example : true := {!!}
-- triggering the hole command leads to:
-- example : true := is: exact trivial

Kevin Lacker (Oct 13 2020 at 22:22):

yeah i was just looking into this - i bet the string chopping here in suggest.lean is the problem:

@[hole_command] meta def library_search_hole_cmd : hole_command :=
{ name := "library_search",
  descr := "Use `library_search` to complete the goal.",
  action := λ _, do
    script  library_search,
    -- Is there a better API for dropping the 'exact ' prefix on this string?
    return [((script.mk_iterator.remove 6).to_string, "by library_search")] }

Kevin Lacker (Oct 13 2020 at 22:22):

do you know if lean tactic language has a way to split strings on a string match? or just like, typical string manipulation library stuff

Bryan Gin-ge Chen (Oct 13 2020 at 22:23):

Yep, I was about to make a PR.

Kevin Lacker (Oct 13 2020 at 22:23):

ok cool go for it

Bryan Gin-ge Chen (Oct 13 2020 at 22:23):

Not off the top of my head, I was just going to change 6 to 16.

Kevin Lacker (Oct 13 2020 at 22:23):

lol

Kevin Lacker (Oct 13 2020 at 22:24):

everyone has all of these aesthetic principles when it comes to the mathematical proofs, but when it comes to splitting a string on a word, bah let's hard code in the length of the string prefix :grinning:

Bryan Gin-ge Chen (Oct 13 2020 at 22:26):

Hah. Well, I'm now digging around in the string API at the moment so there you go.

Alex J. Best (Oct 13 2020 at 22:26):

docs#string.split ?

Alex J. Best (Oct 13 2020 at 22:26):

Splits on a character/character list it seems

Alex J. Best (Oct 13 2020 at 22:28):

#eval "exact library_search".split (= ' ')

Eric Wieser (Oct 13 2020 at 22:49):

I started working on splitting strings and decided it wasn't worth my time - but that's why there are now lemmas relating docs#list.inits and docs#list.tails.

Bryan Gin-ge Chen (Oct 13 2020 at 23:05):

I ended up going with a list function instead. See #4609.

import data.list.defs

def script := "Try this: exact trivial"
def s := script.to_list
#eval ((s.get_rest "Try this: exact ".to_list).get_or_else s).as_string

Eric Wieser (Oct 13 2020 at 23:58):

Didn't see docs#list.get_rest, that's not so bad

Bryan Gin-ge Chen (Oct 14 2020 at 16:57):

Heh, I missed docs#string.get_rest ... :face_palm:

import data.list.defs

def script := "Try this: exact trivial"
#eval (script.get_rest "Try this: exact ".to_list).get_or_else script

Pedro Minicz (Oct 14 2020 at 18:46):

I use suggest a lot, as it is faster than library_search and doesn't fail if it can't close the goal. There is also a tactic named hint if I am not mistaken, but I seldom use it.

Scott Morrison (Oct 14 2020 at 22:06):

hint is completely different: it just tries a list of popular tactics and tells you if any close the goal, and if none do it tells you which ones "make progress".

Scott Morrison (Oct 14 2020 at 22:07):

library_search could likely be made a lot faster with some attention from a good meta-programmer.

Scott Morrison (Oct 14 2020 at 22:07):

The basic balance is between just relying on the fact that Leo made apply (i.e. unification) really fast, and so just try all the lemmas, vs doing some amount of analysis of the current goal and only applying a subset of the lemmas.

Scott Morrison (Oct 14 2020 at 22:08):

Currently we look at the head symbol of the goal, and filter the lemmas by that (after a bit of munging, to handle things like > vs <).

Scott Morrison (Oct 14 2020 at 22:08):

Perhaps there is scope for more caching here?

Scott Morrison (Oct 14 2020 at 22:10):

library_search tends to be extra slow with = goals, because unsurprisingly there are a great many lemmas that talk about =, and so library_search tries them all with apply, even if the types are wrong. Now unification is smart and so most of these fail very fast --- perhaps so fast that any amount of filtering in meta code is actually going to be a slow down. But it's not completely clear to me; particularly if someone can think up a good way to cache tables of lemmas.

Kevin Lacker (Oct 14 2020 at 22:25):

yeah, I was looking through it to profile it and see which parts are slow. I cannot currently claim to be "a good meta-programmer" though ;-) it seems like more time is being spent in solve_by_elim than in apply but they are both significant and it probably depends a lot on context.

Kevin Lacker (Oct 14 2020 at 22:29):

is apply fast because it's basically a wrapper around apply_core which is in c++? I think someone mentioned that lean 4 might allow c++ extensions, which might be nice for something like a tree search

Kevin Lacker (Oct 14 2020 at 22:31):

when a library_search is doing a million object allocations, it kind of seems like the most obvious performance thing is to push more of the logic into c++.

Floris van Doorn (Oct 14 2020 at 22:52):

Nowadays, the most common way that I search through mathlib is using the Search function of mathlib docs. It just searches through the names of theorems, so you need to know the naming convention a bit for it to use it well, but its fuzzy matching is really good. For example if I search
https://leanprover-community.github.io/mathlib_docs/find/continuous_implies_measurable
I get docs#continuous.measurable as third result. A common strategy is that if I want a theorem that involves both foo and bar I just search for foo_bar using this serach function. For example https://leanprover-community.github.io/mathlib_docs/find/tendsto_comp gives docs#filter.tendsto.comp (after some other suggestions).
In Chrome you can add http://leanprover-community.github.io/mathlib_docs/find/%s as search engine, with a keyword. In my case I just have to type ml tendsto_comp in the search bar to do the above search.

The other way I search is the regular search (ctrl+F) or search through all files (ctrl+shift+F) in VSCode.
I don't know how well these strategies translate to tips to beginners/intermediate users, because they probably work less well if you don't know the naming scheme well.

Pedro Minicz (Oct 14 2020 at 23:56):

Kevin Lacker said:

when a library_search is doing a million object allocations, it kind of seems like the most obvious performance thing is to push more of the logic into c++.

I haven't written C++ in years and don't plan to break my streak, but I would love a fastar library_search!

Scott Morrison (Oct 15 2020 at 02:28):

I'd be curious to see examples where solve_by_elim is taking the majority of the time. This means that many lemmas are successfully unifying with the goal, and so there are then many calls to solve_by_elim to "clean up the mess", i.e. enforce that all new goals can be solved from existing hypotheses.

Scott Morrison (Oct 15 2020 at 02:28):

It may be there are good heuristics for avoiding calling solve_by_elim (or just using assumption)

Scott Morrison (Oct 15 2020 at 02:28):

I had always assumed it was being called infrequently enough this wasn't worth optimising.

Scott Morrison (Oct 15 2020 at 02:33):

Also, if we had easier support for parallelism, library_search is embarrassingly parallelizable.

Kevin Lacker (Oct 15 2020 at 16:27):

here's an example i was profiling:

import data.nat.basic

set_option profiler true
-- set_option trace.suggest true

lemma foo {j k : } (h1 : 0 < j) (h2 : 0 < k) (h3 : j  k) : k * j  2 :=
by library_search

Kevin Lacker (Oct 15 2020 at 16:29):

47% of the time is in solve_by_elim, 36% spent in the apply that's called directly from suggest.lean

Kevin Lacker (Oct 15 2020 at 16:29):

it's a bit tricky to measure because solve_by_elim itself calls apply - do you know if the profiler is capable of outputting tree-style profiles? - but i mucked around renaming things to separate them out

Kevin Lacker (Oct 15 2020 at 16:30):

parallelism is a good point - are there any mechanisms for spawning a thread or similar things in lean?

Kevin Lacker (Oct 16 2020 at 04:49):

I guess io.cmd sort of lets you do parallelism, but it seems kinda terrible to do that


Last updated: Dec 20 2023 at 11:08 UTC