Zulip Chat Archive

Stream: general

Topic: Discussion for LeanSearchClient


Siddhartha Gadgil (Sep 18 2024 at 03:29):

This thread is for discussion of LeanSearchClient. A lot of discussion before announcement happened on the thread: https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/LeanSearch/near/471011469

Siddhartha Gadgil (Sep 18 2024 at 03:30):

(deleted)

Jireh Loreaux (Sep 18 2024 at 03:42):

@Siddhartha Gadgil this is fantastic! Thanks!

Vincent Beffara (Sep 18 2024 at 08:21):

This is great! Is there any plan to give it access to the current proof state? It would likely be extremely useful to be able to ask the engines for all the theorems matching the current goal (kind of like apply?).

Kevin Buzzard (Sep 18 2024 at 08:22):

What am I doing wrong?

import Mathlib

def foo := 37

#loogle (_ ^ _) ^ _

/-
unexpected token '#'; expected search_cmd
-/

Sorry for being such a noob. This is on c5befd42d5d9755b17c292f81a721e72612b54e4 (most recent mathlib commit).

Riccardo Brasca (Sep 18 2024 at 08:23):

Maybe this?

Due to some technical issues, for now you need the additional import import LeanSearchClient.LoogleSyntax to use #loogle.

Siddhartha Gadgil (Sep 18 2024 at 08:26):

Vincent Beffara said:

This is great! Is there any plan to give it access to the current proof state? It would likely be extremely useful to be able to ask the engines for all the theorems matching the current goal (kind of like apply?).

It already does that. It tries apply, have and rw (in both directions) and only suggests tactics that work. There may be false positives and negatives due to bugs though (please report here are as an issue).

Siddhartha Gadgil (Sep 18 2024 at 08:28):

To clarify the above, it tries only the (by default 6) theorems returned by the search server. This is configurable with the options leansearch.queries, moogle.queries and loogle.queries.

Vincent Beffara (Sep 18 2024 at 08:30):

I mean, can I already say #loogle _ (or whatever syntax) and it passes the current goal to loogle as a query?

Siddhartha Gadgil (Sep 18 2024 at 08:31):

Oops sorry.

Siddhartha Gadgil (Sep 18 2024 at 08:32):

It filters by applicability of tactic state. It does not build a query.

Siddhartha Gadgil (Sep 18 2024 at 08:33):

What is the correct query is a bit subtle but if this can be clarified I can add that.

Siddhartha Gadgil (Sep 18 2024 at 08:53):

Thinking further, @Vincent Beffara I think that is a brilliant idea. I do not know how to convert a goal state into a loogle query (or queries depending on choices). If someone can describe that I can try to implement.

@Joachim Breitner Is there a nice way to go from goal state to loogle query?

Kim Morrison (Sep 18 2024 at 09:10):

I'm not sure there is something to do here that is better than exact?

Kim Morrison (Sep 18 2024 at 09:10):

You can look at the goal, have some heuristics for "the important constants", and issue a loogle query for those, but ...

Kim Morrison (Sep 18 2024 at 09:10):

This is premise selection, and not really a great fit for what loogle does.

Siddhartha Gadgil (Sep 18 2024 at 09:12):

The only difference I see is we can search all of Mathlib without importing it.

Joachim Breitner (Sep 18 2024 at 09:27):

You can look at the goal, have some heuristics for "the important constants", and issue a loogle query for those, but ...

Yeah, that’s the most plausible I can think of as well.

Joachim Breitner (Sep 18 2024 at 09:28):

Riccardo Brasca said:

Maybe this?

Due to some technical issues, for now you need the additional import import LeanSearchClient.LoogleSyntax to use #loogle.

Could #loogle warn if the import is missing, like for example (I think) run_meta does?

Kevin Buzzard (Sep 18 2024 at 09:32):

Siddhartha Gadgil said:

The only difference I see is we can search all of Mathlib without importing it.

I can count three times in the last couple of months where I've asked a question here about a missing result when exact? failed and it turned out that I just hadn't imported enough. I've currently trained myself to import Mathlib before asking...

Siddhartha Gadgil (Sep 18 2024 at 09:38):

Joachim Breitner said:

Riccardo Brasca said:

Maybe this?

Due to some technical issues, for now you need the additional import import LeanSearchClient.LoogleSyntax to use #loogle.

Could #loogle warn if the import is missing, like for example (I think) run_meta does?

I will first try to fix the technical issue. There is a clean solution suggested by @Joachim Breitner which I should test - it gets slightly complicated to test without risking breaking CI again.

Siddhartha Gadgil (Sep 18 2024 at 09:38):

Oops, I did not realize to whom I was replying - sorry for quoting you to yourself.

Joachim Breitner (Sep 18 2024 at 09:44):

Ah, that’s before the no_keyword

Siddhartha Gadgil said:

it gets slightly complicated to test without risking breaking CI again.

You can bump LeanSearchClient to a branch in mathlib’s lakefile in a mathlib PR test CI safely, can you?

Siddhartha Gadgil (Sep 18 2024 at 09:45):

I also do not want to push to the main of LeanSearchClient. I can push to a branch and on a branch of mathlib change dependencies.

Joachim Breitner (Sep 18 2024 at 09:46):

Right, that’s what I mean :-)

Siddhartha Gadgil (Sep 18 2024 at 09:47):

But I am getting a strange error parser 'nonReservedSymbol' was not found with your code even though
#check Lean.Parser.nonReservedSymbol works.

Is there something I should open other than Lean.Parser?

Siddhartha Gadgil (Sep 18 2024 at 10:10):

Never mind. Adding auxiliary definitions worked.

Jason Rute (Sep 18 2024 at 10:19):

Great job! Now we also just need to add these tools to the search in https://leanprover-community.github.io/mathlib4_docs/index.html.

Siddhartha Gadgil (Sep 18 2024 at 10:22):

Jason Rute said:

Great job! Now we also just need to add these tools to the search in https://leanprover-community.github.io/mathlib4_docs/index.html.

Thanks. It already works in a sense, but I should add docstrings.

Siddhartha Gadgil (Sep 18 2024 at 10:23):

I have made a PR to test if loogle syntax can be safely used. If CI is fine I can add to main.

Jason Rute (Sep 18 2024 at 10:27):

What do you mean it “already works in a sense”? You mean you are in the middle of implementing it? Or are we talking about different things. I’m talking about something similar to “search with Google” but using one of the other three tools.

Siddhartha Gadgil (Sep 18 2024 at 10:28):

I just meant searching for the command, not searching with the command.

Siddhartha Gadgil (Sep 18 2024 at 10:29):

Implementing this is an entirely different task. It is adding javascript to the website generation, maybe to docgen?

Siddhartha Gadgil (Sep 18 2024 at 10:30):

I agree that it would be very good.

Vincent Beffara (Sep 18 2024 at 11:10):

Kim Morrison said:

You can look at the goal, have some heuristics for "the important constants", and issue a loogle query for those, but ...

Yes, something like peel away quantifiers, truncate the parse tree and look for theorems that approximately fit. That would mean dropping the applycability filter for that mode though. And then reply something like "you would be able to apply this theorem, if only that subterm looked like such instead".

Vincent Beffara (Sep 18 2024 at 11:11):

But it's already a huge gain to be able to query from within vscode!

Siddhartha Gadgil (Sep 18 2024 at 11:13):

Vincent Beffara said:

Kim Morrison said:

You can look at the goal, have some heuristics for "the important constants", and issue a loogle query for those, but ...

Yes, something like peel away quantifiers, truncate the parse tree and look for theorems that approximately fit. That would mean dropping the applycability filter for that mode though. And then reply something like "you would be able to apply this theorem, if only that subterm looked like such instead".

I would agree with @Kim Morrison that this is a fundamentally different AI task: premise selection, and it would be best to have an AI tool for this (MagnusHammer for Lean). Then we can query that tool with similar syntax.

Siddhartha Gadgil (Sep 18 2024 at 15:43):

It seems auto-update has broken again, though I tested the syntax with nonReservedSymbol on a branch and CI passed there. It may be a different dependency causing the problem (the line with the error has no syntax similar to that of LeanSearchClient).

Asei Inoue (Sep 18 2024 at 22:16):

@Siddhartha Gadgil very nice!! Thank you.

but why #loogle command outputs is multiple info? I think first info info: Option ?a → ?a, "get!" is not needed.

import LeanSearchClient.LoogleSyntax

/--
info: Moogle Search Results
• #check Nat.Primes.infinite
• #check Nat.exists_infinite_primes
• #check Nat.exists_infinite_pseudoprimes
• #check Nat.infinite_setOf_prime
• #check Nat.infinite_setOf_prime_modEq_one
• #check Nat.infinite_setOf_pseudoprimes
-/
#guard_msgs in
  #moogle "primes exists infinitely."

/--
info: LeanSearch Search Results
• #check Nat.exists_infinite_primes
• #check Nat.Primes.infinite
• #check Nat.infinite_setOf_prime
• #check Nat.frequently_odd
• #check Nat.exists_prime_gt_modEq_one
• #check Nat.frequently_even
-/
#guard_msgs in
  #leansearch "primes exists infinitely."

/--
info: Option ?a → ?a, "get!"
---
info: Loogle Search Results
• #check Option.get!
-/
#guard_msgs in
  #loogle Option ?a  ?a, "get!"

Asei Inoue (Sep 18 2024 at 22:20):

(deleted)

Siddhartha Gadgil (Sep 19 2024 at 00:33):

@Asei Inoue Thanks. That was info that was there for debugging which I forgot to remove. Fixed this (though it will take a while to percolate to users).

Siddhartha Gadgil (Sep 19 2024 at 14:11):

The fix has made its way to Mathlib, and now the extra import is not needed for loogle.
@Kevin Buzzard @Joachim Breitner

Asei Inoue (Sep 19 2024 at 15:00):

@Siddhartha Gadgil

I have found an unexpected behavior. This raise no error.

import LeanSearchClient.LoogleSyntax

/--
info: Loogle Search Results
• #check Option.get!
-/
#guard_msgs in
  #loogle Option ?a  ?a, "get!"

But this fails.

import LeanSearchClient.LoogleSyntax

/--
info: Loogle Search Results
• #check Option.get!
-/
#guard_msgs in
  #loogle Option ?a  ?a, "get!"

/- hello -/

Siddhartha Gadgil (Sep 19 2024 at 15:06):

Can you make an issue? I am travelling so can't look at it till Saturday

Asei Inoue (Sep 19 2024 at 15:08):

@Siddhartha Gadgil Where to make an issue? mathlib4 repo?

Siddhartha Gadgil (Sep 19 2024 at 15:09):

Sorry I was not clear. No, the LeanSearchClient repo linked in the announcement

Asei Inoue (Sep 20 2024 at 03:42):

done https://github.com/leanprover-community/LeanSearchClient/issues/8

Jarod Alper (Oct 01 2024 at 07:58):

This is a great tool! However, I was unable to get any of the three search tools working without some frustration. After following the instructions of running lake update mathlib' and adding import Mathlib.Tactic', I was getting the error unexpected token '#'. It also doesn't seem to work in the Lean4 playground.

However, after running `lake update', it did start working for some reason. Just posting this here in case anyone finds it useful.

Siddhartha Gadgil (Oct 01 2024 at 08:02):

Thanks. This is not part of mathlib but is a dependency of mathlib. My guess is that lake update mathlib only updated mathlib, not dependencies, so lake update is needed.

Maybe the same can be done for the lean playground @Jon Eugster

Joachim Breitner (Oct 01 2024 at 08:24):

The playground runs in a sandbox, so it doesn’t have network access. But indeed the tool should ideally handle network errors more gracefully, and not say

PANIC at Option.get! Init.Data.Option.BasicAux:16:14: value is none
backtrace:
PANIC at Option.get! Init.Data.Option.BasicAux:16:14: value is none
backtrace:

or

Could not parse JSON from ; error: offset 0: unexpected end of input

Siddhartha Gadgil (Oct 01 2024 at 08:25):

Joachim Breitner said:

The playground runs in a sandbox, so it doesn’t have network access. But indeed the tool should ideally handle network errors more gracefully, and not say

PANIC at Option.get! Init.Data.Option.BasicAux:16:14: value is none
backtrace:
PANIC at Option.get! Init.Data.Option.BasicAux:16:14: value is none
backtrace:

or

Could not parse JSON from ; error: offset 0: unexpected end of input

Agreed. Will fix this.

Siddhartha Gadgil (Oct 01 2024 at 08:35):

@Joachim Breitner @Jarod Alper I see two options UI wise:

  • log a warning and have an empty set of suggestions.
  • throw an error but with a useful message.

Which do you think a user will prefer.

Joachim Breitner (Oct 01 2024 at 08:45):

If there is an error, throw an error :-)

Siddhartha Gadgil (Oct 01 2024 at 08:51):

I have modified to give informative errors.

Jon Eugster (Oct 01 2024 at 12:30):

Joachim Breitner said:

The playground runs in a sandbox, so it doesn’t have network access.

:point_up:

Siddhartha Gadgil said:

My guess is that lake update mathlib only updated mathlib, not dependencies, so lake update is needed.

Maybe the same can be done for the lean playground Jon Eugster

Here is how the playground updates:
https://github.com/leanprover-community/lean4web/blob/main/Projects/mathlib-demo/build.sh

concretely:

curl -L https://raw.githubusercontent.com/leanprover-community/mathlib4/master/lean-toolchain -o lean-toolchain
lake update -R
lake build

Jon Eugster (Oct 01 2024 at 12:30):

Do I understand it correctly that this means there is no action needed from my side?

Joachim Breitner (Oct 01 2024 at 12:30):

I think so

Siddhartha Gadgil (Oct 01 2024 at 12:31):

So do I. These services cannot run in a sandbox

Siddhartha Gadgil (Oct 01 2024 at 12:31):

If it is worth it I suppose limited web access can be allowed

Joachim Breitner (Oct 01 2024 at 12:45):

It’s a possibility, although “no network access” is certianly easier to keep secure than “some network access”

Jireh Loreaux (Oct 01 2024 at 13:25):

If people have a browser open already, I don't think it's too crazy that they need to switch to another tab, especially if the cost of the alternative is reduced security.

Eric Wieser (Oct 01 2024 at 23:57):

Presumably an alternative here would be for the commands to communicate with a widget, and the widget to run the web requests client-side?

Jesse Michael Han (Oct 03 2024 at 00:47):

this is really impressive @Siddhartha Gadgil. earlier today i learned that ~150K searches have been issued to Moogle since we launched it nearly a year ago. many more improvements are coming soon!


Last updated: May 02 2025 at 03:31 UTC