Zulip Chat Archive

Stream: general

Topic: Introducing Lean Finder – A Smarter Search for Lean


Wuyang (Aug 11 2025 at 20:24):

Dear Lean Users,

We DeLTA Lab (at Simon Fraser University) are building foundational tools to support mathematicians working on formal mathematics.

We are excited to introduce Lean Finder, a semantic search engine for Lean that understands latent intents behind your queries.

You can try Lean Finder in two ways:

  1. Web Service: With beautifully rendered LaTeX and code: https://leanfinder.github.io/
  2. Zulip Chat: You can simply ping our chatbot: @leanfinder + your inquiry.

Lean Finder supports four types of input:

  • Informalized statement – A full or partial natural-language description of the statement you’re looking for.
  • User question – Ask in your own words, e.g., “Is there code for …” · “Why does this give an error …” · “How to define or derive instances …” · “How to finish or shorten proofs …” · “How to apply this lemma or theorem …”
  • Proof state – Paste your current state directly from VS Code.
  • Statement definition – Provide any fragment or complete definition.

Finally, our web service includes feedback tools. Please vote or upvote retrievals you find useful. Your feedback directly helps us improve Lean Finder, making it an even more powerful resource for the Lean and Zulip communities.

Thank you!

Sincerely,
Wuyang

Wuyang (Aug 11 2025 at 20:26):

Some examples of use cases:

@leanfinder Let L/K be a field extension and let x, y ∈ L be algebraic elements over K with the same minimal polynomial. Then the K-algebra isomorphism algEquiv between the simple field extensions K(x) and K(y) maps the generator x of K(x) to the generator y of K(y); i.e. algEquiv(x) = y.

leanfinder (Aug 11 2025 at 20:26):

Here’s what I found:

Wuyang (Aug 11 2025 at 20:27):

@leanfinder K : Type u_1
E : Type u_2
inst✝ : RCLike K
z✝ z : K
⊢ |re z| ≤ ‖z‖
Transform the goal from proving that the absolute value of the real part of a complex number is less than or equal to its norm, to proving that the square of the real part is less than or equal to the squared norm of the number.

leanfinder (Aug 11 2025 at 20:27):

Here’s what I found:

Matthew Ballard (Aug 11 2025 at 20:31):

@Wuyang Don't cross post

Wuyang (Aug 11 2025 at 20:32):

Matthew Ballard said:

Wuyang Don't cross post

No more! I just posted in general and new members. Hope that could be fine.

Matthew Ballard (Aug 11 2025 at 20:32):

I have removed your post in #new members. Don't cross post.

Eric Wieser (Aug 11 2025 at 22:10):

Some initial feedback; this output is way too verbose to be suitable for zulip; I suggest following the loogle bot and giving a one-line summary and a link to the webpage.

Eric Wieser (Aug 11 2025 at 22:11):

Showing the proof for each theorem is also very strange, as this is not relevant at all when trying to search for a theorem to apply.

Wuyang (Aug 12 2025 at 19:01):

@Eric Wieser Thank you very much for this valuable feedback! It took me some time to integrate this change.
I give an example below. Do you like this format?

Wuyang (Aug 12 2025 at 19:01):

@leanfinder Let L/K be a field extension and let x, y ∈ L be algebraic elements over K with the same minimal polynomial. Then the K-algebra isomorphism algEquiv between the simple field extensions K(x) and K(y) maps the generator x of K(x) to the generator y of K(y); i.e. algEquiv(x) = y.

leanfinder (Aug 12 2025 at 19:01):

Here’s what I found:

minpoly.algEquiv_apply (score: 0.868)

algEquiv_apply (score: 0.865)

minpoly.algEquiv (score: 0.844)

More Retrievals...

Wuyang (Aug 12 2025 at 19:04):

@Eric Wieser One thing to note here is that we support more informative docstrings than mathlib. You may have noticed that many statements in mathlib have very short, or even no docstrings. This may confuse new members. Lean Finder will provide more detailed descriptions for all statements, which I have now put into the spoiler.

Kenny Lau (Aug 12 2025 at 19:45):

it feels like your queries are LLM-generated, is there a specific algorithm you used? is this available in Leanfinder? Leansearch also features an LLM feature called "augmentation"

Wuyang (Aug 12 2025 at 19:53):

@Kenny Lau Thanks!
You don't need to use LLM to augment your query.

Quantitatively, Lean Finder achieves higher retrieval rates on mathlib4 than Lean Search, even with their "augmentation" feature:

  • LeanSearch: 0.487
  • Lean Finder: 0.620

Below I am using another query, which came from a vote submitted by an active user.
@leanfinder Theorem to show that if n is a positive integer, then 1/n is positive

leanfinder (Aug 12 2025 at 19:53):

Here’s what I found:

lemma (score: 0.700)

succ_ge_zero(k (score: 0.699)

Nat.cast_pos (score: 0.699)

More Retrievals...

Jireh Loreaux (Aug 12 2025 at 20:11):

@Wuyang note that there is something screwy in the links (the first one is called "lemma" and links to docs#pow_pos, the second one is clearly misnamed too, and even more strangely, it links to a file in a completely different repository: https://github.com/optsuite/optlib/blob/26fcefb31b5aee7f7f5fc7194f462b437282675f/Optlib/Convex/ImageSubgradientClosed.lean, which contains a lemma succ_ge_zero)

Wuyang (Aug 12 2025 at 20:18):

@Jireh Loreaux Thanks!

  1. The first one seems have some weird rendering issue, which I will dig more
  2. Another feature of Lean Finder is that, it supports statements beyond mathlib. We can retrieve from more 3rd party contributions:

Wuyang (Aug 12 2025 at 20:18):

(deleted)

Wuyang (Aug 12 2025 at 20:19):

@leanfinder Theorem to show that if n is a positive integer, then 1/n is positive

leanfinder (Aug 12 2025 at 20:19):

Here’s what I found:

Nat.one_div_pos_of_nat (score: 0.762)

Nat.one_div_cast_pos (score: 0.762)

PNat.pos (score: 0.757)

More Retrievals...

Wuyang (Aug 12 2025 at 20:19):

@Jireh Loreaux now the rendering works :point_up:

Jireh Loreaux (Aug 12 2025 at 20:20):

Why are the scores different on this run versus the last one?

Wuyang (Aug 12 2025 at 20:21):

@Jireh Loreaux I think we have to call Lean Finder in separate messages instead of being mixed with other irrelevant text


Last updated: Dec 20 2025 at 21:32 UTC