Zulip Chat Archive

Stream: general

Topic: Where are we with AI suggestions for Lean 4?

Kevin Buzzard (Aug 28 2023 at 09:17):

I've always thought that these "let the AI guess the next tactic" tools were a bit of a gimmick, and the last time I tried to get one working was back in the Lean 3 days. But I'm giving a lecture tomorrow where I'm showcasing Lean in the context of AI and I think that showing off this functionality would go down well. Does it exist for Lean 4? What is the cheapest way of getting something like this up and running? Is there a chance that I can be using this by tomorrow on my laptop given that I don't have any relevant API keys etc today?

Scott Morrison (Aug 28 2023 at 10:57):

https://github.com/wellecks/llmstep

Scott Morrison (Aug 28 2023 at 10:57):

Is the current winner in terms of ease of use.

Scott Morrison (Aug 28 2023 at 10:57):

No API keys needed, but a large download (~10gb) before you use it.

Scott Morrison (Aug 28 2023 at 10:58):

If you ping @Sean Welleck, @Zhangir Azerbayev or me if you see us online we can help walk you through installing it.

Scott Morrison (Aug 28 2023 at 11:00):

If you want to show off #formalize #3808, which often works quite nicely, I can loan you an API key. :-)

Kevin Buzzard (Aug 28 2023 at 11:28):

Ok I will experiment and report back! In my lecture today I spend a lot of time making fun of ChatGPT so it will be interesting to see it in action with lean :-)

Alex J. Best (Aug 28 2023 at 12:43):

I find copilot not bad tbh, it certainly wins for ease of installation and use (if you have an academic email you are very quickly approved)

Kevin Buzzard (Aug 28 2023 at 12:44):

OK so for llmstep I downloaded the 5GB model fine, but now my issue is (even after a reboot):

buzzard@buster:~/lean-projects/llmstep$ python3 python/server.py
Loading model...
Killed
buzzard@buster:~/lean-projects/llmstep$

and just before it gets killed it fills up memory:

Tasks: 282 total,   3 running, 279 sleeping,   0 stopped,   0 zombie
%Cpu(s): 12.6 us,  3.0 sy,  0.0 ni, 84.2 id,  0.2 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  15714.0 total,    151.3 free,  12932.2 used,   2630.5 buff/cache
MiB Swap:   2048.0 total,   1199.6 free,    848.3 used.   2414.6 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
   8241 buzzard   20   0   19.3g  12.3g 192148 R 118.0  80.4   0:11.92 python3
    129 root      20   0       0      0      0 R   5.7   0.0   0:04.64 kswapd0
   2090 buzzard   20   0  489688  22636   7524 S   0.3   0.1   0:02.05 Xorg
   2237 buzzard   20   0 4685192  65732  22608 S   0.3   0.4   0:03.87 gnome-s+
      1 root      20   0  168804   4916   2112 S   0.0   0.0   0:01.03 systemd
      2 root      20   0       0      0      0 S   0.0   0.0   0:00.00 kthreadd

Is a 16 gig laptop no good for this?

Kevin Buzzard (Aug 28 2023 at 12:56):

Copilot: thanks! I have copilot up and running, with nothing to pay for 30 days...

Scott Morrison (Aug 28 2023 at 12:57):

Yeah, it probably needs more RAM.

Utensil Song (Aug 28 2023 at 13:03):

Maybe try using Colab to start the server then use environment variables to tell llmstep to connect to it: https://github.com/wellecks/llmstep/issues/6

Colab might shut down the server if the notebook is idle, so better to do something with the notebook (e.g. editing a cell) every few minutes, or add a cell like this before starting the server and click to play the silent music:

%%html
<audio src="https://oobabooga.github.io/silence.m4a" controls />

It looks like this:

image.png

Miguel Marco (Aug 28 2023 at 16:08):

Are there smaller version of the model that could run with less RAM?

Utensil Song (Aug 28 2023 at 16:41):

There's a WIP PR to quantize and load the model which consumes less RAM: https://github.com/zhangir-azerbayev/llmstep/tree/ggml#local-inference-with-llamacpp

Scott Morrison (Aug 29 2023 at 01:43):

Note that is quantizing a different model (trained by @Zhangir Azerbayev), not the original LLMStep model.

Utensil Song (Aug 29 2023 at 02:53):

Ah, I missed that.

Unfortunately, llama.cpp (thus the quantized formats GGML/GGUF ) doesn't support pythia series that LLMStep was finetuned on. But GPTQ using bitsandbytes or AWQ should still be feasible and relatively simple to implement. But honestly, pythia is a bit too old to enjoy the ecosystem compared to other open LLMs .

Filippo A. E. Nuccio (Aug 29 2023 at 13:08):

Has the talk been recorded?

Alex Kontorovich (Aug 29 2023 at 13:58):

Yes, I've been running copilot and have had a lot of surprising success (especially if you're expecting utter failure, which is the case a lot of the time...). This includes the time it correctly guessed the statement of the next theorem I wanted to prove, and gave a correct proof! Saved me like 20 mins of work...

Matthew Ballard (Aug 29 2023 at 14:10):

It certainly seems to know more about different parsers than me right now

Kaiyu Yang (Sep 09 2023 at 17:24):

@Kevin Buzzard Too late for your lecture, but please check this out: https://leanprover.zulipchat.com/#narrow/stream/219941-Machine-Learning-for-Theorem-Proving/topic/LeanInfer.3A.20Neural.20Network.20Inference.20in.20Lean.204!

Kevin Buzzard (Sep 09 2023 at 17:34):

There will be plenty more lectures :-)

Last updated: May 02 2025 at 03:31 UTC