Zulip Chat Archive

Stream: Lean Together 2025

Topic: Jason Rute: The last mile: How do we make AI theorem pr...

Jason Rute (Jan 15 2025 at 14:59):

A thread for discussion about my talk. :smile:

Rendya Yuschak (Jan 15 2025 at 16:01):

Hi! I have no question, but just want to say thank you for the talk. It was really great talk. The thing about the gap between users and AI tools designer really opened my mind.

I am pretty new in lean, and one of the problem that i had was figuring out what tactic to use. I am basically Alice in the user stories :sweat_smile:.

So, anyway, thanks again for the talk. I probably going to learn more first before trying to contribute or anything.

Notification Bot (Jan 16 2025 at 00:55):

Kevin Buzzard has marked this topic as resolved.

Notification Bot (Jan 16 2025 at 01:05):

Pietro Monticone has marked this topic as unresolved.

Pietro Monticone (Jan 16 2025 at 01:05):

:video_camera: Video recording: https://youtu.be/Yr8dzfVkeHg

Terence Tao (Jan 16 2025 at 04:32):

Great talk! The last mile bottlenecks are really important.

The idea of a sorry-filling server seems like the most promising approach in the near term. Perhaps Lean playground can come with a premium paid version which comes with all the existing AI theorem proving tools? :grinning_face_with_smiling_eyes:

Alex Kontorovich (Jan 17 2025 at 18:30):

Coming back to @Jason Rute's idea about making things as try-sorry and letting an AI crawl around trying to fill the sorry in the background: suppose I wanted to try to implement such a thing on a project (and had grant money, in case that was needed for API calls) -- how would I try to implement that in practice?

Adam Topaz (Jan 17 2025 at 18:56):

As far as I know there is no easy way to do this right now but it is something that @Johan Commelin and I are planning to try to implement soon ;) (and we even wanted such a thing in the later stages of LTE)

Kim Morrison (Jan 18 2025 at 04:57):

A first step is just indexing all the available sorrys out there, and making sure that there is a way for an AI to "wake up" at the state of a given sorry. I proposed a setup for doing this at https://leanprover.zulipchat.com/#narrow/channel/219941-Machine-Learning-for-Theorem-Proving/topic/Incentives.20.26.20sorry-filling.20leaderboard/near/494456196.

Jason Rute (Jan 19 2025 at 22:19):

After some discussion with folks here, I realized some of my AI-produced images of Lean users were conveying the wrong message, reinforcing a negative gender stereotype. It wasn't my intention and I've since edited the slides. You can find the new ones below. We also took the YouTube video down until we can edit it. AI is a powerful assistant, but it is also important to be the final editor, making sure your AI-generated content, whether an image or computer code, really has the impact you intend to make. In this case, it didn't.

Jason Rute (Jan 19 2025 at 22:19):

Slides:
The Last Mile - Lean Together - Jason Rute.pdf

Alex Kontorovich (Jan 19 2025 at 22:52):

Kim Morrison said:

A first step is just indexing all the available sorrys out there, and making sure that there is a way for an AI to "wake up" at the state of a given sorry. I proposed a setup for doing this at https://leanprover.zulipchat.com/#narrow/channel/219941-Machine-Learning-for-Theorem-Proving/topic/Incentives.20.26.20sorry-filling.20leaderboard/near/494456196.

What if it didn't have to index all possible sorrys but just the ones the user selects to be attempted, with something like a try-sorry call?...

Kim Morrison (Jan 20 2025 at 01:55):

The usual "epsilon squared" explanation for the difficult of getting things off the ground: try-sorry is only useful to users once there are agents that attempt them, and agents can only be trained once there are try-sorry. Hence we just need to piggyback on something that already exists in the wild.

Julian Berman (Jan 20 2025 at 02:10):

Isn't proof_wanted such an existing try-sorry?

Jason Rute (Jan 20 2025 at 02:12):

I come at this from the view of a data scientist and AI researcher. I think more data is better. In the future, when we have more data, or if we have too many sorries, one could be more selective.

Kim Morrison (Jan 20 2025 at 04:02):

Julian Berman said:

Isn't proof_wanted such an existing try-sorry?

Yes, a failed one. How many are out there right now? I think there are ... two!

Pietro Monticone (Jan 20 2025 at 07:01):

The link to the video above has been updated.

Bolton Bailey (Jan 20 2025 at 16:18):

Kim Morrison said:

Julian Berman said:

Isn't proof_wanted such an existing try-sorry?

Yes, a failed one. How many are out there right now? I think there are ... two!

Am I wrong or are there at least 10 in mathlib?

Martin Dvořák (Jan 20 2025 at 19:18):

Julian Berman said:

Isn't proof_wanted such an existing try-sorry?

Yes but try sorry could have other things depending on it (not in Mathlib itself but in almost any other project).

Kim Morrison (Jan 21 2025 at 02:34):

Bolton Bailey said:

Am I wrong or are there at least 10 in mathlib?

Thank you, you're correct, I'm not sure how I missed those.

Last updated: Dec 20 2025 at 21:32 UTC