Zulip Chat Archive

Stream: new members

Topic: Machine learning tooling

Stanislas Polu (Dec 02 2019 at 10:32):

Hi! Very exciting to watch the vibrant community around Lean.

I was wondering if any of you was aware of existing efforts around tracing and packaging the Lean prover for machine learning (along the lines of Holist for Hol Light). cc @Daniel Selsam given the IMO challenge?

I'd also be interested in learning more about the current size of the mathlib, in particular not sure if [0] is up to date?

[0] http://www.cs.ru.nl/~freek/100/

Rob Lewis (Dec 02 2019 at 10:51):

I can't tell you anything about machine learning. But yes, the 100 theorems list should be mostly up to date. There's some discussion about the contents and size of mathlib in our new paper: https://leanprover-community.github.io/papers/mathlib-paper.pdf

Stanislas Polu (Dec 02 2019 at 10:58):

Thanks @Rob Lewis ! This is very useful.

Stanislas Polu (Dec 02 2019 at 11:00):

Looks like Lean has been architectured in way that should be amenable to machine learning workloads. Is there an official documentation of the Lean binaries available somewhere beyond the "Theorem Proving with Lean" book?

Kevin Buzzard (Dec 02 2019 at 11:02):

Here's a recent summary of what is in mathlib currently: https://xenaproject.wordpress.com/what-maths-is-in-lean/

Rob Lewis (Dec 02 2019 at 11:08):

By binaries, do you mean the .olean binaries that are produced after processing .lean files? The best "documentation" there is probably @Mario Carneiro 's olean-rs tool. https://github.com/digama0/olean-rs I don't think there's much written text about the format.

Rob Lewis (Dec 02 2019 at 11:09):

If you mean the Lean system itself, there's no official documentation. Bits and pieces are explained in various papers.

Stanislas Polu (Dec 02 2019 at 11:12):

@Rob Lewis Thanks, that's what I figured :grimacing: Looks like the service used for interactive editing (the javascript demo as an example) is capable of extracting current open goals for any given code line. Is there a place I can look to better understand that interface?

Stanislas Polu (Dec 02 2019 at 11:13):

And... inspecting the olean format might be an interesting approach indeed. I will look into that as well.

Rob Lewis (Dec 02 2019 at 11:14):

I think we usually point people to the VSCode extension as an example of editor interaction. https://github.com/leanprover/vscode-lean There's also @Patrick Massot 's format_lean tool that extracts line-by-line info. https://github.com/leanprover-community/format_lean

Stanislas Polu (Dec 02 2019 at 11:20):

Excellent this is extremely useful again. THanks! I'll be curious to hear from @Daniel Selsam plans on that subject as well but will dig into these three repositories in the meantime :+1:

Wojciech Nawrocki (Dec 02 2019 at 12:07):

There is also (very experimental) documentation of all definitions in mathlib : https://leanprover-community.github.io/mathlib_docs/

Bryan Gin-ge Chen (Dec 02 2019 at 14:09):

The editor interaction relies primarily on Lean's "server mode", which communicates with Lean using JSON. You can run it using lean --server. I think the best "documentation" for it (in terms of a list of capabilities, etc.) is still this file in the lean-client-js library.

Stanislas Polu (Dec 02 2019 at 14:16):

Thanks @Bryan Gin-ge Chen :+1:

Daniel Selsam (Dec 02 2019 at 17:35):

@Stanislas Polu Welcome to Zulip. There are many ways one could integrate ML, e.g. using some combination of the tactic framework, the foreign function interface, and the language server. Lean4 is still under development though and we don't have official support for any such integration yet. Once Lean4 has the necessary pieces in place, we plan to (among other things) help Sarah and Christian integrate HOList. If you have different architectural desiderata, let us know and we will do our best to help. I plan to just use tactics + FFI for my own preliminary ML experiments, and I am not sure yet what architecture I will want down the road.

I'll also mention as an aside that I (personally, currently) do not expect supervised learning on existing libraries to be anything close to a silver bullet for the IMO. I am personally more excited about generative-modeling/synthetic-data/self-play.

Stanislas Polu (Dec 02 2019 at 19:29):

@Daniel Selsam I'd be interested in participating in any discussion on that subject and happy to help.

From looking at Lean3's interactive demo, I feel like most pieces to interactively compute open subgoals from partial proof trees seem to be in place? Would it make sense for me to mess with Lean3 or is Lean4 going to be completely different from that perspective? What do you think?

What do you refer to by "FFI" ?

Concerning your aside mention, completely agreed with your statement, though they all require fast and interactive access to the prover :)

Patrick Massot (Dec 02 2019 at 19:30):

FFI means foreign function interface.

Patrick Massot (Dec 02 2019 at 19:30):

https://en.wikipedia.org/wiki/Foreign_function_interface

Stanislas Polu (Dec 02 2019 at 19:33):

@Patrick Massot thanks :+1:

Last updated: May 02 2025 at 03:31 UTC