Zulip Chat Archive

Stream: Machine Learning for Theorem Proving

Topic: LeanUniverse


Jason Rute (Jan 11 2025 at 07:17):

Aram Markosyan (not on Zulip) posted a tool LeanUniverse on X: https://x.com/aramhmarkosyan/status/1877800019522470009?s=46

GitHub: https://github.com/facebookresearch/LeanUniverse

Jason Rute (Jan 11 2025 at 07:17):

It seems to be (yet another) tool for extracting Lean data. Note it is Creative Commons, so it can’t be used for commercial purposes.

Jason Rute (Jan 11 2025 at 07:17):

@Kaiyu Yang do you know what this is?

Jason Rute (Jan 11 2025 at 07:18):

(Also a pet peeve of mine is people posting this stuff on X and not here. They would be more likely to reach the right people here. Although, I’m not sure who the intended users are.)

Eric Wieser (Jan 11 2025 at 22:22):

Looking more closely, this is an aggregation tool built on top of LeanDojo

Kaiyu Yang (Jan 12 2025 at 02:11):

Yes, I know this, though I'm not a direct contributor to this project. From my understanding, to get a larger and more diverse Lean dataset for training, it crawls Lean repos from GitHub, use LeanDojo to extract data from them, and aggregate the resulting datasets.


Last updated: May 02 2025 at 03:31 UTC