Zulip Chat Archive
Stream: Machine Learning for Theorem Proving
Topic: LeanUniverse
Jason Rute (Jan 11 2025 at 07:17):
Aram Markosyan (not on Zulip) posted a tool LeanUniverse on X: https://x.com/aramhmarkosyan/status/1877800019522470009?s=46
GitHub: https://github.com/facebookresearch/LeanUniverse
Jason Rute (Jan 11 2025 at 07:17):
It seems to be (yet another) tool for extracting Lean data. Note it is Creative Commons, so it can’t be used for commercial purposes.
Jason Rute (Jan 11 2025 at 07:17):
@Kaiyu Yang do you know what this is?
Jason Rute (Jan 11 2025 at 07:18):
(Also a pet peeve of mine is people posting this stuff on X and not here. They would be more likely to reach the right people here. Although, I’m not sure who the intended users are.)
Eric Wieser (Jan 11 2025 at 22:22):
Looking more closely, this is an aggregation tool built on top of LeanDojo
Kaiyu Yang (Jan 12 2025 at 02:11):
Yes, I know this, though I'm not a direct contributor to this project. From my understanding, to get a larger and more diverse Lean dataset for training, it crawls Lean repos from GitHub, use LeanDojo to extract data from them, and aggregate the resulting datasets.
Last updated: May 02 2025 at 03:31 UTC