Zulip Chat Archive

Stream: new members

Topic: Newcomer where-to-go


Anotherpieceofmyhead (Dec 05 2025 at 23:43):

Hello and pleased to meet you! Im new to the community, sorry if im already doing something wrong.
Im aiming to learn lean while contributing, if this is possible/positively accepted?
So basically im interested:
1) If I can choose some topics I like and want to dig into
2) If I can do some small lemmas and contribute them to the library
Is there a person I ask for where to go or a topic may be?

Shreyas Srinivas (Dec 06 2025 at 00:24):

This is the wrong channel. This post should ideally go to the #new members channel

Shreyas Srinivas (Dec 06 2025 at 00:25):

But in order to avoid duplication I hope one of the moderators will move this topic to that channel

Shreyas Srinivas (Dec 06 2025 at 00:25):

In the meantime, it is helpful to know what areas of math you are fluent in

Notification Bot (Dec 06 2025 at 00:26):

This topic was moved here from #mathlib4 > Newcomer where-to-go by Kevin Buzzard.

Anotherpieceofmyhead (Dec 06 2025 at 16:23):

oopsie, thanks for moving the topic to the right channel!
Im pretty confident in number theory (at least everything for understanding how RSA work), abstract algebra (rereading Aluffis Chptr 0 now), set theory and a bit of topology, open to learn new things (looking forward for categories/algebraic topology/homology stuff)

Snir Broshi (Dec 06 2025 at 16:53):

In my experience, the way to contribute is to pick a specific low/medium-sized thing to formalize, e.g. pick a theorem from https://topology.pi-base.org/theorems or a mathoverflow undergrad answer, and on the way to the result you'll both learn Lean and become more comfortable with the relevant area of Mathlib, and you'll find many small lemmas that are missing.
For an extreme example, formalizing IMO geometry problems reveals a ton of missing geometry lemmas; look at the huge list of dependencies in #32295

Snir Broshi (Dec 06 2025 at 17:02):

Since you mentioned number theory, here's a list of classes of numbers that I think are missing from Mathlib:

It's half number theory and half recreational math, but I think Mathlib could benefit from having more of these.
Most of them should go in Mathlib/NumberTheory/FactorisationProperties.lean.

Anotherpieceofmyhead (Dec 07 2025 at 14:40):

So, to sum it up:
I can go to the
-Stacks project
-nlab
-proofwiki
-mentioned here pi-base
-libraries of another provers like Isabelle and Metamath or CoQ
-even my Aluffi book

look if something is absent in lean and do work on it?

Anotherpieceofmyhead (Dec 07 2025 at 14:52):

Im asking because the part where I do check if the theorem is already in lean or not sounds pretty time consuming, so if there is a list like "do lemmas x y z from here and here" it would be nice to know before I dig

Snir Broshi (Dec 07 2025 at 17:20):

Anotherpieceofmyhead said:

if there is a list like "do lemmas x y z from here and here" it would be nice to know before I dig

What about the list I sent?

metakuntyyy (Dec 07 2025 at 18:09):

@Anotherpieceofmyhead To be fair I also have that problem in what to formalise. Most of the stuff in pretty hard to formalise in Mathlib generality for me and I am looking for easy and low hanging fruits.

I don't know if anyone is working on them but how would you like formalising probability distributions. https://en.wikipedia.org/wiki/List_of_probability_distributions There's a big list, (almost :sweat_smile:) surely that should be useful.

Snir Broshi (Dec 07 2025 at 18:14):

I think probability is a bit difficult for someone new to Lean, but if you're up for it you can start by reading this blog post.
I think this is the home of all probability distributions in Mathlib, maybe you can knock off more from the list.

Weiyi Wang (Dec 07 2025 at 18:14):

I think the first few project doesn't need to aim for Mathlib generality as a whole. You can try formalizing (a special case of) a theorem as its own project, and during the process you will likely discover missing lemmas in mathlib, which are easier to formalize in Mathlib generality and suitable for contribution.

Kevin Buzzard (Dec 07 2025 at 18:19):

I think that targeting mathlib for a first project is no longer a reasonable approach, things are very different to a few years ago and most of the low-hanging fruit is long gone. I just tell my students to find a theorem they like and prove it, and don't worry at all about whether it's already in the library. Writing code is the way to learn.

Kevin Buzzard (Dec 07 2025 at 18:22):

I think that if I were assigned a PR proposing adding some random class of number like primary pseudoperfect number or whatever, I would be sorely tempted to just close it as not something we particularly want in the library unless I could see a clear application.

Snir Broshi (Dec 07 2025 at 18:45):

Kevin Buzzard said:

I think that if I were assigned a PR proposing adding some random class of number like primary pseudoperfect number or whatever, I would be sorely tempted to just close it as not something we particularly want in the library unless I could see a clear application.

I guess that depends on what you classify as "random", but you did approve #14269. docs#Nat.Weird isn't used for anything.
Doesn't a theorem such as every highly abundant number above 3 is even count as an application?
I agree if we're talking about a random number class that has a single paper about it and just happens to have a Wikipedia page, but much like the 1000+ theorems list, isn't having a medium-sized Wikipedia article a witness to the number class not being super random?

Snir Broshi (Dec 07 2025 at 18:59):

Here's another number theory (adjacent?) project suggestion: prove that all the definitions of a perfect field on Wikipedia are equivalent. I'm not sure which ones already exist (I did find docs#perfectField_iff_splits_of_natSepDegree_eq_one in another file) but for example "The separable closure of k is algebraically closed" only has one side of the implication (search PerfectField, IsSepClosure on #loogle).

Kevin Buzzard (Dec 07 2025 at 19:14):

I think that I am nowadays very much on the side of "if there isn't a recent research paper in the Annals of Mathematics about it then why are we bothering with it" side of things; other maintainers have far more liberal views. I have found that targetting the mathematics being done by experts is the most profitable way of growing mathlib and this is now possible but it's in general difficult for beginners to do. My personal feeling is that moving in any other direction is wasting reviewer time. But as I say I am only one of 30 or so maintainers.

Developing more of the theory of perfect fields would be a great project. The issue with getting a beginner up to mathlib-contribution level if they don't have a local expert to hand is that it's again a very inefficient way of making progress. This is why I usually encourage beginners (for example the 30 people taking Imperial's formalization course every year) to just work on a project of their own choosing and not to target mathlib until they are proficient.

Joseph Myers (Dec 08 2025 at 00:04):

Some fields (e.g. combinatorics) tend to make more use of specialist journals and less of generalist journals; I don't think the culture of where people tend to publish in different fields is a particularly good basis for deciding what goes in mathlib (or, when we have multi-repository mathlib, what goes in the core mathlib repository versus one of the sub-mathlibs).

Michael Rothgang (Dec 08 2025 at 10:41):

I think Kevin's statement should be broadened to "Annals or a similar top-tier journal in the respective field" --- my impression is that this is what's meant anyway, and looking at the Annals is a starting point, not the final list.

Snir Broshi (Dec 08 2025 at 19:05):

By that logic no progress can happen in graph theory in Mathlib. Surely none of the basic work from the 1700s has appeared in journals recently. We will hopefully get there, but your bar is super high, it's like you're saying that if a graph theory lemma doesn't help proving Robertson-Seymour then it should be deleted.
For a silly example, I've added a proof that a graph is either connected or its complement is (docs#SimpleGraph.connected_or_connected_compl) and it probably won't be used to prove any result from a recent research paper.

Kevin Buzzard (Dec 08 2025 at 19:16):

Well you will be pleased to know that plenty of other maintainers have plenty of other views. I'm just presenting my personal, rather extreme, and highly outspoken viewpoint which conforms to my personal goals for mathlib (get it to the point where it understands fashionable modern research as quickly as possible because that's where the big prizes are) rather than the community's goals (which seem to me to be ill-defined but certainly for some maintainers will include formalizing random stuff because they are happy with small prizes).

But let's get back to the point. The OP is asking how they might contribute. I am saying "please no more PRs adding random predicates on naturals for no good reason" but that's a negative response. Perhaps a more positive response is "which areas of mathematics are you an expert in? What are the challenging problems in your area? Can you work towards those? Can we even state them?".

Michael Rothgang (Dec 08 2025 at 19:25):

Regarding your example: I believe that "obvious" little lemmas have a tendency to show up when you don't expect them --- as long as the statement is a reasonable lemma about objects we find interesting, that might already suffice. (For new definition, I'd be more careful.)

Anotherpieceofmyhead (Dec 08 2025 at 21:46):

wow, that was a pretty long "if you are not proficient enough you are not very much needed we have an army of bachelors for that ourselves" topic, thanks for participating, I guess.
Ill stick to @Snir Broshi links and recommedations, thanks. Anyways Ill also go through Aluffis book, so may be there are one or two lemmas that were forgotten.

Yan Yablonovskiy 🇺🇦 (Dec 08 2025 at 21:53):

Anotherpieceofmyhead said:

wow, that was a pretty long "if you are not proficient enough you are not very much needed we have an army of bachelors for that ourselves" topic, thanks for participating, I guess.
Ill stick to Snir Broshi links and recommedations, thanks. Anyways Ill also go through Aluffis book, so may be there are one or two lemmas that were forgotten.

I wouldn’t put it so bleakly, it is more that there are very few lean experts with very little time to fully mentor/teach new people, relative to the number of new people. And there are already so many pull requests into Mathlib as is.

That first part of the comment kind of describes interactions of mutual interest in general, and I think for most lean experts it is a fully selfless thing to spend their time mentoring someone very new.

For example , to practice I chose to formalise some things about algebraic wheel theory. Even though it’s missing , I knew the entire time this has no place or interest to be in Mathlib. I think the moral is that you can formalise new things , without it being linked to Mathlib.

For my own learning , since there are sparse resources , I also just review merged pull requests into Mathlib and read all the reviewer comments and discussion etc. Can learn a lot that way and eventually if you want you might end up doing something with Mathlib .

Also keep in mind there are lots of smaller projects you can try join

metakuntyyy (Dec 12 2025 at 09:41):

Keep in mind that the beginners of today become the experts of tomorrow.

I started lean about a year ago and due to the many questions I asked during that time I became a lot more proficient at it.
Thanks to some help from more experienced users (in particular @Aaron Liu and @Kenny Lau but also many more) which gracefully took their time to answer my sometimes silly questions I got good enough to answer simpler questions to newer users.

@Anotherpieceofmyhead Here's something which you can do, if you don't mind busywork. I find that there are tons of lemmas in the library that are missing grind annotations. Sometimes in my work grind is very smart and shortens a 10 line proof to one grind call. Sometimes it's not so smart and that's usually where some grind annotations might be missing. I'm still baffled how good grind is. Adding the correct grind annotations and making it even smarter for everyone downstream would even be better.

Kevin Buzzard (Dec 12 2025 at 10:23):

Anotherpieceofmyhead said:

wow, that was a pretty long "if you are not proficient enough you are not very much needed we have an army of bachelors for that ourselves" topic, thanks for participating, I guess.

I do realise that my comments come across as harsh. I have thought quite a bit about this thread since making them, not least because I wanted to understand why I instinctively gave such a hostile answer. I think that a perhaps more reasonable way of presenting my thoughts on "I'm new, how do I contribute to mathlib" is perhaps this.

Say I just learnt about C++ and thought "wow, this is a great language". Should my next thought be "I want to make C++ better, how do I contribute to the C++ language" and obviously this isn't the right idea at all; the C++ languge is maintained by a small group of experts and someone who's just come across it isn't yet ready to start changing the language itself. The correct thing to think next is "I want to learn more about C++, maybe I should make some C++ projects of my own".

Similarly here, I think the next step after "I've just discovered Lean" is to choose a topic which you personally are interested in and to develop some of the theory yourself in your own project without worrying at all about contributing to a library. So basically answer (1) in your original message.

The background to my original answer, which was basically "please don't do (2)" but expressed more harshly, is that I personally feel that mathlib is currently overwhelmed with people making random contributions which I feel is having a detrimental impact on the ecosystem, creating reviewer burden with very little obvious gain in terms of the big goals which the community has, and that it's already extremely hard work trying to get on top of this problem.

suhr (Dec 12 2025 at 11:05):

I want to make C++ better, how do I contribute to the C++ language

The funny thing is, I know a person who did exactly that.

suhr (Dec 12 2025 at 11:07):

But you don't start with contributing into C++ language, you start with diving into a C++ compiler and figuring out how to do thing you you want to do.

suhr (Dec 12 2025 at 11:12):

And unlike a compiler, there's not much difference between adding definitions to your fork of Mathlib and adding definitions to your own library that uses Mathlib. But the later is easier to maintain.

Bryan Wang (Dec 12 2025 at 12:06):

Suppose I wanted to ask "how do I contribute to mathematics?" Well, I have two options. One option is I could go look in the various mathematics research journals that are being published out there and start learning about the maths that's going on there. After some years I learn enough and maybe I'm able to make a contribution to a research journal. But if I choose this path then I probably won't be able to contribute directly to the journal while I'm still learning, I'll maybe have to start with learning from a textbook. The other option is that maybe after looking in these journals I decide it doesn't really interest me, and what really interests me is to, say, contribute to a recreational maths website/magazine, or an online database of my liking, etc etc (nowadays there are so many places online where one can make a contribution). That's perfectly fine and it could be way more fun as well. In this option you might not always be able to contribute to all those journals from before, but this is simply because these journals have their own editorial considerations (someone has to look through your contribution, after all). Either way, you're still contributing to mathematics nonetheless.

It's really no different for Lean. And in fact the great news is it probably doesn't take years to be able to contribute to mathlib, and there's already a wonderful online textbook to start with called Mathematics in Lean. The best advice, as already mentioned above, is to start with any piece of mathematics you're really interested in, and try to write that in Lean (and once you've tried some code you can even ask questions or get feedback on zulip, something you don't normally get that often in learning traditional mathematics). Whether your code eventually ends up in mathlib or another repository, you've still contributed to the entire Lean ecosystem and community nonetheless.

Artie Khovanov (Dec 16 2025 at 22:23):

I would draw a distinction between wanting to PR superduperabundantnumbers or whatever, and wanting to PR basic results about commonly-used objects without necessarily having a plan to connect that result to a piece of modern maths. I'm not a maintainer and have no say in the goals of Mathlib, but surely the latter is the entire point of Mathlib?

Kevin Buzzard (Dec 16 2025 at 22:33):

Contributing API lemmas which look useful is usually welcome. Indeed I opened a PR today #32975 which does just this -- because it came up in my own work and it seemed to be missing. Note that I discussed it beforehand here #Is there code for X? > countable fibres, countable target => countable source @ 💬 . The thing which caused my visceral reaction above was the long list of definitions. A PR making a definition is much harder to review than a PR which just proves lemmas, because each definition comes with a cost (making all the lemmas to make it useful) and a possibly hard question ("is this the idiomatic way to capture this idea in Lean's type theory?"). Lemmas are cheap as long as the proofs compile quickly.

Artie Khovanov (Dec 16 2025 at 22:37):

I understand, thanks for explaining. Not looking forward to the inevitable wave of "low hanging fruit" AI slop definitions..

Kevin Buzzard (Dec 16 2025 at 22:41):

If you were a moderator you'd get to see far more of the AI slop posts we're already frantically deleting!

Artie Khovanov (Dec 16 2025 at 22:56):

Yeah I wouldn't wish it on anyone


Last updated: Dec 20 2025 at 21:32 UTC