Zulip Chat Archive

Stream: IMO-grand-challenge

Topic: new repo for IMO questions etc

Kevin Lacker (Oct 12 2020 at 22:37):

I did not have an opinion before, but I am starting to be in favor of a new repository for IMO questions

Kevin Lacker (Oct 12 2020 at 22:37):

it just seems like in practice, most of my effort formalizing problems does not go into the actual formalization, it goes into making changes to satisfy code review

Kevin Lacker (Oct 12 2020 at 22:39):

it's nice to add new lemmas to mathlib, but it doesn't seem necessary for the task of creating training data. in fact it might be harmful - for a realistic situation, you would be creating a proof that is not minimized by refactoring parts of it into mathlib

Kevin Lacker (Oct 12 2020 at 22:39):

so you wouldn't want training data that was created by refactoring a proof partly into mathlib

Kevin Lacker (Oct 12 2020 at 22:43):

but really, the key problem is that to train anything, we need data sets on the order of at least hundreds of instances. I think we should lower quality to increase throughput.

Daniel Selsam (Oct 12 2020 at 22:49):

Kevin Lacker said:

it just seems like in practice, most of my effort formalizing problems does not go into the actual formalization, it goes into making changes to satisfy code review

Do you expect this to be an ongoing cost, or for it to converge to zero as you get used to mathlib style?

Kevin Lacker (Oct 12 2020 at 22:53):

other people can feel free to chime in but it seems similarly bothersome for other people who are formalizing problems, looking at the pull requests

Kevin Lacker (Oct 12 2020 at 23:04):

it isn't "stylistic" really. it's that, for mathlib the spirit is, we should find the absolute best proof out there. so if there is some way to rewrite 100 lines of the proof to make it 80 lines that works in a totally different way, let's do that. and basically I do not think we should try to do that with IMO problems because we are so short of having a useful amount of data there.

Heather Macbeth (Oct 12 2020 at 23:11):

Hi @Kevin Lacker, a few thoughts (speaking only for myself and not anyone else involved with mathlib). First, I have really enjoyed seeing your IMO PRs! Like we were discussing last week, I think they often put a great spotlight on "easy" things mathlib is bad at doing, and I tend to see them as opportunities to think about why mathlib is bad at these easy things.

Heather Macbeth (Oct 12 2020 at 23:14):

Secondly, I think you were frustrated by this comment in particular I made on #4518

Idea 1: factor out the pure algebraic lemma that a ^ 2 + (2 * a ^ 2 - 1) ^ 2 + (4 * a ^ 3 - 3 * a) ^ 2 = 1 if and only if (2 * a ^ 2 - 1) * (4 * a ^ 3 - 3 * a) = 0. This lemma is a little nontrivial, since a ^ 2 + (2 * a ^ 2 - 1) ^ 2 + (4 * a ^ 3 - 3 * a) ^ 2 = 1 is directly equivalent to a * (2 * a ^ 2 - 1) * (4 * a ^ 3 - 3 * a) = 0, so to cancel the a factor you have to notice the presence of a second a factor in the third term. If this step were being automated, it would be by showing that the ideals generated by the two expressions have the same Jacobson radical.

and I am sorry if it seemed like I was doing this just to "find the absolute best proof out there" -- I was actually trying to figure out if there was a more automation-friendly proof! Cf this discussion.

Kevin Lacker (Oct 12 2020 at 23:17):

I do think we should check that pull request in, but this is actually only the 3rd most inordinately-delayed-in-my-opinion of my pull requests

Kevin Lacker (Oct 12 2020 at 23:18):

behind https://github.com/leanprover-community/mathlib/pull/4450 and https://github.com/leanprover-community/mathlib/pull/4366

Kevin Lacker (Oct 12 2020 at 23:19):

so, I don't think it's just you. it's more like, on every pull request there are many comments that are not like "this doesn't adhere to the style guide" or "this doesn't work", but more like "it is possible to write this part in 20 lines instead of 30 if you do it totally differently"

Kevin Lacker (Oct 12 2020 at 23:20):

most of my code here is actually getting discarded when reviewers rewrite it! i mean, it's getting better, certainly. but I wish that other people were submitting their own IMO formalizations with that energy, since we only have like, 6 of them

Heather Macbeth (Oct 12 2020 at 23:25):

I didn't closely follow the content of those other PRs (even though I did a massive review for style on one of them), so I don't know the details, but am I correct that it was from one of these PRs that the new norm_digits tactic was inspired? This seems like it will be useful for a lot of future IMO problems, and worth some delay.

Heather Macbeth (Oct 12 2020 at 23:26):

But in general, to bring up here something I wrote in #PR reviews , I would be happy to explicitly introduce a different standard for the acceptance of IMO PRs than other PRs. What do others think?

Kevin Lacker (Oct 12 2020 at 23:31):

to me, the lean programming language offers something incredible - the knowledge that if code compiles, it must be correct. the logical thing to do is to accept the benefits of this magical power, and use it to stop worrying about golfing down to the lowest amount of lines possible, and accept anything that both meets the style guide and compiles. :smile:

Heather Macbeth (Oct 12 2020 at 23:42):

This all seems reasonable to me, for the archive/imo folder. (Again, I hope lots of other people will chime in.). I do think that if a reviewer can spot a reorganization into pieces, where one or more of those pieces could naturally be automated in the future (norm_digits tactic, Grobner basis tactic, interval arithmetic tactic, Sledgehammer-like tactic), then there is (possibly, if not overly burdensome) merit in reorganizing in that way.

Joseph Myers (Oct 13 2020 at 00:34):

I think actually getting a formalization of a solution to the original problem added to a repository is somehow not the most important part of formalizing IMO problems right now. There are several other things about it I think are at least as important:

Discovering gaps in mathlib and adding definitions to fill those gaps.
Discovering gaps in mathlib and adding lemmas to fill those gaps. Many of those will be useful for non-olympiad formalizations as well. (Although you could go through chapter 2 of the IMO Compendium and formalize all the definitions and results there, inevitably that sort of thing misses lots of the too-trivial-to-state-in-textbooks lemmas that are useful for formalization.) In general, formalization seems to involve a very lemma-centric view of mathematics, where reusable lemmas can be more important than a "main result" that nothing will depend on.
Discovering gaps in automation, where something obvious to humans, but that doesn't clearly take the form of a general lemma that can go in mathlib, takes a lot of Lean code. Again, better automation is likely to be more generally relevant.
Learning generally how the length / difficulty of human solutions compares with that of formalizations in different areas, even if the differences don't clearly point to missing definitions, lemmas and automation. Having the formal proofs be cleaned up helps indicate the length difference is because of actual complexity in formalization rather than simply some proofs being more optimized than others.

The one way in which I think IMO PRs should be considered different from others is that if there are multiple, genuinely mathematically different approaches to a problem (as opposed to one just being a cleaned-up version of another), it makes sense to formalize all of them, to learn about all the above issues for each of the approaches, and so the fact that one approach might give a nicer formalization isn't a reason to exclude another approach (supposing each approach is given in a suitably cleaned-up form). (I like having lots of alternative solutions in general - hence the IMO 2019 shortlist including lots of them and ending up 100 pages long.)

Joseph Myers (Oct 13 2020 at 00:56):

Even if you don't expect a problem to lead to new definitions/lemmas/automation in mathlib, of course "it should be fun to formalize" or "I'll learn something myself by formalizing it" are also good reasons for formalizing it (but reviewers may bring their own perspectives on what mathlib can gain from the proof).

Floris van Doorn (Oct 13 2020 at 01:04):

I am happy to loosen the standards for proofs of IMO problems a bit, and allow long/inefficient proofs if a simpler proof also exists.

However, I think two things should still be satisfied:

Basic lemmas that are clearly mathlib-worthy should be separated out and put into mathlib proper.
Maintaining the proofs should not be much more effort than a typical mathlib proof. This means things like avoiding nonterminal simp/automation and no proofs that take a long time to compile.

Joseph Myers (Oct 13 2020 at 01:05):

Will these formalizations end up useful as training material for an AI? I've no idea. Maybe an AI could also learn from PR reviews or the mathlib git history, not just the versions that end up in the mathlib archive. There are lots of things different people might hope to learn, or hope for mathlib as a whole to learn, or for a future AI to learn, from formalizing solutions to problems.

Bryan Gin-ge Chen (Oct 13 2020 at 01:11):

I don't have much to add to what Heather, Joseph and Floris have already said, except that you should feel free to ask reviewers to implement their suggested changes if you don't feel up to it. The "help-wanted" tag is there for a reason!

Joseph Myers (Oct 13 2020 at 01:13):

Maybe "we have formalizations of all past IMO problems" or "all IMO <some-recent-year> problems" would mean more olympiad people trying playing with formalization (or that Kevin Buzzard could successfully encourage more olympiad people to do so). But I think either of those is still some way off (more geometry and graph theory still needed in mathlib).

Kevin Lacker (Oct 13 2020 at 03:58):

I appreciate the chipping in from other people to get PRs submitted.

Kevin Lacker (Oct 13 2020 at 04:05):

I still think we have a big problem - where "we" might not be the larger mathlib community, it might just be the people who would like to create a data set useful for AI IMO problem solving - that we are not really on track to have enough data to be useful within, say, a year and therefore we should figure out how to get more

Johan Commelin (Oct 13 2020 at 05:47):

Joseph Myers said:

Will these formalizations end up useful as training material for an AI? I've no idea. Maybe an AI could also learn from PR reviews or the mathlib git history, not just the versions that end up in the mathlib archive. There are lots of things different people might hope to learn, or hope for mathlib as a whole to learn, or for a future AI to learn, from formalizing solutions to problems.

We do throw away an insane amount of useful git history by squashing PRs into mathlib. Like 37 "wip" commits... all gone.

Johan Commelin (Oct 13 2020 at 05:54):

@Kevin Lacker I understand the frustration. But really, the hope is that by refactoring proofs, the workload will become easier for future problems, because you'll have reusable parts from earlier problems. I very much agree with what Joseph has said; in particular his point about multiple solutions. Until now, I hadn't really looked at it that way. But I guess there is nothing wrong with having 3 solutions in the file.
Anyhow, I hope that you will continue working with mathlib! And it's good to have this discussion!

Johan Commelin (Oct 13 2020 at 05:55):

I don't know enough about AI to know what kind of data is best for training data. Is quantity the most important factor? Or should it also be deduplicated, refactored, designed, ...
I simply don't know.

Kevin Buzzard (Oct 13 2020 at 06:34):

@Kevin Lacker maybe this point has been made already, but mathlib has two properties:

1) It's hard work to write Lean code which is acceptable (as you're discovering)

2) Conversely, it is maintained by a team of hard-working people, who ensure it always compiles, and this part of the job is made easier by contibutors keeping to stringent rules to make maintainers' lives easier.

The moment you "break away" and start formalising things using less stringent criteria and your own rules, you run the risk of having a repo which compiles in 2020 and then is broken in 2021 and of no use to anyone. I speak as someone who spent 8 months of their life defining perfectoid spaces with Patrick and Johan, and now have a repo which is broken and no use to anyone because we didn't do the work you're complaining about. The alternative is much worse.

The pain you're going through (and which we all go through -- I've had 5 line PR's which take a week or more to get accepted) is the price you currently pay for the privilege of having your code work forever in Lean 3. I've learnt the hard way that it's worth it.

Kevin Buzzard (Oct 13 2020 at 07:13):

The argument for "doing it properly now" is that the alternative is "waste time not doing it properly now, and then refactor later". The maintainers have a good eye for the correct level of generality that should be used in mathlib in order to be helpful for all.

Kevin Lacker (Oct 13 2020 at 07:32):

Well, perhaps we can come to a happy medium by staying in archive/imo and also reviewers become slightly more lax in there

Oliver Nash (Oct 13 2020 at 08:04):

I would also add:

Thank you very for your submissions @Kevin Lacker. I personally really appreciate the time you've generously spent on these PRs!
I've been contributing on and off for about a year and as far as I can tell (e.g., from the number of PRs arriving) the maintainers are busier than ever right now.
I have come to enjoy the review process. The community always finds improvements that I missed and much of what I now know was taught to me this way.
I don't have strong feelings on the separate repo question but my null hypothesis is that the natural time to split out the IMO material is when we have had an appropriately long / diverse sequence of solutions merged, of which sufficiently few also plugged gaps in src.

Patrick Massot (Oct 13 2020 at 08:09):

I don't know anything about machine learning and automation, but I would be very surprised if the most efficient solution was to put as many low quality solutions as possible.

Kevin Lacker (Oct 13 2020 at 08:28):

that's kind of the modern trend in machine learning, as much low quality data as possible

Patrick Massot (Oct 13 2020 at 08:29):

Except we will never reach the quantity of data required there.

Kevin Lacker (Oct 13 2020 at 08:30):

perhaps we will have to auto-generate even larger amounts of lower-quality proofs. but that's sort of getting off topic

Patrick Massot (Oct 13 2020 at 08:31):

Since this is the IMO grand challenge stream, it may be worth recalling that Daniel doesn't plan to simply throw data to a neural network and hope. Did you watch the talk he gave a couple of weeks ago? He definitely wants something much more structured.

Kevin Lacker (Oct 13 2020 at 08:33):

well, we'll see what we have to work with

Joseph Myers (Oct 13 2020 at 12:42):

Johan Commelin said:

We do throw away an insane amount of useful git history by squashing PRs into mathlib. Like 37 "wip" commits... all gone.

One version of the history for each PR is available via refs/pull (see git ls-remote for all the refs available in the repository). But I'm not sure if there's any good way to get multiple versions of the history from GitHub for a PR that was rebased.

Johan Commelin (Oct 13 2020 at 12:45):

rebasing makes it even worse... but it's good to know that the history isn't completely lost.

Heather Macbeth (Oct 16 2020 at 04:37):

Joseph Myers said:

The one way in which I think IMO PRs should be considered different from others is that if there are multiple, genuinely mathematically different approaches to a problem (as opposed to one just being a cleaned-up version of another), it makes sense to formalize all of them, to learn about all the above issues for each of the approaches, and so the fact that one approach might give a nicer formalization isn't a reason to exclude another approach (supposing each approach is given in a suitably cleaned-up form).

How do you want to structure the IMO archive to allow for multiple solutions? I have a variant solution to #4518 which I would like to contribute (it's the solution sketched here). Should I append it to the same file?

Yury G. Kudryashov (Oct 16 2020 at 04:41):

I think yes, multiple solutions in the same file.

Heather Macbeth (Oct 16 2020 at 04:42):

With lemma names like imo1962_q4, imo1962_q4' ?

Last updated: May 02 2025 at 03:31 UTC