Zulip Chat Archive

Would it make sense to develop a general competition problem solver by focusing on problems from easier competitions first? Competitions like Mathcounts, AMC, AIME. The questions from these competitions only require a correct answer (numerical or multiple-choice), rather than formal proofs. But there is nothing stopping you from providing a formal proof to accompany your answer, and indeed, an automated solver that reaches an answer without a formal proof arguably did not reach that answer legitimately. In my mind, (one of) the hard part(s) of the IMO Grand Challenge is developing an algorithm to explore the game-tree of possible logical deductions: identifying promising branches to take based on a distance-to-goal-node estimation function trained from examples. An AlphaZero-inspired approach seems logical, although there are important fundamental domain differences that likely demand new ideas. Easier competitions seem like a better training ground to develop this algorithm - easier to generate a large volume of problems and solutions, and we can operate the learning algorithm on a vastly pruned mathlib to limit tree branching factor to start. (I just joined here, so forgive me if I'm asking something already discussed.)

Scott Morrison (Sep 24 2020 at 03:29):

We have so few formalised problem statements in hand at this point, that it seems to me this is the obvious place to start.]

Scott Morrison (Sep 24 2020 at 03:31):

We're certainly happy to collect IMO problems with solutions in the archive/ directory of mathlib. Possibly we are happy to collect statements there as well (but I have a pretty maximalist view of "what belongs in mathlib").

Scott Morrison (Sep 24 2020 at 03:31):

Certainly a concerted effort to start formalising statements at least is important.

Scott Morrison (Sep 24 2020 at 03:31):

e.g. once the 2020 problems came out yesterday, we probably should have a collaborative bash at formalising all the statements.

Scott Morrison (Sep 24 2020 at 03:32):

(I did the statement of Q5, and would be very happy to hack on others with others, or to know where the formalised statements "belong".)

Daniel Selsam (Sep 24 2020 at 03:40):

@David Shin Hi David.

In my mind, (one of) the hard part(s) of the IMO Grand Challenge is developing an algorithm to explore the game-tree of possible logical deductions

But what game is the agent playing? My view is that devising the game might be the hardest part.

An AlphaZero-inspired approach seems logical, although there are important fundamental domain differences that likely demand new ideas.

I don't see how AlphaZero applies, since there is no symmetric 2-player game to bootstrap data from.

David Shin (Sep 24 2020 at 04:20):

Hi Daniel,

But what game is the agent playing? My view is that devising the game might be the hardest part.

The formal game, as I imagine it, is to find a directed path from a given start node to a given end node in the mathematics graph. Each vertex of this graph represents a set of mathematical statements, and a directed edge connects u to v if the application of a logical deduction rule allow you to deduce that the statements of v are true if the statements of u are true. The start node consists of the given assumptions of the problem, and the end node contains the statement to be proven. For a non-proof problem like in Mathcounts/AMC/AIME, the formalization would be slightly different.

I don't see how AlphaZero applies, since there is no symmetric 2-player game to bootstrap data from.

The properties of AlphaZero that I had in mind are that it learns from training examples, traverses the game tree randomly at first, learns an evaluation function that takes game state as input and produces output that induces a preference ranking over candidate branches, performs this learning based on terminal node outcomes from tree traversal trials, and applies that evaluation function with a tree-search algorithm mid-game. A loose analogy, but that is all I meant. I have trouble imagining how IMO-level problem solving could emerge without this sort of architecture in place.

Since you bring up bootstrapping - one idea I have to automatically generate truly large sets of problems is to create random generators of problems of a very particular structure. For example, "maximize f(x) subject to constraint c(x)", where f and c are generated by randomly combining mathematical operators, variables, and constants. Different combinations could demand different solution techniques, and a learning algorithm like what I outline could potentially develop an "intuition" of the most promising branches to take through the tree. Such a system might generate a solver that can only solve one particular type of problem, but I think it could be a useful way to start. With that said, I think a learning approach that doesn't require massive data (like deep learning does) will ultimately be needed here. Humans can learn new problem solving techniques from a single example, and so the right learning algorithm should be capable of the same.

David Shin (Sep 24 2020 at 04:26):

The idea of randomly generating problems and using deep learning to learn how to solve them was used in the domain of symbolic integration and differential equations here: https://arxiv.org/abs/1912.01412

Kevin Lacker (Sep 24 2020 at 06:56):

in practice, we don't have a large library of either IMO or AMC/AIME/mathcounts problems. if part of this is training an AI we might need to go even easier and do something like autogenerating some format of problem

Kevin Lacker (Sep 24 2020 at 06:57):

differential equations kinda make sense as something for computers to do, but they aren't really "on the path" to IMO problems

Kevin Lacker (Sep 24 2020 at 06:57):

i wonder if inequalities would be a good place to start... of all the IMO categories, inequalities really seem like, you don't use your deep mathematical intuition as much as you play this game with trying to arrange the variables in a place where one of the key theorems can work

Kevin Lacker (Sep 24 2020 at 06:58):

they also seem kinda easy to enter into lean as problem statements

Kevin Lacker (Sep 24 2020 at 06:59):

so under this strategy rather than adding a bunch of e.g. mathcounts problems we would enter a bunch from something like https://www.amazon.co.uk/Advanced-Olympiad-Inequalities-Algebraic-Geometric/dp/1794193928/ref=tmm_pap_swatch_0?_encoding=UTF8&qid=1556093238&sr=1-1-catcorr

Kevin Lacker (Sep 24 2020 at 07:04):

@Daniel Selsam when is lean 4 going to be ready? from your talk it sounded like you thought writing problem-solving type of code should happen after lean4 came out

Joseph Myers (Sep 24 2020 at 10:10):

If we think "all 368 past IMO problems would be appropriate in archive/imo" (ignoring for now the questions of other competitions, and of detailed conventions for writing formal statements, and of how much semi-redundant definitions should be added to mathlib to allow more literal translations of statements into Lean), then one thing that should be done is to add proofs people already formalised but didn't add to mathlib at the time, such as those previously discussed of IMO 2019 problem 1 and IMO 2019 problem 4. Though I think the principle of putting lemmas and definitions in mathlib proper whenever they could reasonably be useful not just for that problem is an important one here; I don't know how many of the lemmas in that problem 4 solution are in mathlib by now.

Joseph Myers (Sep 24 2020 at 10:18):

If we might add statements without proofs, there's also the question of whether we can add corresponding definitions to mathlib (geometry especially needs lots more definitions added) without the associated API, given that a definition without associated API might well be inconvenient to use in practice, or actually mathematically incorrect. (A problem statement without a proof could also easily be incorrect by accident.)

Scott Morrison (Sep 24 2020 at 10:22):

Yes --- while I think statements of IMO problems without proofs would be great, I would still expect that they were written using "mathlib quality" definitions, that those definitions were PRd into mathlib proper, and our usual expectations about API were fulfilled. Obviously this will not be easy still for many problems.

Miroslav Olšák (Sep 24 2020 at 10:27):

David Shin said:

Would it make sense to develop a general competition problem solver by focusing on problems from easier competitions first? Competitions like Mathcounts, AMC, AIME. The questions from these competitions only require a correct answer (numerical or multiple-choice), rather than formal proofs. But there is nothing stopping you from providing a formal proof to accompany your answer, and indeed, an automated solver that reaches an answer without a formal proof arguably did not reach that answer legitimately.

Sure, there are many simpler competitions that can be reasonable to start with such as the national rounds before IMO, or for younger students. In the case of Czech republic (the one I am familiar with, not anything special), they are all mainly about proving something. I have heard that the olympiad in US is a bit different in that.
I also have a set of "folklore problems" collected by a friend -- the problems that are too famous to be used in a competition like that: http://www.olsak.net/mirek/folklore-problems.pdf

In my mind, (one of) the hard part(s) of the IMO Grand Challenge is developing an algorithm to explore the game-tree of possible logical deductions: identifying promising branches to take based on a distance-to-goal-node estimation function trained from examples. An AlphaZero-inspired approach seems logical, although there are important fundamental domain differences that likely demand new ideas. Easier competitions seem like a better training ground to develop this algorithm - easier to generate a large volume of problems and solutions, and we can operate the learning algorithm on a vastly pruned mathlib to limit tree branching factor to start. (I just joined here, so forgive me if I'm asking something already discussed.)

For such experiments, it looks reasonable to me to focus on a specific domain first. Euclidean gemetry (I don't mean combinatorial geometry) seems as a reasonable domain as it is about a quarter of IMO problems (and many problems from simpler competitions), yet the domain is significantly limited (about 5 types of objects) and requires just pretty simple logic.

Concerning AlphaZero, it depends what you mean by "AlphaZero-inspired". Sure, it is very likely that the AI for IMO will do some search guided by machine learning. On the other hand, AlphaZero is based on a tree search, and mathematical problem solving is not really a tree search.

David Shin (Sep 24 2020 at 12:34):

Miroslav Olšák said:

Concerning AlphaZero, it depends what you mean by "AlphaZero-inspired". Sure, it is very likely that the AI for IMO will do some search guided by machine learning. On the other hand, AlphaZero is based on a tree search, and mathematical problem solving is not really a tree search.

Might it be useful to think of some solutions in terms of a tree search? If I think about how I approach, say, an Olympiad inequality problem, I can describe my thought process as encountering branches. "This inequality looks like it might reduce to AM/GM or Cauchy-Schwarz, and I can think of a few different ways to declare the terms". That sounds like enumerate candidate branches. A machine-learning trained algorithm might identify these candidates as promising based on past examples. I, as a human, might execute a candidate branch, and then heuristically evaluate whether that got me closer or further to my goal. A machine-learning trained distance-to-goal evaluation function might serve as the analog to this heuristic evaluation.

In general, I imagine the AI will have a finite toolbox of techniques, perhaps expertly specified by humans, but ideally automatically self-learned from random exploration of training examples, and that it will perform branch-expansion down a tree by choosing from this toolbox.

Miroslav Olšák (Sep 24 2020 at 13:06):

I am not so good in solving inequalities. (I feel confident the most in combinatorics, perhaps geometry)
I agree that sometimes, there is something tree-like on mathematical problem solving. Typically when you make an irreversable step in the form "Let's try to prove this stronger hypothesis / hypothesis that could be useful.", or when you try to approach the problem with certain standard method. On the other hand, forward search can be also useful, and it is not tree-like -- for example search for "interesting" properties of a function from a functional equation, or just facts in a geometrical configuration.

And my impression is that even if I fail a branch in a tree-like search, I can often take away something out of it. Sometimes just intuition how the problem behaves, sometimes a proof of a subproblem. This feels on a very high-level as the CDCL algorithm for SAT solving -- even if I fail, there is something I could use for other branches of the same problem.

Last updated: May 02 2025 at 03:31 UTC

leanprover-community / mathlib

Zulip Chat Archive

Stream: IMO-grand-challenge

Topic: Easier competitions

David Shin (Sep 24 2020 at 03:25):

Scott Morrison (Sep 24 2020 at 03:29):

Scott Morrison (Sep 24 2020 at 03:31):

Scott Morrison (Sep 24 2020 at 03:31):

Scott Morrison (Sep 24 2020 at 03:31):

Scott Morrison (Sep 24 2020 at 03:32):

Daniel Selsam (Sep 24 2020 at 03:40):

David Shin (Sep 24 2020 at 04:20):

David Shin (Sep 24 2020 at 04:26):

Kevin Lacker (Sep 24 2020 at 06:56):

Kevin Lacker (Sep 24 2020 at 06:57):

Kevin Lacker (Sep 24 2020 at 06:57):

Kevin Lacker (Sep 24 2020 at 06:58):

Kevin Lacker (Sep 24 2020 at 06:59):

Kevin Lacker (Sep 24 2020 at 07:04):

Joseph Myers (Sep 24 2020 at 10:10):

Joseph Myers (Sep 24 2020 at 10:18):

Scott Morrison (Sep 24 2020 at 10:22):

Miroslav Olšák (Sep 24 2020 at 10:27):

David Shin (Sep 24 2020 at 12:34):

Miroslav Olšák (Sep 24 2020 at 13:06):