Zulip Chat Archive

Stream: IMO-grand-challenge

Topic: Alibaba competition 2024 - AI track


Junyan Xu (Jul 24 2024 at 15:16):

Let me mention that this year's Alibaba math competition (which involves more advanced math than IMO) had an AI track (with prize money ~10% that of the 1st AIMO progress prize's) and the top two teams open sourced their approaches (which made use of GPT and Claude though):
1st place (a high school student in Shanghai): Twitter, blog post, GitHub
2nd place: GitHub, WeChat post, logs and output
There was also an online seminar (in Chinese) where the top three teams shared their approaches, which you can watch by scanning the left QR code below using WeChat (or DM me for a YouTube link).
52c653fce382e60b34f596ebbd703ec.png

Jason Rute (Jul 25 2024 at 07:51):

@Junyan Xu I think most of us are not familiar with this. Can you clarify:

  1. What kind of competition is this? You said harder than the IMO. Is it more like Putnam problems? How many problems are there? Did they need to supply proofs?
  2. How good did the winning teams do? How many problems did they get and how does that compare to humans who get that many right?
  3. What are your informal impressions. Was it impressive results by the winning team, or is there a ways to go?
  4. (Optional) Do you have a quick summary of what the winning teams strategies were? Did they use code? Did any of them use formal methods (probably not)? Or was it just straight LLM stuff?

Junyan Xu (Jul 25 2024 at 15:38):

@Jason Rute

  1. I was not saying that the Alibaba competition is harder than the IMO, only that it's more advanced in the sense that the problem statements and/or solutions are not restricted to pre-university level. In 2023, the gold medalists were a sophomore and two PhD students at Peking University and a PhD student at Stanford U. There are seven problems this year, and proofs need to be supplied for some problems but there are also multiple choice problems. Contestants are required to submit code (and MD5) beforehand, and Alibaba verifies the submitted solutions agree with the output of the code afterwards. (At least this is how I understand it works.)
  2. This year's AI track was held for the qualifying(preliminary) round only, and no AI qualified for the final round; the qualifying score is 45 points, but the best performing AI only got 34 points (the 2nd/3rd teams got 27 points). Notice that the qualifying round is open-book for humans (with consultation of online and offline materials and programming allowed). You can see all the 2024 qualifying problems here. (The total number of points is not shown but should be 120 according to this.) This is the sixth edition of the competition, and you can view past qualifying problems here (under Past History), and under Competition News you can also find some final problems. The organizational committee include Weinan E, Alessio Figalli, Gang Tian, Yitang Zhang.
  3. I don't think any of the top three approaches use formal methods, but the top two approaches use multiple (at least two) agents. None of them train or finetune any models. I think the results are not super impressive, but decent.

Notification Bot (Jul 25 2024 at 15:39):

3 messages were moved here from #IMO-grand-challenge > AI Math Olympiad Prize by Junyan Xu.


Last updated: May 02 2025 at 03:31 UTC