Zulip Chat Archive

Stream: IMO-grand-challenge

Topic: IMO 2020

Miroslav Olšák (Sep 23 2020 at 11:24):

Kevin Buzzard said:

(link to problems)

I am now looking (as a human, no formalism involved) to problem 5, looks interesting at first glance.

Joseph Myers said:

Now the IMO 2020 problems are out, we can see we're some way from being able to express problem 1, as that requires the concept of a list of four points forming a convex quadrilateral, and of the interior of such a convex quadrilateral (and of course we should try to define such things in a general form that also covers non-convex polygons, or the interior and exterior of a non-polygonal curve or surface, or a convex polyhedron with n faces). (We also don't have explicit concepts of "internal bisector" of an angle, but that's straightforward to define, or "perpendicular bisector" of a segment, though that can be expressed easily using mk' p d.orthogonal where p is the midpoint and d is the direction of the segment, but we may want specific definitions for both the midpoint (in an affine space context) and the perpendicular bisector.) And once the problem has been expressed, a condition involving ratios of angles seems outside the scope of any system where only equality of angles is used.

Problem 6 would be much more straightforward to express in a natural way, though we ought to add a definition of the distance between a point and a nonempty affine subspace (and of distance between two nonempty affine subspaces) rather than needing to use orthogonal_projection explicitly whenever we wish to refer to such a distance.

I see the problems quite the opposite. Problem 1 is a very typical geometry problem sharing functions / predicates with many other standard geometry problems and I believe it can be also solved in a similar way (although there can be an issue in how to obtain the required construction). I could play with it in GeoGebra / GeoLogic.

Problem 6 is combinatorial geometry -- a domain I consider among the hardest solve and also to write formally, although the problem statement itself does not need to be so difficult to express.

Kevin Buzzard (Sep 23 2020 at 11:26):

Wow I've never seen a note like that before in Q6. Does the point-awarding process only depend on the value of alpha, or is it also in some way related to the correctness of the proof? e.g. if someone correctly proves alpha=1/2 and then I incorrectly prove it, can I claim the points because of the note?

Oliver Nash (Sep 23 2020 at 11:31):

I've seen it before. See e.g., 2014Q6: http://olivernash.org/2017/08/05/visualising-imo-results/index.html?2014-6

Kevin Buzzard (Sep 23 2020 at 11:32):

and again the point-awarding process just depends on the value of c rather than validity of the proof :-)

Miroslav Olšák (Sep 23 2020 at 11:32):

The organizers checking the solutions usually agree on some rules for awarding points before they get the solutions from participants in order to make it fair and unbiased. But of course, the amount of points cannot be only strictly lead by such rules -- it is for example typical that someone makes a small mistake which can be fixed and he loose a point for that.
Here, the organizers are just being nice to reveal some idea of the scoring rules to the participants.

Miroslav Olšák (Sep 23 2020 at 11:35):

You don't get any points for completely invalid proof of course, even if your constant alpha is negative :-D.

Scott Morrison (Sep 23 2020 at 11:59):

I convinced Lean that 2 is amongst the n for Q5. :-)

Joseph Myers (Sep 23 2020 at 17:19):

I was a coordinator on IMO 2014 problem 6. The mark scheme gave ranges of marks for essentially correct proofs for c in various ranges (with the precise mark within that range of possible marks for the given c then depending on any flaws in the proof).

Scott Morrison (Sep 23 2020 at 23:17):

Scott Morrison said:

I convinced Lean that 2 is amongst the n for Q5. :-)

Even that though was much less easy than it should have been. We need more powerful and general case bashing tactics, so that a putative IMO solver has a chance to "work out the easy examples" first. We also need to bring slim_check up to the level that it is potentially useful during the first stage of thinking about a problem.

David Wärn (Sep 25 2020 at 16:40):

Is it feasible to solve Q2 using a lean tactic? Here's what I have in mind: we want to show that for any point $(a, b, c) \in[0, 1]^3$ , either the condition $a \ge b \ge c \ge 1-a-b-c > 0$ cannot hold, or the conclusion $r \cdot s \cdot t \cdot u \cdot v < 1$ must hold. Using some naive strategy, we try to bound $r, s, t, u, v$ separately to close the goal. If the naive strategy fails, we subdivide the cube into 8 smaller cubes, and recurse. This should terminate for compactness reasons

Kevin Buzzard (Sep 25 2020 at 17:23):

I think you're right

Kevin Buzzard (Sep 25 2020 at 17:25):

In practice of course, in lean, it will be a lot of work to do get $x^y$ sufficiently computable I should think -- one will certainly need a few decimal places and I'm pretty sure norm_num doesn't do this

Kevin Buzzard (Sep 25 2020 at 17:26):

I guess you could work with logs and then you'd just need to worry about accurate addition, multiplication and a log function

Kevin Buzzard (Sep 25 2020 at 17:27):

The powering function forces you to work over R not Q

Kevin Buzzard (Sep 25 2020 at 17:28):

Did you see the thread about interval arithmetic reals in #maths ?

Miroslav Olšák (Sep 25 2020 at 18:44):

The compactness argument does not really work -- we have a condition $1-a-b-c > 0$ there. In particular, if the variables are close to $(a,b,c,d) = (1,0,0,0)$ , then the value is close to 1, so a naive estimate can fail in arbitrarily small cubes close to that point.

Kevin Buzzard (Sep 25 2020 at 19:11):

but then the other estimate holds

Kevin Buzzard (Sep 25 2020 at 19:12):

Oh -- sorry -- eps^eps is close to 1 isn't it.

Miroslav Olšák (Sep 25 2020 at 19:21):

True

David Wärn (Sep 25 2020 at 19:40):

Oh you're right. The problem is harder than I thought

David Wärn (Sep 25 2020 at 20:38):

I suppose in principle points near $(1,0,0,0)$ should be easy to handle, since $\varepsilon ^\varepsilon < 1 - 999\varepsilon$ for small enough $\varepsilon$ . But this is getting complicated

David Wärn (Sep 25 2020 at 20:41):

Yes, arithmetic precision is probably also an issue here. In principle you can probably do everything in Q, by clearing denominators in exponents, but then you might end up with very large rationals

Joseph Myers (Sep 25 2020 at 22:29):

I tried the interval arithmetic approach (not formalised, using MPFR in C). If I skip $a \ge 3/4$ in the expectation that the hypothetical tactic can find a different argument using asymptotics there, my program searched 132 cubes (including the one skipped, excluding those ignored because $a \ge b \ge c \ge d = 1 - a - b - c > 0$ is trivially impossible), when using 11-bit (= IEEE binary16) or higher floating-point precision (of those cubes, 88 were leaf nodes of the search where the interval arithmetic gave a sufficiently good bound, the rest needed dividing into 8 smaller cubes each and recursing). Using a slightly lower precision still works but with more cubes needed; likewise, if the cut-off for using asymptotics instead needs to be larger than $3/4$ , it still works but with more cubes (and maybe more precision) needed. Note that this is with correctly rounded (upward) values of $x^x$ , so with higher precision used internally by MPFR to compute those values; if 11-bit precision were used internally to bound $x^x$ , the bounds would be less good so more cubes might be needed. Also note that when doing interval arithmetic, the bounds you get for $x^x$ as a one-argument function of an interval of values for $x$ are better than the bounds you get for $x^y$ as a two-argument function where the intervals for $x$ and $y$ are the same but $x$ need not equal $y$ .

Joseph Myers (Sep 26 2020 at 00:09):

Checking the bounds in more detail, $a+2b+3c+4d=1+b+2c+3d\le1+6b$ , and if $b\le1/512$ then $b^b<1-6b$ , so that covers cases with $b\le1/512$ , and, with 15-bit floating-point precision, this can be handled with 327 cubes including those skipped for small $b$ (224 cubes being leaf nodes where interval arithmetic gave a good-enough bound). I'm not sure this would be much easier for a tactic to find than a more human-style proof, given the need to prove some asymptotics along the way.

Miroslav Olšák (Sep 26 2020 at 14:31):

By the way, what is the current status of interval arithmetics in LEAN? I am considering using it for proving geometry statements in a neighborhood of a certain diagram (to check the degeneracy and ordering conditions).

Kevin Buzzard (Sep 26 2020 at 18:02):

There's a discussion about this in #maths but the short answer is that there isn't any

Miroslav Olšák (Sep 26 2020 at 18:11):

Ah, I see

Miroslav Olšák (Sep 26 2020 at 18:22):

But I remember from one of your talks that there is already a tactic in lean able to prove that 1:ℝ ≠ 2:ℝ, so there is at least something about real numbers.

Kevin Buzzard (Sep 26 2020 at 18:28):

Yes, norm_num can do that, but I'm pretty sure it's not using interval arithmetic. I have no idea what it's doing to be honest

Miroslav Olšák (Sep 26 2020 at 18:40):

The thing is whether it could also prove that $\pi \ne \sqrt2 + \sqrt3$ , or what are its limits.

Kevin Buzzard (Sep 26 2020 at 18:50):

Oh no way could it prove anything like that

Kevin Buzzard (Sep 26 2020 at 18:51):

It can't prove anything about irrational numbers I shouldn't think

Kevin Buzzard (Sep 26 2020 at 18:51):

It has no idea what sqrt(3) is to any number of decimal places

Miroslav Olšák (Sep 26 2020 at 18:54):

I see, it normalizes the numbers using rational arithmetic with +, -, *, /, but it cannot do anything with square roots or approximate values.

Mario Carneiro (Sep 26 2020 at 19:04):

In principle you could do something with cylindrical algebraic decomposition to do algebra on square roots and such but it's a lot more complicated than rational arithmetic

Miroslav Olšák (Sep 26 2020 at 19:19):

And also more complicated than calculating the result up to 3 decimal places. But I understand, the tactic is primarily meant for exact operations rather than (dis/in)equalities, and there is currently nothing in lean that would calculate numbers with a decimal point.

Mario Carneiro (Sep 26 2020 at 19:38):

well interval arithmetic is just the establishment of various carefully constructed inequalities between exact rational values

Mario Carneiro (Sep 26 2020 at 19:38):

The calculation of pi to 7 decimals can only be called an application of interval arithmetic

Miroslav Olšák (Sep 26 2020 at 20:09):

Of course, I meant it as an example, not as the entire topic. On the other hand, I would find reasoning like that relatively sufficient for now -- for the problems / applications we are discussing here.
So, how hard would it be to prove $\pi \ne \sqrt2 + \sqrt3$ in lean? The proof I have in mind is calculating both numbers to 3 decimal places and observing that they are different.

Kevin Buzzard (Sep 26 2020 at 21:07):

import tactic data.real.pi

open real

example : pi < sqrt 2 + sqrt 3 :=
begin
  have h1 : pi < 3.141593 := pi_lt_3141593,
  have h2 : 1.414 < sqrt 2,
  { rw ←sqrt_mul_self (show (0 : ℝ) ≤ 1.414, by norm_num),
    rw sqrt_lt; norm_num },
  have h3 : 1.732 < sqrt 3,
  { rw ←sqrt_mul_self (show (0 : ℝ) ≤ 1.732, by norm_num),
    rw sqrt_lt; norm_num },
  linarith
end

David Wärn (Sep 27 2020 at 13:05):

Joseph Myers said:

Checking the bounds in more detail, $a+2b+3c+4d=1+b+2c+3d\le1+6b$ , and if $b\le1/512$ then $b^b<1-6b$ , so that covers cases with $b\le1/512$ , and, with 15-bit floating-point precision, this can be handled with 327 cubes including those skipped for small $b$ (224 cubes being leaf nodes where interval arithmetic gave a good-enough bound). I'm not sure this would be much easier for a tactic to find than a more human-style proof, given the need to prove some asymptotics along the way.

Alternatively, $(a+2b+3c+4d)a^ab^bc^cd^d<(1+b+2c+3d)b^bc^cd^d < e^{b+2c+3d}b^bc^cd^d = (eb)^b(e^2c)^c(e^3d)^d$ . So we may assume $b > 1/e$ or $c > 1/e^2$ or $d>1/e^3$ .

Peter Jin (Sep 28 2020 at 13:23):

The 2020 results are now up: https://www.imo-official.org/year_statistics.aspx?year=2020 (Note the Q6 scores in particular; do they suggest a few participants found rather unique values for $\alpha$ ? @Kevin Buzzard @Miroslav Olšák )

Peter Jin (Sep 30 2020 at 03:21):

Also some context for 2020 Q6 on AoPS: https://artofproblemsolving.com/community/c6h2278656p17821829

Joseph Myers (Oct 04 2020 at 23:57):

David Wärn said:

Alternatively, $(a+2b+3c+4d)a^ab^bc^cd^d<(1+b+2c+3d)b^bc^cd^d < e^{b+2c+3d}b^bc^cd^d = (eb)^b(e^2c)^c(e^3d)^d$ . So we may assume $b > 1/e$ or $c > 1/e^2$ or $d>1/e^3$ .

If I generate the case division and rational bounds outside of Lean, I can get a complete formalised proof using this approach, with norm_num. It's very large (would be shorter using a smarter bound for x ^ x given x in an interval rather than treating it as x ^ y for x and y in the same interval but not necessarily equal) and slow, and surely not suitable for the mathlib archive (though there are a few mathlib-relevant lemmas). It also shows up some limitations in norm_num. I needed bounds on exp 1 for this approach; the bounds with 8 decimals after the point are the limit of what norm_num can handle without timing out; if I change 12 to 13 in exp_one_gt_271828182 I get a timeout from norm_num (-T100000) even though the expression it's trying to normalise is vastly shorter than in the IMO 1979 example people were trying unsuccessfully. And all the individual bounds on powers end with something like

  have h : (11 / 32 : ℝ) ^ (1 : ℕ) ≤ (1569 / 2048 : ℝ) ^ (4 : ℕ), by norm_num,
  repeat { rw ←rpow_nat_cast at h },
  norm_cast at h,
  convert h,
  norm_num

because norm_num can't handle expressions with rpow even once it simplifies the real exponent to an integer (in type ℝ) so it's necessary to use rpow_nat_cast in those proofs rather than norm_num doing that automatically.

I suppose that if each of those lemmas such as pow_5_8_11 could be proved by interval_arithmetic, that would indicate an interval arithmetic tactic that definitely does something useful, even if it can't manage other things such as the case split itself.

imo2020_q2_intervals.lean

Mario Carneiro (Oct 05 2020 at 00:14):

If you have a self contained example I can probably show you how to make norm_num do more advanced calculations

Mario Carneiro (Oct 05 2020 at 00:16):

Note that norm_num evaluates expressions in the "obvious way" from inside out, so some terms like (1569 / 2048 : ℝ) ^ (4 : ℕ) might end up being rather unnecessarily large

Mario Carneiro (Oct 05 2020 at 00:16):

it's usually possible to rearchitect the verification conditions to avoid this blowup

Joseph Myers (Oct 05 2020 at 00:23):

Here's a small example. I was hoping norm_num would be able to handle the whole proof rather than needing to use rpow_nat_cast manually.

import analysis.special_functions.pow

noncomputable theory
open_locale classical

open real

lemma pow_ineq : 11 / 32 ≤ (1569 / 2048 : ℝ) ^ (4 : ℝ) :=
begin
  have h : 11 / 32 ≤ (1569 / 2048 : ℝ) ^ (4 : ℕ), by norm_num,
  rw ←rpow_nat_cast at h,
  exact_mod_cast h
end

Mario Carneiro (Oct 05 2020 at 00:24):

holy crap that file looks like it was written by a computer

Joseph Myers (Oct 05 2020 at 00:25):

Almost all of it was.

Joseph Myers (Oct 05 2020 at 00:26):

Everything between -- Start automatically-generated content. and -- End automatically-generated content. was computer-generated.

Mario Carneiro (Oct 05 2020 at 00:26):

I see. In that case, you should probably do more to clean up the verification conditions for norm_num's consumption

Mario Carneiro (Oct 05 2020 at 00:26):

like don't use rpow, just hand it inequalities of natural numbers

Joseph Myers (Oct 05 2020 at 00:30):

The point is more to illustrate examples that an interval arithmetic tactic could try to handle, rather than as something it would be worth cleaning up in the absence of more advanced tactics. But I'd still hope (a) norm_num could be made to handle rpow a bit better, (b) it could be made to handle larger expressions without timing out when computing the bounds on exp 1. (Those bounds on exp 1 seem appropriate for mathlib, just like the bounds on pi.)

Mario Carneiro (Oct 05 2020 at 00:31):

You should certainly be able to turn up the bounds on exp, but you have to be careful to calculate with fixed precision because I expect your algorithm is exponential

Mario Carneiro (Oct 05 2020 at 00:31):

norm_num can't do that on its own because it doesn't know what precision is required

Mario Carneiro (Oct 05 2020 at 00:31):

it's not an approximation tactic, it assumes you want everything in exact rational arithmetic

Mario Carneiro (Oct 05 2020 at 00:33):

right now it would be difficult to handle rpow in norm_num because that would require importing rpow

Joseph Myers (Oct 05 2020 at 00:33):

The rpow cases I'd hope it could handle are ones where the result is obviously rational (because the exponent is an integer). Bounding rpow that way isn't very efficient, but the bounds that suffice for this problem are simple enough that exact rational arithmetic does work.

Mario Carneiro (Oct 05 2020 at 00:33):

which would almost certainly cause a dependency cycle

Joseph Myers (Oct 05 2020 at 00:34):

So norm_num might need better ways to be extensible (handle certain functions only if relevant files are imported)?

Mario Carneiro (Oct 05 2020 at 00:34):

It doesn't seem too onerous to me to require the use of other tactics to normalize rpow

Mario Carneiro (Oct 05 2020 at 00:34):

Yes, that's been on the agenda for a long time

Mario Carneiro (Oct 05 2020 at 00:36):

where do you even use rpow? I don't see it in the file

Mario Carneiro (Oct 05 2020 at 00:36):

except in the example lemma you sent, where it is in the input statement

Joseph Myers (Oct 05 2020 at 00:36):

^ on reals is notation for rpow.

Mario Carneiro (Oct 05 2020 at 00:36):

yes, I don't see that

Joseph Myers (Oct 05 2020 at 00:38):

And since the original IMO problem involves a ^ a for a : ℝ, it's unavoidable to have rpow somewhere.

Mario Carneiro (Oct 05 2020 at 00:38):

yes, but only for variables. Once you put in numerals, you should be able to switch to regular pow

Joseph Myers (Oct 05 2020 at 00:39):

Which is what the proofs of inequalities do in the end, via rpow_nat_cast, to make norm_num work.

Mario Carneiro (Oct 05 2020 at 00:40):

Still, these lines look pretty fearsome:

have h : (1 / 4 : ℝ) ^ (7 : ℕ) ≤ (1513 / 2048 : ℝ) ^ (32 : ℕ), by norm_num,

Mario Carneiro (Oct 05 2020 at 00:40):

2048 ^ 32 is a big number

Mario Carneiro (Oct 05 2020 at 00:41):

I would hope there is a way to work with logarithms or something to loosen the bounds a bit

Mario Carneiro (Oct 05 2020 at 00:42):

I mean I get that you just want to throw the computer at the problem but it's not going to do a very good job

Joseph Myers (Oct 05 2020 at 00:43):

I didn't try to weaken the bounds at all. They were all computed to 11-bit precision (because that suffices, while smaller precision still suffices but needs more cases) but likely weaker bounds work work in many cases.

Mario Carneiro (Oct 05 2020 at 00:43):

Where did you do the codegen?

Mario Carneiro (Oct 05 2020 at 00:43):

Do you have a python program kicking around or something?

Joseph Myers (Oct 05 2020 at 00:44):

There are of course simple human-level proofs of the original result that can be formalised, this just demonstrates the proposed interval arithmetic approach does work.

Mario Carneiro (Oct 05 2020 at 00:44):

Well I wouldn't call this interval arithmetic exactly, because the bounds are propagating in the wrong direction

Joseph Myers (Oct 05 2020 at 00:45):

Code generator in C with MPFR. Time to run the C program, about 0.015s.

Mario Carneiro (Oct 05 2020 at 00:45):

So the C program should be delivering a lot more help if it's so much faster

Mario Carneiro (Oct 05 2020 at 00:46):

You should calculate an approximation sequence that lets the intermediate values stay small

Mario Carneiro (Oct 05 2020 at 00:47):

Does the C program do interval arithmetic itself?

Joseph Myers (Oct 05 2020 at 00:48):

It relies on MPFR to calculate powers to 11 bit precision (rounded upward). It only ever needs to compute values rounded upward rather than tracking actual intervals through arithmetic.

Mario Carneiro (Oct 05 2020 at 00:48):

Right, that's my point

Mario Carneiro (Oct 05 2020 at 00:48):

You are calculating powers at 11 bit precision

Mario Carneiro (Oct 05 2020 at 00:48):

that yields a sequence of numbers such that n_i * x <= n_(i+1) more or less, yes?

Mario Carneiro (Oct 05 2020 at 00:49):

you should provide the whole sequence, not just the last element of the sequence

Mario Carneiro (Oct 05 2020 at 00:50):

every intermediate step in the interval arithmetic calculation should be put in the lean file, and norm_num should only have to calculate the small steps

Mario Carneiro (Oct 05 2020 at 00:50):

if you skip steps norm_num has to work at way more than 11 bits precision

Joseph Myers (Oct 05 2020 at 00:51):

Internally, MPFR is computing exp and log with error bounds (and some intermediate precision higher than 11 bits) and increasing the intermediate precision if necessary to get correct rounding. But yes, supplying intermediate steps that prove a bound on rpow would reduce the precision norm_num needs to use.

Mario Carneiro (Oct 05 2020 at 00:52):

Ideally, this would all be done behind the scenes, as MPFR does, but that's a separate tactic, built on norm_num

Joseph Myers (Oct 05 2020 at 00:54):

Yes. The hypothetical interval_arithmetic or similar tactic that tries to prove inequalities using heuristics for intermediate precision.

Mario Carneiro (Oct 05 2020 at 00:55):

I think there is still value in having a interval_arithmetic_with_certificate [...] tactic where you give it the sequence of approximations, because that will be very fast and is more suitable for going in mathlib

Mario Carneiro (Oct 05 2020 at 00:55):

in fact, I believe the pi bounds look like that now

Mario Carneiro (Oct 05 2020 at 00:55):

src#real.pi_gt_3141592

Mario Carneiro (Oct 05 2020 at 00:59):

Notice also that the core lemma:

lemma sqrt_two_add_series_step_down (a b : ℕ) {c d n : ℕ} {z : ℝ}
  (hz : z ≤ sqrt_two_add_series (a/b) n) (hb : 0 < b) (hd : 0 < d)
  (h : a ^ 2 * d ≤ (2 * d + c) * b ^ 2) : z ≤ sqrt_two_add_series (c/d) (n+1) :=

has as its main verification condition a simple fact about inequalities of naturals, there are no real numbers except for z that is not computed with

David Wärn (Oct 06 2020 at 10:10):

This is very interesting Joseph! Great to see this dumb approach realised :-)

Joseph Myers (Oct 11 2020 at 01:41):

I've now PRed a more conventional human-level approach to IMO 2020 Q2 (from the official solutions) as #4565. That takes 40 lines of Lean (plus a few more AM-GM variants added to mathlib proper) as opposed to 9000.

Yury G. Kudryashov (Oct 11 2020 at 04:25):

Now the whole file is 49 lines long, and the begin..end block occupies 23 lines (including begin and end).

Yury G. Kudryashov (Oct 11 2020 at 04:26):

And we need someone else to review it.

Last updated: May 02 2025 at 03:31 UTC