Zulip Chat Archive

Stream: IMO-grand-challenge

Topic: problems that ask for both data and a proof


view this post on Zulip David Renshaw (May 17 2021 at 01:26):

Many math competition problems ask for some piece of data to found. For example:

Find the smallest positive integer x such that 30 * x ≡ 42 [mod 47].

We have been encoding these by providing the answer data (in this case 39) and just asking for a proof after it has been plugged in:

example : is_least { x : + | 30 * x  42 [MOD 47] } 39 := ...

This is a bit sad. As a human, when I solve the problem informally, most of my time is spent trying to determine the missing data. It feels like our encoding is not fully capturing the essence of the problem.

I wonder whether we can achieve a more faithful encoding with a specialized form of existential, like this:

inductive Solution (α : Type) (p : α  Prop) : Type
| intro (w : α) (h : p w) : Solution

Then the problem could be posed as

example : Solution + (λs, is_least { x : + | 30 * x  42 [MOD 47] } s) := ...

and a tactic-mode proof would start with use 39.

view this post on Zulip David Renshaw (May 17 2021 at 01:28):

I know that using a plain existential would not work, because nonconstructivity/propositional extensionality means that a solution does not necessarily need to provide the data in any legible form.

view this post on Zulip David Renshaw (May 17 2021 at 01:30):

I'm not 100% confident in my understanding of the necessary type theory, but I think that making Solution a Type forces solutions to construct explicit values?
Or at least things that can be passed to #reduce to get explicit values.

view this post on Zulip David Renshaw (May 17 2021 at 01:31):

So I suppose my main question is: why haven't we been encoding problems this way? Are there any issues with it?

view this post on Zulip Floris van Doorn (May 17 2021 at 01:38):

Note that your type Solution is the same as docs#subtype. With classical logic (docs#classical.subtype_of_exists) this is equivalent as requiring an existential witness.

This topic has been discussed before. If there is one canonical answer, things might not be too bad: one thing you can do is check whether the given term is syntactically (or definitionally) equal to the canonical answer. If there are multiple acceptible answers (say the answer is 2 ^ (n + m), then 2 ^ (m + n) and 2^n * 2 ^ m are probably also fine) it gets trickier. But maybe you can still have an explicit list of allowed answers, and you require that the given (elaborated) term is of the form ⟨n, pn⟩ where n is syntactically or definitionally equal to one of the allowed answers.

view this post on Zulip David Renshaw (May 17 2021 at 01:42):

Floris van Doorn said:

Note that your type Solution is the same as docs#subtype.

/me head explodes

view this post on Zulip David Renshaw (May 17 2021 at 01:46):

oof, I was hoping that something like subtype_of_exists was not possible.

view this post on Zulip David Renshaw (May 17 2021 at 01:51):

I want solvers to be allowed to use classical logic in the proof part of a solution (i.e. Solution.h) but not in the data part (i.e. Solution.w).

view this post on Zulip David Renshaw (May 17 2021 at 01:53):

Maybe that can be checked easily by just examining Solution.w for any instance of Classical.some?

view this post on Zulip David Renshaw (May 17 2021 at 02:03):

Maybe the noncomputable keyword is already exactly that? Like, as long as the solution is not noncomputable, then we can extract a normal form of the data?

view this post on Zulip Floris van Doorn (May 17 2021 at 02:07):

You don't want to accept all answers that are computable / do not use classical logic.
For example, suppose that you asked to find the smallest n such that some decidable property p(n) holds. An "incorrect" answer would be:

  • I first prove that p(10^10^10) holds
  • Now I write a program that tests p(0), p(1), p(2), until it finds a number k where p(k) holds. It then returns ⟨k, ...⟩ where ... is the proof that p(k) holds.

This gives a computable witness of the answer, but any reasonable person would disallow that answer.

view this post on Zulip Mario Carneiro (May 17 2021 at 02:14):

Actually you can even prove the pure existential \exists n, p(n) and use nat.find to get a (no-axioms) computable construction of the least n such that p(n)

view this post on Zulip Mario Carneiro (May 17 2021 at 02:15):

although if you prove \exists n, p(n) using no axioms then most likely(?) the existence property holds so that you have actually proved p(10^10^10) or something like that

view this post on Zulip Mario Carneiro (May 17 2021 at 02:17):

Conversely, some "reasonable" answers are not considered computable by lean, in particular anything using real numbers. For example if the answer is 1 / 2 ^ n : real then lean would mark it as noncomputable

view this post on Zulip Johan Commelin (May 17 2021 at 05:02):

Will Lean 4 gives us more tools to distinguish between "actual" solutions and "tricks"?

view this post on Zulip Miroslav Olšák (May 17 2021 at 07:23):

This topic was discussed here already a few times, with no ultimate answer afair. I even made a collection of such answers from IMO shortlists 2006 -- 2018: http://www.olsak.net/mirek/determine-answers.txt
My view is that we actually need a special inductive type for answers (probably problem-dependent), and an answer should be given in its raw form -- only using constructors. Even computability is too strong, some expressions are computable in theory but not really in practice (such as the exact 5th Ramsey number), and that can be the point of the IMO problem. For natural numbers, this structure can be just num, for reals something combining naturals with arithmetical operations including a real power plus constant pi. For some more advanced answers we probably should allow binders... if you look at the list of answers I have collected, some are pretty tricky.

view this post on Zulip Joseph Myers (May 17 2021 at 23:18):

I think that such a problem is most naturally represented as asking for a solver to provide more than one term:

def answer : + := sorry

theorem result : is_least { x : + | 30 * x  42 [MOD 47] } answer := sorry

where the solver must then replace each sorry by a term of the given type (and the type of the result involves the value given for answer), and, when one of the types is not a Prop, a human must inspect the given term to see if it's reasonable (rather than e.g. being a term that just restates the problem for a functional equation). A difficulty with supplying an inductive type for valid answers is that there isn't any standard definition of e.g. what the possible kinds of solution set for a functional equation are, and specifying an inductive type based on knowledge of the answer to the problem is giving the solver extra information.

The same structure applies when a problem has multiple parts asking for separate things to be proved. IMO 2015 Q1, for example, would ask for three terms (a proof for part (a), a set of integers for part (b), and a proof of correctness of that set).

Note that this allows for partial credit for such problems, if the mark scheme gives human contestants partial credit for solving only one part of the problem (and an AI might only replace some of the sorrys), or for producing a solution with a less good form of the solution set (the mark scheme for IMO 2019 Q4 put a script in the 7-minus part of the mark scheme if it reduced the problem to checking cases n1000n \le 1000 or k1000k \le 1000, for example, although any AI that could do such a reduction would probably also be able to finish solving the problem).

view this post on Zulip Miroslav Olšák (May 18 2021 at 05:29):

Providing a bit of extra information seems to me as a valid price for having the task formally specified. Solutions of functional equations on reals are usually rather tame -- some of the most tricky ones could be { (f(x) = C) | C = 0 or 1 <= C < 2 }, or { (f(x) = x, g(x) = x) } + { (f(x) = x^2+C, g(x) = x) | C in R } showing that parts of the set can be parametrized by a real (or more reals), and the real parameters can be constrained. Solutions of functional equations on integer are can be more tricky.


Last updated: Aug 05 2021 at 04:14 UTC