Zulip Chat Archive

Stream: Machine Learning for Theorem Proving

Topic: The set of all even numbers

AaronHu (Aug 07 2024 at 16:23):

Hello everyone,

This is my first time using this community. In the solution to the first question on AlphaProof, the Lean file includes a theorem stating that \( a \) is the set of all even numbers. Could you please explain how this set of all even numbers is derived? Is this result provided by the AI, or is it given by humans?

Thank you!

Notification Bot (Aug 07 2024 at 16:23):

AaronHu has marked this topic as resolved.

Notification Bot (Aug 07 2024 at 16:24):

AaronHu has marked this topic as unresolved.

Jason Rute (Aug 07 2024 at 17:24):

Floris van Doorn said:

One question: what information was provided to AlphaProof? Something like
theorem imo_2024_p1 :
   {(α : ℝ) | ∀ (n : ℕ), 0 < n → (n : ℤ) ∣ (∑ i in Finset.Icc 1 n, ⌊i * α⌋)}
   = _ := by
  sorry
(plus maybe the English problem description)?

Jason Rute (Aug 07 2024 at 17:25):

Eric Wieser said:

Floris van Doorn said:

One question: what information was provided to AlphaProof [...]

Yes, with some elaborator tricks to prevent AlphaProof filling the _ with something that would change the type of the equality.

Jason Rute (Aug 07 2024 at 17:27):

I think that exchange is the information we have about this, but it is clear that the AI came up with the set of even numbers. It also suggests that the set was left as a metavariable that was filled in by the AI via the proof it provided. (I personally don’t understand the specifics.)

Eric Wieser (Aug 07 2024 at 22:09):

From the blog post:

When presented with a problem, AlphaProof generates solution candidates and then proves or disproves them by searching over possible proof steps in Lean.

So {α : ℝ | ∃ k : ℤ, Even k ∧ α = k} is one such "solution candidate" as output by "the AI".

Eric Wieser (Aug 07 2024 at 22:11):

Jason Rute said:

It also suggests that the set was left as a metavariable that was filled in by the AI via the proof it provided.

The key thing is that raw text substitution would let you replace the _ with Int, which would be neither provable nor disprovable.

Jason Rute (Aug 08 2024 at 00:41):

Maybe you are willing to say @Eric Wieser. Is the solution candidate filled in first and then AlphaProof attempts to prove it (so nothing special with metavariables)? If so, that makes more sense to me.

Mario Carneiro (Aug 08 2024 at 06:45):

What prevents the AI from using {(α : ℝ) | ∀ (n : ℕ), 0 < n → (n : ℤ) ∣ (∑ i in Finset.Icc 1 n, ⌊i * α⌋)} itself as a solution candidate and proving it by rfl?

Eric Wieser (Aug 08 2024 at 09:48):

What prevents a human contestant doing the same?

Joseph Myers (Aug 08 2024 at 10:44):

This is at least a good example of a "determine" problem for which fully automated marking would be problematic, because there are multiple reasonable ways to express the answer. Whereas for P2, P5 and P6 you could more reasonably require an AI contestant to give a fully simplified answer, and it's clear (once you know what the answer is!) exactly what a fully simplified answer would look like for those problems.

Jason Rute (Aug 08 2024 at 11:03):

I bet with reasonable accuracy you could have an LLM judge if the solution was specific enough. In the prompt you would provide some examples of bad answers (literally correct but not specific) and good answers, and then ask the LLM to score this answer (giving a reason first). That would be automatic, and fast enough since you only have to score results which first pass Lean.

Jason Rute (Aug 08 2024 at 11:06):

(I’m responding to @Joseph Myers’s point, but this might also be related to what they do in AlphaProof to make sure it only gives good answers in the end.)

Last updated: May 02 2025 at 03:31 UTC