Zulip Chat Archive

Stream: new members

Topic: Getting GPT3 chatbots to write ZFC in Lean4

Lars Ericson (May 03 2023 at 23:58):

I was able to get OpenAI ChatGPT to write ZFC axioms in Lean 4. This was after failed attempts for Perplexity.AI and for Bing Chat.

In doing this, all I was really doing was feeding back the error messages from Lean. It then used the error messages to search it's memory of the Internet for better solutions. This seems like a loop that could be automated, given a starting point piece of code that it generated.

I'm just doing this for fun. So I'm not asking an actual question here, since ChatGPT finally closed the loop. I do have some ambitions along the lines of stepwise refinement of ChatGPT-produced explanations of Fermat's Last Theorem combined with its ability to generate Lean scripts, to maybe stepwise refine my way to a Lean 4 version of the proof of Fermat's Last Theorem. (Which you guys have probably already done, but again, this is just for fun.) Here is a description of an initial dialog I had with the these GPT chatbots on this topic (but before I found out they could write code).

Bulhwi Cha (May 04 2023 at 00:08):

Can you share the code here?

Scott Morrison (May 04 2023 at 00:56):

@Lars Ericson, have you looked at my sagredo demo? https://youtu.be/CEwRMT0GpKo This automates the loop of feeding back error messages or new goals to GPT.

Scott Morrison (May 04 2023 at 00:57):

Also, @Lars Ericson, you are barking up the wrong tree if you expect to get anywhere with Fermat's Last Theorem in Lean without (or especially with) GPT's help. This is a massive undertaking well beyond current technology / available expertise. :-)

Lars Ericson (May 04 2023 at 14:12):

Bulhwi Cha said:

Can you share the code here?

Here is the final code that ChatGPT produced which passed the Lean 4 type checker:

-- Define the type of sets
axiom Set : Type

-- Define the membership relation
axiom ElementOf : Set → Set → Prop
infix:50 " ∈ " => ElementOf

-- Define the empty set
axiom empty : Set

-- Define the axiom of extensionality
axiom extensionality : ∀ {a b : Set}, (∀ x, x ∈ a ↔ x ∈ b) → a = b

-- Define the axiom of regularity
axiom regularity : ∀ a : Set, a ≠ empty → ∃ b : Set, b ∈ a ∧ ∀ x, x ∈ b → ¬(x ∈ a)

-- Define the axiom of pairing
axiom pairing : ∀ a b : Set, ∃ c : Set, ∀ x, x ∈ c ↔ (x = a ∨ x = b)

-- Define the axiom of union
axiom union : ∀ a : Set, ∃ b : Set, ∀ x, x ∈ b ↔ ∃ y, y ∈ a ∧ x ∈ y

-- Define the axiom of power set
axiom power_set : ∀ a : Set, ∃ b : Set, ∀ x, x ∈ b ↔ ∀ y, y ∈ x → y ∈ a

-- Define the axiom of choice
axiom choice : ∀ {a : Set → Prop}, (∀ x, ∃ y, a x → y ∈ x) → ∃ f : Set → Set, ∀ x, a x → f x ∈ x

Lars Ericson (May 04 2023 at 14:19):

@Scott Morrison thanks for link to sagredo. I will add a link to the video to my LinkedIn article.

On Fermat's Last Theorem, I am somewhat familiar now with the process of drilling down to axiomatize mathematical ideas in Lean. It helps to have a grander project that "has an answer" that people accept but find difficult to accept, as a focus of attention/mountain to climb. I don't expect to get to the top of the mountain, but it is still fun to climb until it's not. It is also fun to try to get ChatGPT to teach me.

Lars Ericson (May 04 2023 at 17:32):

@Scott Morrison I am watching your video now and it occurred to me that at 3:33
image.png
what you are describing/implementing with the tactic is a kind of generative adversarial network.

Lars Ericson (May 04 2023 at 21:59):

Anyway it's just a matter of filling this out:

theorem fermat_last_theorem : ∀ (a b c n : ℕ), n > 2 → a^n + b^n ≠ c^n :=
begin
sorry
end

with a little help from my robot friends. Attached is the first
fmt.pdf
of about 1000 passes, the game can be played for a while.

Eric Wieser (May 04 2023 at 22:44):

If this eventually succeeds, it will be because you have run it for however long it takes for humans to have already published the full proof in mathlib

Eric Wieser (May 04 2023 at 22:45):

Note that the output is already nonsense at:

structure algebraic_curve :=
 (equation : K → K → Prop)
 (is_algebraic : x y : K, equation x y)

which allows only the entire plane to be an algebraic_curve.

Fabian Glöckle (May 04 2023 at 23:07):

Eric Wieser said:

Note that the output is already nonsense at:
structure algebraic_curve :=
 (equation : K → K → Prop)
 (is_algebraic : x y : K, equation x y)
which allows only the entire plane to be an algebraic_curve.

already at the statement of FLT ;)

Eric Wieser (May 04 2023 at 23:15):

Guess Kevin's going to have to cancel his funding application, FLT is false after all:

import tactic
theorem not_fermat_last_theorem : ¬∀ (a b c n : ℕ), n > 2 → a^n + b^n ≠ c^n :=
λ h, h default default default 3 (by norm_num) (by norm_num; refl)

Scott Morrison (May 04 2023 at 23:24):

Let me try to make stone soup here. The entirely expected utter failure of having GPT "think" about FLT can be turned into an argument that we need to include better automation tooling in the loop with GPT (i.e. what sagredo attempts). If GPT knew about slim_check, and wanted to use it (or if the automation loop used it), then we might see better progress:

import Mathlib.Tactic.SlimCheck

theorem not_exactly_fermat_last_theorem : ∀ (a b c n : Nat), n > 2 → a^n + b^n ≠ c^n := by
  slim_check

-- Prints something like:

===================
Found problems!
a := 0
b := 2
c := 2
n := 10
guard: 2 < 10
issue: 1024 ≠ 1024 does not hold
(0 shrinks)
-------------------

(Assuming, of course, that someone gets around to merging https://github.com/leanprover-community/mathlib4/pull/3114.)

Lars Ericson (May 04 2023 at 23:54):

Out of the blue, ChatGPT just threw this at me. Lean 4 fails on "import tactic". Either it's cribbing an actual proof from mathlib or it's a "hallucination" (term of art in Scott's video):

import tactic

theorem fermat_last_theorem : ∀ (a b c n : ℕ), n > 2 → a^n + b^n ≠ c^n :=
begin
  intros a b c n hn h,
  have h1 : a^n + b^n ≥ 2,
  { apply nat.succ_le_of_lt,
    rw [←nat.pow_two],
    apply add_lt_add,
    apply pow_lt_pow_of_lt_left (show 0 < 2, by norm_num) hn,
    apply pow_lt_pow_of_lt_left (show 0 < 2, by norm_num) hn },
  have h2 : c^n ≥ 1,
  { apply nat.succ_le_of_lt,
    rw [←nat.pow_two],
    apply lt_of_le_of_ne (add_le_add (nat.zero_le _) (nat.zero_le _)) h,
    intro h3,
    rw [h3, add_zero] at h,
    rw [nat.pow_two, nat.pow_two] at h,
    exact nat.succ_le_of_lt (lt_of_mul_lt_mul_left h (show 0 < 4, by norm_num)) },
  have h3 : a^n + b^n ≤ c^n,
  { apply nat.le_of_not_gt,
    intro h4,
    have h5 : c^n > 0 := pow_pos (lt_of_le_of_lt (nat.zero_le _) h2) _,
    rw [←nat.sub_add_cancel h1, nat.pow_add, ←mul_assoc, ←nat.mul_sub_left_distrib, nat.sub_lt_right_iff_lt_add h5] at h4,
    have h6 : 2 * a^n * b^n < c^n * c^n,
    { apply lt_of_le_of_lt (mul_le_mul_of_nonneg_left (add_le_add (nat.mul_le_mul_left _ h4) (nat.mul_le_mul_left _ h4)) (show 0 ≤ c^n, by linarith)) _,
      rw [←nat.pow_two],
      apply nat.pow_lt_pow_of_lt_left (show 0 < c, from lt_of_le_of_lt (nat.zero_le _) h4) (show 2 > 0, by norm_num) },
    rw [←nat.pow_two, ←nat.pow_two] at h6,
    have h7 : 2 * (a*b)^n < c^n * c^n,
    { rw [mul_pow, mul_pow, mul_pow],
      rw [←nat.pow_two, nat.pow_add, nat.pow_add, nat.pow_add],
      have h8 : ∀ (x y : ℕ), 2 * (x * y) = x * (2 * y),
      { intros x y,
        rw [mul_assoc, ←mul_assoc 2],
        rw [←nat.mul_succ, mul_comm],
        rw [mul_assoc, ←mul_assoc 2],
        rw [←nat.mul_succ, mul_comm],
        rw [mul_assoc, mul_assoc],
        rw [mul_comm y, mul_assoc],
        rw [←mul_assoc x, ←mul_assoc x],
        rw [mul_comm y],
        rw [mul_comm x],
        repeat {rw mul_assoc}},
      rw [h8, h8, h8],
      apply

If I take out the "import tactic" then I get a raft of errors starting with "failed to synthesize instance LT ℕ". I may nag ChatGPT about this for a while but what I was trying to do in pass 1 of 1000 was to do a breadth-first traversal (what are the main steps of the proof), then a depth-first traversal on every concept mentioned on that first version, rinse and repeat until I have the concepts well defined and then try to get the top level steps expressed as proprositions in terms of those concepts and then rinse and repeat on the steps. So I wouldn't really want to drill down too much on the above hallucination because it is not clearly motivated by the 7 steps ChatGPT started with:

Elliptic Curves: Wiles made use of elliptic curves, which are a fundamental mathematical object in algebraic geometry. He studied certain types of elliptic curves called semi-stable elliptic curves, focusing on their properties and connections to number theory.
Taniyama-Shimura-Weil Conjecture: One crucial aspect of Wiles' proof was the Taniyama-Shimura-Weil conjecture, which proposes a connection between elliptic curves and modular forms. Wiles showed that if Fermat's Last Theorem were false, there would exist an elliptic curve with specific properties that violated this conjecture.
Modular Forms: Wiles delved into the theory of modular forms, which are complex functions with certain transformation properties. He established intricate connections between modular forms and elliptic curves, laying the groundwork for his subsequent arguments.
Galois Representations: Wiles introduced Galois representations into the proof, which provide a way to study the symmetries of equations. He developed a deep understanding of these representations and their relation to modular forms and elliptic curves.
Frey Curve: Wiles constructed what is known as the "Frey curve," named after Jean-Pierre Serre's student, Ernst Frey. This curve had specific properties that allowed Wiles to reduce the problem of Fermat's Last Theorem to a contradiction in the framework of elliptic curves and modular forms.
Selmer Groups and Descent: Wiles employed sophisticated mathematical tools, such as Selmer groups and descent theory, to analyze the properties of elliptic curves and modular forms further. These tools helped him establish crucial results related to the non-existence of certain solutions to the Fermat equation.
Proof by Contradiction: Wiles ultimately showed that if a non-trivial solution to the Fermat equation existed, it would lead to a contradiction within the framework he established using elliptic curves, modular forms, and Galois representations. This contradiction demonstrated that the equation has no solutions, thus proving Fermat's Last Theorem.

Eric Wieser (May 04 2023 at 23:55):

That entire output is Lean 3 code, so Lean4 will accept approximately none of it

Scott Morrison (May 05 2023 at 01:13):

@Lars Ericson, it really seems you are not ready to listen to us saying that GPT4 is not a relevant tool for approaching FLT. I'm not sure why.

Mario Carneiro (May 05 2023 at 01:42):

Eric Wieser said:

Guess Kevin's going to have to cancel his funding application, FLT is false after all:

import tactic
theorem not_fermat_last_theorem : ¬∀ (a b c n : ℕ), n > 2 → a^n + b^n ≠ c^n :=
λ h, h default default default 3 (by norm_num) (by norm_num; refl)

This is reminding me of when I graded for Jeremy's intro to lean class, there was an exercise about stating FLT and I must have written this disproof 15 or so times

Joseph Myers (May 05 2023 at 02:05):

There are a lot of claimed "proofs" of FLT out there of the general form: do some trivial algebraic manipulations, do something that either has an algebraic mistake or isn't mathematics at all, do some more trivial algebraic manipulations, suddenly FLT drops out. I can certainly imagine that (a) various such false proofs are among the training material LLMs have read and (b) an LLM might suddenly decide to hallucinate an autoformalization of such a false proof that it once read instead of anything useful.

Johan Commelin (May 05 2023 at 05:29):

@Lars Ericson That outline you posted above claims that Wiles did this and Wiles did that, etc... But all of those steps were already done by others (Taniyama, Shimura, Serre, Frey, Ribet, and many, many others). What Wiles did, was prove the Taniyama-Shimura conjecture (step 2 in the outline). He didn't introduce Galois representations into the proof, others had already done that. Same for all the other steps ≠ 2.

Johan Commelin (May 05 2023 at 05:30):

Besides that, I think you are aiming too low with this project. FLT is old news. You should really be extracting a proof of Langlands out of GPT. That's the real deal.

Kevin Buzzard (May 05 2023 at 07:08):

(responding to Lars' points above) 2. Ribet showed that TSW=>FLT, not Wiles, indeed it was Ribet's work which inspired Wiles to work on FLT. 4. Galois representations were introduced into the theory many years before Wiles. It was Mazur and Tilouine who proposed the R=T conjecture -- Wiles just proved it. 5. Frey introduced the Frey curve and his name is Gerhard Frey; he wasn't a student of Serre (indeed Serre famously had no PhD students).

Kevin Buzzard (May 05 2023 at 07:10):

I could also add that the chances of getting a language model to do anything useful regarding formalising a proof of FLT (I don't mean proving it, I mean getting them to do anything remotely helpful which will aid the extremely long journey towards the proof) are right now zero.

Siddhartha Gadgil (May 05 2023 at 07:35):

Kevin Buzzard said:

I could also add that the chances of getting a language model to do anything useful regarding formalising a proof of FLT (I don't mean proving it, I mean getting them to do anything remotely helpful which will aid the extremely long journey towards the proof) are right now zero.

I don't have evidence to back this yet, but "something remotely helpful" does not seem hopeless - partial autoformalization with few enough errors so that correcting these is significantly less work than formalizing from scratch seems not too far away (the Draft, Sketch and Proof paper is probably the furthest along these lines).

Kevin Buzzard (May 05 2023 at 07:36):

Fair point. My first paragraph is facts, the second is my opinion.

Reid Barton (May 05 2023 at 10:58):

What are your opinions about these statements?

The ring tactic does something helpful which will aid the extremely long journey towards the proof of FLT.
There is a nonzero chance right now of getting a large language model to automatically do something comparably useful to ring.

Kevin Buzzard (May 05 2023 at 13:07):

Reid as you probably know I'm a sceptic -- I will only believe that LLMs are remotely useful for FLT or more generally for any mathematics which I'm interested in when I've actually seen it happening. Right now I've seen all kinds of "oh look it wrote list.append automatically" stuff but we already have list.append. I'm certainly watching and am well aware that people constantly talk about how these things are getting exponentially better but at present they have achieved nothing beyond school level mathematics. I am watching carefully though. Let me stress that I wrote right now zero.

Oliver Nash (May 05 2023 at 13:28):

Have they even achieved school-level mathematics? I hear they've passed some standardised tests but isn't that just because our tests at that level are largely memorisation?

Eric Rodriguez (May 05 2023 at 13:52):

The sagredo demo I was the most impressed with was the one that wrote the proof for some equation having some finite set of solutions - those kind of proofs are often tedious to write in Lean but also trivial in real life. I'm sure some of those cases will come up in the proof of FLT somehow, and therefore that's already somewhere that LLMs can help

Siddhartha Gadgil (May 05 2023 at 13:57):

Kevin Buzzard said:

Reid as you probably know I'm a sceptic -- I will only believe that LLMs are remotely useful for FLT or more generally for any mathematics which I'm interested in when I've actually seen it happening. Right now I've seen all kinds of "oh look it wrote list.append automatically" stuff but we already have list.append. I'm certainly watching and am well aware that people constantly talk about how these things are getting exponentially better but at present they have achieved nothing beyond school level mathematics. I am watching carefully though. Let me stress that I wrote right now zero.

The ring tactic also cannot do any non-trivial mathematics by itself. But it is certainly useful.
Personally I find copilot (which is LLM based) pretty useful, as do my students.

Even if the core of formalizing say FLT is people doing it, if there productivity is doubled the time needed is halved.

Reid Barton (May 05 2023 at 14:04):

lean-gptf already did some pretty neat stuff, though I guess the "right now" clause disqualifies it because afaik it is not possible to use it any more.

Lars Ericson (May 05 2023 at 14:05):

@Scott Morrison , I'm not not hearing you. I hear you. However, I have experienced ChatGPT as being a useful tool for summarizing research, finding relevant papers, and even sourcing poorly explained or motivated equations presented in talks. You can just type the LaTeX form of an equation at it, without any context, and it will tell you where it's from and how it's applied. I understand that it may not be as helpful for FMT and Lean4 being bare bones (no mathlib?) may not be as helpful. I'm still enjoying the exercise though.

Here is what I see from ChatGPT:

It doesn't know the difference between Lean 3 and Lean 4, at all
It will repeat itself, so it doesn't have that much memory or ability to keep track of the contents of its prior responses in the same dialogue. So it will cycle through wrong answers with repeated prompts.
It apologizes endlessly, even if you tell it not to
If you ask the same question twice, there is a degree of randomness in the generation, so it won't give you the same response twice.
It can be biased
It is not reliable
It will incorporate immediate feedback
As set up, it may be learning on the go from the feedback it receives

Regarding point 1, this is what ChatGPT says are the differences between Lean 3 and Lean 4:

"There are several major differences between Lean 3 and Lean 4, both syntactic and semantic. Here are some of the most significant differences:

"Syntax: The syntax of Lean 4 has been redesigned to be more consistent and easier to read. The notation for defining functions, for example, has been changed to use the def keyword instead of definition or theorem.

"Type system: Lean 4 has a more expressive type system than Lean 3. It features a dependent type system, which allows for more precise types that can depend on values. Additionally, the system supports type classes, which allow for overloading of functions based on the types of their arguments.

"Metaprogramming: Lean 4 introduces a new, more powerful metaprogramming system based on tactics. This allows for more advanced automation and proof search.

"Libraries: Lean 4 has a new standard library that is designed to be more modular and extensible than the one in Lean 3. Additionally, many of the standard tactics and functions have been redesigned or renamed to be more consistent and easier to use.

"Compiler: Lean 4 uses a new, faster and more efficient compiler than Lean 3. The new compiler is designed to be more incremental and interactive, which makes it easier to work with.

"Overall, Lean 4 represents a significant improvement over Lean 3 in many ways. While there is some backward compatibility between the two versions, it may take some effort to migrate code from Lean 3 to Lean 4."

I can try to get it to generate the above list of steps of FMT, and then feed back the corrections above. The initial prompt is

"Please give the detailed steps for the proof of Fermat's Last Theorem, in sufficient detail to be checked by a human mathematician."

ChatGPT says

"The proof of Fermat's Last Theorem is a complex and technical argument that spans several different areas of mathematics, including number theory, algebraic geometry, and modular forms. It was first published in 1994 by Andrew Wiles, and it has been widely regarded as one of the greatest achievements in the history of mathematics. Here is an overview of the main steps involved in the proof, presented at a high level of abstraction:

Elliptic Curves and Modular Forms: The proof of Fermat's Last Theorem relies heavily on the theory of elliptic curves, which are algebraic curves defined by polynomial equations with coefficients in a field. One of the key insights of the proof is the connection between elliptic curves and modular forms, which are analytic functions that satisfy certain transformation properties under modular transformations. In particular, there is a deep connection between the modular forms of weight 2 and the elliptic curves over the complex numbers.
Modularity Theorem: The Modularity Theorem, also known as the Taniyama-Shimura-Weil Conjecture, is a deep result in number theory that asserts that every elliptic curve over the rational numbers is modular, meaning that it can be associated with a certain modular form of weight 2. This theorem was first proposed by Yutaka Taniyama and Goro Shimura in the 1950s, and it was later proved by Andrew Wiles and Richard Taylor in the 1990s, using a combination of methods from algebraic geometry, modular forms, and Galois representations.
Frey Curve: One of the key insights of the proof of Fermat's Last Theorem is the construction of a certain elliptic curve, called the Frey curve, that arises from a hypothetical counterexample to Fermat's equation. This curve is defined over the rational numbers and has certain properties that are crucial for the proof.
Modularity Lifting Theorem: The Modularity Lifting Theorem, also known as the Taylor-Wiles Theorem, is a deep result in algebraic geometry that provides a way to lift the modularity of an elliptic curve over the rational numbers to a modular form over a certain ring of integers. This theorem was first proved by Andrew Wiles and Richard Taylor in the 1990s, using a combination of methods from algebraic geometry, modular forms, and Galois representations.
Selmer Groups: Selmer groups are certain cohomology groups associated with an elliptic curve over the rational numbers, and they play a crucial role in the proof of Fermat's Last Theorem. The Selmer groups are related to the Galois representations associated with the elliptic curve, and they provide a way to measure the failure of the elliptic curve to satisfy certain local conditions.
Main Theorem: The Main Theorem of the proof of Fermat's Last Theorem is a deep result in number theory that asserts that certain Selmer groups associated with the Frey curve are trivial, meaning that they have no non-trivial elements. This theorem was proved by Andrew Wiles in the 1990s, using a combination of methods from algebraic geometry, modular forms, and Galois representations.
Proof of the Main Theorem: The proof of the Main Theorem is a complex and technical argument that involves a number of different steps. One of the key steps is the construction of certain Galois representations, called modular Galois representations, that are associated with the modular form of weight 2 associated with the elliptic curve. Another key step is the use of the Langlands program, a far-reaching conjecture that provides a framework for studying the connections between different areas of mathematics.
Conclusion: The proof of the Main Theorem"

I then put in the comments above, we'll see what it says:

"What Wiles did, was prove the Taniyama-Shimura conjecture. He didn't introduce Galois representations into the proof, others had already done that. Ribet showed that TSW=>FLT, not Wiles, indeed it was Ribet's work which inspired Wiles to work on FLT. Galois representations were introduced into the theory many years before Wiles. It was Mazur and Tilouine who proposed the R=T conjecture -- Wiles just proved it. Frey introduced the Frey curve and his name is Gerhard Frey; he wasn't a student of Serre (indeed Serre famously had no PhD students)."

ChaGPT responds

"Thank you for pointing out those inaccuracies in my previous response. You are correct that Wiles' contribution to the proof of Fermat's Last Theorem was the proof of the Taniyama-Shimura conjecture, and that Galois representations were introduced into the theory before Wiles' work. Ribet's work showed that the Taniyama-Shimura conjecture implies Fermat's Last Theorem, not Wiles' work.

"Additionally, Frey introduced the Frey curve, and he was not a student of Serre. Thank you for bringing these errors to my attention."

Then I ask it to put it together:

"Please give the detailed steps for the proof of Fermat's Last Theorem, in sufficient detail to be checked by a human mathematician, incorporating this additional information."

It comes back with:

"Certainly, here are the detailed steps for the proof of Fermat's Last Theorem, incorporating the additional information you mentioned:

Step 1: Assume that there exists a solution to Fermat's Last Theorem, i.e., there exist integers a, b, c, and n > 2 such that a^n + b^n = c^n.

Step 2: Show that if there exists a solution to Fermat's Last Theorem, then there exists a corresponding elliptic curve, E, given by the equation y^2 = x(x-a^n)(x+b^n), and a modular form f of weight 2 and level N, where N is a positive integer.

Step 3: Prove the Taniyama-Shimura conjecture, which states that every elliptic curve over Q is modular, meaning it corresponds to a modular form. This was a major achievement of Andrew Wiles and Richard Taylor in the 1990s.

Step 4: Deduce that the elliptic curve corresponding to a solution of Fermat's Last Theorem is modular.

Step 5: Use the modularity of E to prove that the Frey curve, a certain elliptic curve over Q constructed by Gerhard Frey, is also modular.

Step 6: Use the modularity of the Frey curve to prove the conjecture R=T, where R is a certain condition on Galois representations and T is a condition on modular forms. This was first proposed by Mazur and Tilouine.

Step 7: Conclude that there are no solutions to Fermat's Last Theorem, since the existence of a solution would lead to a contradiction with the R=T conjecture.

In summary, the proof of Fermat's Last Theorem relied heavily on the Taniyama-Shimura conjecture, which was proven by Wiles and Taylor, and the R=T conjecture, which was first proposed by Mazur and Tilouine. The proof also used concepts from number theory, algebraic geometry, and analysis, and involved a deep understanding of elliptic curves, modular forms, Galois representations, and their interplay."

At first glance it seems like it is getting more technical and precise.

Reid Barton (May 05 2023 at 14:10):

It's clear that ChatGPT does not "know" how to prove FLT. Why not start your experiments from something much much easier, where there is some chance of success? Like even FLT for n=4?

Lars Ericson (May 05 2023 at 19:33):

@Reid Barton, ChatGPT in the end is a search engine which has ingested a giant corpus and has embedded it into a vector space that can be searched with the right prompts which are themselves translated into vectors. It reassembles the information into English or code as requested. The reassemblies are more or less competent. They are apt to be far less competent in an area requiring very high skill or an area where there is a low level of training material and the differences are subtle, like with Lean 3 versus Lean 4. I should have done this with Lean 3 as the baseline, but I am curious about Lean 4. I'm not trying to be the first kid on the block to automate a proof of FLT. I enjoy chasing definitions. Like compact subset to open cover to open set to set to ZFC. I got ChatGPT to eventually randomly generate a form of ZFC that Lean 4 would typecheck. So, following this particuilar game of Solitaire, I would then work my way back from set to open set to open cover to compact subset.

Eric Wieser (May 05 2023 at 21:57):

I got ChatGPT to eventually randomly generate a form of ZFC that Lean 4 would typecheck.

This is a pretty low bar, expecially if you allow it to use axiom. Did you check if the axioms were contradictory? The fact it type-checks doesn't tell you that.

Lars Ericson (May 07 2023 at 16:06):

Chewing on "every elliptic curve over the rational numbers is modular, meaning that it can be associated with a certain modular form of weight 2":

Elliptic curves:

Modular forms:

NB: Forecasters assess that there is a 72% likelihood that there we be a formal proof of the Modularity Theorem by 2029-05-01.

Patrick Nicodemus (May 08 2023 at 03:44):

That forecast is a weighted average of six individuals' guesses. It would be possible to interview every single one of them in less than an hour and figure out whether they knew anything about formal theorem proving. Frankly I'd be willing to bet at least 3 have no idea what the modularity theorem says or what the proof looks like. The number 72% means nothing.
It would be silly to listen to half a dozen random speculators over the practitioners in formal verification you are talking to here, some of whom also have experience in number theory.

Scott Morrison (May 08 2023 at 03:49):

Indeed, I happily bet that of those 6, (one of whom is a bot), exactly one knows anything about theorem proving, interactive or otherwise. :rofl:.

Bolton Bailey (May 08 2023 at 07:32):

As the creator of that market, I can confirm I absolutely could not even give a rough description of what the modularity theorem says :stuck_out_tongue_wink:.

Reid Barton (May 08 2023 at 19:53):

Definitely recommend some arbitrage with the formalized proof of FLT market!

Kevin Buzzard (May 08 2023 at 20:38):

The modularity theorem doesn't give you FLT for free -- you'll also need Mazur's theorem on torsion and also more, e.g. Ribet's level-lowering (and these won't be done in 6 years either ;-) ). There are ways of proving FLT without Ribet nowadays though (but you always need Mazur)

Bolton Bailey (May 09 2023 at 05:52):

Reid Barton said:

Definitely recommend some arbitrage with the formalized proof of FLT market!

Yes, creating an arbitrage opportunity to make the price of the FLT market a bit more accurate was part of the motivation to make the modularity theorem market. I think I misunderstood it a bit at the time - I thought that FLT was originally proved by proving the modularity theorem and proving that the modularity theorem implies FLT. I guess in actuality it's only a special case of the modularity theorem that's needed, I don't know how much harder the full modularity theorem would be to formalize.

Kevin Buzzard said:

The modularity theorem doesn't give you FLT for free -- you'll also need Mazur's theorem on torsion and also more, e.g. Ribet's level-lowering (and these won't be done in 6 years either ;-) ). There are ways of proving FLT without Ribet nowadays though (but you always need Mazur)

I was also totally unaware of these "other ways". Do they seem like the pathway of least resistance for an eventual formal proof?

Kevin Buzzard (May 09 2023 at 07:31):

Of FLT? Yes. But all proofs are still super long and technical and seem to need some trace formula in the noncompact case, that's one of many reasons why we won't have it by 2029

Lars Ericson (May 09 2023 at 20:47):

Nigel Boston: The proof of Fermat's Last Theorem: "An interested reader wanting a simple overview of the proof should consult Gouvea [13], Ribet [25], Rubin and Silverberg [26], or my article [1]. A much more detailed overview of the proof is the one given by Darmon, Diamond, and Taylor [6], and the Boston conference volume [5] contains much useful elaboration on ideas used in the proof. The Seminaire Bourbaki article by Oesterl´e and Serre [22] is also very enlightening. Of course, one should not overlook the original proof itself [38], [34] ."

Gouvea [13]. F. Gouvea. A marvelous proof. American Mathematical Monthly, 101:203–222, 1994.
Ribet [25]. K. Ribet. Galois representations and modular forms. Bulletin of AMS, 32:375–402, 1995.
Rubin and Silverberg [26]. K. Rubin and A. Silverberg. A report on Wiles’ Cambridge lectures. Bulletin of AMS, 31:15–38, 1994.
Boston [1]. N. Boston. A Taylor-made plug for Wiles’ proof. College Mathematics Journal, 26:100–105, 1995.
Darmon, Diamond, and Taylor [6]. H. Darmon, F. Diamond, and R. Taylor. Fermat’s last theorem. Elliptic curves, modular forms and Fermat’s last theorem (Hong Kong, 1993):2–140, 1997.
The Boston conference volume [5]. G. Cornell, J. Silverman, and G. Stevens. Modular forms and Fermat’s last theorem. Springer-Verlag, 1997.
Seminaire Bourbaki article by Oesterl´e and Serre [22] . J. Oesterl´e and J.-P. Serre. Travaux de Wiles (et Taylor, . . .). Ast´erisque, 237:319–355, 1996.
The original proof [38]. A. Wiles. Modular elliptic curves and Fermat’s last theorem. Ann. of Math., 141:443–551, 1995.
The original proof [34]. R. Taylor and A. Wiles. Ring-theoretic properties of certain Hecke algebras. Ann. of Math., 141:553–572, 1995.

Lars Ericson (May 10 2023 at 23:28):

Google Bard went into wide release today. Here it is on FLT and Lean. Performance seems to be worse than Perplexity.AI, Bing Chat and ChatGPT. Google Bard says it is memoryless with respect to queries so it can't incorporate information and results from prior queries.

Mario Carneiro (May 10 2023 at 23:33):

Does anyone in ML research know how systems like ChatGPT or Google Bard are trained to answer questions about themselves? It seems like if they were simply trained on the internet they would not be able to answer that kind of question at all and would just hallucinate something.

Scott Morrison (May 11 2023 at 00:10):

If Bard is saying that it is memoryless between queries in the same conversation, that is clearly hallucinating...

During the reinforcement-learning-on-human-feedback stage there is plenty of scope for it to "learn about itself".

Zhangir Azerbayev (May 11 2023 at 03:50):

Note that between unsupervised pre-training and RLHF, there is an intermediate phase of training called "instruction tuning". During instruction tuning the model is trained on dialogue-formatted samples of desired behavior: for example, saying it's a language model trained by openai, saying it doesn't know the answer to a question, and refusing to say naughty things. Then, the purpose of RLHF is to make sure that the model is calibrated (i.e says "I don't know" iff it doesn't know) and knows the precise boundary of what is and isn't allowed to say.

Fabian Glöckle (May 11 2023 at 08:19):

And also there is a system prompt saying something like: "The following is a conversation between NAME, a friendly, helpful and harmless AI assistant trained by COMPANY which ..."

Lars Ericson (May 13 2023 at 18:01):

Some problem space reductions:

3 is a regular prime so the n=3 proof is subsumed by the regular prime proof, when it is available. Given the regular prime proof in Lean, the remaining work in Lean would be on FLT for irregular primes.

I guess from the slides that @Riccardo Brasca 's project is ongoing. The project was also mentioned by @Kevin Buzzard in this paper on how computers can help us with reasoning.

Kevin Buzzard (May 13 2023 at 19:07):

Wiles' proof does not distinguish between the regular and irregular case. He proves semistable Taniyama-Shiumura, which doesn't have a prime in its statement.

Notification Bot (May 14 2023 at 03:55):

2 messages were moved from this topic to #new members > Using the LEM in a proof by Scott Morrison.

Last updated: May 02 2025 at 03:31 UTC