Zulip Chat Archive

Stream: general

Topic: Discussion: FATE


Wanyi He (Aug 26 2025 at 06:24):

Discussion thread for #announce > FATE: A Multi-level Formal Benchmark for Algebraic Problems

Kim Morrison (Aug 26 2025 at 06:50):

Why the v4.16.0 toolchain? This is more than 6 months old.

Eric Wieser (Aug 26 2025 at 06:55):

FATE-X is 4.19.0

Kim Morrison (Aug 26 2025 at 06:56):

Oh, even stranger that they are on different toolchains at different difficulties!

Jiang Jiedong (Aug 26 2025 at 06:57):

The initial version of FATE-M and H was created a long time ago. FATE-X is more recent…

Jiang Jiedong (Aug 26 2025 at 06:59):

Anyway it is not hard to update and unify the Lean toolchain. This is planned but we are not current working on it.

Kim Morrison (Aug 26 2025 at 07:03):

https://github.com/frenzymath/FATE-M/pull/1 is a start, to v4.17.0 v4.22.0. :-)

Jiang Jiedong (Aug 26 2025 at 07:12):

Wow!

Kim Morrison (Aug 26 2025 at 08:05):

I "randomly sampled" problem 37 from FATE-M, and gave it to Claude Code. It succeeded with the prompt:

Try to fill in the sorry in 37.lean. You may need to do some research online first.

Your TODO list should include writing a complete natural language proof in the module doc-string first.

and no further interaction from me except to approve web searches.

Kim Morrison (Aug 26 2025 at 08:06):

(All the pieces are already in Mathlib, it just had to find everything and put them together.)

Kim Morrison (Aug 26 2025 at 08:06):

import Mathlib

/-
Prove that $R$ is a P.I.D. if and only if $R$ is a U.F.D. that is also a Bezout Domain.

## Natural Language Proof:

We prove both directions of this fundamental characterization of Principal Ideal Domains.

**Forward Direction (PID ⟹ UFD ∧ Bezout):**
1. Every PID is a UFD: This is a classical result. In a PID, every ideal is principal (generated by one element).
   This implies that every irreducible element is prime (since the ideal it generates is maximal among proper
   principal ideals). The existence of factorizations follows from the ascending chain condition on principal
   ideals, and uniqueness follows from the primality of irreducibles.

2. Every PID is a Bezout domain: By definition, a Bezout domain is an integral domain where every finitely
   generated ideal is principal. Since in a PID every ideal (not just finitely generated ones) is principal,
   it is automatically a Bezout domain.

**Backward Direction (UFD ∧ Bezout ⟹ PID):**
The key insight is that a UFD satisfies the ascending chain condition on principal ideals (ACCP). Combined
with the Bezout property, we can show that R is Noetherian:

1. In a UFD with the Bezout property, every ideal generated by two elements is principal. This follows because
   for any a, b ∈ R, we can factor out their gcd d, leaving coprime elements. The Bezout property then gives
   us u, v such that au + bv = d, showing (a,b) = (d).

2. A UFD satisfies ACCP (ascending chain condition on principal ideals). This is because in a UFD, if we have
   an ascending chain of principal ideals (a₁) ⊆ (a₂) ⊆ ..., then each aᵢ₊₁ divides aᵢ. By unique factorization,
   this means aᵢ₊₁ has fewer prime factors than aᵢ (counting multiplicity), which must stabilize.

3. In a Bezout domain, ACCP implies the ring is Noetherian. This is because every ideal is the union of its
   finitely generated subideals, each of which is principal by the Bezout property.

4. A Noetherian Bezout domain is a PID: Since the ring is Noetherian, every ideal is finitely generated.
   By the Bezout property, every finitely generated ideal is principal. Therefore, every ideal is principal.
-/
theorem isPrincipalIdealRing_iff_uniqueFactorizationMonoid_and_isBezout
    {R : Type*} [CommRing R] [IsDomain R] :
    IsPrincipalIdealRing R  UniqueFactorizationMonoid R  IsBezout R := by
  constructor
  · intro h
    haveI : IsPrincipalIdealRing R := h
    constructor
    · exact inferInstance
    · exact inferInstance
  · intro hUFD, hBezout
    haveI : UniqueFactorizationMonoid R := hUFD
    haveI : IsBezout R := hBezout
    -- In a Bézout domain, UFD → PID (using IsBezout.TFAE)
    exact (IsBezout.TFAE.out 2 1).mp hUFD

Kim Morrison (Aug 26 2025 at 08:07):

I didn't look at its natural language proof at all, if anyone would like to check it I'd be curious how sane it is!

Kim Morrison (Aug 26 2025 at 08:08):

(Google doesn't index zulip, but if this counts as contamination I'm happy to delete Claude's code above!)

Jiang Jiedong (Aug 26 2025 at 08:11):

Yes! We tested general LLMs and put the results in the blog. Indeed, they are capable of using the current Mathlib to solve a lot of questions in FATE-M.

ResultsFig.png

ResultsTable.png

Kim Morrison (Aug 26 2025 at 08:14):

Problem 51 in FATE-H needs adjusting for v4.19.0. This statement suffices:

theorem adjoin_minpoly_coeffs_toSet_eq {F K : Type} [Field F] [Field K]
    [Algebra F K] {θ : K} (L : IntermediateField F K) (h : Fθ = ) :
    IntermediateField.adjoin F ((fun x : L => (x : K)) '' (minpoly L θ).coeffs.toSet : Set K) = L := by
  sorry

but I'm sure someone more familiar with intermediate fields could give a more colloquial way to state the cast.

Jiang Jiedong (Aug 26 2025 at 08:17):

Kim Morrison said:

I didn't look at its natural language proof at all, if anyone would like to check it I'd be curious how sane it is!

The natural language proof Claude generated appears to be very good and standard. The formalization is not so related to natural language because the type inference in Mathlib is so well-developed that the formal proof is nearly automatic.

Junyan Xu (Aug 26 2025 at 15:59):

Hmm this one is almost exactly IntermediateField.adjoin_minpoly_coeff_of_exists_primitive_element (added in a PR by Jz Pan on Steinitz theorem which I reviewed)


Last updated: Dec 20 2025 at 21:32 UTC