Zulip Chat Archive

Stream: general

Topic: Conditional no-cycle proof for Collatz — feedback welcome

Eric Merle (Feb 22 2026 at 15:23):

Hello everyone,

I've been working on a formalization in Lean 4 proving that no non-trivial cycle exists in the Collatz iteration, conditional on three explicit hypotheses (encoded as structure parameters, not axioms).

Repository: https://github.com/ericmerle3789/collatz-nocycle-lean4
https://github.com/ericmerle3789/collatz-nocycle-lean4

The main theorem is:

theorem no_nontrivial_cycle_phase59
    (baker : BakerSeparation) (barina : BarinaVerification)
    (cf : ContinuedFractionSeparation)
    (n k : ℕ) (hcyc : IsOddCycle n k) : False

What this proves: Under Baker's theorem on linear forms in logarithms (1966), Barina's computational verification up to 2^71 (2025), and a continued fraction separation hypothesis derived from Baker + CF theory of log₂3, no odd integer n > 1 can be periodic under the Syracuse map.

What this does NOT prove: This is not a proof of the full Collatz conjecture. It does not address divergence. The three hypotheses are not formalized from first principles — they are explicit parameters.

Build: 36 Lean files, 0 axioms, 0 sorry, Lean 4.27.0 + Mathlib 4.27.0. Some native_decide proofs require ~16 GB RAM (CF gap verifications involving 3^190537, a number with ~90K digits).

What I'd appreciate feedback on:

Is the IsOddCycle definition correct and standard? (n > 1 ∧ n % 2 = 1 ∧ k ≥ 1 ∧ nSeq n k = n)
Are the hypothesis structures (Baker, Barina, CF) faithful to the published results? Details in HYPOTHESES.md.
Is there anything in Mathlib that could replace or strengthen parts of this? For instance, continued fraction infrastructure that could eliminate the third hypothesis.
Any structural issues, naming problems, or bad patterns?

This was developed with AI assistance (Claude, Anthropic) as part of a larger research project. I'm not a professional mathematician — I'm an independent researcher learning Lean through this project. Any guidance from the community would be very valuable.

Thank you for your time.
Eric.

Ralf Stephan (Feb 22 2026 at 16:04):

Not sure if this helps but I have some related AI-formalized log(3) continued fraction lemmas in https://github.com/rwst/lean-code/blob/main/CRoz.lean

Sébastien Gouëzel (Feb 22 2026 at 16:06):

Eric Merle said:

Are the hypothesis structures (Baker, Barina, CF) faithful to the published results?

The assumption CF doesn't seem to be faithful to the published results. You should point to the precise statement in Simons & de Weger that would give this as I haven't found it there.

Eric Merle (Feb 22 2026 at 18:06):

@Sébastien Gouëzel Thank you — you are right.

ContinuedFractionSeparation was our own derivation, not a faithful encoding of any single published result. While auditing this, we found a second issue: SimonsDeWegerBound (k ≤ 982) was also misattributed — the number 982 comes from our Product Bound, not from Simons–de Weger. We also had an m/k confusion: SdW bounds m (local minima), not k (odd steps).

Both are now fixed in d2fa81a:

SimonsDeWegerBound → ProductBoundThreshold
ContinuedFractionSeparation → DerivedLargeKBound
Added HYPOTHESES.md separating published results from our derivations
Build: 0 sorry, 0 axioms

On a related note — the part of this project we find most compelling is actually the anti-divergence side (Porte 1), where the core question reduces to a drift problem: the accelerated Collatz map gives E[v₂] = 2 per odd step, yielding an expected log-ratio of log(4/3) > 0 per step. We have formalized this and verified descent certificates for 524,288 residue classes mod 2²⁰. The gap we cannot close is the passage from average drift to almost-sure convergence — whether trajectory bits can conspire indefinitely against the drift. Given your work on entropy and drift in hyperbolic groups, we would welcome any insight on whether concentration arguments could apply in this non-i.i.d. setting.

Thank you for the careful reading — it made the formalization significantly better.

Eric Merle (Feb 22 2026 at 18:13):

@Ralf Stephan Thank you — this looks relevant. Our DerivedLargeKBound relies on continued fraction convergents of log₂(3), so formalized CF lemmas for log(3) could be directly useful. I will look into your CRoz.lean and see if there is overlap with what we need. Our repo is at https://github.com/ericmerle3789/collatz-nocycle-lean4 (branch claude/phase57) if you want to compare approaches.

Sébastien Gouëzel (Feb 22 2026 at 21:32):

Your current assumption H3 (DerivedLargeKBound) is still not reasonable, as I don't think it would follow from any published result in the literature (and I don't think anyone would know how to prove it). You can have assumptions if you like, but they should correspond word for word to results in the literature, and then everything in your project should only depend on these. This is not the case in the current formalization.

Eric Merle (Feb 23 2026 at 18:48):

@Sébastien Gouëzel

Thank you, Sébastien — your feedback has been invaluable and we are grateful for your careful reading.

You are right, and we have not yet corrected this — here is our concrete plan.

DerivedLargeKBound will be removed. The cleaned result will rest on two hypotheses only, each anchored in published literature: Baker (1966 / Matveev 2000) for linear forms in logarithms, and Barina (2025) for computational verification up to 2⁷¹. Under these two hypotheses alone, we aim to establish: no non-trivial cycle with k ≤ 1322 odd steps. This is weaker than our previous claim, but honest.

For k > 1322, we are considering encoding the best approximation theorem for log₂3 (Hardy & Wright, Theorem 184) as a third hypothesis. Before doing so, do you have a view on its formalizability in Lean, or on a published result that would be more directly suitable?

Sébastien Gouëzel (Feb 25 2026 at 20:12):

Eric Merle said:

For k > 1322, we are considering encoding the best approximation theorem for log₂3 (Hardy & Wright, Theorem 184) as a third hypothesis. Before doing so, do you have a view on its formalizability in Lean, or on a published result that would be more directly suitable?

No idea. I don't know anything about this field, I just wanted to emphasize good practice (for formalized mathematics, but also in general): if you use a result of the litterature, give a precise reference to this result, and only use it in the form stated by the authors.

Last updated: Feb 28 2026 at 14:05 UTC