Zulip Chat Archive

Stream: new members

Topic: Help with First Proof

Pi Lanningham (Feb 04 2025 at 22:18):

Hi all!

I'm playing around with Lean and working on my first semi-real-world thing with it, and I'm kind of at a loss for how to proceed.

For some context, I'm modeling the semantics of a simple blockchain. You've got a set of unspent transaction outputs, and a quantity of collected fees; and you've got a transition function that can apply some transaction to move funds around and collect a fee.

The code for this is:

@[reducible] def TxId := String
@[reducible] def Ix := Nat
@[reducible] def Slot := Nat
@[reducible] def Coin := Nat
@[reducible] def Address := String
@[reducible] def TxRef := TxId × Ix
structure TxOut where
  address: Address
  value: Coin


class UTxO (U: Type u) where
  map: U → ((TxRef × TxOut) -> a) → List a
  lookup : U -> TxRef → Option TxOut
  add : U -> List (TxRef × TxOut) → U
  remove : U -> List TxRef → U

notation (name := between) lower:arg " ≤ " mid:arg " ≤ " upper:arg => (lower ≤ mid) ∧ (mid ≤ upper)

structure UTXOEnv where
  slot: Slot


structure UTXOState (U: Type) [UTxO U] where
  utxo: U
  fees: Coin

structure Transaction where
  txId: TxId
  inputs: List TxRef
  outputs: List TxOut
  valid_range: Slot × Slot
  fee: Coin


section
variable {U: Type} [UTxO U] [ToString U]


def transaction_valid (env: UTXOEnv) (state: UTXOState U) (transaction: Transaction): Bool :=
  let (valid_start, valid_end) := transaction.valid_range
  let current_slot := env.slot
  let resolved_inputs := transaction.inputs.map (UTxO.lookup state.utxo)
  let consumed := List.sum $ List.map (λ out => if let some out := out then out.value else 0) resolved_inputs
  let produced := transaction.fee + (List.sum $ List.map (fun o => o.value) transaction.outputs)

  List.and [
    valid_start ≤ current_slot ≤ valid_end,
    transaction.fee > 0,
    resolved_inputs.all Option.isSome,
    consumed = produced
  ]

def utxo_transition
  (env: UTXOEnv)
  (state: UTXOState U)
  (transaction: Transaction): UTXOState U :=
  if transaction_valid env state transaction then
    let outputs := transaction.outputs.mapIdx (λ idx out => ((transaction.txId, idx), out))
    let utxo := state.utxo
    let utxo' := UTxO.remove utxo transaction.inputs
    let utxo'' := UTxO.add utxo' outputs
    { state with
        utxo := utxo''
        fees := state.fees + transaction.fee
    }
  else
    state

def total_value (utxo: U): Coin :=
  List.sum $ UTxO.map utxo (λ (_, out) => out.value)

end


@[reducible] def UTXOList := List (TxRef × TxOut)
instance : UTxO UTXOList where
  map l f := l.map (fun (r, out) => f (r, out))
  lookup l r := l.find? (fun (r', _) => r == r') |>.map Prod.snd
  add l new := l ++ new
  remove l refs := l.filter (fun (r, _) => !refs.contains r)

#eval
  let env: UTXOEnv := dbgTraceVal $ { slot := 1 }
  let utxo: UTXOList := [
    (("a", 0), { address := "addr1", value := 2 }),
    (("b", 1), { address := "addr2", value := 3 })
  ]
  let state : UTXOState.{0} UTXOList := dbgTraceVal $ { utxo := utxo, fees := 0 }
  let tx := dbgTraceVal $ {
    txId := "c",
    inputs := [("a", 0)],
    outputs := [{ address := "addr3", value := 1 }],
    valid_range := (0, 2),
    fee := 1
  }
  utxo_transition env state tx

Now I'd like to prove something like: the total value in the system is always preserved by this rule.

I think I would write the premise like so:

theorem total_value_unchanged (env: UTXOEnv) (state: UTXOState U) (transaction: Transaction):
  let new_state := utxo_transition env state transaction
  let consumed := total_value state.utxo
  let produced := total_value new_state.utxo
  consumed = produced := by
    sorry

I've played around with trying to understand what happens when I intros or simp but I'm not getting terribly far on my own.

I'm not necessarily looking for the full answer, but any tips on what I should be looking at to tell if I'm making useful progress, and some of the tools that might be helpful in something like this, would be helpful! Thanks in advance!

Pi Lanningham (Feb 04 2025 at 22:24):

or maybe I can only prove this for a specific U, like UTxOList?

Aaron Liu (Feb 04 2025 at 22:35):

Right now, your code does not work for me :sad:, so it is hard to me to help you right now.
It is generally bad to have lets in the theorem statement, so I would suggest zeta-reducing those.

Pi Lanningham (Feb 04 2025 at 22:44):

oh, maybe because I stripped out the ToString implementations; let me give you a gist with what I have so far, h/o

Pi Lanningham (Feb 04 2025 at 22:45):

https://gist.github.com/Quantumplation/d0d2ce0ba73d9b2e3fd09837dcf4b986

Also hosted here: https://github.com/Quantumplation/cardano-lean-specification

Aaron Liu (Feb 04 2025 at 22:54):

Pi Lanningham said:

I'm playing around with Lean and working on my first semi-real-world thing with it, and I'm kind of at a loss for how to proceed.

What are you stuck on exactly? When I am stuck, usually, it is because I don't have a good idea of how to prove my claim. Do you have an informal proof that the value should remain unchanged?

Pi Lanningham (Feb 04 2025 at 23:20):

(sorry I just moved to mobile so the above is a little sloppy hehe)

Pi Lanningham (Feb 04 2025 at 23:20):

I know how to do the split and how to expand some definitions, but only through simp so I don't know how to get things in exactly the form I would need to progress.

Pi Lanningham (Feb 04 2025 at 23:20):

If I was proving this on paper, I'd likely:

Write down the definition of total_value;

Point out that consumed == produced is equivalent to showing consumed - produced == 0

Expand the definitions to (state.fees + sum state.utxo.value) - (new_state.fees + sum new_state.utxo.value) == 0

Expand new_state.fees to state.fees + transaction.fees

Eliminate state.fees, so (sum state.utxo.value) - (transaction.fees + sum new_state.utxos.value) == 0

Show that new_state.utxos is state.utxos - transaction.inputs + transaction.ouputs

Show that sum distributes here, so we can eliminate sum utxo.state.value, so we have -(transaction.fee - sum transaction.inputs + sum transaction.outputs) = 0

Rewrite this to sum transaction.inputs == transaction.fee + sum transaction.outputs

Show that transaction_valid being true implies this

Show that transaction_valid being false implies new_state == state, so total_value is equal too

Aaron Liu (Feb 04 2025 at 23:23):

Pi Lanningham said:

I know how to do the split and how to expand some definitions, but only through simp so I don't know how to get things in exactly the form I would need to progress.

Some ways to do this (in order of my personal preference): rw, unfold, delta.
Usually it is better to separate rewriting the definition into another theorem, and then rw [unfolding_theorem].

Pi Lanningham (Feb 04 2025 at 23:34):

Ok I'll play around with that later tonight, thank you! I just needed more tools I think

Notification Bot (Feb 04 2025 at 23:41):

This topic was moved here from #lean4 > Help with First Proof by Patrick Massot.

Aaron Liu (Feb 04 2025 at 23:52):

If I was proving this on paper, I'd likely:

I think you should do many of the same thing when proving in Lean too.

Write down the definition of total_value;

You seem to have done this already.

Point out that consumed == produced is equivalent to showing consumed - produced == 0

In mathlib this is called docs#sub_eq_zero.
Note that subtraction on Nat is badly behaved, so you might want to switch to Int here.

Expand the definitions to (state.fees + sum state.utxo.value) - (new_state.fees + sum new_state.utxo.value) == 0

Expand new_state.fees to state.fees + transaction.fees

Use rw or unfold or some other tactic to do this.

Eliminate state.fees, so (sum state.utxo.value) - (transaction.fees + sum new_state.utxos.value) == 0

I think the lemmas in file#Algebra/Group/Basic will be helpful. If you don't want to import mathlib, then see Init.Data.Int.Lemmas.

Show that new_state.utxos is state.utxos - transaction.inputs + transaction.ouputs

You will need to show this.

Show that sum distributes here, so we can eliminate sum utxo.state.value, so we have -(transaction.fee - sum transaction.inputs + sum transaction.outputs) = 0

I couldn't find anything that says the sum distributes, so you might have to prove that yourself.

Rewrite this to sum transaction.inputs == transaction.fee + sum transaction.outputs

This is more group lemmas.

Show that transaction_valid being true implies this

You will have to show this.

Show that transaction_valid being false implies new_state == state, so total_value is equal too

This falls directly out of how you defined transaction_valid, so this should be easy.

Pi Lanningham (Feb 05 2025 at 04:39):

How do I actually "apply" sub_eq_zero? I see that it's defined as a - b = 0 <=> a = b, but if I do rw [Int.sub_eq_zero] just says no instances of the pattern are found, but like... that's exactly what my goal is :sweat_smile:

Pi Lanningham (Feb 05 2025 at 04:40):

ah, ok, I can use the left arrow to go in reverse!

Pi Lanningham (Feb 05 2025 at 07:06):

Ok, I've gotten pretty far!

Here's what I have so far:

theorem total_value_unchanged (env: UTXOEnv) (state: UTXOState UTXOList) (transaction: Transaction):
  -- let new_state := utxo_transition env state transaction
  -- let consumed := total_value state.utxo
  -- let produced := total_value new_state.utxo
  total_value state = total_value (utxo_transition env state transaction) := by
    simp [utxo_transition]
    split
    . rewrite [<- Int.sub_eq_zero]
      rw [total_value]
      rw [total_value]
      simp
      -- rw [Int.add_comm]
      rw [<- Int.sub_sub]
      rw [<- Int.sub_sub]
      rw [Int.add_comm]
      simp
      sorry
    . rfl

up to the sorry, here's my goal:

env : UTXOEnv
state : UTXOState UTXOList
transaction : Transaction
h✝ : transaction_valid env state transaction = true
⊢ (UTxO.map state.utxo fun x => x.snd.value).sum - transaction.fee -
    (UTxO.map
        (UTxO.add (UTxO.remove state.utxo transaction.inputs)
          (List.mapIdx (fun idx out => ((transaction.txId, idx), out)) transaction.outputs))
        fun x => x.snd.value).sum =
  0

I want to simplify the mapIdx and the fun x => x.snd.value, since essentially they should cancel out...

So, I'm trying to prove that's the case:

theorem mapIdx_2nd (l: List α):
  List.map (fun x: Prod Nat α => x.snd) (List.mapIdx (fun i x => (i, x)) l) = l := by
  induction l
  . rfl
  . rw [List.mapIdx_cons]
    rw [List.map_cons]
    rw [Prod.snd]
    simp

But this leaves me with an i+1 in the mapIdx... I can see intuitively that this doesn't matter, because 2nd is ignoring the first argument, but it doesn't let me satisfy the induction. :thinking:

Aaron Liu (Feb 05 2025 at 12:17):

If it doesn't depend on the first argument, you should put that in the theorem too!

-- `d` represents all the ways it could depend
theorem mapIdx_2nd (l : List α) (d : Nat → α → Nat) :
  -- do you know about `List.zip`?
  List.map (fun x : Prod Nat α => x.snd) (List.mapIdx (fun i x => (d i x, x)) l) = l := by
  induction l generalizing d with
  | nil => rfl
  | cons x xs ih =>
    /- we can pick what `d` to pass to the `ih` now,
    because in the previous inductive step we proved it for all `d` -/
    simpa [Nat.add_assoc] using ih fun i x => d (i + 1) x

Pi Lanningham (Feb 05 2025 at 19:53):

Thanks so much for your help @Aaron Liu ! That made some things click for me. I'm still really unclear on the machinery under the hood that is making it happen (I can't visualize, meaning i'm kind of at a guess and check situation, but at least I can make progress!)

And yes, I'm familiar with zip; I'm not 100% sure how it applies here, since i'm not zipping two lists together (unless maybe i'm zipping [0..] with the transaction outputs).

Ultimately I also needed to generalize f, and add a g because i'm also accessing .value inside the lambda, but I got it proved and it actually helped make progress on the original proof.

I'll keep picking away at it, but if you're curious, here's how far I've gotten now:

theorem mapIdx_2nd (l: List α) (f : Nat → α → β) (g : α → γ):
  List.map (fun x: Prod β α => g x.snd) (List.mapIdx (fun i x => (f i x, x)) l) = List.map g l := by
  induction l generalizing f with
  | nil => rfl
  | cons x xs ih =>
    simpa [Nat.add_assoc] using ih fun i x => (f (i + 1) x)

theorem total_value_unchanged (env: UTXOEnv) (state: UTXOState UTXOList) (transaction: Transaction):
  -- let new_state := utxo_transition env state transaction
  -- let consumed := total_value state.utxo
  -- let produced := total_value new_state.utxo
  -- consumed = produced := by
  total_value state = total_value (utxo_transition env state transaction) := by
    simp [utxo_transition]
    split
    . rewrite [<- Int.sub_eq_zero]
      rw [total_value]
      rw [total_value]
      simp
      -- rw [Int.add_comm]
      rw [<- Int.sub_sub]
      rw [<- Int.sub_sub]
      rw [Int.add_comm]
      simp
      simp [UTxO.map]
      simp [UTxO.add, UTxO.remove]
      rw [mapIdx_2nd (f := fun idx _ => (transaction.txId, idx)) (g := fun x: TxOut => x.value)]
      sorry
    . rfl

Pi Lanningham (Feb 05 2025 at 23:34):

If I have a ^ b ^ c as a premise, how can I show that just b is true? i've been looking through the Bool.and_* rules and i'm not sure which one would be helpful

Matt Diamond (Feb 05 2025 at 23:38):

are you dealing with a Prop ∧ or a Boolean &&?

Pi Lanningham (Feb 05 2025 at 23:39):

uhh.. not sure; Here's what's in my goal, it just says "And" and i'm trying to pluck out the middle condition:

image.png

I wasn't aware Prop was different.

Pi Lanningham (Feb 05 2025 at 23:39):

lemme go look at the prop docs

Matt Diamond (Feb 05 2025 at 23:43):

yeah, there's a difference between docs#And and docs#Bool.and... if it's the Prop version, the accessors are .left and .right (or .1 and .2)

Pi Lanningham (Feb 05 2025 at 23:44):

and then I would use a let clause to make it into something I can use in the rest of the proof?

Matt Diamond (Feb 05 2025 at 23:45):

sure, though if you're working with Prop it's better to use have, as the actual proof term is irrelevant

Matt Diamond (Feb 05 2025 at 23:45):

if you have h : a ∧ b ∧ c, that's going to be parsed as a ∧ (b ∧ c) so you'd want to do have h' := h.right.left to get a proof of b

Pi Lanningham (Feb 05 2025 at 23:46):

ok ty!!

Matt Diamond (Feb 05 2025 at 23:46):

sure, no problem

Matt Diamond (Feb 05 2025 at 23:48):

also keep in mind that you can do rcases h with ⟨ha, hb, hc⟩ to deconstruct the And into its 3 components at once

Eric Paul (Feb 05 2025 at 23:49):

Here are some examples of other ways you get extract one of the terms from an And, just for fun as they can all be done by things @Matt Diamond has already said

import Mathlib

example (p q s : Prop) (h : p ∧ q ∧ s) : True := by
  have hs : s := by tauto
  obtain ⟨hp, hq, hs⟩ := h
  trivial

The first example gets a proof of s using tauto. The second example extracts a proof of each term in the And.

Pi Lanningham (Feb 06 2025 at 01:05):

:thinking:

So, I have a \elem B and \forall x, x \elem B -> C a, and i'm trying to show C a as my goal, how do I "combine" these?

more concretely, I have

theorem valid_implies_inputs_exist (env: UTXOEnv) (state: UTXOState UTXOList) (transaction: Transaction):
  transaction_valid env state transaction → ∀ r ∈ transaction.inputs, (UTxO.lookup state.utxo r).isSome := by
    intros valid tx_ref is_input
    rw [transaction_valid] at *
    simp at valid
    have all_inputs_exist := valid.right.right.left
    sorry

which leads to this proof state:

env : UTXOEnv
state : UTXOState UTXOList
transaction : Transaction
tx_ref : TxRef
is_input : tx_ref ∈ transaction.inputs
valid : {snipped for space}
all_inputs_exist : ∀ (a : TxId) (b : Ix), (a, b) ∈ transaction.inputs → (UTxO.lookup state.utxo (a, b)).isSome = true
⊢ (UTxO.lookup state.utxo tx_ref).isSome = true

I tried applying all_inputs_exist as a rewrite rule, or passing tx_ref or is_input to all_inputs_exist, but it's not clicking for me, and because the machinery of the proof checker is such a black box to me, often I feel like I'm just guessing at how to express what I'm trying to say.

I also tried destructuring tx_ref into a pair by the definition of TxRef (@[reducible] def TxRef := TxId × Ix) since i thought maybe that was interfering with it finding the relationship, but that didn't work either.

(also, thank you to everyone whose been helping, I'm trying to put in the work and understand and not just get the answer from y'all :sweat_smile: but it's very mind bending stuff)

Aaron Liu (Feb 06 2025 at 01:08):

Try specialize all_inputs_exist tx_ref.fst tx_ref.snd is_input.
If you want to keep all_inputs_exist as a hypothesis, then you can do have lookup_tx_ref := all_inputs_exist tx_ref.fst tx_ref.snd is_input.

Pi Lanningham (Feb 06 2025 at 01:08):

ahh, specialize was the tactic I was missing I think, lemme play around with that

Pi Lanningham (Feb 06 2025 at 01:15):

Huzzah, I think this did it!

theorem valid_implies_inputs_exist (env: UTXOEnv) (state: UTXOState UTXOList) (transaction: Transaction):
  transaction_valid env state transaction → ∀ r ∈ transaction.inputs, (UTxO.lookup state.utxo r).isSome := by
    intros valid tx_ref is_input
    rw [transaction_valid] at *
    simp at valid
    have all_inputs_exist := valid.right.right.left
    clear valid
    have (tx_id, ix) := tx_ref
    specialize all_inputs_exist tx_id ix is_input
    trivial

Aaron Liu (Feb 06 2025 at 01:35):

You can find a lot of useful stuff here. Most of it is for mathlib, but I think it should still be useful.

Pi Lanningham (Feb 06 2025 at 02:26):

is there a way to give names to an expression in the goal, to make it more readable? for example, my goal currently starts (List.map (fun x => x.2.value) state.utxo).sum - and it's very visually noisy, so I'd like to replace that with old_value, but playing around with replace, substitute, and let bindings didn't get me anywhere

Aaron Liu (Feb 06 2025 at 03:08):

I usually don't worry about goal readability, but you can do let old_value := ...; rw [show old_value = ... from rfl] at *.
Mathlib has a tactic called set that does all of this automatically, so it would be set old_value := ... with h.

Pi Lanningham (Feb 06 2025 at 03:51):

oo ok; i finally figured out how to import Mathlib :sweat_smile:

regarding goal readability, i'm mostly doing it to organize my own thoughts as I try to figure out how to get to the proof hehe

Pi Lanningham (Feb 06 2025 at 05:44):

Hmm; if I have f x + ... = f x + ... how do I cancel out f x on both sides? I found Int.add_left_cancel which seems to be the rule I want, but it won't let me use it as a rw rule.

Vlad Tsyrklevich (Feb 06 2025 at 08:32):

Does congr work? For Int.add_left_cancel, the goal/hypothesis is (probably) reversed, you would want to apply Int.add_left_inj instead. (Having an #mwe would make this easier to reason about/demonstrate, as this depends on whether you're talking about a hypothesis or a goal.)

Aaron Liu (Feb 06 2025 at 13:10):

Addition associates to the left, so a + b + c is actually (a + b) + c. You can force it to the right by using simp only [Int.add_assoc] at hyp_name.

Pi Lanningham (Feb 07 2025 at 01:37):

Thanks, those answers were both very helpful! I had tried the add_assoc, but I had a Nat in my theorem declaration instead of an Int that was tripping it up :sweat_smile:

Pi Lanningham (Feb 07 2025 at 01:40):

I'm still not super clear on the difference between rw and simp

Julian Berman (Feb 07 2025 at 01:42):

rw is "a lemma says that A and B are the same, so change A to B". simp is quite a bit more involved, it's "try a whole lot of lemmas (which have been tagged "use me simp") and all of which in theory simplify the goal in various ways -- and hopefully at the end the goal ends up trivial"

Pi Lanningham (Feb 07 2025 at 02:31):

Huzzah!

theorem x (l : List α) (f : α → Int) (g : α → Bool):
  (List.map f (List.filter g l)).sum = (List.map f l).sum - (List.map f (List.filter (fun x => !g x) l)).sum := by
    induction l with
    | nil => rfl
    | cons x xs ih =>
      simp [ih]
      rw [List.filter]
      split
      . rename_i g_true
        simp
        rw [List.filter]
        rw [<- Bool.not_eq_false_eq_eq_true (g x)] at g_true
        rename (!g x) = false => not_g_false
        simp [not_g_false]
        set fx := f x
        set xs_sum := (List.map f xs).sum
        set filter_sum := (List.map f (List.filter g xs)).sum
        set not_filter_sum := (List.map f (List.filter (fun x => !g x) xs)).sum
        rw [Int.add_sub_assoc]
        set remainder := (xs_sum - not_filter_sum)
        simp [Int.add_left_cancel]
        simp [ih]
      . rename_i g_false
        rw [List.filter]
        simp [g_false]
        assumption

Pi Lanningham (Feb 12 2025 at 23:40):

Ok, I'm pretty stuck again;

Here's my proof so far:

https://gist.github.com/Quantumplation/cbfbe910a75f5f03c21be9820a55ab3a

At the sorry, my goal is this:

⊢ (List.map (fun x => x.2.value) (List.filter (fun x => decide (x.1 ∈ transaction.inputs)) state.utxo)).sum =
  (List.map
      ((fun out =>
          match out with
          | some out => out.value
          | x => 0) ∘
        instUTxOUTXOList.2 state.utxo)
      transaction.inputs).sum

Cleaning up some notation, it's basically saying I need to prove that "The sum of the unspent transaction outputs filtered to the list of transaction inputs is equal to the sum of each transaction inputs looked up in the UTxO", which seems intuitively true to me, but I'm not sure how to convince lean4 of that fact.

(or to remove some of the domain specific language, if all behaves like a Dictionary, sum (filter (λ x -> x \in subset) all) = sum (map (λ x -> lookup x in all) subset))

I think my next steps probably involve simplifying out the function composition, so it looks something like:

⊢ (List.map (fun x => x.2.value) (List.filter (fun x => decide (x.1 ∈ transaction.inputs)) state.utxo)).sum =
  (List.map
      (fun input =>
          match (UTxO.lookup input) with
          | some out => out.value
          | x => 0) state.utxo)
      transaction.inputs).sum

Then using valid_implies_inputs_exist to show that the match can be simplified out:

⊢ (List.map (fun x => x.2.value) (List.filter (fun x => decide (x.1 ∈ transaction.inputs)) state.utxo)).sum =
  (List.map
      (fun input => (UTxO.lookup input).value)
      transaction.inputs).sum

and then showing these are equivalent...

I might also try removing the abstraction around UTxO, and just making it always a list, and see if that helps simplify the proofs.

Does anyone else have other advice? Super appreciative of everyone whose been taking time to help me learn! :)

Chris Wong (Feb 13 2025 at 00:20):

A potential problem with that approach is that since UTxO.lookup returns an Option, not a bare value, the last goal is not well-typed

Chris Wong (Feb 13 2025 at 00:21):

You might be interested in docs#List.filterMap, which fuses the map and filter steps, so might be easier to work with

Aaron Liu (Feb 13 2025 at 00:41):

Sorry, but your goal is false, I think?

-- output of `extract_goal`, negated
theorem extracted_1 : ¬∀ (env : UTXOEnv) (state : UTXOState UTXOList) (transaction : Transaction)
  (is_valid :
    (transaction.4.1 ≤ env.slot ∧ env.slot ≤ transaction.4.2) ∧
      0 < transaction.fee ∧
        (∀ (a : TxId) (b : Ix), (a, b) ∈ transaction.inputs → (UTxO.lookup state.utxo (a, b)).isSome = true) ∧
          (List.map
                ((fun out =>
                    match out with
                    | some out => out.value
                    | x => 0) ∘
                  UTxO.lookup state.utxo)
                transaction.inputs).sum =
            transaction.fee + (List.map (fun out => out.value) transaction.outputs).sum)
  (balanced :
    (List.map
          ((fun out =>
              match out with
              | some out => out.value
              | x => 0) ∘
            instUTxOUTXOList.2 state.utxo)
          transaction.inputs).sum =
      transaction.fee + (List.map (fun out => out.value) transaction.outputs).sum),
  (List.map (fun x => x.2.value) (List.filter (fun x => decide (x.1 ∈ transaction.inputs)) state.utxo)).sum =
    (List.map
        ((fun out =>
            match out with
            | some out => out.value
            | x => 0) ∘
          instUTxOUTXOList.2 state.utxo)
        transaction.inputs).sum := by
  intro h
  specialize h ⟨0⟩ ⟨[⟨⟨"", 0⟩, ⟨"", 1⟩⟩, ⟨⟨"", 0⟩, ⟨"", 1⟩⟩], 0⟩ ⟨"", [⟨"", 0⟩], [], ⟨0, 0⟩, 1⟩
  simp (decide := true) at h

Pi Lanningham (Feb 13 2025 at 00:45):

Hmm; Ok, good call out. I think that's an artifact of you having duplicate TxRef's in your example, so I'd need the UTxO type to encode the fact that transaction outputs are unique as well :thinking:

Saurish Sharma (Apr 14 2025 at 02:55):

Hi everyone! I'm Saurish Sharma, a 14-year-old student deeply interested in mathematics and formalization. I've written a formally verified proof of the Collatz Conjecture using Lean.

The proof is structured and verified in Lean 4, and I’ve also included the code in a LaTeX file, which provides the formal steps and logic of the proof. I plan to upload the Lean code to GitHub soon, but in the meantime, I'd really appreciate any feedback or suggestions on the approach and logic.

Thanks — I'm excited to learn from this community!
collatz_proof_saurish.tex

Matt Diamond (Apr 14 2025 at 03:03):

Could someone move this post to a new thread?

Also, I'm a bit hesitant to download a random file... would it be possible to share your code here using #backticks?

Thomas Browning (Apr 14 2025 at 03:19):

@Saurish Sharma Here's your code as far as I can tell. But it doesn't compile (try clicking the button in the upper right corner to view in the lean 4 playground) and seems to be using some notation from Lean 3. Did an LLM help write this?

def collatz_step (n : ℕ) : ℕ :=
if n % 2 = 0 then n / 2 else 3 * n + 1

def v2 (n : ℕ) : ℕ :=
nat.find_greatest (λ k, 2^k ∣ n) (n + 1)

def prime_set : list ℕ := [2, 3, 5, 7, 11, 13, 17, 19]

def prime_factor_exponents (n : ℕ) : list ℕ :=
prime_set.map (λ p, nat.factorization n p)

lemma v2_odd_increase (n : ℕ) (h_odd : n % 2 = 1) : v2 (3 * n + 1) > v2 n :=
begin
  have h_v2_n : v2 n = 0,
  {
    unfold v2,
    rw nat.find_greatest_eq_zero,
    intros k hk,
    cases k,
    { exact dvd_zero 1 },
    {
      have : 2 ^ k > n := by linarith,
      exact absurd hk (not_lt_of_ge (nat.le_of_dvd (by linarith) hk)),
    }
  },
  have h_v2_3n1 : v2 (3 * n + 1) ≥ 1,
  {
    unfold v2,
    apply nat.find_greatest_ge,
    use 1,
    rw pow_one,
    exact dvd_trans (dvd_add_right (dvd_mul_left 3 n)) (dvd_refl _),
    exact nat.succ_le_succ (nat.zero_lt_succ _),
  },
  rw h_v2_n,
  exact lt_of_lt_of_le zero_lt_one h_v2_3n1,
end

open list.lex

lemma lex_lt_of_v2_increase (n : ℕ) (h_odd : n % 2 = 1) :
  prime_factor_exponents (collatz_step n) < prime_factor_exponents n :=
begin
  simp [collatz_step, h_odd],
  unfold prime_factor_exponents,
  apply list.lex.cons,
  apply v2_odd_increase n h_odd,
end

lemma prime_factor_decreases (n : ℕ) :
  prime_factor_exponents (collatz_step n) < prime_factor_exponents n :=
begin
  by_cases h : n % 2 = 0,
  {
    simp [collatz_step, h],
    unfold prime_factor_exponents,
    apply list.lex.cons,
    sorry -- Detailed proof for even case omitted, but assumed correct
  },
  {
    exact lex_lt_of_v2_increase n h,
  }
end

def collatz_iterate : ℕ → ℕ → ℕ
| 0        n := n
| (k + 1)  n := collatz_step (collatz_iterate k n)

theorem no_collatz_cycles : ¬ ∃ n k, k > 0 ∧ collatz_iterate k n = n :=
begin
  rintro ⟨n, k, hk_pos, hk⟩,
  let f := λ m, prime_factor_exponents m,
  have step_dec : ∀ m, f (collatz_step m) < f m := prime_factor_decreases,
  have iterate_dec : ∀ j < k, f (collatz_iterate (j + 1) n) < f (collatz_iterate j n),
  {
    intros j hj,
    induction j with j ih,
    { simp [collatz_iterate], exact step_dec n },
    {
      simp [collatz_iterate],
      exact lt_trans (step_dec _) (ih (nat.lt_of_succ_lt hk_pos)),
    }
  },
  have : f (collatz_iterate k n) < f n := iterate_dec (k - 1) (nat.pred_lt hk_pos.ne'),
  rw hk at this,
  exact lt_irrefl _ this,
end

Shelvacu (Apr 14 2025 at 03:20):

The file is just plaintext TeX, here it is in a codeblock:

long text

\documentclass[12pt]{article}
\usepackage{amsmath, amssymb, amsthm}
\usepackage{listings}
\usepackage{xcolor}
\usepackage{geometry}
\geometry{margin=1in}
\usepackage{hyperref}
\hypersetup{
colorlinks=true,
linkcolor=blue,
urlcolor=blue,
pdftitle={Collatz Conjecture - Formal Proof in Lean},
pdfauthor={Saurish Sharma}
}

\title{A Formally Verified Proof of the Collatz Conjecture}
\author{Saurish Sharma}
\date{April 2025}

\definecolor{codegray}{gray}{0.95}
\lstset{
backgroundcolor=\color{codegray},
basicstyle=\ttfamily\small,
breaklines=true,
frame=single,
language=Lean,
captionpos=b
}

\begin{document}

\maketitle

\section*{Abstract}
We present a formally verified proof of the Collatz Conjecture using a lexicographic descent function on prime factor exponents. The proof is implemented in the Lean theorem prover and shows that all natural numbers eventually reach 1 under the Collatz process. We also prove that no non-trivial cycles can exist.

\section{Definitions and Preliminaries}

\subsection*{Collatz Step Function}
\begin{lstlisting}
def collatz_step (n : ℕ) : ℕ :=
if n % 2 = 0 then n / 2 else 3 * n + 1
\end{lstlisting}

\subsection*{2-adic Valuation (v₂)}
\begin{lstlisting}
def v2 (n : ℕ) : ℕ :=
nat.find_greatest (λ k, 2^k ∣ n) (n + 1)
\end{lstlisting}

This function returns the largest exponent $k$ such that $2^k$ divides $n$.

\subsection*{Prime Factor Exponent Vector}
\begin{lstlisting}
def prime_set : list ℕ := [2, 3, 5, 7, 11, 13, 17, 19]

def prime_factor_exponents (n : ℕ) : list ℕ :=
prime_set.map (λ p, nat.factorization n p)
\end{lstlisting}

\section{Main Results}

\subsection*{Theorem 1: v₂ increases when n is odd}
\begin{lstlisting}
lemma v2_odd_increase (n : ℕ) (h_odd : n % 2 = 1) : v2 (3 * n + 1) > v2 n :=
begin
have h_v2_n : v2 n = 0,
{
unfold v2,
rw nat.find_greatest_eq_zero,
intros k hk,
cases k,
{ exact dvd_zero 1 },
{
have : 2 ^ k > n := by linarith,
exact absurd hk (not_lt_of_ge (nat.le_of_dvd (by linarith) hk)),
}
},
have h_v2_3n1 : v2 (3 * n + 1) ≥ 1,
{
unfold v2,
apply nat.find_greatest_ge,
use 1,
rw pow_one,
exact dvd_trans (dvd_add_right (dvd_mul_left 3 n)) (dvd_refl _),
exact nat.succ_le_succ (nat.zero_lt_succ _),
},
rw h_v2_n,
exact lt_of_lt_of_le zero_lt_one h_v2_3n1,
end
\end{lstlisting}

\subsection*{Theorem 2: Exponent Vector Decreases Lexicographically}
\begin{lstlisting}
open list.lex

lemma lex_lt_of_v2_increase (n : ℕ) (h_odd : n % 2 = 1) :
prime_factor_exponents (collatz_step n) < prime_factor_exponents n :=
begin
simp [collatz_step, h_odd],
unfold prime_factor_exponents,
apply list.lex.cons,
apply v2_odd_increase n h_odd,
end

lemma prime_factor_decreases (n : ℕ) :
prime_factor_exponents (collatz_step n) < prime_factor_exponents n :=
begin
by_cases h : n % 2 = 0,
{
simp [collatz_step, h],
unfold prime_factor_exponents,
apply list.lex.cons,
sorry -- Detailed proof for even case omitted, but assumed correct
},
{
exact lex_lt_of_v2_increase n h,
}
end
\end{lstlisting}

\subsection*{Theorem 3: No Non-Trivial Cycles Exist}
\begin{lstlisting}
def collatz_iterate : ℕ → ℕ → ℕ
| 0 n := n
| (k + 1) n := collatz_step (collatz_iterate k n)

theorem no_collatz_cycles : ¬ ∃ n k, k > 0 ∧ collatz_iterate k n = n :=
begin
rintro ⟨n, k, hk_pos, hk⟩,
let f := λ m, prime_factor_exponents m,
have step_dec : ∀ m, f (collatz_step m) < f m := prime_factor_decreases,
have iterate_dec : ∀ j < k, f (collatz_iterate (j + 1) n) < f (collatz_iterate j n),
{
intros j hj,
induction j with j ih,
{ simp [collatz_iterate], exact step_dec n },
{
simp [collatz_iterate],
exact lt_trans (step_dec _) (ih (nat.lt_of_succ_lt hk_pos)),
}
},
have : f (collatz_iterate k n) < f n := iterate_dec (k - 1) (nat.pred_lt hk_pos.ne'),
rw hk at this,
exact lt_irrefl _ this,
end
\end{lstlisting}

\section{Conclusion}

We have defined a lexicographic descent function using a fixed prime exponent vector to show that every number eventually decreases under the Collatz map. Our formalization in Lean ensures logical soundness and rules out the existence of non-trivial cycles. This provides a formally verified structure for understanding the dynamics of the Collatz sequence.

\end{document}

Saurish Sharma (Apr 14 2025 at 03:33):

@Thomas Browning Yes, I used ChatGPT to assist me with parts of the Lean code. I'm still learning Lean and formal proof writing, so I used it mainly to help understand the structure and syntax — but I’ve tried to carefully review and verify everything myself.

Some parts might mix Lean 3 and Lean 4 syntax because of that, and I’d really appreciate help in cleaning it up and making sure everything is written properly in Lean 4.

If anyone has time to look over the code or guide me on how to organize it better, I’d be very grateful. I’m here to learn and improve, and I’d love any suggestions from more experienced formalizers.

Thanks a lot in advance!

Thomas Browning (Apr 14 2025 at 03:38):

Saurish Sharma said:

I’ve tried to carefully review and verify everything myself.

The lovely thing about lean is that you don't have to verify everything yourself. The computer verifies it for you. So if you take the lean code I pasted and click the "view in lean 4 playground" button, then lean itself will tell you the issues that your code has.

Thomas Browning (Apr 14 2025 at 03:39):

The code itself isn't proof of anything. It's only if it compiles and is declared correct and free of errors by lean that it actually means anything.

Saurish Sharma (Apr 14 2025 at 03:55):

Sir,I tried to correct it by using lean 4 web editor but it didn't work out. Can you help me in correcting it.

Thomas Browning (Apr 14 2025 at 04:18):

Well, the code is much closer to lean 3 than lean 4. But even using the lean 3 web editor, there is an issue. Your lemma prime_factor_decreases is false as stated. Here's a proof of the negation in lean 3:https://leanprover-community.github.io/lean-web-editor/#code=import%20data.nat.factorization.basic%0Aimport%20data.nat.prime_norm_num%0A%0Adef%20collatz_step%20%28n%20%3A%20%E2%84%95%29%20%3A%20%E2%84%95%20%3A%3D%0Aif%20n%20%25%202%20%3D%200%20then%20n%20%2F%202%20else%203%20*%20n%20%2B%201%0A%0Adef%20prime_set%20%3A%20list%20%E2%84%95%20%3A%3D%20%5B2%2C%203%2C%205%2C%207%2C%2011%2C%2013%2C%2017%2C%2019%5D%0A%0Adef%20prime_factor_exponents%20%28n%20%3A%20%E2%84%95%29%20%3A%20list%20%E2%84%95%20%3A%3D%0Aprime_set.map%20%28%CE%BB%20p%2C%20nat.factorization%20n%20p%29%0A%0Aopen%20list.lex%0A%0Alemma%20not_prime_factor_decreases%20%3A%0A%20%20%C2%AC%20%E2%88%80%20n%20%3A%20%E2%84%95%2C%20prime_factor_exponents%20%28collatz_step%20n%29%20%3C%20prime_factor_exponents%20n%20%3A%3D%0Abegin%0A%20%20rw%20%5Bnot_forall%5D%2C%0A%20%20use%203%2C%0A%20%20rw%20%5Bnot_lt%5D%2C%0A%20%20change%20prime_factor_exponents%203%20%E2%89%A4%20prime_factor_exponents%20%282%20*%205%29%2C%0A%20%20rw%20%5Bprime_factor_exponents%2C%20prime_factor_exponents%2C%20nat.factorization_mul%2C%20nat.prime.factorization%2C%0A%20%20%20%20nat.prime.factorization%2C%20nat.prime.factorization%5D%2C%0A%20%20simp%20%5Bprime_set%5D%2C%0A%20%20apply%20le_of_lt%2C%0A%20%20apply%20list.lex.rel%2C%0A%20%20all_goals%20%7B%20norm_num%20%7D%2C%0Aend

import data.nat.factorization.basic
import data.nat.prime_norm_num

def collatz_step (n : ℕ) : ℕ :=
if n % 2 = 0 then n / 2 else 3 * n + 1

def prime_set : list ℕ := [2, 3, 5, 7, 11, 13, 17, 19]

def prime_factor_exponents (n : ℕ) : list ℕ :=
prime_set.map (λ p, nat.factorization n p)

open list.lex

lemma not_prime_factor_decreases :
  ¬ ∀ n : ℕ, prime_factor_exponents (collatz_step n) < prime_factor_exponents n :=
begin
  rw [not_forall],
  use 3,
  rw [not_lt],
  change prime_factor_exponents 3 ≤ prime_factor_exponents (2 * 5),
  rw [prime_factor_exponents, prime_factor_exponents, nat.factorization_mul, nat.prime.factorization,
    nat.prime.factorization, nat.prime.factorization],
  simp [prime_set],
  apply le_of_lt,
  apply list.lex.rel,
  all_goals { norm_num },
end

Saurish Sharma (Apr 14 2025 at 04:35):

Thanks so much for pointing that out! You're right — I see now that my lemma prime_factor_decreases doesn’t hold for all n. I really appreciate the counterexample and the Lean proof.

I’ll rethink the strategy and explore a different kind of invariant or measure that might help with the proof.

Thanks again — this really helps me learn!

Last updated: May 02 2025 at 03:31 UTC