Zulip Chat Archive

Stream: lean4

Topic: Treating `Float` as reals, inconsistent?


Tomas Skrivan (Jan 01 2022 at 22:19):

I would like to extract runable code from Lean code that is dealing with real numbers. A simple way of doing this is to just assume axioms of real numbers on top of Float. I'm curious if this can lead to inconsistencies? Can you prove False from it?

Here is a partial set of those axioms. I was lazy to write down all of them, but feel free to add any one of them if you can use it to prove False.

abbrev  := Float

-- Axioms of Field
axiom add_assoc :  x y z : , (x + y) + z = x + (y + z)
-- ...

-- Total order
axiom self_leq :  x : , x  x
-- ...

-- Compatibility of operations and inequality
axiom leq_add :  x y z : , x  y  x + z  y + z
-- ...

-- Axiomatic definition of supremum
def upper_bound (A :   Prop) (h : ) : Prop :=
   (x : ), A x  x  h

def bounded (A :   Prop) : Prop :=
   (h : ), upper_bound A h

def supremum (A :   Prop) (s : ) : Prop :=
  upper_bound A s
  
   (h : ), upper_bound A h  s  h

constant sup : {A // bounded A}  
axiom real_sup :  A, (h : bounded A)  supremum A (sup A, h⟩)


-- Axiomatic definition of some functions
def Float.abs (x : ) := if x  0 then x else -x
axiom sqrt_sqr :  (x : ), Float.sqrt x*x = x.abs

axiom exp_of_zero : Float.exp 0 = 1
-- This would require definition of derivative
-- axiom der_exp : der Float.exp = Float.exp


--- Challange: can you prove False?
example : False := sorry

David Wärn (Jan 01 2022 at 22:32):

Why not write a tactic which takes an expression representing a real and outputs a float? Just give it a dictionary like real.exp -> Float.exp, real.pi -> 3.14 etc.

Tomas Skrivan (Jan 01 2022 at 22:41):

How would you deal with something like Array real?

Patrick Stevens (Jan 01 2022 at 22:53):

Surely nan <= nan is false?

Tomas Skrivan (Jan 01 2022 at 23:03):

Sure it is, but give me a Lean proof of it.

Henrik Böving (Jan 01 2022 at 23:04):

Assuming that NaN is somehow exposed in the API this should be possible since less equal is decidable for Float.

Henrik Böving (Jan 01 2022 at 23:05):

And if NaN is not exposed in the API yet I guess it will be only a matter of time until it is.

Kevin Buzzard (Jan 01 2022 at 23:22):

Can it be proved that there's a surjection from nat to float? Because with the axioms defining the reals you can prove there's no such surjection

Joe Hendrix (Jan 01 2022 at 23:33):

Can floating point operations be evaluated? Floating point addition is not associative, so yes I think this could be used to prove false if floating point operations can be evaluated.
If this is just for code generation though, I'm not sure that it matters. Just don't use the axioms in files where you care about verification.

Henrik Böving (Jan 01 2022 at 23:56):

Evaluation is actually not an issue since the add function is hidden behind a constant so it is not possible to prove anything about it, the only computable things about float are the order relations.

Henrik Böving (Jan 02 2022 at 00:23):

As a general side remark, if it actually does turn out that treating Float as reals is an issue, we could, instead of using double from C just bind to an arbitrary precision library and let that one do the correct calculations for us right? That way we could still evaluate computation on reals properly.

Mac (Jan 02 2022 at 06:05):

Henrik Böving said:

Evaluation is actually not an issue since the add function is hidden behind a constant so it is not possible to prove anything about it, the only computable things about float are the order relations.

This is not necessarily true. You can use nativeDecide to prove things about opaque (constant) definitions based on their evaluation.

Tomas Skrivan (Jan 02 2022 at 08:45):

Mac said:

This is not necessarily true. You can use nativeDecide to prove things about opaque (constant) definitions based on their evaluation.

Great! The tactic nativeDecide answers my question, you can prove False when assuming:

axiom leq_antisymmetry :  x y : , x  y  y  x  x = y
axiom leq_totality     :  x y : , x  y  y  x
axiom simplify1 :  x y : , (x + y) - x = y

Here is the full proof:

abbrev  := Float

axiom leq_antisymmetry :  x y : , x  y  y  x  x = y
axiom leq_totality     :  x y : , x  y  y  x
axiom simplify1 :  x y : , (x + y) - x = y

def dec (P : Prop) [d : Decidable P] : Decidable P := d

open Decidable in
instance (x y : ) : Decidable (x = y) :=
match dec (x  y), dec (y  x) with
| isTrue hxy, isTrue hyx => isTrue (leq_antisymmetry x y hxy hyx)
| isFalse hxy, isTrue hyx =>
  isFalse (by intro xy; apply hxy; rw[xy] at hyx; rw[xy]; assumption )
| isTrue hxy, isFalse hyx =>
  isFalse (by intro xy; apply hyx; rw[xy] at hxy; rw[xy]; assumption )
| isFalse hxy, isFalse hyx =>
  isFalse (by induction (leq_totality x y)
              intro xy; apply hxy; assumption
              intro xy; apply hyx; assumption)

def x : Float := 1.0
def n : Float := 10000000000000000
def y : Float := x/n

theorem hoho : x  y := by nativeDecide

theorem foo1 : (x + y) - x = 0.0 := by nativeDecide
theorem foo2 : (x + y) - x = y := simplify1 _ _

theorem inconsistency : False :=
by
  have h : 0.0  y := by nativeDecide
  apply h; rw[ foo1, foo2]

Mac (Jan 02 2022 at 08:49):

One thing to note is that nativeDecide does rely on two axioms ofReduceBool/ofReduceNat (see here) so you can avoid these inconsistencies if you ban theorems with those axioms.

Tomas Skrivan (Jan 02 2022 at 08:51):

Mac said:

One thing to note is that nativeDecide does rely on two axioms ofReduceBool/ofReduceNat (see here) so you can avoid these inconsistencies if you ban theorems with those axioms.

So if I ban these axioms am I good to assume axioms of real numbers for Float type?

Mac (Jan 02 2022 at 08:58):

@Tomas Skrivan looking at the code for Float it doesn't seem to me like it has any notable properties outside its native code, so you can probable safely assume just about anything about it outside that realm.

Tomas Skrivan (Jan 02 2022 at 08:59):

And how do I ban axioms?

Mac (Jan 02 2022 at 09:04):

@Tomas Skrivan you can use #print axioms <theorem_ident> to view the axioms used in a theorem. For example,

#print axioms inconsistency
/- 'inconsistency' depends on axioms: [leq_antisymmetry, leq_totality, Lean.ofReduceBool, simplify1] -/

As such, to check that the theorems you prove are not using ofReduceBool to ensure it is no dependent on the possible native inconsistences. With some metaprogramming work you could also probably automate this check (e.g., by creating a attribute that checks that the tagged theorem does not use a given axiom).

Gabriel Ebner (Jan 02 2022 at 12:25):

A simple way of doing this is to just assume axioms of real numbers on top of Float.

You shouldn't do this because it is a wrong model of floating-point numbers, and doesn't allow you to prove any useful properties about floating-point algorithms. For example, addition is not associative so you need to sum floating-point numbers carefully to avoid cancellation errors. And you want to be able to obtain bounds on the errors here. In general, a good rule of thumb is to assume that every floating-point operation introduces a small error, and the specification only guarantees a bound on this error. If you assume that Float = ℝ, then you cannot prove anything about the errors at all (since they are all zero). So you'll have algorithms that will fail even though you've proven them correct.

Gabriel Ebner (Jan 02 2022 at 12:30):

As others have already observed, you can consistently assume that Float = ℝ as long as you ignore the VM implementation (i.e., avoid nativeDecide and friends). The reason is simply that Float and its operations are defined as constants, and there are no axioms about floating-point numbers (in core Lean) which would be false in the real numbers.

Tomas Skrivan (Jan 02 2022 at 13:18):

Gabriel Ebner said:

A simple way of doing this is to just assume axioms of real numbers on top of Float.

You shouldn't do this because it is a wrong model of floating-point numbers, and doesn't allow you to prove any useful properties about floating-point algorithms. For example, addition is not associative so you need to sum floating-point numbers carefully to avoid cancellation errors. And you want to be able to obtain bounds on the errors here. In general, a good rule of thumb is to assume that every floating-point operation introduces a small error, and the specification only guarantees a bound on this error. If you assume that Float = ℝ, then you cannot prove anything about the errors at all (since they are all zero). So you'll have algorithms that will fail even though you've proven them correct.

This would be great if it works but please show me some useful error bounds for Congujate Gradient or GMRES, type of algorithms I care about, using this technique. Until then, I'm going to prove stuff for idealized machine that can do arithmetic on real numbers and follow best practices to avoid rounding errors.

Tomas Skrivan (Jan 02 2022 at 13:40):

What you are suggesting is basically interval arithmetic. I have stumbled onto a recent paper that even prove certain impossibility result of interval arithmetic in using neural networks for classification. (I didn't read the paper but the abstract seems to be pretty clear)

Tomas Skrivan (Jan 02 2022 at 13:47):

I'm trying to build something like certigrad and I want to take similar approach to floats, see Daniel's talk.

Gabriel Ebner (Jan 02 2022 at 14:00):

please show me some useful error bounds for Congujate Gradient

Conjugate gradient is an iterative method iirc, so the error bound for the result is trivial (convergence is the hard part and can fail in floating-point from what I understand).

Gabriel Ebner (Jan 02 2022 at 14:16):

What you are suggesting is basically interval arithmetic.

Not completely, some floating-point operations have precise specifications (such as the sign function, or comparisons). There it is perfectly fine to make case distinction instead of error bounds. Interval arithmetic is much more limited in this regard, if you have an x that is close to zero, then interval arithmetic can only conclude that -1 ≤ sgn x ≤ 1.

Tomas Skrivan (Jan 02 2022 at 14:21):

In infinite precision, CG gives you exact answer in at most n steps. In this sense, it is direct method. However, because you can stop after fewer steps and still get a good approximation people think about CG as an iterative method. In finite arithmetic, the convergence is delayed.

I studied this a long time ago, so I do not remember much. A quick Google search directed me to this article to quote from it:

As it turns out, nobody has actually ever proved that any of the Conjugate Variant methods used in practice actually satisfy these conditions. Any proof that a given method satisfies these properties would constitute a significant theoretical advancement in the understanding of the conjugate gradient algorithm in finite precision. Similarly, finding weaker conditions which provide some sort of convergence guarantees for a finite precision CG implementation would also be of significant importance.

My point is that proving robustness in finite precision is so difficult that people can't even do it satisfactory with pen and paper for algorithm that is 13 lines of code! Therefore, attempting it in Lean is just lost battle.

Gabriel Ebner (Jan 02 2022 at 14:22):

I'm going to use Lean to prove stuff for idealized [numbers]

One way to make this approach more principled would be to use a typeclass, make the functions polymorphic, and then use Real in proofs and Float when executing. That is something like this:

class FpOps α extends Add α, Mul α, .... where
  sin : α  α
  cos : α  α
  ...

def myFun [FpOps α] (x y : α) : α := ....

Tomas Skrivan (Jan 02 2022 at 14:25):

Gabriel Ebner said:

I'm going to use Lean to prove stuff for idealized [numbers]

One way to make this approach more principled would be to use a typeclass, make the functions polymorphic, and then use Real in proofs and Float when executing. That is something like this:

class FpOps α extends Add α, Mul α, .... where
  sin : α  α
  cos : α  α
  ...

def myFun [FpOps α] (x y : α) : α := ....

Yes, this is an alternative approach that I have to think about a bit more. However, I find it a bit too verbose. A bit longer counter argument is here as we discussed it already in a different thread.

Tomas Skrivan (Jan 02 2022 at 14:28):

Therefore, I'm not entirely sure what I'm gaining in doing so. By using Float = ℝ I loose nativeDecide, but I have never used nativeDecide or felt the need of using it. Any other reason for using the more verbose approach?

Gabriel Ebner (Jan 02 2022 at 14:46):

In the other thread several people have suggested the polymorphic approach as well. I really think that's the best option here if you don't want to do floating-point verification (which is indeed hairy).

Gabriel Ebner (Jan 02 2022 at 14:48):

Any other reason for using the more verbose approach?

Here is a fun demo of what you can do if you postulate the wrong axioms for Float:

axiom A (x : Float) : x + 1 - x = 1

def foo (x : Float) : Nat :=
  if h1 : 1 == (1 : Float) then
    if h2 : x + 1 - x == 1 then
      1
    else
      have h : False := by rw [A] at h2; contradiction
      -- look, now we can safely cast the float to a string :-/
      let x' : String := cast h.elim x
      x'.length
  else
    3

#eval foo (2 ^ 60 : Nat).toFloat -- casts a float to a string

Tomas Skrivan (Jan 02 2022 at 15:16):

Exploiting a fact that at runtime you can reach provably unreachable branch, thus ability of proving False an opening a wormhole to a different dimension, does not follow best practices in my book :) You need to give me a better example to convince me.

Mac (Jan 02 2022 at 19:04):

@Tomas Skrivan even after reading the previous thread you linked, what you are doing here still makes very little sense to me. If you want to prove things about an algorithm assuming that the values are reals, why not just do that proof on a version of the algorithm using reals? Alternatively, If you want to prove properties about an algorithm using floats, you need to model floats. Otherwise, you are not really proving anything about the actual algorithm. This is especially true as many floating point algorithms are useful (i.e., have the interesting properties they have) precisely because they used a fixed-length approximation instead of reals.

Tomas Skrivan (Jan 02 2022 at 19:35):

If you have an algorithm defined with reals, how do you actually generate runable code where you replace reals with floats?

Joe Hendrix (Jan 02 2022 at 20:42):

Is this a Lean 3 question? Where is your Lean 4 theory of reals defined?

Tomas Skrivan (Jan 03 2022 at 07:18):

I'm using Lean 4 and currently I'm treating reals purely axiomatically until mathlib gets ported.

Joe Hendrix (Jan 03 2022 at 07:37):

Since you seem to primarily care about execution, why are you not just using the Lean 4 floating point numbers?

Henrik Böving (Jan 03 2022 at 07:39):

He does just def Real := Float and then assumes the stuff necessary about his Real for his work.

Christian Pehle (Jan 03 2022 at 14:43):

@Tomas Skrivan it might make sense to look into bindings to arb (https://arblib.org/using.html), as far as I've
understood the arithmetic used there produces "correct", albeit potentially worse than floating point error bounds. I've been looking into building an autograd library in Lean as well btw., but more towards the practical side of things, not so much focussed on provability.

Tomas Skrivan (Jan 03 2022 at 15:12):

@Christian Pehle I will have a look into arb, looks interesting. However, I'm more interested in speed then provable correctness, so I will definitely be using single or double floats.

We probably have similar goal, I'm not so much interested in proving correctness but use dependent types to help me do the math behind scientific computing, the prime example is automatic/symbolic differentiation.

I think I'm pretty far with automatic differentiation and I would like to make it public in few weeks or months. Currently, the best thing I'm capable of automatically deriving discrete Laplacian from discrete Dirichlet energy. Example of a theorem I can prove with autograd:

example {n} (g : Fin n  ) [NonZero n]
  :
     (λ (f : Fin n  ) =>  i, (f (i + 1) - f i)∥^2) g
    =
    (λ i => (2 : ) * (g (i - 1 + 1) - g (i - 1) - (g (i + 1) - g i)))
  :=
by autograd; funext i; simp; done

I really want ring tactic in Lean 4, then the result can be the nice 2*(g i) - g (i - 1) - g (i + 1). Now I'm trying to get the continuous version working: ∇ (λ (f : (ℝ ⟿ ℝ)) => (∫ t, ∥∇ f t∥^2)) g = - 2 * ∇ (∇ g).

Then I'm planning on doing forward and reverse mode differentiation. I already have basic definitions and the core theorem/rewrite rule proven, for forward mode operator 𝓣: 𝓣 (f ∘ g) = (𝓣 f ∘ 𝓣 g) and for reverse mode operator 𝓑: 𝓑 (f ∘ g) = (𝓑 f • 𝓑 g), where composes functions normally but their gradients in reverse direction.

Christian Pehle (Jan 03 2022 at 15:21):

Yeah I would definitely be interested to help out, although this would be a side project for me and not research, I've followed your progress a bit (based on the questions you asked) and it seems like a cool approach. I got a basic C++ backend working (currently PyTorch based) and looked into various ways of deriving autograd code based on the IR, roughly like Jax / Dex does it. It would take a ton of work to make this fully functional though, but I also strongly believe that Lean could be a really nice language for scientific computing.

Joe Hendrix (Jan 03 2022 at 15:57):

Clearly, assuming Float and Real are equivalent is going to let one prove false, but on the other hand, my impression is that make similar transformations on floating point code all the time.

Tomas Skrivan (Jan 03 2022 at 16:23):

Joe Hendrix said:

Clearly, assuming Float and Real are equivalent is going to let one prove false, but on the other hand, my impression is that make similar transformations on floating point code all the time.

You can prove false when using the axiom Lean.readuceBool. If not, the current consensus is that you can't do it. The question is really subtle and depends on details on how Lean handles Float.

Joe Hendrix (Jan 03 2022 at 17:58):

What is subtle? Are you proposing removing the reduceBool axiom to accommodate your work?

Tomas Skrivan (Jan 03 2022 at 19:22):

By subtle I mean that from all those discussions It looks like that there is a difference between what people think how Lean handles floats and how it actually handles floats. Proving anything about floats, like 1.0 + 0.1 = 1.1, requires reduceBool axiom(or reduceNat), proving 1.0 + 0.1 = 1.1 is not as straightforward as one might naively think. I'm definitely not for removing those axioms, but you can check what axioms are used. Thus, in my work I will make sure that I do not use these axioms, or at least in conjunction with axiomatic definition of reals over floats.

Chris Lovett (Jan 03 2022 at 22:04):

It is interesting to see how Intel uses formal theorem proving on their floating point implementations: https://www.cl.cam.ac.uk/~jrh13/slides/jnao-02jun06/slides.pdf - might be fun to port those proofs top Lean. And then if you also want to be able to deal with bit manipulations: https://dl.acm.org/doi/pdf/10.1145/2980983.2908107


Last updated: Dec 20 2023 at 11:08 UTC