Zulip Chat Archive

Stream: lean4

Topic: IEEE floats

Kevin Buzzard (Feb 20 2022 at 13:39):

"Does Lean 4 have IEEE floats", I am asked by someone in my department (Sheehan Olver). Further questions seem to indicate that they want to make statements of the form "if I know various facts about $f:\R\to \R$ then I can prove that $f'(x)$ is within $\epsilon$ of F([x]+[h])/[h] where F : float -> float is float -> real -(f)-> real -> float, and [x] is the float associated to real x. For this one seems to need maps between float and real in both directions, plus some statements about error bounds when e.g. moving from real to float to real, and when comparing float addition to real addition (and similar for subtraction and multiplication and whatever else is defined on float). This all must be basic stuff in (some theory or other whose name I don't know). I am dimly aware that Lean 4 has floats, but am also very aware that I don't really know what I'm talking about. On the other hand I can just pretend that mathlib is ported to Lean 4 and then it has reals and these I understand.

What work is needed to rigorously prove basic things about floats? Is it a huge project which will involve new ideas, or just a case of porting some library which has been done 100 times over already in other systems? Will it be a huge amount of work? Is it on anyone's radar?

Sheehan Olver (Feb 20 2022 at 13:47):

FYI here’s a Jupiter notebook of the kind of results I’d like to verify:

https://nbviewer.org/github/Imperial-MATH50003/MATH50003NumericalAnalysis/blob/main/notebooks/Differentiation.ipynb

Henrik Böving (Feb 20 2022 at 13:48):

We do have floats here src#Float however they are right now designed as an opaque data type that via some @extern declarations on its operations effectively links to the C implementation of IEEE floats and has absolutely no logic regarding the actual IEEE stuff implemented in Lean.

There has been some discussion on the topic of integrating Floats with Reals in Lean, lots of links regarding this topic can be found over here https://leanprover.zulipchat.com/#narrow/stream/113489-new-members/topic/Real.20to.20Float.20and.20computable.20numbers/near/266659575 it appears some stuff regarding this has been done in both Lean and other theorem provers. But I'm not aware of any implementation of the mentioned stuff in Lean 4 right now.

Henrik Böving (Feb 20 2022 at 13:52):

Also since the question seems to be of numeric nature @Tomas Skrivan might be able to help further here, he's done a bit of work in that direction already.

Tomas Skrivan (Feb 20 2022 at 15:39):

Unfortunately, I know very little about proving facts about floats. I'm mostly interested in using Lean/dependent types for (semi)automatic code transformation and I don't have a good idea how to do that with regards to floats. So I did not spend much time studying this problem.

A quick Google search revealed Coq library Flocq about formalization of floats, looking at the size and complexity could reveal the difficulty/amount of work needed.

Tomas Skrivan (Feb 20 2022 at 15:42):

Ugh they wrote a whole book about it :scared:

Tim Carstens (Feb 20 2022 at 17:13):

This is the floating point model used by the Verified Software Toolchain, which supports reasoning about floats (and numeric algorithms in general):

https://flocq.gitlabpages.inria.fr/

Tim Carstens (Feb 20 2022 at 17:18):

@Kevin Buzzard Reasoning about floats is "easy" in the sense that people know how to do it; it is "hard" in the sense that "numerical methods" is a field dedicated to efficiently using and reasoning about them

We have a small but active community of numerical people using VST + Coq in industry; happy to make introductions if your colleague has a project they'd like to get underway

Joseph Myers (Feb 20 2022 at 17:33):

I'm familiar with IEEE floating point (I'm one of the maintainers of the floating-point functions in glibc, for example), but I've never done floating-point formalization - though I did do an IMO 2020 Q2 solution by numerical bounds which illustrates how you can prove concrete numerical bounds in Lean right now with norm_num (without floating point being involved at all in the formalization itself) as long as you do most of the work to reduce to specific integer / rational arithmetic cases outside of Lean.

I don't think any new ideas are needed for floating-point formalization - it's been done before in e.g. Coq and HOL - just a lot of work to set things up. You can see the sort of lemmas involved in proving things about error propagation in e.g. the MPFR algorithms document. Although the inequalities involved are mathematically routine, you definitely need mathlib first in order to do that sort of reasoning about real numbers at all.

How complicated your setup for defining floating-point is may depend on what properties of IEEE floating-point you want to model. Do you want to model floating-point exceptions? If you want to prove correctness of an implementation of a standard library function, which has to satisfy rules about which exception flags are raised, then exceptions matter (including the different options IEEE 754 provides for tininess detection); if you want to prove bounds on the accuracy of a numerical algorithm, maybe they don't. Do you want to cover rounding modes other than roundTiesToEven? Again, that matters for correctness of standard library functions (the sort of functions I tend to be implementing with floating point!), but maybe not for a lot of numerical purposes. Do you care about decimal floating point? That's also defined in IEEE 754; it's more of a niche interest, but maximal generality would mean including it (and probably other bases as well, though IEEE 754 only covers binary and decimal).

What about alternate exception handling? Interchange encodings, as opposed to just dealing with floating-point values without regard to how they are encoded? Choice of NaN payload for results, which isn't generally specified in IEEE 754? If for example you want to provide a formal model of instruction set semantics for some processor architecture, all those things become relevant, and so allowing for them might be relevant when designing how to represent floating-point formally even if they're not your immediate interests. Do you want to model how the elementary arithmetic operations are actually mostly formatOf operations in IEEE 754 (results need not be in the same format as the arguments), although most processors don't implement that directly?

Do you want your Lean definitions of floating-point operations to be computable, or only suitable for proving things? The mathlib style would definitely be not to worry about computability, and to leave it to tactics to do computation if you ever want to prove something like "this floating-point operation, with these floating-point numbers as arguments, produces this floating-point number as a result" that involves concrete numbers rather than general bounds. All the operations in IEEE 754 can in principle be implemented computably (i.e. so that both the results are provably correct and Lean can see they do terminate), even the recommended transcendental operations in clause 9, but it may be much simpler if you don't care about computability.

Sheehan Olver (Feb 20 2022 at 18:17):

I definitely want other rounding modes as it makes it possible to implement interval arithmetic in floating point. An example is one of the problem sheet questions which asks to rigorously compute e using floating point rounding; if this was made verifiable it would make it possible to verify other special functions. So that is an example of concrete numbers and bounds.

But other theorems (such as the error bound of finite differences) require just abstract bounds.

I don’t think decimal IEEE is needed. The applications with concrete numbers would be to make computations done in hardware verifiable. This is important if one wants to cite the results in a paper as an afterthought: that is, in the first instance one would do the computation using hardware and only verify when writing up (or potentially say “it is verifiable” without actually doing it).

Mario Carneiro (Feb 21 2022 at 04:34):

There is some work in mathlib on IEEE floats, but it is very experimental and although I was hoping someone would pick it up (there are some really easy sorrys in there) no one ever did: src#fp.float

Mario Carneiro (Feb 21 2022 at 04:36):

We could port Flocq if it comes to it, but this is definitely a long term project and there has historically not been much interest in pushing in this direction

Mario Carneiro (Feb 21 2022 at 04:39):

Certainly it is much more practical for the near term to use verified computation on Rat instead and do height reduction by adding approximation when necessary to keep the numbers manageable; the floating point unit will be cold but you can still do similar tricks to FP computation, interval arithmetic and all that

Mario Carneiro (Feb 21 2022 at 04:42):

Using literal IEEE floats is a rats nest of problems because sometimes compilers do double rounding via conversion to 80 bit floats internally (sometimes, under some optimization settings), and every processor has its own idea for what to do about NaN payloads, even though there is an IEEE operation that makes these bits observable... And custom rounding modes are of course important for interval arithmetic but support for this in C is nonexistent which means you have to either write custom assembly or use compiler specific intrinsics...

Mario Carneiro (Feb 21 2022 at 04:44):

Not to mention the issues with IEEE operations themselves, like == being broken both because it's not reflexive (let x := NaN in x == x is false) and it fails congruence (let zero := 0, neg_zero := -zero in zero == neg_zero && 1/zero != 1/neg_zero is true)

Mario Carneiro (Feb 21 2022 at 04:51):

Do you want to model how the elementary arithmetic operations are actually mostly formatOf operations in IEEE 754 (results need not be in the same format as the arguments), although most processors don't implement that directly?

@Joseph Myers Could you elaborate on this one? What do you mean by formatOf, is this conversion to/from integral types or conversion between floating point types of different bit width? And why would they be a majority of operations?

Mario Carneiro (Feb 21 2022 at 04:58):

All the operations in IEEE 754 can in principle be implemented computably (i.e. so that both the results are provably correct and Lean can see they do terminate), even the recommended transcendental operations in clause 9

How did they achieve this? Requiring correct rounding for a transcendental operation is essentially a real number inequality, which is not computable in general; so asserting that some implementation of float64 sin() is correct amounts to 2^64 exact real number inequalities (of the form 0.123 <= sin(0.144) < 0.124), which, although it might be possible to prove if you are lucky and all 2^64 real numbers land suitably far from their claimed bounds that you can establish these inequalities, still sounds like more work than people could ever have put into the problem.

Mario Carneiro (Feb 21 2022 at 05:00):

Not to mention that many processors disagree about the last few digits of these approximate transcendental functions. Hell, Intel got the value of $\pi$ wrong when implementing FSIN so you get really terrible results if you use it at large values

Sheehan Olver (Feb 21 2022 at 15:03):

I’m not an expert but in Julia sin(x) just uses the Taylor series/Horner:

https://github.com/JuliaLang/julia/blob/0b48b91c9881823d860eaf0a605072b7f20a4cbb/base/special/trig.jl#L76

The errors here are likely to be provable.

For concrete cases you are probably right: floats are a subset of rationals so inequalities can be verified that way. Though there may be practical issues with the fact that one gets extremely large numerators/denominators.

Reid Barton (Feb 21 2022 at 15:15):

Math tells us $\sin(q)$ is never rational for rational $q \ne 0$ , so any convergent sequence of bounds on $\sin(q)$ will eventually answer the question of whether $r < \sin(q)$ for rational $r$ . But predicting in advance how close $\sin(q)$ could be to $r$ (and so how well we have to approximate $\sin(q)$ ) is some more difficult question in analytic/transcendental number theory.

Sheehan Olver (Feb 21 2022 at 15:45):

Floats and floating point arithmetic are more structured than general rationals. Again I’m not an expert but a quick experiment shows that for 100,000 random floats that the true sin(x) when rounded up or down is equal to the Julia sin(x). (It isn’t necessarily rounding in the right direction)

julia> x = randn(100_000); # 100k random Float64

julia> all(@.(sin(x) == Float64(sin(big(x))))) # some are not correctly rounded
false

julia> all(@.(sin(x) == Float64(sin(big(x)), RoundUp) || sin(x) == Float64(sin(big(x)), RoundDown))) # all are between two nearest floats
true

Mario Carneiro (Feb 21 2022 at 16:58):

Right, this is the Table maker's dilemma, which is essentially a variation on the busy beaver problem for concrete n. Not necessarily undecidable, but you don't know going into it how bad it will be, and claimed solutions are very short (the assertion that N digits of precision suffice for all numbers in the range) and very hard to verify

Mario Carneiro (Feb 21 2022 at 17:02):

If you allow incorrect rounding then the problem becomes much easier, because then you can apply regular error analysis to argue that some polynomial approximation is within epsilon of the answer where epsilon is set to 1 ULP or maybe 1/2 ULP

Dylan MacKenzie (ecstatic-morse) (Feb 21 2022 at 17:04):

I found this quite interesting as well. I didn't know about that line in the standard. A quick search came up with CR-libm, which was one of the first libm's with formally verified correct rounding for transcendental functions. The basic approach seems to be using clever tricks to prove that a given polynomial approximation gives correctly rounded results for some subset of the input, and then exhaustively checking the rest. Notably, pow is not guaranteed to be rounded correctly, since the search space is twice as large and there's no clever tricks.

Dylan MacKenzie (ecstatic-morse) (Feb 21 2022 at 17:06):

(Table 1 in that first link gives the domain for which results are guaranteed to be correctly rounded for each function)

Kevin Buzzard (Feb 21 2022 at 17:07):

Floats might be rational but float addition can't be rational addition because the latter is associative.

Mario Carneiro (Feb 21 2022 at 17:09):

well, the IEEE rule is you perform the operation in exact real arithmetic and then round with the specified rounding mode

Mario Carneiro (Feb 21 2022 at 17:10):

so associativity is something like f(f(x + y) + z) = f(x + f(y + z)) where + is addition of rationals f is a weird function that is monotonic and idempotent and not much else

Mario Carneiro (Feb 21 2022 at 17:11):

so the fact that it fails shouldn't be too surprising

Dylan MacKenzie (ecstatic-morse) (Feb 21 2022 at 17:25):

Ah, I guess the only clever trick is that most of those functions overflow/underflow to $\infty$ for some interval, so you don't have to check the full 64-bit domain. The ones that don't (e.g. sin) are only guaranteed within some interval.

Joseph Myers (Feb 22 2022 at 02:36):

"formatOf indicates that the name of the operation specifies the floating-point destination format, which might be different from the floating-point operands’ formats. There are formatOf versions of these operations for every supported arithmetic format.". That is, those operations aren't homogeneous; the operands and the result need not be of the same format (e.g. you can add two binary128 operands, producing a binary32 result, with only a single rounding involved), though they do need to have the same radix. This applies to the operations: addition, subtraction, multiplication, division, squareRoot, fusedMultiplyAdd (conversion operations are also described as formatOf operations, since the type of the result is independent of the type of the input there).

Joseph Myers (Feb 22 2022 at 02:46):

The state of the art for correct rounding of special functions has advanced somewhat since crlibm, see e.g. this paper. So worst cases for correct rounding for binary64 can be found rather more efficiently than by an exhaustive search, and while a corresponding search isn't yet feasible for binary128, a smaller search is possible that bounds the internal precision needed rather better than the bounds that can be obtained from more general results not involving such a search at all.

(For all the transcendental functions with correctly rounded versions recommended in IEEE 754, it can be deduced from known results that they never produce a rational result for a rational argument for nontrivial reasons; mostly this follows from Lindemann's theorem that exp(x) is transcendental for nonzero algebraic x. And there are versions of Lindemann's theorem in the literature that give explicit, if rather large, bounds, at least one referenced in the paper linked above.)

Last updated: May 02 2025 at 03:31 UTC