Boost C++ Libraries of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

This is the documentation for an old version of Boost. Click here to view this page for the latest version.
Theory behind floating point comparisons

The following is the most obvious way to compare two floating-point values u and v for being close at a given absolute tolerance epsilon:

abs(u - v) <= epsilon; // (1)

However, in many circumstances, this is not what we want. The same absolute tolerance value 0.01 may be too small to meaningfully compare two values of magnitude 10e12 and at the same time too little to meaningfully compare values of magnitude 10e-12. For examples, see Squassabia.

We do not want to apply the same absolute tolerance for huge and tiny numbers. Instead, we would like to scale the epsilon with u and v. The Unit Test Framework implements floating-point comparison algorithm that is based on the solution presented in Knuth:

   abs(u - v) <= epsilon * abs(u)
&& abs(u - v) <= epsilon * abs(v)); // (2)

defines a very close with tolerance epsilon relationship between u and v, while

   abs(u - v) <= epsilon * abs(u)
|| abs(u - v) <= epsilon * abs(v); // (3)

defines a close enough with tolerance epsilon relationship between u and v.

Both relationships are commutative but are not transitive. The relationship defined in (2) is stronger that the relationship defined in (3) since (2) necessarily implies (3).

The multiplication in the right side of inequations may cause an unwanted underflow condition. To prevent this, the implementation is using modified version of (2) and (3), which scales the checked difference rather than epsilon:

   abs(u - v)/abs(u) <= epsilon
&& abs(u - v)/abs(v) <= epsilon; // (4)
   abs(u - v)/abs(u) <= epsilon
|| abs(u - v)/abs(v) <= epsilon; // (5)

This way all underflow and overflow conditions can be guarded safely. The above however, will not work when v or u is zero. In such cases the solution is to resort to a different algorithm, e.g. (1).

Tolerance selection considerations

In case of absence of domain specific requirements the value of tolerance can be chosen as a sum of the predicted upper limits for "relative rounding errors" of compared values. The "rounding" is the operation by which a real value 'x' is represented in a floating-point format with 'p' binary digits (bits) as the floating-point value X. The "relative rounding error" is the difference between the real and the floating point values in relation to real value: abs(x-X)/abs(x). The discrepancy between real and floating point value may be caused by several reasons:

The first two operations proved to have a relative rounding error that does not exceed

half_epsilon = half of the 'machine epsilon value'

for the appropriate floating point type FPT [9]. Conversion to binary presentation, sadly, does not have such requirement. So we can't assume that float(1.1) is close to the real number 1.1 with tolerance half_epsilon for float (though for 11./10 we can). Non-arithmetic operations either do not have a predicted upper limit relative rounding errors.

[Note] Note

Note that both arithmetic and non-arithmetic operations might also produce others "non-rounding" errors, such as underflow/overflow, division-by-zero or "operation errors".

All theorems about the upper limit of a rounding error, including that of half_epsilon, refer only to the 'rounding' operation, nothing more. This means that the 'operation error', that is, the error incurred by the operation itself, besides rounding, isn't considered. In order for numerical software to be able to actually predict error bounds, the IEEE754 standard requires arithmetic operations to be 'correctly or exactly rounded'. That is, it is required that the internal computation of a given operation be such that the floating point result is the exact result rounded to the number of working bits. In other words, it is required that the computation used by the operation itself doesn't introduce any additional errors. The IEEE754 standard does not require same behavior from most non-arithmetic operation. The underflow/overflow and division-by-zero errors may cause rounding errors with unpredictable upper limits.

At last be aware that half_epsilon rules are not transitive. In other words combination of two arithmetic operations may produce rounding error that significantly exceeds 2*half_epsilon. All in all there are no generic rules on how to select the tolerance and users need to apply common sense and domain/ problem specific knowledge to decide on tolerance value.

To simplify things in most usage cases latest version of algorithm below opted to use percentage values for tolerance specification (instead of fractions of related values). In other words now you use it to check that difference between two values does not exceed x percent.

For more reading about floating-point comparison see references below.

Bibliographic references


The art of computer programming (vol II)

Donald. E. Knuth, 1998, Addison-Wesley Longman, Inc., ISBN 0-201-89684-2, Addison-Wesley Professional; 3rd edition. (The relevant equations are in §4.2.2, Eq. 36 and 37.)

Rounding near zero, in Advanced Arithmetic for the Digital Computer

Ulrich W. Kulisch, 2002, Springer, Inc., ISBN 0-201-89684-2, Springer; 1st edition


Comparing Floats: How To Determine if Floating Quantities Are Close Enough Once a Tolerance Has Been Reached

Alberto Squassabia, in C++ Report (March 2000)

The Journeyman's Shop: Trap Handlers, Sticky Bits, and Floating-Point Comparisons

Pete Becker, in C/C++ Users Journal (December 2000)


What Every Computer Scientist Should Know About Floating-Point Arithmetic

David Goldberg, pages 150-230, in Computing Surveys (March 1991), Association for Computing Machinery, Inc.

From Rounding Error Estimation to Automatic Correction with Automatic Differentiation

Philippe Langlois, Technical report, INRIA

William Kahan home page

Lots of information on floating point arithmetics.

[9] machine epsilon value is represented by std::numeric_limits<FPT>::epsilon()