Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

Why use a high-precision library rather than built-in floating-point types?
PrevUpHomeNext

For nearly all applications, the built-in floating-point types, double (and long double if this offers higher precision than double) offer enough precision, typically a dozen decimal digits.

Some reasons why one would want to use a higher precision:

  • A much more precise result (many more digits) is just a requirement.
  • The range of the computed value exceeds the range of the type: factorials are the textbook example.
  • Using double is (or may be) too inaccurate.
  • Using long double (or may be) is too inaccurate.
  • Using an extended-precision type implemented in software as double-double (Darwin) is sometimes unpredictably inaccurate.
  • Loss of precision or inaccuracy caused by extreme arguments or cancellation error.
  • An accuracy as good as possible for a chosen built-in floating-point type is required.
  • As a reference value, for example, to determine the inaccuracy of a value computed with a built-in floating point type, (perhaps even using some quick'n'dirty algorithm). The accuracy of many functions and distributions in Boost.Math has been measured in this way from tables of very high precision (up to 1000 decimal digits).

Many functions and distributions have differences from exact values that are only a few least significant bits - computation noise. Others, often those for which analytical solutions are not available, require approximations and iteration: these may lose several decimal digits of precision.

Much larger loss of precision can occur for boundary or corner cases, often caused by cancellation errors.

(Some of the worst and most common examples of cancellation error or loss of significance can be avoided by using complements: see why complements?).

If you require a value which is as accurate as can be represented in the floating-point type, and is thus the closest representable value and has an error less than 1/2 a least significant bit or ulp it may be useful to use a higher-precision type, for example, cpp_dec_float_50, to generate this value. Conversion of this value to a built-in floating-point type ('float', double or long double) will not cause any further loss of precision. A decimal digit string will also be 'read' precisely by the compiler into a built-in floating-point type to the nearest representable value.

[Note] Note

In contrast, reading a value from an std::istream into a built-in floating-point type is not guaranteed by the C++ Standard to give the nearest representable value.

William Kahan coined the term Table-Maker's Dilemma for the problem of correctly rounding functions. Using a much higher precision (50 or 100 decimal digits) is a practical way of generating (almost always) correctly rounded values.


PrevUpHomeNext