...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
Function std::numeric_limits<T>::max()
returns the largest finite value that can be represented by the type T.
If there is no such value (and numeric_limits<T>::bounded
is false
) then returns T()
.
For built-in types there is usually a corresponding MACRO value TYPE_MAX, where TYPE is CHAR, INT, FLOAT etc.
Other types, including those provided by a typedef, for example INT64_T_MAX
for int64_t
,
may provide a macro definition.
To cater for situations where no numeric_limits
specialization is available (for example because the precision of the type
varies at runtime), packaged versions of this (and other functions) are
provided using
#include <boost/math/tools/precision.hpp> T = boost::math::tools::max_value<T>();
Of course, these simply use std::numeric_limits<T>::max()
if available, but otherwise 'do something
sensible'.
Since C++11: std::numeric_limits<T>::lowest()
is
min()
.
max()
(but implementation-dependent).
-(std::numeric_limits<double>::max)() == std::numeric_limits<double>::lowest();
Function std::numeric_limits<T>::min()
returns the minimum finite value that can be represented by the type T.
For built-in types there is usually a corresponding MACRO value TYPE_MIN, where TYPE is CHAR, INT, FLOAT etc.
Other types, including those provided by a typedef, for example INT64_T_MIN
for int64_t
,
may provide a macro definition.
For floating-point types, it is more fully defined as the minimum positive normalized value.
See std::numeric_limits<T>::denorm_min()
for the smallest denormalized value, provided
std::numeric_limits<T>::has_denorm == std::denorm_present
To cater for situations where no numeric_limits
specialization is available (for example because the precision of the type
varies at runtime), packaged versions of this (and other functions) are
provided using
#include <boost/math/tools/precision.hpp> T = boost::math::tools::min_value<T>();
Of course, these simply use std::numeric_limits<T>::min()
if available.
Function std::numeric_limits<T>::denorm_min()
returns the smallest denormalized
value, provided
std::numeric_limits<T>::has_denorm == std::denorm_present
std::cout.precision(std::numeric_limits<double>::max_digits10); if (std::numeric_limits<double>::has_denorm == std::denorm_present) { double d = std::numeric_limits<double>::denorm_min(); std::cout << d << std::endl; // 4.9406564584124654e-324 int exponent; double significand = frexp(d, &exponent); std::cout << "exponent = " << std::hex << exponent << std::endl; // fffffbcf std::cout << "significand = " << std::hex << significand << std::endl; // 0.50000000000000000 } else { std::cout << "No denormalization. " << std::endl; }
The exponent is effectively reduced from -308 to -324 (though it remains encoded as zero and leading zeros appear in the significand, thereby losing precision until the significand reaches zero).
Function std::numeric_limits<T>::round_error()
returns the maximum error (in units of ULP)
that can be caused by any basic arithmetic operation.
round_style == std::round_indeterminate;
The rounding style is indeterminable at compile time.
For floating-point types, when rounding is to nearest, only half a bit
is lost by rounding, and round_error
== 0.5
.
In contrast when rounding is towards zero, or plus/minus infinity, we can
loose up to one bit from rounding, and round_error
== 1
.
For integer types, rounding always to zero, so at worst almost one bit
can be rounded, so round_error
== 1
.
round_error()
can be used with std::numeric_limits<T>::epsilon()
to estimate the maximum potential error caused by rounding. For typical
floating-point types, round_error() = 1/2
, so half
epsilon is the maximum potential error.
double round_err = std::numeric_limits<double>::epsilon() // 2.2204460492503131e-016 * std::numeric_limits<double>::round_error(); // 1/2 std::cout << round_err << std::endl; // 1.1102230246251565e-016
There are, of course, many occasions when much bigger loss of precision occurs, for example, caused by Loss of significance or cancellation error or very many iterations.
Function std::numeric_limits<T>::epsilon()
is meaningful only for non-integral types.
It returns the difference between 1.0
and the next value representable by the floating-point type T. So it is
a one least-significant-bit change in this floating-point value.
For double
(float_64t
) it is 2.2204460492503131e-016
showing all possibly significant 17 decimal digits.
std::cout.precision(std::numeric_limits<double>::max_digits10); double d = 1.; double eps = std::numeric_limits<double>::epsilon(); double dpeps = d+eps; std::cout << std::showpoint // Ensure all trailing zeros are shown. << d << "\n" // 1.0000000000000000 << dpeps << std::endl; // 2.2204460492503131e-016 std::cout << dpeps - d // 1.0000000000000002 << std::endl;
We can explicitly increment by one bit using the function boost::math::float_next()
and the result is the same as adding epsilon
.
double one = 1.; double nad = boost::math::float_next(one); std::cout << nad << "\n" // 1.0000000000000002 << nad - one // 2.2204460492503131e-016 << std::endl;
Adding any smaller value, like half epsilon
,
will have no effect on this value.
std::cout.precision(std::numeric_limits<double>::max_digits10); double d = 1.; double eps = std::numeric_limits<double>::epsilon(); double dpeps = d + eps/2; std::cout << std::showpoint // Ensure all trailing zeros are shown. << dpeps << "\n" // 1.0000000000000000 << eps/2 << std::endl; // 1.1102230246251565e-016 std::cout << dpeps - d // 0.00000000000000000 << std::endl;
So this cancellation error leaves the values equal, despite adding half
epsilon
.
To achieve greater portability over platform and floating-point type, Boost.Math
and Boost.Multiprecion provide a package of functions that 'do something
sensible' if the standard numeric_limits
is not available. To use these #include
<boost/math/tools/precision.hpp>
.
A tolerance might be defined using this version of epsilon thus:
RealType tolerance = boost::math::tools::epsilon<RealType>() * 2;
epsilon
is very useful
to compute a tolerance when comparing floating-point values, a much more
difficult task than is commonly imagined.
For more information you probably want (but still need) see What Every Computer Scientist Should Know About Floating-Point Arithmetic
The naive test comparing the absolute difference between two values and a tolerance does not give useful results if the values are too large or too small.
So Boost.Test uses an algorithm first devised by Knuth for reliably checking if floating-point values are close enough.
See Donald. E. Knuth. The art of computer programming (vol II). Copyright 1998 Addison-Wesley Longman, Inc., 0-201-89684-2. Addison-Wesley Professional; 3rd edition.
See also:
Alberto Squassia, Comparing floats
Alberto Squassia, Comparing floats code
For example, if we want a tolerance that might suit about 9 arithmetical operations, say sqrt(9) = 3, we could define:
T tolerance = 3 * std::numeric_limits<T>::epsilon();
This is very widely used in Boost.Math testing with Boost.Test's macro
BOOST_CHECK_CLOSE_FRACTION
T expected = 1.0; T calculated = 1.0 + std::numeric_limits<T>::epsilon(); BOOST_CHECK_CLOSE_FRACTION(expected, calculated, tolerance);
used thus:
BOOST_CHECK_CLOSE_FRACTION(expected, calculated, tolerance);
(There is also a version using tolerance as a percentage rather than a fraction).
using boost::multiprecision::number; using boost::multiprecision::cpp_dec_float; using boost::multiprecision::et_off; typedef number<cpp_dec_float<50>, et_off > cpp_dec_float_50; // 50 decimal digits.
Note | |
---|---|
that Boost.Test does not yet allow floating-point comparisons with expression
templates on, so the default expression template parameter has been replaced
by |
cpp_dec_float_50 tolerance = 3 * std::numeric_limits<cpp_dec_float_50>::epsilon(); cpp_dec_float_50 expected = boost::math::constants::two_pi<cpp_dec_float_50>(); cpp_dec_float_50 calculated = 2 * boost::math::constants::pi<cpp_dec_float_50>(); BOOST_CHECK_CLOSE_FRACTION(expected, calculated, tolerance);
For floating-point types only, for which std::numeric_limits<T>::has_infinity
== true
,
function std::numeric_limits<T>::infinity()
provides an implementation-defined representation for ∞.
The 'representation' is a particular bit pattern reserved for infinity.
For IEEE754 system (for which std::numeric_limits<T>::is_iec559
== true
)
positive
and negative infinity are assigned bit patterns for all defined
floating-point types.
Confusingly, the string resulting from outputting this representation, is also implementation-defined. And the string that can be input to generate the representation is also implementation-defined.
For example, the output is 1.#INF
on Microsoft systems, but inf
on most *nix platforms.
This implementation-defined-ness has hampered use of infinity (and NaNs) but Boost.Math and Boost.Multiprecision work hard to provide a sensible representation for all floating-point types, not just the built-in types, which with the use of suitable facets to define the input and output strings, makes it possible to use these useful features portably and including Boost.Serialization.
For floating-point types only, for which std::numeric_limits<T>::has_quiet_NaN
== true
,
function std::numeric_limits<T>::quiet_NaN()
provides an implementation-defined representation for NaN.
NaNs are values to
indicate that the result of an assignment or computation is meaningless.
A typical example is 0/0
but there are many others.
NaNs may also be used, to represent missing values: for example, these could, by convention, be ignored in calculations of statistics like means.
Many of the problems with a representation for Not-A-Number has hampered portable use, similar to those with infinity.
NaN can be used with binary multiprecision types like cpp_bin_float_quad
:
using boost::multiprecision::cpp_bin_float_quad; if (std::numeric_limits<cpp_bin_float_quad>::has_quiet_NaN == true) { cpp_bin_float_quad tolerance = 3 * std::numeric_limits<cpp_bin_float_quad>::epsilon(); cpp_bin_float_quad NaN = std::numeric_limits<cpp_bin_float_quad>::quiet_NaN(); std::cout << "cpp_bin_float_quad NaN is " << NaN << std::endl; // cpp_bin_float_quad NaN is nan cpp_bin_float_quad expected = NaN; cpp_bin_float_quad calculated = 2 * NaN; // Comparisons of NaN's always fail: bool b = expected == calculated; std::cout << b << std::endl; BOOST_CHECK_NE(expected, expected); BOOST_CHECK_NE(expected, calculated); } else { std::cout << "Type " << typeid(cpp_bin_float_quad).name() << " does not have NaNs!" << std::endl; }
But using Boost.Math and suitable facets can permit portable use of both NaNs and positive and negative infinity.
See boost:/libs/math/example/nonfinite_facet_sstream.cpp and we also need
#include <boost/math/special_functions/nonfinite_num_facets.hpp>
Then we can equally well use a multiprecision type cpp_bin_float_quad:
using boost::multiprecision::cpp_bin_float_quad; typedef cpp_bin_float_quad T; using boost::math::nonfinite_num_put; using boost::math::nonfinite_num_get; { std::locale old_locale; std::locale tmp_locale(old_locale, new nonfinite_num_put<char>); std::locale new_locale(tmp_locale, new nonfinite_num_get<char>); std::stringstream ss; ss.imbue(new_locale); T inf = std::numeric_limits<T>::infinity(); ss << inf; // Write out. assert(ss.str() == "inf"); T r; ss >> r; // Read back in. assert(inf == r); // Confirms that the floating-point values really are identical. std::cout << "infinity output was " << ss.str() << std::endl; std::cout << "infinity input was " << r << std::endl; }
infinity output was inf infinity input was inf
Similarly we can do the same with NaN (except that we cannot use assert
)
{ std::locale old_locale; std::locale tmp_locale(old_locale, new nonfinite_num_put<char>); std::locale new_locale(tmp_locale, new nonfinite_num_get<char>); std::stringstream ss; ss.imbue(new_locale); T n; T NaN = std::numeric_limits<T>::quiet_NaN(); ss << NaN; // Write out. assert(ss.str() == "nan"); std::cout << "NaN output was " << ss.str() << std::endl; ss >> n; // Read back in. std::cout << "NaN input was " << n << std::endl; }
NaN output was nan NaN input was nan
For floating-point types only, for which std::numeric_limits<T>::has_signaling_NaN
== true
,
function std::numeric_limits<T>::signaling_NaN()
provides an implementation-defined representation for NaN that causes a
hardware trap. It should be noted however, that at least one implementation
of this function causes a hardware trap to be triggered simply by calling
std::numeric_limits<T>::signaling_NaN()
,
and not only by using the value returned.