...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
true
for all arithmetic types
(integer, floating and fixed-point) for which std::numeric_limits<T>::numeric_limits
is specialized.
A typical test is
if (std::numeric_limits<T>::is_specialized == false) { std::cout << "type " << typeid(T).name() << " is not specialized for std::numeric_limits!" << std::endl; // ... }
Typically numeric_limits<T>::is_specialized
is true
for all T
where the compile-time constant members
of numeric_limits
are indeed
known at compile time, and don't vary at runtime. For example floating-point
types with runtime-variable precision such as mpfr_float
have no numeric_limits
specialization as it would be impossible to define all the members at compile
time. In contrast the precision of a type such as mpfr_float_50
is known at compile time, and so it does have a numeric_limits
specialization.
Note that not all the std::numeric_limits
member constants and functions are meaningful for all user-defined types
(UDT), such as the decimal and binary multiprecision types provided here.
More information on this is given in the sections below.
For floating-point types, ∞ is defined wherever possible, but clearly infinity
is meaningless for __arbitrary_precision arithmetic backends, and there
is one floating-point type (GMP's mpf_t
,
see gmp_float)
which has no notion of infinity or NaN at all.
A typical test whether infinity is implemented is
if(std::numeric_limits<T>::has_infinity) { std::cout << std::numeric_limits<T>::infinity() << std::endl; }
and using tests like this is strongly recommended to improve portability.
Warning | |
---|---|
If the backend is switched to a type that does not support infinity (or similarly NaNs) then, without checks like this, there will be trouble. |
std::numeric_limits<T>::is_signed ==
true
if the type T
is signed.
For fundamental
(built-in) binary types, the sign is held in a single bit, but
for other types (cpp_dec_float
and cpp_bin_float
) it may
be a separate storage element, usually bool
.
std::numeric_limits<T>::is_exact ==
true
if type T uses exact representations.
This is defined as true
for
all integer types and false
for floating-point types.
A usable definition has been discussed.
ISO/IEC 10967-1, Language independent arithmetic, noted by the C++ Standard defines
A floating-point type F shall be a finite subset of [real].
The important practical distinction is that all integers (up to max()
)
can be stored exactly.
Rational types using two integer types are also exact.
Floating-point types cannot store all real values (those in the set of ℜ) exactly. For example, 0.5 can be stored exactly in a binary floating-point, but 0.1 cannot. What is stored is the nearest representable real value, that is, rounded to nearest.
Fixed-point types (usually decimal) are also defined as exact, in that they only store a fixed precision, so half cents or pennies (or less) cannot be stored. The results of computations are rounded up or down, just like the result of integer division stored as an integer result.
There are number of proposals to add Decimal floating-point Support to C++.
And also C++ Binary Fixed-Point Arithmetic.
std::numeric_limits<T>::is_bounded ==
true
if the set of values represented
by the type T
is finite.
This is true
for all fundamental (built-in)
type integer, fixed and floating-point types, and most multi-precision
types.
It is only false
for a few
__arbitrary_precision types like cpp_int
.
Rational and fixed-exponent representations are exact but not integer.
std::numeric_limits<T>::is_modulo
is defined as true
if adding two positive values of type
T can yield a result less than either value.
is_modulo ==
true
means that the type does not
overflow, but, for example, 'wraps around' to zero, when adding one to
the max()
value.
For most fundamental
(built-in) integer types, std::numeric_limits<>::is_modulo
is true
.
bool
is the only exception.
The modulo behaviour is sometimes useful, but also can be unexpected, and sometimes undesired, behaviour.
Overflow of signed integers can be especially unexpected, possibly causing change of sign.
Boost.Multiprecision integer type cpp_int
is not modulo because as an __arbitrary_precision types, it expands to
hold any value that the machine resources permit.
However fixed precision cpp_int's may be modulo if they are unchecked (i.e. they behave just like fundamental (built-in) integers), but not if they are checked (overflow causes an exception to be raised).
fundamental (built-in) and multi-precision floating-point types are normally not modulo.
Where possible, overflow is to std::numeric_limits<>::infinity()
, provided std::numeric_limits<>::has_infinity
== true
.
Constant std::numeric_limits<T>::radix
returns either 2 (for fundamental
(built-in) and binary types) or 10 (for decimal types).
The number of radix
digits
that be represented without change:
The values include any implicit bit, so for example, for the ubiquious
double
using 64 bits (IEEE
binary64 ), digits
== 53, even though there are only 52 actual bits of the significand stored
in the representation. The value of digits
reflects the fact that there is one implicit bit which is always set to
1.
The Boost.Multiprecision binary types do not use an implicit bit, so the
digits
member reflects
exactly how many bits of precision were requested:
typedef number<cpp_bin_float<53, digit_base_2> > float64; typedef number<cpp_bin_float<113, digit_base_2> > float128; std::numeric_limits<float64>::digits == 53. std::numeric_limits<float128>::digits == 113.
For the most common case of radix
== 2
,
std::numeric_limits<T>::digits
is the number of bits in the representation,
not counting any sign bit.
For a decimal integer type, when radix
== 10
,
it is the number of decimal digits.
Constant std::numeric_limits<T>::digits10
returns the number of decimal
digits that can be represented without change or loss.
For example, numeric_limits<unsigned char>::digits10
is 2.
This somewhat inscrutable definition means that an unsigned
char
can hold decimal values 0..99
without loss of precision or accuracy, usually from truncation.
Had the definition been 3 then that would imply it could hold 0..999, but
as we all know, an 8-bit unsigned
char
can only hold 0..255, and an
attempt to store 256 or more will involve loss or change.
For bounded integers, it is thus one less
than number of decimal digits you need to display the biggest integer
std::numeric_limits<T>::max()
.
This value can be used to predict the layout width required for
std::cout << std::setw(std::numeric_limits<short>::digits10 +1 +1) // digits10+1, and +1 for sign. << std::showpos << (std::numeric_limits<short>::max)() // +32767 << std::endl << std::setw(std::numeric_limits<short>::digits10 +1 +1) << (std::numeric_limits<short>::min)() << std::endl; // -32767
For example, unsigned short
is often stored in 16 bits, so the maximum value is 0xFFFF or 65535.
std::cout << std::setw(std::numeric_limits<unsigned short>::digits10 +1 +1) // digits10+1, and +1 for sign. << std::showpos << (std::numeric_limits<unsigned short>::max)() // 65535 << std::endl << std::setw(std::numeric_limits<unsigned short>::digits10 +1 +1) // digits10+1, and +1 for sign. << (std::numeric_limits<unsigned short>::min)() << std::endl; // 0
For bounded floating-point types, if we create a double
with a value with digits10
(usually 15) decimal digits, 1e15
or 1000000000000000
:
std::cout.precision(std::numeric_limits<double>::max_digits10); double d = 1e15; double dp1 = d+1; std::cout << d << "\n" << dp1 << std::endl; // 1000000000000000 // 1000000000000001 std::cout << dp1 - d << std::endl; // 1
and we can increment this value to 1000000000000001
as expected and show the difference too.
But if we try to repeat this with more than digits10
digits,
std::cout.precision(std::numeric_limits<double>::max_digits10); double d = 1e16; double dp1 = d+1; std::cout << d << "\n" << dp1 << std::endl; // 10000000000000000 // 10000000000000000 std::cout << dp1 - d << std::endl; // 0 !!!
then we find that when we add one it has no effect, and display show that there is loss of precision. See Loss of significance or cancellation error.
So digits10
is the number
of decimal digits guaranteed to be correct.
For example, 'round-tripping' for double
:
digits10
(
== 15) significant decimal digits is converted to double
and then converted back to the same number of significant decimal digits,
then the final string will match the original 15 decimal digit string.
double
floating-point
number is converted to a decimal string with at least 17 decimal digits
and then converted back to double
,
then the result will be binary identical to the original double
value.
For most purposes, you will much more likely want std::numeric_limits<>::max_digits10
,
the number of decimal digits that ensure that a change of one least significant
bit (Unit
in the last place (ULP)) produces a different decimal digits string.
For the most common double
floating-point type,max_digits10
is digits10+2
, but you should use C++11 max_digits10
where possible (see below).
std::numeric_limits<T>::max_digits10
was added for floating-point
because digits10
decimal
digits are insufficient to show a least significant bit (ULP) change giving
puzzling displays like
0.666666666666667 != 0.666666666666667
from failure to 'round-trip', for example:
double write = 2./3; // Any arbitrary value that cannot be represented exactly. double read = 0; std::stringstream s; s.precision(std::numeric_limits<double>::digits10); // or `float64_t` for 64-bit IEE754 double. s << write; s >> read; if(read != write) { std::cout << std::setprecision(std::numeric_limits<double>::digits10) << read << " != " << write << std::endl; }
If you wish to ensure that a change of one least significant bit (ULP)
produces a different decimal digits string, then max_digits10
is the precision to use.
For example:
double pi = boost::math::double_constants::pi; std::cout.precision(std::numeric_limits<double>::max_digits10); std::cout << pi << std::endl; // 3.1415926535897931
will display π to the maximum possible precision using a double
.
and similarly for a much higher precision type:
using namespace boost::multiprecision; typedef number<cpp_dec_float<50> > cpp_dec_float_50; // 50 decimal digits. using boost::multiprecision::cpp_dec_float_50; cpp_dec_float_50 pi = boost::math::constants::pi<cpp_dec_float_50>(); std::cout.precision(std::numeric_limits<cpp_dec_float_50>::max_digits10); std::cout << pi << std::endl; // 3.141592653589793238462643383279502884197169399375105820974944592307816406
For integer types, max_digits10
is implementation-dependent, but is usually digits10
+ 2
.
This is the output field-width required for the maximum value of the type
T std::numeric_limits<T>::max()
including a sign and a space.
So this will produce neat columns.
std::cout << std::setw(std::numeric_limits<int>::max_digits10) ...
The extra two or three least-significant digits are 'noisy' and may be
junk, but if you want to 'round-trip' - printing a value out as a decimal
digit string and reading it back in - (most commonly during serialization
and de-serialization) you must use os.precision(std::numeric_limits<T>::max_digits10)
.
Note | |
---|---|
For Microsoft Visual Studio 2010, |
Note | |
---|---|
For Microsoft Visual Studio before 2013 and the default floating-point format, a small range of double-precision floating-point values with a significand of approximately 0.0001 to 0.004 and exponent values of 1010 to 1014 do not round-trip exactly being off by one least significant bit, for probably every third value of the significand.
A workaround is using the scientific or exponential format Other older compilers also fail to implement round-tripping entirely fault-free, for example, see Incorrectly Rounded Conversions in GCC and GLIBC. For more details see Incorrect Round-Trip Conversions in Visual C++, and references therein and Easy Accurate Reading and Writing of Floating-Point Numbers, Aubrey Jaffer (August 2018). Microsoft VS2017 and other recent compilers, now use the Ryu fast float-to-string conversion by Ulf Adams algorithm, claimed to be both exact and fast for 32 and 64-bit floating-point numbers. |
Note | |
---|---|
BOOST_NO_CXX11_NUMERIC_LIMITS is a suitable feature-test macro to determine
if |
Note | |
---|---|
requires cxx11_numeric_limits is a suitable test
for use of |
If max_digits10
is not
available, you should use the Kahan
formula for floating-point type T.
In C++, the equations for what Kahan (on page 4) describes as 'at least' and 'at most' are:
static long double const log10Two = 0.30102999566398119521373889472449L; // log10(2.) static_cast<int>(floor((significand_digits - 1) * log10Two)); // == digits10 - 'at least' . static_cast<int>(ceil(1 + significand_digits * log10Two)); // == max_digits10 - 'at most'.
Unfortunately, these cannot be evaluated (at least by C++03) at compile-time. So the following expression is often used instead.
max_digits10 = 2 + std::numeric_limits<T>::digits * 3010U/10000U; // == 2 + std::numeric_limits<T>::digits for double and 64-bit long double. // == 3 + std::numeric_limits<T>::digits for float, 80-bit long-double and __float128.
often the actual values are computed for the C limits macros:
#define FLT_MAXDIG10 (2+FLT_MANT_DIG * 3010U/10000U) // 9 #define DBL_MAXDIG10 (2+ (DBL_MANT_DIG * 3010U)/10000U) // 17 #define LDBL_MAXDIG10 (2+ (LDBL_MANT_DIG * 3010U)/10000U) // 17 for MSVC, 18 for others.
The factor 3010U/10000U is log_{10}(2) = 0.3010 that
can be evaluated at compile-time using only short
unsigned int
s
to be a desirable const
or
constexpr
(and usually also
static
).
Boost macros allow this to be done portably, see BOOST_CONSTEXPR_OR_CONST or BOOST_STATIC_CONSTEXPR.
(See also Richard P. Brent and Paul Zimmerman, Modern Computer Arithmetic Equation 3.8 on page 116).
For example, to be portable (including obselete platforms) for type T
where T
may be: float
, double
, long
double
, 128-bit quad type
,
cpp_bin_float_50
...
typedef float T; #if defined BOOST_NO_CXX11_NUMERIC_LIMITS // No max_digits10 implemented. std::cout.precision(max_digits10<T>()); #else #if(_MSC_VER <= 1600) // The MSVC 2010 version had the wrong value for std::numeric_limits<float>::max_digits10. std::cout.precision(max_digits10<T>()); #else // Use the C++11 max_digits10. std::cout.precision(std::numeric_limits<T>::max_digits10); std::cout.precision(std::numeric_limits<T>::digits10); std::cout.setf(std::ios_base::showpoint); // Append any trailing zeros, // or more memorably std::cout << std::showpoint << std::endl; // #endif #endif std::cout << "std::cout.precision(max_digits10) = " << std::cout.precision() << std::endl; // 9 double x = 1.2345678901234567889; std::cout << "x = " << x << std::endl; //
which should output:
std::cout.precision(max_digits10) = 9 x = 1.23456789
The rounding style determines how the result of floating-point operations is treated when the result cannot be exactly represented in the significand. Various rounding modes may be provided:
For integer types, std::numeric_limits<T>::round_style
is always towards zero, so
std::numeric_limits<T>::round_style == std::round_to_zero;
A decimal type, cpp_dec_float
rounds in no particular direction, which is to say it doesn't round at
all. And since there are several guard digits, it's not really the same
as truncation (round toward zero) either.
For floating-point types, it is normal to round to nearest.
std::numeric_limits<T>::round_style == std::round_to_nearest;
See function std::numeric_limits<T>::round_error
for the maximum error (in
ULP) that rounding can cause.
true
if a loss of precision
is detected as a denormalization
loss, rather than an inexact result.
Always false
for integer types.
false
for all types which
do not have has_denorm
== std::denorm_present
.
Denormalized
values are representations with a variable number of exponent bits
that can permit gradual underflow, so that, if type T is double
.
std::numeric_limits<T>::denorm_min() < std::numeric_limits<T>::min()
A type may have any of the following enum
float_denorm_style
values:
std::denorm_absent
, if it does not allow
denormalized values. (Always used for all integer and exact types).
std::denorm_present
, if the floating-point
type allows denormalized values.
std::denorm_indeterminate
, if indeterminate
at compile time.
bool std::numeric_limits<T>::tinyness_before
true
if a type can determine
that a value is too small to be represent as a normalized value before
rounding it.
Generally true for is_iec559
floating-point __fundamantal types, but false for integer types.
Standard-compliant IEEE 754 floating-point implementations may detect the floating-point underflow at three predefined moments:
std::numeric_limits<T>::min()
,
such implementation detects tinyness before rounding
(e.g. UltraSparc).
std::numeric_limits<T>::digits
bits, if the result is tiny, such implementation detects tinyness
after rounding (e.g. SuperSparc).