...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
#include <boost/math/special_functions/rsqrt.hpp> namespace boost::math { template<class Real> Real rsqrt(Real const & x); } // namespaces
The function rsqrt
computes
the reciprocal square root 1/√x. Those in the game programming
community might suspect this is a fast, low precision wrapper around the
rsqrtss instruction.
This is not correct: We tried this instruction, but
found no performance benefit to using it. However, the trick
of computing a low precision reciprocal square root and then bootstrapping
to higher precision via Newton's method does work, but
it only yields a performance benefit for quad and higher precision. We do
of course allow you to use rsqrt
for float
, double
,
and long double
,
but be aware there is no performance benefit to doing so. However, the savings
for quad precision and higher are very significant.
The use is
using boost::multiprecision::float128; float128 x = 0.1Q; float128 y = boost::math::rsqrt(x);
The reciprocal square root of +∞ is zero, and the reciprocal square root of a NaN is a NaN.
Performance:
Running ./reporting/performance/rsqrt_performance.x Run on (16 X 4300 MHz CPU s) CPU Caches: L1 Data 32 KiB (x8) L1 Instruction 32 KiB (x8) L2 Unified 1024 KiB (x8) L3 Unified 11264 KiB (x1) Load Average: 0.43, 0.49, 0.46 ---------------------------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------------------------- Rsqrt<float> 1.35 ns 1.35 ns 503364351 Rsqrt<double> 2.25 ns 2.25 ns 309753242 Rsqrt<long double> 2.68 ns 2.68 ns 261382652 Rsqrt<float128> 182 ns 182 ns 3756956 Rsqrt<number<mpfr_float_backend<100>>> 299 ns 299 ns 2494027 Rsqrt<number<mpfr_float_backend<200>>> 412 ns 412 ns 1589284 Rsqrt<number<mpfr_float_backend<300>>> 617 ns 617 ns 1067473 Rsqrt<number<mpfr_float_backend<400>>> 812 ns 812 ns 830564 Rsqrt<number<mpfr_float_backend<1000>>> 3183 ns 3183 ns 221079 Rsqrt<cpp_bin_float_50> 4321 ns 4321 ns 163243 Rsqrt<cpp_bin_float_100> 9393 ns 9393 ns 72967