Boost C++ Libraries of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards


Reciprocal square root

#include <boost/math/special_functions/rsqrt.hpp>

namespace boost::math {

template<class Real>
Real rsqrt(Real const & x);

} // namespaces

The function rsqrt computes the reciprocal square root 1/√x. Those in the game programming community might suspect this is a fast, low precision wrapper around the rsqrtss instruction. This is not correct: We tried this instruction, but found no performance benefit to using it. However, the trick of computing a low precision reciprocal square root and then bootstrapping to higher precision via Newton's method does work, but it only yields a performance benefit for quad and higher precision. We do of course allow you to use rsqrt for float, double, and long double, but be aware there is no performance benefit to doing so. However, the savings for quad precision and higher are very significant.

The use is

using boost::multiprecision::float128;
float128 x = 0.1Q;
float128 y = boost::math::rsqrt(x);

The reciprocal square root of +∞ is zero, and the reciprocal square root of a NaN is a NaN.


Running ./reporting/performance/rsqrt_performance.x
Run on (16 X 4300 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 1024 KiB (x8)
  L3 Unified 11264 KiB (x1)
Load Average: 0.43, 0.49, 0.46
Benchmark                                        Time             CPU   Iterations
Rsqrt<float>                                  1.35 ns         1.35 ns    503364351
Rsqrt<double>                                 2.25 ns         2.25 ns    309753242
Rsqrt<long double>                            2.68 ns         2.68 ns    261382652
Rsqrt<float128>                                182 ns          182 ns      3756956
Rsqrt<number<mpfr_float_backend<100>>>         299 ns          299 ns      2494027
Rsqrt<number<mpfr_float_backend<200>>>         412 ns          412 ns      1589284
Rsqrt<number<mpfr_float_backend<300>>>         617 ns          617 ns      1067473
Rsqrt<number<mpfr_float_backend<400>>>         812 ns          812 ns       830564
Rsqrt<number<mpfr_float_backend<1000>>>       3183 ns         3183 ns       221079
Rsqrt<cpp_bin_float_50>                       4321 ns         4321 ns       163243
Rsqrt<cpp_bin_float_100>                      9393 ns         9393 ns        72967