...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
#include <boost/math/statistics/bivariate_statistics.hpp> namespace boost{ namespace math{ namespace statistics { template<typename ExecutionPolicy, typename Container> auto covariance(ExecutionPolicy&& exec, Container const & u, Container const & v); template<typename Container> auto covariance(Container const & u, Container const & v); template<typename ExecutionPolicy, typename Container> auto means_and_covariance(ExecutionPolicy&& exec, Container const & u, Container const & v); template<typename Container> auto means_and_covariance(Container const & u, Container const & v); template<typename ExecutionPolicy, typename Container> auto correlation_coefficient(ExecutionPolicy&& exec, Container const & u, Container const & v); template<typename Container> auto correlation_coefficient(Container const & u, Container const & v); }}}
This file provides functions for computing bivariate statistics. The functions are C++11 compatible, but require C++17 to use execution policies. If an execution policy is not passed to the function the default is std::execution::seq.
Computes the population covariance of two datasets:
std::vector<double> u{1,2,3,4,5}; std::vector<double> v{1,2,3,4,5}; double cov_uv = boost::math::statistics::covariance(u, v);
The implementation follows Bennet et al. The parallel implementation follows Schubert et al. The data is not modified. Works with real-valued inputs and does not work with complex-valued inputs.
Nota bene: If the input is an integer type the output will be a double precision type.
The algorithm used herein simultaneously generates the mean values of the input
data u and v. For certain applications,
it might be useful to get them in a single pass through the data. As such,
we provide means_and_covariance
:
std::vector<double> u{1,2,3,4,5}; std::vector<double> v{1,2,3,4,5}; auto [mu_u, mu_v, cov_uv] = boost::math::statistics::means_and_covariance(u, v);
Computes the Pearson correlation coefficient of two datasets u and v:
std::vector<double> u{1,2,3,4,5}; std::vector<double> v{1,2,3,4,5}; double rho_uv = boost::math::statistics::correlation_coefficient(u, v); // rho_uv = 1.
Works with real-valued inputs and does not work with complex-valued inputs.
Nota bene: If the input is an integer type the output will be a double precision type.
If one or both of the datasets is constant, the correlation coefficient is
an indeterminant form (0/0). In this case the returned value is a quiet_NaN()
.