Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

This is the documentation for a snapshot of the develop branch, built from commit d4017097fb.
PrevUpHomeNext

The Ljung-Box Test

Synopsis

#include <boost/math/statistics/ljung_box.hpp>

namespace boost::math::statistics {

template<class RandomAccessIterator>
std::pair<Real, Real> ljung_box(RandomAccessIterator begin, RandomAccessIterator end, int64_t lags = -1, int64_t fit_dof = 0);


template<class RandomAccessContainer>
auto ljung_box(RandomAccessContainer const & v, int64_t lags = -1, int64_t fit_dof = 0);

}

Background

The Ljung-Box test is used to test if residuals from a fitted model have unwanted autocorrelation. If autocorrelation exists in the residuals, then presumably a model with more parameters can be fitted to the original data and explain more of the structure it contains.

The test statistic is

where n is the length of v and ℓ is the number of lags.

The variance of the statistic slightly exceeds the variance of the chi squared distribution, but nonetheless it still is a fairly good test with reasonable computational cost.

An example use is given below:

#include <vector>
#include <random>
#include <iostream>
#include <boost/math/statistics/ljung_box.hpp>
using boost::math::statistics::ljung_box;
std::random_device rd;
std::normal_distribution<double> dis(0, 1);
std::vector<double> v(8192);
for (auto & x : v) { x = dis(rd); }
auto [Q, p] = ljung_box(v);
// Possible output: Q = 5.94734, p = 0.819668

Now if the result is clearly autocorrelated:

for (size_t i = 0; i < v.size(); ++i) { v[i] = i; }
auto [Q, p] = ljung_box(v);
// Possible output: Q = 81665.1, p = 0

By default, the number of lags is taken to be the logarithm of the number of samples, so that the default complexity is [bigO](n ln n). If you want to calculate a given number of lags, use the second argument:

int64_t lags = 10;
auto [Q, p] = ljung_box(v,10);

Finally, it is sometimes relevant to specify how many degrees of freedom were used in creating the model from which the residuals were computed. This does not affect the test statistic Q, but only the p-value. If you need to specify the number of degrees of freedom, use

int64_t fit_dof = 2;
auto [Q, p] = ljung_box(v, -1, fit_dof);

For example, if you fit your data with an ARIMA(p, q) model, then fit_dof = p + q.


PrevUpHomeNext