...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
#include <boost/math/distributions/kolmogorov_smirnov.hpp>
namespace boost{ namespace math{ template <class RealType = double, class Policy = policies::policy<> > class kolmogorov_smirnov_distribution; typedef kolmogorov_smirnov_distribution<> kolmogorov_smirnov; template <class RealType, class Policy> class kolmogorov_smirnov_distribution { public: typedef RealType value_type; typedef Policy policy_type; // Constructor: kolmogorov_smirnov_distribution(RealType n); // Accessor to parameter: RealType number_of_observations()const; }} // namespaces
The KolmogorovSmirnov test in statistics compares two empirical distributions, or an empirical distribution against any theoretical distribution.^{[1]} It makes use of a specific distribution which is informally known in the literature as the KolmogorvSmirnov distribution, implemented here.
Formally, if n observations are taken from a theoretical distribution G(x), and if G_{n}(x) represents the empirical CDF of those n observations, then the test statistic
will be distributed according to a KolmogorovSmirnov distribution parameterized by n.
The exact form of a KolmogorovSmirnov distribution is the subject of a large, decadesold literature.^{[2]} In the interest of simplicity, Boost implements the firstorder, limiting form of this distribution (the same form originally identified by Kolmogorov^{[3]}), namely
Note that while the exact distribution only has support over [0, 1], this limiting form has positive mass above unity, particularly for small n. The following graph illustrations how the distribution changes for different values of n:
kolmogorov_smirnov_distribution(RealType n);
Constructs a KolmogorovSmirnov distribution with n observations.
Requires n > 0, otherwise calls domain_error.
RealType number_of_observations()const;
Returns the parameter n from which this object was constructed.
All the usual nonmember accessor functions that are generic to all distributions are supported: Cumulative Distribution Function, Probability Density Function, Quantile, Hazard Function, Cumulative Hazard Function, mean, median, mode, variance, standard deviation, skewness, kurtosis, kurtosis_excess, range and support.
The domain of the random variable is [0, +∞].
The CDF of the KolmogorovSmirnov distribution is implemented in terms of the fourth Jacobi Theta function; please refer to the accuracy ULP plots for that function.
The PDF is implemented separately, and the following ULP plot illustrates its accuracy:
Because PDF values are simply scaled out and up by the square root of n, the above plot is representative for all values of n. Note that for present purposes, "accuracy" refers to deviations from the limiting approximation, rather than deviations from the exact distribution.
In the following table, n is the number of observations, x is the random variable, π is Archimedes' constant, and ζ(3) is Apéry's constant.
Function 
Implementation Notes 

cdf 
Using the relation: cdf = jacobi_theta4tau(0, 2*x*x/π) 

Using a manual derivative of the CDF 
cdf complement 
When x*x*n == 0: 1 When 2*x*x*n <= π: 1  jacobi_theta4tau(0, 2*x*x*n/π) When 2*x*x*n > π: jacobi_theta4m1tau(0, 2*x*x*n/π) 
quantile 
Using a NewtonRaphson iteration 
quantile from the complement 
Using a NewtonRaphson iteration 
mode 
Using a runtime PDF maximizer 
mean 
sqrt(π/2) * ln(2) / sqrt(n) 
variance 
(π^{2}/12  π/2*ln^{2}(2))/n 
skewness 
(9/16*sqrt(π/2)*ζ(3)/n^{3/2}  3 * mean * variance  mean^{2} * variance) / (variance^{3/2}) 
kurtosis 
(7/720*π^{4}/n^{2}  4 * mean * skewness * variance^{3/2}  6 * mean^{2} * variance  mean^{4}) / (variance^{2}) 
^{[2] } Simard, R. and L'Ecuyer, P. (2011) "Computing the TwoSided KolmogorovSmirnov Distribution". Journal of Statistical Software, vol. 39, no. 11.
^{[3] } Kolmogorov A (1933). "Sulla determinazione empirica di una legge di distribuzione". G. Ist. Ital. Attuari. 4: 83–91.