...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
When calibrating or comparing a scientific instrument or measurement method of some kind, we want to be answer the question "Does an observed sample mean differ from the "true" mean in any significant way?". If it does, then we have evidence of a systematic difference. This question can be answered with a Students-t test: more information can be found on the NIST site.
Of course, the assignment of "true" to one mean may be quite arbitrary, often this is simply a "traditional" method of measurement.
The following example code is taken from the example program students_t_single_sample.cpp.
We'll begin by defining a procedure to determine which of the possible hypothesis are rejected or not-rejected at a given significance level:
Note | |
---|---|
Non-statisticians might say 'not-rejected' means 'accepted', (often of the null-hypothesis) implying, wrongly, that there really IS no difference, but statisticians eschew this to avoid implying that there is positive evidence of 'no difference'. 'Not-rejected' here means there is no evidence of difference, but there still might well be a difference. For example, see argument from ignorance and Absence of evidence does not constitute evidence of absence. |
// Needed includes: #include <boost/math/distributions/students_t.hpp> #include <iostream> #include <iomanip> // Bring everything into global namespace for ease of use: using namespace boost::math; using namespace std; void single_sample_t_test(double M, double Sm, double Sd, unsigned Sn, double alpha) { // // M = true mean. // Sm = Sample Mean. // Sd = Sample Standard Deviation. // Sn = Sample Size. // alpha = Significance Level.
Most of the procedure is pretty-printing, so let's just focus on the calculation, we begin by calculating the t-statistic:
// Difference in means: double diff = Sm - M; // Degrees of freedom: unsigned v = Sn - 1; // t-statistic: double t_stat = diff * sqrt(double(Sn)) / Sd;
Finally calculate the probability from the t-statistic. If we're interested in simply whether there is a difference (either less or greater) or not, we don't care about the sign of the t-statistic, and we take the complement of the probability for comparison to the significance level:
students_t dist(v); double q = cdf(complement(dist, fabs(t_stat)));
The procedure then prints out the results of the various tests that can be done, these can be summarised in the following table:
Hypothesis |
Test |
---|---|
The Null-hypothesis: there is no difference in means |
Reject if complement of CDF for |t| < significance level / 2:
|
The Alternative-hypothesis: there is difference in means |
Reject if complement of CDF for |t| > significance level / 2:
|
The Alternative-hypothesis: the sample mean is less than the true mean. |
Reject if CDF of t > 1 - significance level:
|
The Alternative-hypothesis: the sample mean is greater than the true mean. |
Reject if complement of CDF of t < significance level:
|
Note | |
---|---|
Notice that the comparisons are against |
Now that we have all the parts in place, let's take a look at some sample output, first using the Heat flow data from the NIST site. The data set was collected by Bob Zarr of NIST in January, 1990 from a heat flow meter calibration and stability analysis. The corresponding dataplot output for this test can be found in section 3.5.2 of the NIST/SEMATECH e-Handbook of Statistical Methods..
__________________________________ Student t test for a single sample __________________________________ Number of Observations = 195 Sample Mean = 9.26146 Sample Standard Deviation = 0.02279 Expected True Mean = 5.00000 Sample Mean - Expected Test Mean = 4.26146 Degrees of Freedom = 194 T Statistic = 2611.28380 Probability that difference is due to chance = 0.000e+000 Results for Alternative Hypothesis and alpha = 0.0500 Alternative Hypothesis Conclusion Mean != 5.000 NOT REJECTED Mean < 5.000 REJECTED Mean > 5.000 NOT REJECTED
You will note the line that says the probability that the difference is due to chance is zero. From a philosophical point of view, of course, the probability can never reach zero. However, in this case the calculated probability is smaller than the smallest representable double precision number, hence the appearance of a zero here. Whatever its "true" value is, we know it must be extraordinarily small, so the alternative hypothesis - that there is a difference in means - is not rejected.
For comparison the next example data output is taken from P.K.Hou, O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64. and from Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55 J. C. Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907. The values result from the determination of mercury by cold-vapour atomic absorption.
__________________________________ Student t test for a single sample __________________________________ Number of Observations = 3 Sample Mean = 37.80000 Sample Standard Deviation = 0.96437 Expected True Mean = 38.90000 Sample Mean - Expected Test Mean = -1.10000 Degrees of Freedom = 2 T Statistic = -1.97566 Probability that difference is due to chance = 1.869e-001 Results for Alternative Hypothesis and alpha = 0.0500 Alternative Hypothesis Conclusion Mean != 38.900 REJECTED Mean < 38.900 NOT REJECTED Mean > 38.900 NOT REJECTED
As you can see the small number of measurements (3) has led to a large uncertainty in the location of the true mean. So even though there appears to be a difference between the sample mean and the expected true mean, we conclude that there is no significant difference, and are unable to reject the null hypothesis. However, if we were to lower the bar for acceptance down to alpha = 0.1 (a 90% confidence level) we see a different output:
__________________________________ Student t test for a single sample __________________________________ Number of Observations = 3 Sample Mean = 37.80000 Sample Standard Deviation = 0.96437 Expected True Mean = 38.90000 Sample Mean - Expected Test Mean = -1.10000 Degrees of Freedom = 2 T Statistic = -1.97566 Probability that difference is due to chance = 1.869e-001 Results for Alternative Hypothesis and alpha = 0.1000 Alternative Hypothesis Conclusion Mean != 38.900 REJECTED Mean < 38.900 NOT REJECTED Mean > 38.900 REJECTED
In this case, we really have a borderline result, and more data (and/or more accurate data), is needed for a more convincing conclusion.