...one of the most highly
regarded and expertly designed C++ library projects in the
world.

— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards

Imagine you have an event (let's call it a "failure" - though
we could equally well call it a success if we felt it was a 'good'
event) that you know will occur in 1 in N trials. You may want to know
how many trials you need to conduct to be P% sure of observing at least
k such failures. If the failure events follow a negative binomial distribution
(each trial either succeeds or fails) then the static member function
`negative_binomial_distibution<>::find_minimum_number_of_trials`

can be used to estimate the minimum number of trials required to be
P% sure of observing the desired number of failures.

The example program neg_binomial_sample_sizes.cpp demonstrates its usage.

It centres around a routine that prints out a table of minimum sample sizes (number of trials) for various probability thresholds:

void find_number_of_trials(double failures, double p);

First define a table of significance levels: these are the maximum
acceptable probability that *failure* or fewer events
will be observed.

double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };

Confidence value as % is (1 - alpha) * 100, so alpha 0.05 == 95% confidence that the desired number of failures will be observed. The values range from a very low 0.5 or 50% confidence up to an extremely high confidence of 99.999.

Much of the rest of the program is pretty-printing, the important part is in the calculation of minimum number of trials required for each value of alpha using:

(int)ceil(negative_binomial::find_minimum_number_of_trials(failures, p, alpha[i]);

find_minimum_number_of_trials returns a double, so `ceil`

rounds this up to ensure we have an integral minimum number of trials.

void find_number_of_trials(double failures, double p) { // trials = number of trials // failures = number of failures before achieving required success(es). // p = success fraction (0 <= p <= 1.). // // Calculate how many trials we need to ensure the // required number of failures DOES exceed "failures". cout << "\n""Target number of failures = " << (int)failures; cout << ", Success fraction = " << fixed << setprecision(1) << 100 * p << "%" << endl; // Print table header: cout << "____________________________\n" "Confidence Min Number\n" " Value (%) Of Trials \n" "____________________________\n"; // Now print out the data for the alpha table values. for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i) { // Confidence values %: cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]) << " " // find_minimum_number_of_trials << setw(6) << right << (int)ceil(negative_binomial::find_minimum_number_of_trials(failures, p, alpha[i])) << endl; } cout << endl; } // void find_number_of_trials(double failures, double p)

finally we can produce some tables of minimum trials for the chosen confidence levels:

int main() { find_number_of_trials(5, 0.5); find_number_of_trials(50, 0.5); find_number_of_trials(500, 0.5); find_number_of_trials(50, 0.1); find_number_of_trials(500, 0.1); find_number_of_trials(5, 0.9); return 0; } // int main()

Note | |
---|---|

Since we're calculating the floor(negative_binomial::find_minimum_number_of_trials(failures, p, alpha[i]))
which would give us the largest number of trials we could conduct
and still be P% sure of observing |

We'll finish off by looking at some sample output, firstly suppose we wish to observe at least 5 "failures" with a 50/50 (0.5) chance of success or failure:

Target number of failures = 5, Success fraction = 50% ____________________________ Confidence Min Number Value (%) Of Trials ____________________________ 50.000 11 75.000 14 90.000 17 95.000 18 99.000 22 99.900 27 99.990 31 99.999 36

So 18 trials or more would yield a 95% chance that at least our 5 required failures would be observed.

Compare that to what happens if the success ratio is 90%:

Target number of failures = 5.000, Success fraction = 90.000% ____________________________ Confidence Min Number Value (%) Of Trials ____________________________ 50.000 57 75.000 73 90.000 91 95.000 103 99.000 127 99.900 159 99.990 189 99.999 217

So now 103 trials are required to observe at least 5 failures with 95% certainty.