...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
Here are some commented examples to copy-paste from, this should allow you to kick off a project with Boost.Histogram. If you prefer a traditional structured exposition, go to the user guide.
Boost.Histogram uses axis objects to convert input values into indices. The library comes with several builtin axis types, which can be configured via template parameters. This already gives you a lot of flexibility should you need it, otherwise just use the defaults. Beyond that, you can easily write your own axis type.
When the axis types for the histogram are known at compile-time, the library generates the fastest and most efficient code for you. Here is such an example.
#include <algorithm> // std::for_each #include <boost/format.hpp> // only needed for printing #include <boost/histogram.hpp> // make_histogram, regular, weight, indexed #include <cassert> // assert (used to test this example for correctness) #include <functional> // std::ref #include <iostream> // std::cout, std::flush #include <sstream> // std::ostringstream int main() { using namespace boost::histogram; // strip the boost::histogram prefix /* Create a 1d-histogram with a regular axis that has 6 equidistant bins on the real line from -1.0 to 2.0, and label it as "x". A family of overloaded factory functions called `make_histogram` makes creating histograms easy. A regular axis is a sequence of semi-open bins. Extra under- and overflow bins extend the axis by default (this can be turned off). index : -1 | 0 | 1 | 2 | 3 | 4 | 5 | 6 bin edges: -inf -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 inf */ auto h = make_histogram(axis::regular<>(6, -1.0, 2.0, "x")); /* Let's fill a histogram with data, typically this happens in a loop. STL algorithms are supported. std::for_each is very convenient to fill a histogram from an iterator range. Use std::ref in the call, if you don't want std::for_each to make a copy of your histogram. */ auto data = {-0.5, 1.1, 0.3, 1.7}; std::for_each(data.begin(), data.end(), std::ref(h)); // let's fill some more values manually h(-1.5); // is placed in underflow bin -1 h(-1.0); // is placed in bin 0, bin interval is semi-open h(2.0); // is placed in overflow bin 6, bin interval is semi-open h(20.0); // is placed in overflow bin 6 /* This does a weighted fill using the `weight` function as an additional argument. It may appear at the beginning or end of the argument list. C++ doesn't have keyword arguments like Python, this is the next-best thing. */ h(0.1, weight(1.0)); /* Iterate over bins with the `indexed` range generator, which provides a special accessor object, that can be used to obtain the current bin index, and the current bin value by dereferncing (it acts like a pointer to the value). Using `indexed` is convenient and gives you better performance than looping over the histogram cells with hand-written for loops. By default, under- and overflow bins are skipped. Passing `coverage::all` as the optional second argument iterates over all bins. - Access the value with the dereference operator. - Access the current index with `index(d)` method of the accessor. - Access the corresponding bin interval view with `bin(d)`. The return type of `bin(d)` depends on the axis type (see the axis reference for details). It usually is a class that represents a semi-open interval. Edges can be accessed with methods `lower()` and `upper()`. */ std::ostringstream os; for (auto&& x : indexed(h, coverage::all)) { os << boost::format("bin %2i [%4.1f, %4.1f): %i\n") % x.index() % x.bin().lower() % x.bin().upper() % *x; } std::cout << os.str() << std::flush; assert(os.str() == "bin -1 [-inf, -1.0): 1\n" "bin 0 [-1.0, -0.5): 1\n" "bin 1 [-0.5, -0.0): 1\n" "bin 2 [-0.0, 0.5): 2\n" "bin 3 [ 0.5, 1.0): 0\n" "bin 4 [ 1.0, 1.5): 1\n" "bin 5 [ 1.5, 2.0): 1\n" "bin 6 [ 2.0, inf): 2\n"); }
We passed the regular
axis type directly to the make_histogram
function. The library then generates a specialized histogram type with just
one regular axis from a generic template.
Sometimes, you don't know the number or types of axes at compile-time, because it depends on run-time information. Perhaps you want to write a command-line tool that generates histograms from input data, or you use this library as a back-end for a product with a GUI. This is possible as well, here is the example.
#include <boost/format.hpp> #include <boost/histogram.hpp> #include <cassert> #include <iostream> #include <sstream> #include <string> int main() { using namespace boost::histogram; /* Create a histogram which can be configured dynamically at run-time. The axis configuration is first collected in a vector of axis::variant type, which can hold different axis types (those in its template argument list). Here, we use a variant that can store a regular and a category axis. */ using reg = axis::regular<>; using cat = axis::category<std::string>; using variant = axis::variant<axis::regular<>, axis::category<std::string>>; std::vector<variant> axes; axes.emplace_back(cat({"red", "blue"})); axes.emplace_back(reg(3, 0.0, 1.0, "x")); axes.emplace_back(reg(3, 0.0, 1.0, "y")); // passing an iterator range also works here auto h = make_histogram(std::move(axes)); // fill histogram with data, usually this happens in a loop h("red", 0.1, 0.2); h("blue", 0.7, 0.3); h("red", 0.3, 0.7); h("red", 0.7, 0.7); /* Print histogram by iterating over bins. Since the [bin type] of the category axis cannot be converted into a double, it cannot be handled by the polymorphic interface of axis::variant. We use axis::get to "cast" the variant type to the actual category type. */ // get reference to category axis, performs a run-time checked static cast const auto& cat_axis = axis::get<cat>(h.axis(0)); std::ostringstream os; for (auto&& x : indexed(h)) { os << boost::format("(%i, %i, %i) %4s [%3.1f, %3.1f) [%3.1f, %3.1f) %3.0f\n") % x.index(0) % x.index(1) % x.index(2) % cat_axis.bin(x.index(0)) % x.bin(1).lower() % x.bin(1).upper() % x.bin(2).lower() % x.bin(2).upper() % *x; } std::cout << os.str() << std::flush; assert(os.str() == "(0, 0, 0) red [0.0, 0.3) [0.0, 0.3) 1\n" "(1, 0, 0) blue [0.0, 0.3) [0.0, 0.3) 0\n" "(0, 1, 0) red [0.3, 0.7) [0.0, 0.3) 0\n" "(1, 1, 0) blue [0.3, 0.7) [0.0, 0.3) 0\n" "(0, 2, 0) red [0.7, 1.0) [0.0, 0.3) 0\n" "(1, 2, 0) blue [0.7, 1.0) [0.0, 0.3) 1\n" "(0, 0, 1) red [0.0, 0.3) [0.3, 0.7) 0\n" "(1, 0, 1) blue [0.0, 0.3) [0.3, 0.7) 0\n" "(0, 1, 1) red [0.3, 0.7) [0.3, 0.7) 0\n" "(1, 1, 1) blue [0.3, 0.7) [0.3, 0.7) 0\n" "(0, 2, 1) red [0.7, 1.0) [0.3, 0.7) 0\n" "(1, 2, 1) blue [0.7, 1.0) [0.3, 0.7) 0\n" "(0, 0, 2) red [0.0, 0.3) [0.7, 1.0) 1\n" "(1, 0, 2) blue [0.0, 0.3) [0.7, 1.0) 0\n" "(0, 1, 2) red [0.3, 0.7) [0.7, 1.0) 0\n" "(1, 1, 2) blue [0.3, 0.7) [0.7, 1.0) 0\n" "(0, 2, 2) red [0.7, 1.0) [0.7, 1.0) 1\n" "(1, 2, 2) blue [0.7, 1.0) [0.7, 1.0) 0\n"); }
The axis configuration is passed to make_histogram
as a std::vector<axis::variant<...>>
,
which can hold arbitrary sequences of axis types from a predefined set.
Run-time configurable histograms are a slower than their compile-time brethren, but still pretty fast.
Note | |
---|---|
If you know already at compile-time that you will only use one axis type,
|
Note | |
---|---|
If you care about maximum performance: In this example, |
The library was designed to be very flexible and modular. The modularity is used, for example, to also provide profiles. Profiles are generalized histograms. A histogram counts how often an input falls into a particular cell. A profile accepts pairs of input values and a sample value. The profile computes the mean of the samples that end up in each cell. Have a look at the example, which should clear up any confusion.
#include <boost/format.hpp> #include <boost/histogram.hpp> #include <cassert> #include <iostream> #include <sstream> int main() { using namespace boost::histogram; /* Create a profile. Profiles does not only count entries in each cell, but also compute the mean of a sample value in each cell. */ auto p = make_profile(axis::regular<>(5, 0.0, 1.0)); /* Fill profile with data, usually this happens in a loop. You pass the sample with the `sample` helper function. The sample can be the first or last argument. */ p(0.1, sample(1)); p(0.15, sample(3)); p(0.2, sample(4)); p(0.9, sample(5)); /* Iterate over bins and print profile. */ std::ostringstream os; for (auto&& x : indexed(p)) { os << boost::format("bin %i [%3.1f, %3.1f) count %i mean %g\n") % x.index() % x.bin().lower() % x.bin().upper() % x->count() % x->value(); } std::cout << os.str() << std::flush; assert(os.str() == "bin 0 [0.0, 0.2) count 2 mean 2\n" "bin 1 [0.2, 0.4) count 1 mean 4\n" "bin 2 [0.4, 0.6) count 0 mean 0\n" "bin 3 [0.6, 0.8) count 0 mean 0\n" "bin 4 [0.8, 1.0) count 1 mean 5\n"); }
The library was designed to work well with the C++ standard library. Here
is an example on how to get the most common color from an image, using a
3d histogram and std::max_element
.
#include <algorithm> // std::max_element #include <boost/format.hpp> // only needed for printing #include <boost/histogram.hpp> // make_histogram, integer, indexed #include <iostream> // std::cout, std::endl #include <sstream> // std::ostringstream int main() { using namespace boost::histogram; using namespace boost::histogram::literals; /* We make a 3d histogram for color values (r, g, b) in an image. We assume the values are of type char. The value range then is [0, 256). The integer axis is perfect for color values. */ auto h = make_histogram( axis::integer<>(0, 256, "r"), axis::integer<>(0, 256, "g"), axis::integer<>(0, 256, "b") ); /* We don't have real image data, so fill some fake data. */ h(1, 2, 3); h(1, 2, 3); h(0, 1, 0); /* Now let's say we want to know which color is most common. We can use std::max_element on an indexed range for that. */ auto ind = indexed(h); auto max_it = std::max_element(ind.begin(), ind.end()); /* max_it is a special iterator to the histogram cell with the highest count. This iterator allows one to access the cell value, bin indices, and bin values. You need to twice dereference it to get to the cell value. */ std::ostringstream os; os << boost::format("count=%i rgb=(%i, %i, %i)") % **max_it % max_it->bin(0_c) % max_it->bin(1) % max_it->bin(2); std::cout << os.str() << std::endl; assert(os.str() == "count=2 rgb=(1, 2, 3)"); }
The histograms get their great flexibility and performance from being templated,
but this can make the types a bit cumbersome to write. Often it is possible
to use auto
to let the compiler
deduce the type, but when you want to store histograms in a class, you need
to write the type explicitly. The next example shows how this works.
//////////////// Begin: put this in header file //////////////////// #include <algorithm> // std::max_element #include <boost/format.hpp> // only needed for printing #include <boost/histogram.hpp> // make_histogram, integer, indexed #include <iostream> // std::cout, std::endl #include <sstream> // std::ostringstream #include <tuple> #include <vector> // use this when axis configuration is fix to get highest performance struct HolderOfStaticHistogram { // put axis types here using axes_t = std::tuple< boost::histogram::axis::regular<>, boost::histogram::axis::integer<> >; using hist_t = boost::histogram::histogram<axes_t>; hist_t hist_; }; // use this when axis configuration should be flexible struct HolderOfDynamicHistogram { // put all axis types here that you are going to use using axis_t = boost::histogram::axis::variant< boost::histogram::axis::regular<>, boost::histogram::axis::variable<>, boost::histogram::axis::integer<> >; using axes_t = std::vector<axis_t>; using hist_t = boost::histogram::histogram<axes_t>; hist_t hist_; }; //////////////// End: put this in header file //////////////////// int main() { using namespace boost::histogram; HolderOfStaticHistogram hs; hs.hist_ = make_histogram(axis::regular<>(5, 0, 1), axis::integer<>(0, 3)); // now assign a different histogram hs.hist_ = make_histogram(axis::regular<>(3, 1, 2), axis::integer<>(4, 6)); // hs.hist_ = make_histogram(axis::regular<>(5, 0, 1)); does not work; // the static histogram cannot change the number or order of axis types HolderOfDynamicHistogram hd; hd.hist_ = make_histogram(axis::regular<>(5, 0, 1), axis::integer<>(0, 3)); // now assign a different histogram hd.hist_ = make_histogram(axis::regular<>(3, -1, 2)); // and assign another hd.hist_ = make_histogram(axis::integer<>(0, 5), axis::integer<>(3, 5)); }