Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

This is the documentation for a snapshot of the develop branch, built from commit 0f79ae966a.
PrevUpHomeNext

Getting started

1d-histogram with axis types known at compile-time
3d-histogram (axis configuration defined at run-time)
1d-profile
Standard library algorithms
Making classes that hold histograms

Here are some commented examples to copy-paste from, this should allow you to kick off a project with Boost.Histogram. If you prefer a traditional structured exposition, go to the user guide.

Boost.Histogram uses axis objects to convert input values into indices. The library comes with several builtin axis types, which can be configured via template parameters. This already gives you a lot of flexibility should you need it, otherwise just use the defaults. Beyond that, you can easily write your own axis type.

When the axis types for the histogram are known at compile-time, the library generates the fastest and most efficient code for you. Here is such an example.

#include <algorithm>           // std::for_each
#include <boost/format.hpp>    // only needed for printing
#include <boost/histogram.hpp> // make_histogram, regular, weight, indexed
#include <cassert>             // assert (used to test this example for correctness)
#include <functional>          // std::ref
#include <iostream>            // std::cout, std::flush
#include <sstream>             // std::ostringstream

int main() {
  using namespace boost::histogram; // strip the boost::histogram prefix

  /*
    Create a 1d-histogram with a regular axis that has 6 equidistant bins on
    the real line from -1.0 to 2.0, and label it as "x". A family of overloaded
    factory functions called `make_histogram` makes creating histograms easy.

    A regular axis is a sequence of semi-open bins. Extra under- and overflow
    bins extend the axis by default (this can be turned off).

    index    :      -1  |  0  |  1  |  2  |  3  |  4  |  5  |  6
    bin edges:  -inf  -1.0  -0.5   0.0   0.5   1.0   1.5   2.0   inf
  */
  auto h = make_histogram(axis::regular<>(6, -1.0, 2.0, "x"));

  /*
    Let's fill a histogram with data, typically this happens in a loop.

    STL algorithms are supported. std::for_each is very convenient to fill a
    histogram from an iterator range. Use std::ref in the call, if you don't
    want std::for_each to make a copy of your histogram.
  */
  auto data = {-0.5, 1.1, 0.3, 1.7};
  std::for_each(data.begin(), data.end(), std::ref(h));
  // let's fill some more values manually
  h(-1.5); // is placed in underflow bin -1
  h(-1.0); // is placed in bin 0, bin interval is semi-open
  h(2.0);  // is placed in overflow bin 6, bin interval is semi-open
  h(20.0); // is placed in overflow bin 6

  /*
    This does a weighted fill using the `weight` function as an additional
    argument. It may appear at the beginning or end of the argument list. C++
    doesn't have keyword arguments like Python, this is the next-best thing.
  */
  h(0.1, weight(1.0));

  /*
    Iterate over bins with the `indexed` range generator, which provides a
    special accessor object, that can be used to obtain the current bin index,
    and the current bin value by dereferncing (it acts like a pointer to the
    value). Using `indexed` is convenient and gives you better performance than
    looping over the histogram cells with hand-written for loops. By default,
    under- and overflow bins are skipped. Passing `coverage::all` as the
    optional second argument iterates over all bins.

    - Access the value with the dereference operator.
    - Access the current index with `index(d)` method of the accessor.
    - Access the corresponding bin interval view with `bin(d)`.

    The return type of `bin(d)` depends on the axis type (see the axis reference
    for details). It usually is a class that represents a semi-open interval.
    Edges can be accessed with methods `lower()` and `upper()`.
  */

  std::ostringstream os;
  for (auto&& x : indexed(h, coverage::all)) {
    os << boost::format("bin %2i [%4.1f, %4.1f): %i\n")
          % x.index() % x.bin().lower() % x.bin().upper() % *x;
  }

  std::cout << os.str() << std::flush;

  assert(os.str() == "bin -1 [-inf, -1.0): 1\n"
                     "bin  0 [-1.0, -0.5): 1\n"
                     "bin  1 [-0.5, -0.0): 1\n"
                     "bin  2 [-0.0,  0.5): 2\n"
                     "bin  3 [ 0.5,  1.0): 0\n"
                     "bin  4 [ 1.0,  1.5): 1\n"
                     "bin  5 [ 1.5,  2.0): 1\n"
                     "bin  6 [ 2.0,  inf): 2\n");
}

We passed the regular axis type directly to the make_histogram function. The library then generates a specialized histogram type with just one regular axis from a generic template.

  • Pro: Many user errors are already caught at compile-time, not at run-time.
  • Con: You get template errors if you make a mistake, which may be hard to read. We try to give you useful error messages, but still.

Sometimes, you don't know the number or types of axes at compile-time, because it depends on run-time information. Perhaps you want to write a command-line tool that generates histograms from input data, or you use this library as a back-end for a product with a GUI. This is possible as well, here is the example.

#include <boost/format.hpp>
#include <boost/histogram.hpp>
#include <cassert>
#include <iostream>
#include <sstream>
#include <string>

int main() {
  using namespace boost::histogram;

  /*
    Create a histogram which can be configured dynamically at run-time. The axis
    configuration is first collected in a vector of axis::variant type, which
    can hold different axis types (those in its template argument list). Here,
    we use a variant that can store a regular and a category axis.
  */
  using reg = axis::regular<>;
  using cat = axis::category<std::string>;
  using variant = axis::variant<axis::regular<>, axis::category<std::string>>;
  std::vector<variant> axes;
  axes.emplace_back(cat({"red", "blue"}));
  axes.emplace_back(reg(3, 0.0, 1.0, "x"));
  axes.emplace_back(reg(3, 0.0, 1.0, "y"));
  // passing an iterator range also works here
  auto h = make_histogram(std::move(axes));

  // fill histogram with data, usually this happens in a loop
  h("red", 0.1, 0.2);
  h("blue", 0.7, 0.3);
  h("red", 0.3, 0.7);
  h("red", 0.7, 0.7);

  /*
    Print histogram by iterating over bins.
    Since the [bin type] of the category axis cannot be converted into a double,
    it cannot be handled by the polymorphic interface of axis::variant. We use
    axis::get to "cast" the variant type to the actual category type.
  */

  // get reference to category axis, performs a run-time checked static cast
  const auto& cat_axis = axis::get<cat>(h.axis(0));
  std::ostringstream os;
  for (auto&& x : indexed(h)) {
    os << boost::format("(%i, %i, %i) %4s [%3.1f, %3.1f) [%3.1f, %3.1f) %3.0f\n")
          % x.index(0) % x.index(1) % x.index(2)
          % cat_axis.bin(x.index(0))
          % x.bin(1).lower() % x.bin(1).upper()
          % x.bin(2).lower() % x.bin(2).upper()
          % *x;
  }

  std::cout << os.str() << std::flush;
  assert(os.str() == "(0, 0, 0)  red [0.0, 0.3) [0.0, 0.3)   1\n"
                     "(1, 0, 0) blue [0.0, 0.3) [0.0, 0.3)   0\n"
                     "(0, 1, 0)  red [0.3, 0.7) [0.0, 0.3)   0\n"
                     "(1, 1, 0) blue [0.3, 0.7) [0.0, 0.3)   0\n"
                     "(0, 2, 0)  red [0.7, 1.0) [0.0, 0.3)   0\n"
                     "(1, 2, 0) blue [0.7, 1.0) [0.0, 0.3)   1\n"
                     "(0, 0, 1)  red [0.0, 0.3) [0.3, 0.7)   0\n"
                     "(1, 0, 1) blue [0.0, 0.3) [0.3, 0.7)   0\n"
                     "(0, 1, 1)  red [0.3, 0.7) [0.3, 0.7)   0\n"
                     "(1, 1, 1) blue [0.3, 0.7) [0.3, 0.7)   0\n"
                     "(0, 2, 1)  red [0.7, 1.0) [0.3, 0.7)   0\n"
                     "(1, 2, 1) blue [0.7, 1.0) [0.3, 0.7)   0\n"
                     "(0, 0, 2)  red [0.0, 0.3) [0.7, 1.0)   1\n"
                     "(1, 0, 2) blue [0.0, 0.3) [0.7, 1.0)   0\n"
                     "(0, 1, 2)  red [0.3, 0.7) [0.7, 1.0)   0\n"
                     "(1, 1, 2) blue [0.3, 0.7) [0.7, 1.0)   0\n"
                     "(0, 2, 2)  red [0.7, 1.0) [0.7, 1.0)   1\n"
                     "(1, 2, 2) blue [0.7, 1.0) [0.7, 1.0)   0\n");
}

The axis configuration is passed to make_histogram as a std::vector<axis::variant<...>>, which can hold arbitrary sequences of axis types from a predefined set.

Run-time configurable histograms are a slower than their compile-time brethren, but still pretty fast.

[Note] Note

If you know already at compile-time that you will only use one axis type, axis::regular<> for example, but not how many per histogram, then you can also pass a std::vector<axis::regular<>> to make_histogram. You get almost the same speed as in the very first case, where both the axis configuration was fully known at compile-time.

[Note] Note

If you care about maximum performance: In this example, axis::category<std::string> is used with two string labels "red" and "blue". It is faster to use an enum, enum { red, blue }; and a axis::category<> axis.

The library was designed to be very flexible and modular. The modularity is used, for example, to also provide profiles. Profiles are generalized histograms. A histogram counts how often an input falls into a particular cell. A profile accepts pairs of input values and a sample value. The profile computes the mean of the samples that end up in each cell. Have a look at the example, which should clear up any confusion.

#include <boost/format.hpp>
#include <boost/histogram.hpp>
#include <cassert>
#include <iostream>
#include <sstream>

int main() {
  using namespace boost::histogram;

  /*
    Create a profile. Profiles does not only count entries in each cell, but
    also compute the mean of a sample value in each cell.
  */
  auto p = make_profile(axis::regular<>(5, 0.0, 1.0));

  /*
    Fill profile with data, usually this happens in a loop. You pass the sample
    with the `sample` helper function. The sample can be the first or last
    argument.
  */
  p(0.1, sample(1));
  p(0.15, sample(3));
  p(0.2, sample(4));
  p(0.9, sample(5));

  /*
    Iterate over bins and print profile.
  */
  std::ostringstream os;
  for (auto&& x : indexed(p)) {
    os << boost::format("bin %i [%3.1f, %3.1f) count %i mean %g\n")
          % x.index() % x.bin().lower() % x.bin().upper()
          % x->count() % x->value();
  }

  std::cout << os.str() << std::flush;
  assert(os.str() == "bin 0 [0.0, 0.2) count 2 mean 2\n"
                     "bin 1 [0.2, 0.4) count 1 mean 4\n"
                     "bin 2 [0.4, 0.6) count 0 mean 0\n"
                     "bin 3 [0.6, 0.8) count 0 mean 0\n"
                     "bin 4 [0.8, 1.0) count 1 mean 5\n");
}

The library was designed to work well with the C++ standard library. Here is an example on how to get the most common color from an image, using a 3d histogram and std::max_element.

#include <algorithm>           // std::max_element
#include <boost/format.hpp>    // only needed for printing
#include <boost/histogram.hpp> // make_histogram, integer, indexed
#include <iostream>            // std::cout, std::endl
#include <sstream>             // std::ostringstream

int main() {
  using namespace boost::histogram;
  using namespace boost::histogram::literals;

  /*
    We make a 3d histogram for color values (r, g, b) in an image. We assume the values
    are of type char. The value range then is [0, 256). The integer axis is perfect for
    color values.
  */
  auto h = make_histogram(
    axis::integer<>(0, 256, "r"),
    axis::integer<>(0, 256, "g"),
    axis::integer<>(0, 256, "b")
  );

  /*
    We don't have real image data, so fill some fake data.
  */
  h(1, 2, 3);
  h(1, 2, 3);
  h(0, 1, 0);

  /*
    Now let's say we want to know which color is most common. We can use std::max_element
    on an indexed range for that.
  */
  auto ind = indexed(h);
  auto max_it = std::max_element(ind.begin(), ind.end());

  /*
    max_it is a special iterator to the histogram cell with the highest count.
    This iterator allows one to access the cell value, bin indices, and bin values.
    You need to twice dereference it to get to the cell value.
  */
  std::ostringstream os;
  os << boost::format("count=%i rgb=(%i, %i, %i)")
        % **max_it % max_it->bin(0_c) % max_it->bin(1) % max_it->bin(2);

  std::cout << os.str() << std::endl;

  assert(os.str() == "count=2 rgb=(1, 2, 3)");
}

The histograms get their great flexibility and performance from being templated, but this can make the types a bit cumbersome to write. Often it is possible to use auto to let the compiler deduce the type, but when you want to store histograms in a class, you need to write the type explicitly. The next example shows how this works.

//////////////// Begin: put this in header file ////////////////////

#include <algorithm>           // std::max_element
#include <boost/format.hpp>    // only needed for printing
#include <boost/histogram.hpp> // make_histogram, integer, indexed
#include <iostream>            // std::cout, std::endl
#include <sstream>             // std::ostringstream
#include <tuple>
#include <vector>

// use this when axis configuration is fix to get highest performance
struct HolderOfStaticHistogram {
  // put axis types here
  using axes_t = std::tuple<
    boost::histogram::axis::regular<>,
    boost::histogram::axis::integer<>
  >;
  using hist_t = boost::histogram::histogram<axes_t>;
  hist_t hist_;
};

// use this when axis configuration should be flexible
struct HolderOfDynamicHistogram {
  // put all axis types here that you are going to use
  using axis_t = boost::histogram::axis::variant<
    boost::histogram::axis::regular<>,
    boost::histogram::axis::variable<>,
    boost::histogram::axis::integer<>
  >;
  using axes_t = std::vector<axis_t>;
  using hist_t = boost::histogram::histogram<axes_t>;
  hist_t hist_;
};

//////////////// End: put this in header file ////////////////////

int main() {
  using namespace boost::histogram;

  HolderOfStaticHistogram hs;
  hs.hist_ = make_histogram(axis::regular<>(5, 0, 1), axis::integer<>(0, 3));
  // now assign a different histogram
  hs.hist_ = make_histogram(axis::regular<>(3, 1, 2), axis::integer<>(4, 6));
  // hs.hist_ = make_histogram(axis::regular<>(5, 0, 1)); does not work;
  // the static histogram cannot change the number or order of axis types

  HolderOfDynamicHistogram hd;
  hd.hist_ = make_histogram(axis::regular<>(5, 0, 1), axis::integer<>(0, 3));
  // now assign a different histogram
  hd.hist_ = make_histogram(axis::regular<>(3, -1, 2));
  // and assign another
  hd.hist_ = make_histogram(axis::integer<>(0, 5), axis::integer<>(3, 5));
}

PrevUpHomeNext