Extended functionality

Default initialization for vector-like containers

STL and most other containers value initialize new elements in common operations like vector::resize(size_type n) or explicit vector::vector(size_type n).

In some performance-sensitive environments, where vectors are used as a replacement for variable-size buffers for file or network operations, value initialization is a cost that is not negligible as elements are going to be overwritten by an external source shortly after new elements are added to the container.

Boost.Container offers two new members for vector, static_vector and stable_vector: explicit container::container(size_type n, default_init_t) and explicit container::resize(size_type n, default_init_t), where new elements are constructed using default initialization.

Ordered range insertion for associative containers (ordered_unique_range, ordered_range)

When filling associative containers big performance gains can be achieved if the input range to be inserted is guaranteed by the user to be ordered according to the predicate. This can happen when inserting values from a set to a multiset or between different associative container families ([multi]set/map vs. flat_[multi]set/map).

Boost.Container has some overloads for constructors and insertions taking an ordered_unique_range_t or an ordered_range_t tag parameters as the first argument. When an ordered_unique_range_t overload is used, the user notifies the container that the input range is ordered according to the container predicate and has no duplicates. When an ordered_range_t overload is used, the user notifies the container that the input range is ordered according to the container predicate but it might have duplicates. With this information, the container can avoid multiple predicate calls and improve insertion times.

Configurable tree-based associative ordered containers

set, multiset, map and multimap associative containers are implemented as binary search trees which offer the needed complexity and stability guarantees required by the C++ standard for associative containers.

Boost.Container offers the possibility to configure at compile time some parameters of the binary search tree implementation. This configuration is passed as the last template parameter and defined using the utility class tree_assoc_options.

The following parameters can be configured:

The underlying tree implementation type (tree_type). By default these containers use a red-black tree but the user can use other tree types:
- Red-Black Tree
- AVL tree
- Scapegoat tree. In this case Insertion and Deletion are amortized O(log n) instead of O(log n).
- Splay tree. In this case Searches, Insertions and Deletions are amortized O(log n) instead of O(log n).
Whether the size saving mechanisms are used to implement the tree nodes (optimize_size). By default this option is activated and is only meaningful to red-black and avl trees (in other cases, this option will be ignored). This option will try to put rebalancing metadata inside the "parent" pointer of the node if the pointer type has enough alignment. Usually, due to alignment issues, the metadata uses the size of a pointer yielding to four pointer size overhead per node, whereas activating this option usually leads to 3 pointer size overhead. Although some mask operations must be performed to extract data from this special "parent" pointer, in several systems this option also improves performance due to the improved cache usage produced by the node size reduction.

See the following example to see how tree_assoc_options can be used to customize these containers:

#include <boost/container/set.hpp>
#include <cassert>

int main ()
{
   using namespace boost::container;

   //First define several options
   //

   //This option specifies an AVL tree based associative container
   typedef tree_assoc_options< tree_type<avl_tree> >::type AVLTree;

   //This option specifies an AVL tree based associative container
   //disabling node size optimization.
   typedef tree_assoc_options< tree_type<avl_tree>
                             , optimize_size<false> >::type AVLTreeNoSizeOpt;

   //This option specifies an Splay tree based associative container
   typedef tree_assoc_options< tree_type<splay_tree> >::type SplayTree;

   //Now define new tree-based associative containers
   //

   //AVLTree based set container
   typedef set<int, std::less<int>, std::allocator<int>, AVLTree> AvlSet;

   //AVLTree based set container without size optimization
   typedef set<int, std::less<int>, std::allocator<int>, AVLTreeNoSizeOpt> AvlSetNoSizeOpt;

   //Splay tree based multiset container
   typedef multiset<int, std::less<int>, std::allocator<int>, SplayTree> SplayMultiset;

   //Use them
   //
   AvlSet avl_set;
   avl_set.insert(0);
   assert(avl_set.find(0) != avl_set.end());

   AvlSetNoSizeOpt avl_set_no_szopt;
   avl_set_no_szopt.insert(1);
   avl_set_no_szopt.insert(1);
   assert(avl_set_no_szopt.count(1) == 1);

   SplayMultiset splay_mset;
   splay_mset.insert(2);
   splay_mset.insert(2);
   assert(splay_mset.count(2) == 2);
   return 0;
}

Constant-time range splice for `(s)list`

In the first C++ standard list::size() was not required to be constant-time, and that caused some controversy in the C++ community. Quoting Howard Hinnant's On List Size paper:

There is a considerable debate on whether std::list<T>::size() should be O(1) or O(N). The usual argument notes that it is a tradeoff with:

splice(iterator position, list& x, iterator first, iterator last);

If size() is O(1) and this != &x, then this method must perform a linear operation so that it can adjust the size member in each list

C++11 definitely required size() to be O(1), so range splice became O(N). However, Howard Hinnant's paper proposed a new splice overload so that even O(1) list:size() implementations could achieve O(1) range splice when the range size was known to the caller:

void splice(iterator position, list& x, iterator first, iterator last, size_type n);

Effects: Inserts elements in the range [first, last) before position and removes the elements from x.

Requires: [first, last) is a valid range in x. The result is undefined if position is an iterator in the range [first, last). Invalidates only the iterators and references to the spliced elements. n == distance(first, last).

Throws: Nothing.

Complexity: Constant time.

This new splice signature allows the client to pass the distance of the input range in. This information is often available at the call site. If it is passed in, then the operation is constant time, even with an O(1) size.

Boost.Container implements this overload for list and a modified version of it for slist (as slist::size() is also O(1)).

Extended allocators

Many C++ programmers have ever wondered where does good old realloc fit in C++. And that's a good question. Could we improve vector performance using memory expansion mechanisms to avoid too many copies? But vector is not the only container that could benefit from an improved allocator interface: we could take advantage of the insertion of multiple elements in list using a burst allocation mechanism that could amortize costs (mutex locks, free memory searches...) that can't be amortized when using single node allocation strategies.

These improvements require extending the STL allocator interface and use make use of a new general purpose allocator since new and delete don't offer expansion and burst capabilities.

Boost.Container containers support an extended allocator interface based on an evolution of proposals N1953: Upgrading the Interface of Allocators using API Versioning, N2045: Improving STL allocators and the article Applying classic memory allocation strategies to C++ containers. The extended allocator interface is implemented by allocator, adaptive_pool and node_allocator classes.
Extended allocators use a modified Doug Lea Malloc (DLMalloc) low-level allocator and offers an C API to implement memory expansion and burst allocations. DLmalloc is known to be very size and speed efficient, and this allocator is used as the basis of many malloc implementations, including multithreaded allocators built above DLmalloc (See ptmalloc2, ptmalloc3 or nedmalloc). This low-level allocator is implemented as a separately compiled library and the following extended allocators depend on the library:
allocator: This extended allocator offers expansion, shrink-in place and burst allocation capabilities implemented as a thin wrapper around the modified DLMalloc. It can be used with all containers and it should be the default choice when the programmer wants to use extended allocator capabilities.
node_allocator: It's a Simple Segregated Storage allocator, similar to Boost.Pool that takes advantage of the modified DLMalloc burst interface. It does not return memory to the DLMalloc allocator (and thus, to the system), unless explicitly requested. It does offer a very small memory overhead so it's suitable for node containers ([boost::container::list list], [boost::container::slist slist] [boost::container::set set]...) that allocate very small value_types and it offers improved node allocation times for single node allocations with respecto to allocator.
adaptive_pool: It's a low-overhead node allocator that can return memory to the system. The overhead can be very low (< 5% for small nodes) and it's nearly as fast as node_allocator. It's also suitable for node containers.

Use them simply specifying the new allocator in the corresponding template argument of your favourite container:

#include <boost/container/vector.hpp>
#include <boost/container/flat_set.hpp>
#include <boost/container/list.hpp>
#include <boost/container/set.hpp>

//"allocator" is a general purpose allocator that can reallocate
//memory, something useful for vector and flat associative containers
#include <boost/container/allocator.hpp>

//"adaptive_pool" is a node allocator, specially suited for
//node-based containers
#include <boost/container/adaptive_pool.hpp>

int main ()
{
   using namespace boost::container;

   //A vector that can reallocate memory to implement faster insertions
   vector<int, allocator<int> > extended_alloc_vector;

   //A flat set that can reallocate memory to implement faster insertions
   flat_set<int, std::less<int>, allocator<int> > extended_alloc_flat_set;

   //A list that can manages nodes to implement faster
   //range insertions and deletions
   list<int, adaptive_pool<int> > extended_alloc_list;

   //A set that can recycle nodes to implement faster
   //range insertions and deletions
   set<int, std::less<int>, adaptive_pool<int> > extended_alloc_set;

   //Now user them as always
   extended_alloc_vector.push_back(0);
   extended_alloc_flat_set.insert(0);
   extended_alloc_list.push_back(0);
   extended_alloc_set.insert(0);

   //...
   return 0;
}

Polymorphic Memory Resources

The document C++ Extensions for Library Fundamentals (Final draft: N4480) includes classes that provide allocator type erasure and runtime polymorphism. As Pablo Halpern, the author of the proposal, explains in the paper (N3916 Polymorphic Memory Resources (r2)):

“A significant impediment to effective memory management in C++ has been the inability to use allocators in non-generic contexts. In large software systems, most of the application program consists of non-generic procedural or object-oriented code that is compiled once and linked many times.”

“Allocators in C++, however, have historically relied solely on compile-time polymorphism, and therefore have not been suitable for use in vocabulary types, which are passed through interfaces between separately-compiled modules, because the allocator type necessarily affects the type of the object that uses it. This proposal builds upon the improvements made to allocators in C++11 and describes a set of facilities for runtime polymorphic memory resources that interoperate with the existing compile-time polymorphic allocators.”

Boost.Container implements nearly all classes of the proposal under the namespace boost::container::pmr. There are two groups,

Header only utilities (these don't require the separately compiled library):
- memory_resource.
- resource_adaptor.
Utilities that require the the separately compiled library:
- polymorphic_allocator.
- monotonic_buffer_resource.
- unsynchronized_pool_resource.
- synchronized_pool_resource.
- Global resource functions: get_default_resource/ set_default_resource/ new_delete_resource/ null_memory_resource
- Aliases for boost containers using the polymorphic allocator (like pmr::vector, etc.)

Boost.Container's polymorphic resource library is usable from C++03 containers, and offers some alternative utilities if the required C++11 features of the Library Fundamentals specification are not available.

Let's review the usage example given in N3916 and see how it can be implemented using Boost.Container: Suppose we are processing a series of shopping lists, where a shopping list is a container of strings, and storing them in a collection (a list) of shopping lists. Each shopping list being processed uses a bounded amount of memory that is needed for a short period of time, while the collection of shopping lists uses an unbounded amount of memory and will exist for a longer period of time. For efficiency, we can use a more time-efficient memory allocator based on a finite buffer for the temporary shopping lists.

Let's see how ShoppingList can be defined to support an polymorphic memory resource that can allocate memory from different underlying mechanisms. The most important details are:

It should declare that supports an allocator defining an allocator_type typedef. This allocator_type will be of type memory_resource *, which is a base class for polymorphic resources.
It must define constructors that take the the allocator as argument. It can be implemented in two ways:
- ShoppingList has constructors taking memory_resource* as the last argument.
- ShoppingList has constructors taking allocator_arg_t as the first argument and memory_resource* as the second argument.

Note: In C++03 compilers, it is required that the programmer specializes as true constructible_with_allocator_suffix or constructible_with_allocator_prefix as in C++03 there is no way to automatically detect the chosen option at compile time. If no specialization is done, Boost.Container assumes the suffix option.

//ShoppingList.hpp
#include <boost/container/pmr/vector.hpp>
#include <boost/container/pmr/string.hpp>

class ShoppingList
{
   // A vector of strings using polymorphic allocators. Every element
   // of the vector will use the same allocator as the vector itself.
   boost::container::pmr::vector_of
      <boost::container::pmr::string>::type m_strvec;
   //Alternatively in compilers that support template aliases:
   //    boost::container::pmr::vector<boost::container::pmr::string> m_strvec;
   public:

   // This makes uses_allocator<ShoppingList, memory_resource*>::value true
   typedef boost::container::pmr::memory_resource* allocator_type;

   // If the allocator is not specified, "m_strvec" uses pmr::get_default_resource().
   explicit ShoppingList(allocator_type alloc = 0)
      : m_strvec(alloc) {}

   // Copy constructor. As allocator is not specified,
   // "m_strvec" uses pmr::get_default_resource().
   ShoppingList(const ShoppingList& other)
      : m_strvec(other.m_strvec) {}

   // Copy construct using the given memory_resource.
   ShoppingList(const ShoppingList& other, allocator_type a)
      : m_strvec(other.m_strvec, a) {}

   allocator_type get_allocator() const
   { return m_strvec.get_allocator().resource(); }

   void add_item(const char *item)
   { m_strvec.emplace_back(item); }

   //...
};

However, this time-efficient allocator is not appropriate for the longer lived collection of shopping lists. This example shows how those temporary shopping lists, using a time-efficient allocator, can be used to populate the long lived collection of shopping lists, using a general purpose allocator, something that would be annoyingly difficult without the polymorphic allocators.

In Boost.Container for the time-efficient allocation we can use monotonic_buffer_resource, providing an external buffer that will be used until it's exhausted. In the default configuration, when the buffer is exhausted, the default memory resource will be used instead.

#include "ShoppingList.hpp"
#include <cassert>
#include <boost/container/pmr/list.hpp>
#include <boost/container/pmr/monotonic_buffer_resource.hpp>

void processShoppingList(const ShoppingList&)
{  /**/   }

int main()
{
   using namespace boost::container;
   //All memory needed by folder and its contained objects will
   //be allocated from the default memory resource (usually new/delete) 
   pmr::list_of<ShoppingList>::type folder; // Default allocator resource
   //Alternatively in compilers that support template aliases:
   //    boost::container::pmr::list<ShoppingList> folder;
   {
      char buffer[1024];
      pmr::monotonic_buffer_resource buf_rsrc(&buffer, 1024);

      //All memory needed by temporaryShoppingList will be allocated
      //from the local buffer (speeds up "processShoppingList")
      ShoppingList temporaryShoppingList(&buf_rsrc);
      assert(&buf_rsrc == temporaryShoppingList.get_allocator());

      //list nodes, and strings "salt" and "pepper" will be allocated
      //in the stack thanks to "monotonic_buffer_resource".
      temporaryShoppingList.add_item("salt");
      temporaryShoppingList.add_item("pepper");
      //...

      //All modifications and additions to "temporaryShoppingList"
      //will use memory from "buffer" until it's exhausted.
      processShoppingList(temporaryShoppingList);

      //Processing done, now insert it in "folder",
      //which uses the default memory resource
      folder.push_back(temporaryShoppingList);
      assert(pmr::get_default_resource() == folder.back().get_allocator());
      //temporaryShoppingList, buf_rsrc, and buffer go out of scope
   }
   return 0;
}

Notice that the shopping lists within folder use the default allocator resource whereas the shopping list temporaryShoppingList uses the short-lived but very fast buf_rsrc. Despite using different allocators, you can insert temporaryShoppingList into folder because they have the same ShoppingList type. Also, while ShoppingList uses memory_resource directly, pmr::list, pmr::vector and pmr::string all use polymorphic_allocator.

The resource passed to the ShoppingList constructor is propagated to the vector and each string within that ShoppingList. Similarly, the resource used to construct folder is propagated to the constructors of the ShoppingLists that are inserted into the list (and to the strings within those ShoppingLists). The polymorphic_allocator template is designed to be almost interchangeable with a pointer to memory_resource, thus producing a bridge between the template-policy style of allocator and the polymorphic-base-class style of allocator.

This example actually shows how easy is to use Boost.Container to write type-erasured allocator-capable classes even in C++03 compilers.