Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

This is the documentation for a snapshot of the master branch, built from commit 67c8db8f76.
PrevUpHomeNext

Direct iostream formatting: vectorstream and bufferstream

Formatting directly in your character vector: vectorstream
Formatting directly in your character buffer: bufferstream

Shared memory, memory-mapped files and all Boost.Interprocess mechanisms are focused on efficiency. The reason why shared memory is used is that it's the fastest IPC mechanism available. When passing text-oriented messages through shared memory, there is need to format the message. Obviously C++ offers the iostream framework for that work.

Some programmers appreciate the iostream safety and design for memory formatting but feel that the stringstream family is far from efficient not when formatting, but when obtaining formatted data to a string, or when setting the string from which the stream will extract data. An example:

//Some formatting elements
std::string my_text = "...";
int number;

//Data reader
std::istringstream input_processor;

//This makes a copy of the string. If not using a
//reference counted string, this is a serious overhead.
input_processor.str(my_text);

//Extract data
while(/*...*/){
   input_processor >> number;
}

//Data writer
std::ostringstream output_processor;

//Write data
while(/*...*/){
   output_processor << number;
}

//This returns a temporary string. Even with return-value
//optimization this is expensive.
my_text = input_processor.str();

The problem is even worse if the string is a shared-memory string, because to extract data, we must copy the data first from shared-memory to a std::string and then to a std::stringstream. To encode data in a shared memory string we should copy data from a std::stringstream to a std::string and then to the shared-memory string.

Because of this overhead, Boost.Interprocess offers a way to format memory-strings (in shared memory, memory mapped files or any other memory segment) that can avoid all unneeded string copy and memory allocation/deallocations, while using all iostream facilities. Boost.Interprocess vectorstream and bufferstream implement vector-based and fixed-size buffer based storage support for iostreams and all the formatting/locale hard work is done by standard std::basic_streambuf<> and std::basic_iostream<> classes.

The vectorstream class family (basic_vectorbuf, basic_ivectorstream ,basic_ovectorstream and basic_vectorstream) is an efficient way to obtain formatted reading/writing directly in a character vector. This way, if a shared-memory vector is used, data is extracted/written from/to the shared-memory vector, without additional copy/allocation. We can see the declaration of basic_vectorstream here:

//!A basic_iostream class that holds a character vector specified by CharVector
//!template parameter as its formatting buffer. The vector must have
//!contiguous storage, like std::vector, boost::interprocess::vector or
//!boost::interprocess::basic_string
template <class CharVector, class CharTraits =
         std::char_traits<typename CharVector::value_type> >
class basic_vectorstream
: public std::basic_iostream<typename CharVector::value_type, CharTraits>

{
   public:
   typedef CharVector                                                   vector_type;
   typedef typename std::basic_ios
      <typename CharVector::value_type, CharTraits>::char_type          char_type;
   typedef typename std::basic_ios<char_type, CharTraits>::int_type     int_type;
   typedef typename std::basic_ios<char_type, CharTraits>::pos_type     pos_type;
   typedef typename std::basic_ios<char_type, CharTraits>::off_type     off_type;
   typedef typename std::basic_ios<char_type, CharTraits>::traits_type  traits_type;

   //!Constructor. Throws if vector_type default constructor throws.
   basic_vectorstream(std::ios_base::openmode mode
                     = std::ios_base::in | std::ios_base::out);

   //!Constructor. Throws if vector_type(const Parameter &param) throws.
   template<class Parameter>
   basic_vectorstream(const Parameter &param, std::ios_base::openmode mode
                     = std::ios_base::in | std::ios_base::out);

   ~basic_vectorstream(){}

   //!Returns the address of the stored stream buffer.
   basic_vectorbuf<CharVector, CharTraits>* rdbuf() const;

   //!Swaps the underlying vector with the passed vector.
   //!This function resets the position in the stream.
   //!Does not throw.
   void swap_vector(vector_type &vect);

   //!Returns a const reference to the internal vector.
   //!Does not throw.
   const vector_type &vector() const;

   //!Preallocates memory from the internal vector.
   //!Resets the stream to the first position.
   //!Throws if the internals vector's memory allocation throws.
   void reserve(typename vector_type::size_type size);
};

The vector type is templatized, so that we can use any type of vector: std::vector, boost::interprocess::vector... But the storage must be contiguous, we can't use a deque. We can even use boost::interprocess::basic_string, since it has a vector interface and it has contiguous storage. We can't use std::string, because although some std::string implementation are vector-based, others can have optimizations and reference-counted implementations.

The user can obtain a const reference to the internal vector using vector_type vector() const function and he also can swap the internal vector with an external one calling void swap_vector(vector_type &vect). The swap function resets the stream position. This functions allow efficient methods to obtain the formatted data avoiding all allocations and data copies.

Let's see an example to see how to use vectorstream:

#include <boost/interprocess/containers/vector.hpp>
#include <boost/interprocess/containers/string.hpp>
#include <boost/interprocess/allocators/allocator.hpp>
#include <boost/interprocess/managed_shared_memory.hpp>
#include <boost/interprocess/streams/vectorstream.hpp>
#include <iterator>

using namespace boost::interprocess;

typedef allocator<int, managed_shared_memory::segment_manager>
   IntAllocator;
typedef allocator<char, managed_shared_memory::segment_manager>
   CharAllocator;
typedef vector<int, IntAllocator>   MyVector;
typedef basic_string
   <char, std::char_traits<char>, CharAllocator>   MyString;
typedef basic_vectorstream<MyString>               MyVectorStream;

int main ()
{
   //Remove shared memory on construction and destruction
   struct shm_remove
   {
      shm_remove() { shared_memory_object::remove("MySharedMemory"); }
      ~shm_remove(){ shared_memory_object::remove("MySharedMemory"); }
   } remover;

   managed_shared_memory segment(
      create_only,
      "MySharedMemory", //segment name
      65536);           //segment size in bytes

   //Construct shared memory vector
   MyVector *myvector =
      segment.construct<MyVector>("MyVector")
      (IntAllocator(segment.get_segment_manager()));

   //Fill vector
   myvector->reserve(100);
   for(int i = 0; i < 100; ++i){
      myvector->push_back(i);
   }

   //Create the vectorstream. To create the internal shared memory
   //basic_string we need to pass the shared memory allocator as
   //a constructor argument
   MyVectorStream myvectorstream(CharAllocator(segment.get_segment_manager()));

   //Reserve the internal string
   myvectorstream.reserve(100*5);

   //Write all vector elements as text in the internal string
   //Data will be directly written in shared memory, because
   //internal string's allocator is a shared memory allocator
   for(std::size_t i = 0, max = myvector->size(); i < max; ++i){
      myvectorstream << (*myvector)[i] << std::endl;
   }

   //Auxiliary vector to compare original data
   MyVector *myvector2 =
      segment.construct<MyVector>("MyVector2")
      (IntAllocator(segment.get_segment_manager()));

   //Avoid reallocations
   myvector2->reserve(100);

   //Extract all values from the internal
   //string directly to a shared memory vector.
   std::istream_iterator<int> it(myvectorstream), itend;
   std::copy(it, itend, std::back_inserter(*myvector2));

   //Compare vectors
   assert(std::equal(myvector->begin(), myvector->end(), myvector2->begin()));

   //Create a copy of the internal string
   MyString stringcopy (myvectorstream.vector());

   //Now we create a new empty shared memory string...
   MyString *mystring =
      segment.construct<MyString>("MyString")
      (CharAllocator(segment.get_segment_manager()));

   //...and we swap vectorstream's internal string
   //with the new one: after this statement mystring
   //will be the owner of the formatted data.
   //No reallocations, no data copies
   myvectorstream.swap_vector(*mystring);

   //Let's compare both strings
   assert(stringcopy == *mystring);

   //Done, destroy and delete vectors and string from the segment
   segment.destroy_ptr(myvector2);
   segment.destroy_ptr(myvector);
   segment.destroy_ptr(mystring);
   return 0;
}

As seen, vectorstream offers an easy and secure way for efficient iostream formatting, but many times, we have to read or write formatted data from/to a fixed size character buffer (a static buffer, a c-string, or any other). Because of the overhead of stringstream, many developers (specially in embedded systems) choose sprintf family. The bufferstream classes offer iostream interface with direct formatting in a fixed size memory buffer with protection against buffer overflows. This is the interface:

//!A basic_iostream class that uses a fixed size character buffer
//!as its formatting buffer.
template <class CharT, class CharTraits = std::char_traits<CharT> >
class basic_bufferstream
   : public std::basic_iostream<CharT, CharTraits>

{
   public:                         // Typedefs
   typedef typename std::basic_ios
      <CharT, CharTraits>::char_type          char_type;
   typedef typename std::basic_ios<char_type, CharTraits>::int_type     int_type;
   typedef typename std::basic_ios<char_type, CharTraits>::pos_type     pos_type;
   typedef typename std::basic_ios<char_type, CharTraits>::off_type     off_type;
   typedef typename std::basic_ios<char_type, CharTraits>::traits_type  traits_type;

   //!Constructor. Does not throw.
   basic_bufferstream(std::ios_base::openmode mode
                     = std::ios_base::in | std::ios_base::out);

   //!Constructor. Assigns formatting buffer. Does not throw.
   basic_bufferstream(CharT *buffer, std::size_t length,
                     std::ios_base::openmode mode
                        = std::ios_base::in | std::ios_base::out);

   //!Returns the address of the stored stream buffer.
   basic_bufferbuf<CharT, CharTraits>* rdbuf() const;

   //!Returns the pointer and size of the internal buffer.
   //!Does not throw.
   std::pair<CharT *, std::size_t> buffer() const;

   //!Sets the underlying buffer to a new value. Resets
   //!stream position. Does not throw.
   void buffer(CharT *buffer, std::size_t length);
};

//Some typedefs to simplify usage
typedef basic_bufferstream<char>     bufferstream;
typedef basic_bufferstream<wchar_t>  wbufferstream;
// ...

While reading from a fixed size buffer, bufferstream activates endbit flag if we try to read an address beyond the end of the buffer. While writing to a fixed size buffer, bufferstream will active the badbit flag if a buffer overflow is going to happen and disallows writing. This way, the fixed size buffer formatting through bufferstream is secure and efficient, and offers a good alternative to sprintf/sscanf functions. Let's see an example:

#include <boost/interprocess/managed_shared_memory.hpp>
#include <boost/interprocess/streams/bufferstream.hpp>
#include <vector>
#include <iterator>
#include <cstddef>

using namespace boost::interprocess;

int main ()
{
   //Remove shared memory on construction and destruction
   struct shm_remove
   {
      shm_remove() { shared_memory_object::remove("MySharedMemory"); }
      ~shm_remove(){ shared_memory_object::remove("MySharedMemory"); }
   } remover;

   //Create shared memory
   managed_shared_memory segment(create_only,
                                 "MySharedMemory",  //segment name
                                 65536);

   //Fill data
   std::vector<int> data;
   data.reserve(100);
   for(std::size_t i = 0; i < 100; ++i){
      data.push_back((int)i);
   }
   const std::size_t BufferSize = 100*5;

   //Allocate a buffer in shared memory to write data
   char *my_cstring =
      segment.construct<char>("MyCString")[BufferSize]('\0');
   bufferstream mybufstream(my_cstring, BufferSize);

   //Now write data to the buffer
   for(std::size_t i = 0; i < 100; ++i){
      mybufstream << data[i] << std::endl;
   }

   //Check there was no overflow attempt
   assert(mybufstream.good());

   //Extract all values from the shared memory string
   //directly to a vector.
   std::vector<int> data2;
   std::istream_iterator<int> it(mybufstream), itend;
   std::copy(it, itend, std::back_inserter(data2));

   //This extraction should have ended will fail error since
   //the numbers formatted in the buffer end before the end
   //of the buffer. (Otherwise it would trigger eofbit)
   assert(mybufstream.fail());

   //Compare data
   assert(std::equal(data.begin(), data.end(), data2.begin()));

   //Clear errors and rewind
   mybufstream.clear();
   mybufstream.seekp(0, std::ios::beg);

   //Now write again the data trying to do a buffer overflow
   for(std::size_t i = 0, m = data.size()*5; i < m; ++i){
      mybufstream << data[i%5u] << std::endl;
   }

   //Now make sure badbit is active
   //which means overflow attempt.
   assert(!mybufstream.good());
   assert(mybufstream.bad());
   segment.destroy_ptr(my_cstring);
   return 0;
}

As seen, bufferstream offers an efficient way to format data without any allocation and extra copies. This is very helpful in embedded systems, or formatting inside time-critical loops, where stringstream extra copies would be too expensive. Unlike sprintf/sscanf, it has protection against buffer overflows. As we know, according to the Technical Report on C++ Performance, it's possible to design efficient iostreams for embedded platforms, so this bufferstream class comes handy to format data to stack, static or shared memory buffers.


PrevUpHomeNext