C++ Boost

Serialization

Archive Class Usage


Archive Classes
Library Interface
Saving
Loading
Details by Type
Primitive Types
Class Types
Base Classes
Pointers
References
Character Sets

Archive Classes

An archive is defined by two complementary classes. One is for saving data while the other is for loading it. The system includes a number of archive implementations "ready to go" for the most common requirements. These can be used "as is" or as a basis for developing one's own particular type of archive. To invoke serialization using one of these archives, one or more of the following header files must be included in the code module containing the serialization code.

// a portable text archive
boost::archive::text_oarchive(ostream &s) // saving
boost::archive::text_iarchive(istream &s) // loading

// a portable text archive using a wide character stream
boost::archive::text_woarchive(wostream &s) // saving
boost::archive::text_wiarchive(wistream &s) // loading

// a non-portable native binary archive
boost::archive::binary_oarchive(ostream &s) // saving
boost::archive::binary_iarchive(istream &s) // loading

// a portable XML archive
boost::archive::xml_oarchive(ostream &s) // saving
boost::archive::xml_iarchive(istream &s) // loading

// a portable XML archive which uses wide characters - use for utf-8 output
boost::archive::xml_woarchive(wostream &s) // saving
boost::archive::xml_wiarchive(wistream &s) // loading

Archive Library Interface

An archive contains a sequence of bytes created from an arbitrary nested set of C++ data structures. Archives are implemented as a hierarchy of classes. However, the interface to all archives included with the library can best be represented by the following public interface.

Saving


namespace boost {
namespace archive {

enum archive_flags {
    no_header = 1,          // suppress archive header info
    no_codecvt = 2,         // suppress alteration of codecvt facet
    no_xml_tag_checking = 4 // suppress checking of xml tags - igored on saving
};

template<class OStream>
class oarchive : ...
{
    ...
public:
    // called to save objects
    template<class T>
    oarchive & operator<<(const T & t);

    template<class T>
    oarchive & operator&(T & t)
    {
        return *this << t;
    }

    void save_binary(const void *address, std::size_t count);

    template<class T>
    register_type(T * t = NULL);

    unsigned int library_version() const;

    struct is_saving {
        typedef mpl::bool_<true> type;
        BOOST_STATIC_CONSTANT(bool, value=true);
    };

    struct is_loading {
        typedef mpl::bool_<false> type;
        BOOST_STATIC_CONSTANT(bool, value=false);
    };

    oarchive(OStream & os, unsigned int flags = 0);

    ~oarchive();
};

template<class T> oarchive & operator<<(const T & t); template<class T> oarchive & operator&(T & t);

Appends an object of type T to the archive. The object may be

void save_binary(const void *address, std::size_t count);

Appends to the archive count bytes found at address.

template<class T> register_type(T * t = NULL);

Appends a sequential integer to the archive. This integer becomes the "key" used to look up the class type when the archive is later loaded. This process is referred to as "class registration". It is only necessary to invoke this function for objects of derived classes which are serialized through a base class pointer. This is explained in detail in Special Considerations - Derived Pointers

unsigned int library_version() const;

Returns the version number of the serialization library that created the archive. This number will be incremented each time the library is altered in such a way that serialization could be altered for some type. For example, suppose the type used for a count of collection members is changed. The code that loads collections might be conditioned on the library version to make sure that libraries created by previous versions of the library can still be read.

is_saving::type = mpl::bool<true>; is_saving::value= true; is_loading::type = mpl::bool<false>; is_loading::value= false;

These integral constants permit archive attributes to be queried at compiler or execution time. They can used to generate code with boost mpl . For and example showing how these can beused, see the implementation of split_free.hpp.

oarchive(OStream & os, unsigned int flags = 0);

Contructs and archive given an open stream as an argument and optional flags. For most applications there will be no need to use flags. Flags are taken from the following table and can be combined with the | operator.By default, archives prepend output with initial data which helps identify them as archives produced by this system. This permits a more graceful in the case where attempt is made to load an archive from an invalid file format. In addition to this, each type of archive might have its own information. For example, native binary archives include information about sizes of native types and endianess to gracefully handle the case where it has been erroneously assumed that such an archive is portable across platforms. In some cases where this extra overhead might be considered objectionable, it can be suppressed with the no_header flag.

In some cases, an archive may alter (and later restore) the codecvt facet of the stream locale. To suppress this action, include the no_codecvt flag.

XML archives contain nested tags signifying the start and end of data fields. These tags are normally checked for aggreement with the object name when data is loaded. If a mismatch occurs an exception is thrown. Its possible that this may not be desired behavior. To suppress this checking of XML tags, use no_xml_tag_checking flag.

~oarchive();

Destructor for an archive. This should be called before the stream is closed. It restores any altered stream facets to thier state before the the archive was opened.

Loading


template<class IStream>
class iarchive : ...
{
    ...
public:
    // called to load objects
    template<class T>
    iarchive & operator>>(T & t);

    template<class T>
    iarchive & operator&(T & t)
    {
        return *this >> t;
    }

    void delete_created_pointers();

    void load_binary(void *address, std::size_t count);

    template<class T>
    register_type(T * t = NULL);

    unsigned int library_version() const;

    struct is_saving {
        typedef mpl::bool_<false> type;
        BOOST_STATIC_CONSTANT(bool, value=false);
    };

    struct is_loading {
        typedef mpl::bool_<true> type;
        BOOST_STATIC_CONSTANT(bool, value=true);
    };

    iarchive(IStream & is, unsigned int flags = 0);

    ~iarchive();
};

} //namespace archive
) //namespace boost

template<class T> iarchive & operator>>(T & t); template<class T> iarchive & operator&(T & t);

Retrieves an object of type T from the archive. The object may be

void load_binary(void *address, std::size_t count);

Retrieves from the archive count bytes and stores them in memory starting at address.

void delete_created_pointers();

Deletes all objects created by the loading of pointers. This can be used to avoid memory leaks that might otherwise occur if pointers are being loaded and the archive load encounters an exception.

template<class T> register_type(T * t = NULL);

Retrieves the next integer from the archive and adds an entry to a table which relates the integer to the type T. When pointers are loaded, this integer is used to indicate which object type should be created. This process is referred to as "class registration". It is only necessary to invoke this function for objects of derived classes which are serialized through a base class pointer. If this function is called during the saving of data to the archive, it should be called during the loading of the data from the archive at the same point in the serialization process. This is explained in detail in Special Considerations - Derived Pointers

unsigned int library_version() const;

Returns the version number of the serialization library that created the archive. This number will be incremented each time the library is altered in such a way that serialization could be altered for some type. For example, suppose the type used for a count of collection members is changed. The code that loads collections might be conditioned on the library version to make sure that libraries created by previous versions of the library can still be read.

is_saving::type = mpl::bool<false>; is_saving::value= false; is_loading::type = mpl::bool<true>; is_loading::value= true;

These integral constants permit archive attributes to be queried at compiler or execution time. They can used to generate code with boost mpl . For and example showing how these can beused, see the implementation of split_free.hpp.

iarchive(IStream & is, unsigned int flags = 0);

Contructs an archive given an open stream as an argument and optional flags. If flags are used, they should be the same as those used when the archive was created. Function and usage of flags is described above.

~iarchive();

Destructor for an archive. This should be called before the stream is closed. It restores any altered stream facets to thier state before the the archive was opened.
There are archives based on text, binary and XML file formats but all have the above interface. Given that all archives present the same public interface, specifcation of serialization is exactly the same for all archives. Archive classes have other members not mentioned here. However they are related to the internal functioning of the library and are not meant to be called by users of an archive. Implementation of new archives is discussed in New Archives - Implementation.

The existence of the << and >> suggest a relationship between archives and C++ i/o streams. Archives are not C++ i/o streams. All the archives included with this system take a stream as an argument in the constructor and that stream is used for output or input. However, this is not a requirement of the serialization functions or the archive interface. It just turns out that the archives written so far have found it useful to base their implementation on streams.

Details by Type

Primitive Types

In this document, we use the term primitive type to mean types whose data is simply saved/loaded to/from an archive with no further processing. By default, arithmetic (including characters), bool, and enum types are primitive types. Using serialization traits, any user type can also be designated as "primitive" so that it is handled in this way.

The template operators &, <<, and >> of the archive classes described above will generate code to save/load all primitive types to/from an archive. This code will usually just add the data to the archive according to the archive format. For example, a four byte integer is appended to a binary archive as 4 binary bytes while a to a text archive it would be rendered as a space followed by a string representation.

Class Types

For class/struct types, the template operators &, <<, and >> will generate code that invokes the programmer's serialization code for the particular data type. There is no default. An attempt to serialize a class/struct for which no serialization has been explicitly specified will result in a compile time error. Specification of serialization for user defined types is explained in detail in the next section of this manual.

Base Classes

The header file base_object.hpp includes the template:

template<class Base, class Derived>
Base & base_object(Derived &d);
which should be used to create a reference to an object of the base which can be used as an argument to the archive save/load operators:

ar & boost::serialization::base_object<Base>(*this);
Resist the temptation to just cast *this to the base class. This might seem to work but may fail to invoke code necessary for proper serialization.

Pointers

A pointer to any class instance can be serialized with any of the archive save/load operators.

To properly save and restore an object through a pointer the following situations must be addressed:

  1. If the same object is saved multiple times through different pointers, only one copy of the object need be saved.
  2. If an object is loaded multiple times through different pointers, only one new object should be created and all returned pointers should point to it.
  3. The system must detect the case where an object is first saved through a pointer then the object itself is saved. Without taking extra precautions, loading would result in the creation of multiple copies of the original object. This system detects this case when saving and throws an exception - see below.
  4. An object of a derived class may be stored through a pointer to the base class. The true type of the object must be determined and saved. Upon restoration the correct type must be created and its address correctly cast to the base class. That is, polymorphic pointers have to be considered.
  5. NULL pointers must be dectected when saved and restored to NULL when deserialized.
This serialization library addresses all of the above considerations. The process of saving and loading an object through a pointer is non-trivial. It can be summarized as follows:

Saving a pointer:

  1. determine the true type of the object being pointed to.
  2. write a special tag to the archive
  3. if the object pointed to has not already been written to the archive, do so now
Loading a pointer:
  1. read a tag from the archive.
  2. determine the type of object to be created
  3. if the object has already been loaded, return it's address.
  4. otherwise, create a new instance of the object
  5. read the data back in using the operators described above
  6. return the address of the newly created object.
Given that class instances are saved/loaded to/from the archive only once, regardless of how many times they are serialized with the << and >> operators Serialization of pointers of derived types through a pointer to the base class may require a little extra "help". Also, the programmer may desire to modify the process described above for his own reasons. For example, it might be desired to suppress the tracking of objects as it is known a priori that the application in question can never create duplicate objects. Serialization of pointers can be "fine tuned" via the specification of Class Serialization Traits as described in another section of this manual

References

In general, references are serialized just as any other objects are. However, references have the property that several references may refer to the same object - much like pointers. So, there exists the opportunity to gain some storage efficiency by storing only one copy. This subject is addressed the chapter Class Serialization Traits - Object Tracking.

Character Sets

This library includes two archive classes for XML. The wide character version (xml_w?archive) renders it output as UTF-8 which can handle any wide character without loss of information. std::string data is converted from multi-byte format to wide character format using the current locale. Hence this version should give a fair rendering of all C++ data for all cases. This could result in some unexpected behavior. Suppose an std::string is created with the locale character set to hebrew characters. On output this is converted to wide characters. On input however, there could be a problem if the locale is not set the same as when the archive is created.

The normal character version (xml_?archive) renders std::string output without any conversion. Though this may work fine for serialization, it may create difficulties if the XML archive is used for some other purpose.


© Copyright Robert Ramey 2002-2004. Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)