C++ Boost

Serialization

Class Serialization


Member Function
Free Function
Base Classes
Versioning
Splitting serialize into save/load
Member Functions
Free Functions
const Members
Non-Default Constructors
Reference Members
Templates
Class Serialization Traits
Serialization Wrappers
Serialization Implementations Included in the Library
The header file serialization.hpp contains public interface to the serialization library. This entire interface consists of three overridable function templates.

Member Function

The first of these three templates is:

template<class Archive, class T>
inline void serialize(
    Archive & ar, 
    T & t, 
    const unsigned long int file_version
){
    // invoke member function for class T
    t.serialize(ar, file_version);
}
It is invoked each time the data members of a class instance are to be saved to or loaded from an archive. The default definition of this template presumes the existence of a class member function template of the following signature:

template<class Archive>
void serialize(Archive &ar, const unsigned int version){
    ...
}
If this is not declared, then a compile time error will occur. In order that the member function generated by this template can be called to append the data to an archive, it either must be public or the class must be made accessible to the serialization library by including:

friend class boost::serialization::access;
in the class definition. This latter method should be preferred over the option of making member function public. This will prevent serialization functions from being called from outside the library. This is almost certainly an error. Unfortunately, it may appear to function but fail in a way that is very difficult to find.

It may not be immediately obvious how this one template serves for both saving data to an archive as well as loading data from the archive. The key is that the & operator is defined as << for output archives and as >> input archives. The "polymorphic" behavior of the & permits the same template to be used for both save and load operations. This is very convenient in that it saves a lot of typing and guarantees that the saving and loading of class data members are always in sync. This is the key to the whole serialization system.

Free Function

Of course we're not restricted to using the default implementation described above. We can override the default one with our own. Doing this will permit us to implement serialization of a class without altering the class definition itself. We call this non-intrusive serialization. Suppose our class is named my_class, the override would be specified as:

// namespace selection

template<class Archive>
inline void serialize(
    Archive & ar, 
    my_class & t, 
    const unsigned long int file_version
){
    ...
}
Note that we have called this override "non-intrusive". This is slightly inaccurate. It does not require that the class have special functions, that it be derived from some common base class or any other fundamental design changes. However, it will require access to the class members that are to be saved and loaded. If these members are private, it won't be possible to serialize them. So in some instances, minor modifications to the class to be serialized will be necessary even when using this "non-intrusive" method. In practice this may not be such a problem as many libraries (E.G. STL) expose enough information to permit implementation of non-intrusive serialization with absolutly no changes to the library.

Regardless of which method is used the body of the serialize function will specify the data to be saved/loaded by sequential application of the archive operator & to all the data members of the class.


{
    // save/load class member variables
    ar & member1;
    ar & member2;
}

Namespaces for Free Function Overrides

The question arises as to which namespace free serialization functions should be part of.

The options for this depend on:

according to the following table:

ADLTwo Phase
Lookup
Dependent
Type T?
Namespace permitted
nono-boost::serialization
noyes-no compilers do this
yesno-boost::serialzation
namespace of T
namespace of Archive
yesyesnonamespace of T
namespace of Archive
yesyesyesboost::serialization
namespace of T
namespace of Archive

To deal with this while maintaining portability, the test programs use the following before specifying free function overloads:


// function specializations must be defined in the appropriate
// namespace - boost::serialization
#ifdef BOOST_NO_ARGUMENT_DEPENDENT_LOOKUP
namespace boost { namespace serialization {
#endif
which works for all compilers.

From Vandervoorde and Josuttis book "C++ Templates - A Complete Guide"[14] page 509:

dependant name
A name the meaning of which depends on a template parameter. For example, A::x is a dependant name when A or T is a template parameter. The name of a function in a function call is also dependant if any of the arguments in the call has a type that depends on a template parameter. For example, f in f((T*)0) is dependent if T is a template parameter. The name of a template parameter is not considered dependent, however.
and page 515:
two-phase lookup
The name lookup mechanism used for names in templates. The "two phases" are (1) the phase during which a template definition is first encountered by a compiler, and (2) the instantiation of a template. Nondependant names are looked up only in the first phase, but during this first phase nondepdendent base class are not considered. Dependant names with a scope qualifier(::) are looked up only in the second phase. Dependant names without a scop qualifier may be looked up in both places, but in the second phase only argument-dependant lookup is performed.
In this library, the file serialization.hpp, which calls the serialization override, is included by including any archive classes. This would suggest that all serialization overrides could be in any of the three possible namespaces if the serialization code is included before the archives. However, this is not always possible. Our implementation of "export" functionality requires just the opposite.

This is consided inelegant to say the least. Hopefully, this may be improved in the future.

Base Classes

If the class to be serialized is derived from another class, its data should be serialized with the following syntax:

{
    // invoke serialization of the base class 
    ar & boost::serialization::base_object<base_class_of_T>(*this);
    // save/load class member variables
    ar & member1;
    ar & member2;
}
Note that this is NOT the same as calling the serialize function of the base class. This might seem to work but will circumvent certain code used for tracking of objects, and registering base-derived relationships and other bookkeeping that is required for the serialization system to function as designed. For this reason, all serialize member functions should be private.

Versioning

It will eventually occur that class definitions change after archives have been created. When a class instance is saved, the current version in included in the class information stored in the archive. When the class instance is loaded from the archive, the original version number is passed as an argument to the loading function. This permits the load function to include logic to accommodate older definitions for the class and reconcile them with latest version. Save functions always save the current version. So this results in automatically converting older format archives to the newest versions. Version numbers are maintained independently for each class. This results in a simple system for permitting access to older files and conversion of same. The current version of the class is assigned as a Class Serialization Trait described later in this manual.

{
    // invoke serialization of the base class 
    ar & boost::serialization::base_object<base_class_of_T>(*this);
    // save/load class member variables
    ar & member1;
    ar & member2;
    // if its a recent version of the class
    if(1 < file_version)
        // save load recently added class members
        ar & member3;
}

Splitting serialize into Save/Load

There are times when it is inconvenient to use the same template for both save and load functions. For example, this might occur if versioning gets complex.

Splitting Member Functions

For member functions this can be addressed by including the header file boost/serialization/split_member.hpp including code like this in the class:

template<class Archive>
void save(Archive & ar, const unsigned int version) const
{
    // invoke serialization of the base class 
    ar << boost::serialization::base_object<const base_class_of_T>(*this);
    ar << member1;
    ar << member2;
    ar << member3;
}

template<class Archive>
void load(Archive & ar, const unsigned int version)
{
    // invoke serialization of the base class 
    ar >> boost::serialization::base_object<base_class_of_T>(*this);
    ar >> member1;
    ar >> member2;
    if(version > 0)
        ar >> member3;
}

template<class Archive>
void serialize(
    Archive & ar,
    const unsigned int file_version 
){
    boost::serialization::split_member(ar, *this, file_version);
}
This splits the serialization into two separate functions save and load. Since the new serialize template is always the same it can be generated by invoking the macro BOOST_SERIALIZATION_SPLIT_MEMBER() defined in the header file boost/serialization/split_member.hpp . So the entire serialize function above can be replaced with:

BOOST_SERIALIZATION_SPLIT_MEMBER()

Splitting Free Functions

The situation is same for non-intrusive serialization with the free serialize function template. To use save and load function templates rather than serialize:

namespace boost { namespace serialization {
template<class Archive>
void save(Archive & ar, const my_class & t, unsigned int version)
{
    ...
}
template<class Archive>
void load(Archive & ar, my_class & t, unsigned int version)
{
    ...
}
}}
include the header file
boost/serialization/split_free.hpp . and override the free serialize function template:

namespace boost { namespace serialization {
template<class Archive>
inline void serialize(
    Archive & ar,
    my_class & t,
    const unsigned int file_version
){
    split_free(ar, t, file_version); 
}
}}
To shorten typing, the above template can be replaced with the macro:

BOOST_SERIALIZATION_SPLIT_FREE(my_class)
Note that although the functionality to split the serialize function into save/load has been provided, the usage of the serialize function with the corresponding & operator is preferred. The key to the serialization implementation is that objects are saved and loaded in exactly the same sequence. Using the & operator and serialize function guarantees that this is always the case and will minimize the occurence of hard to find errors related to synchronization of save and load functions.

const Members

Saving const members to an archive requires no special considerations. Loading const members can be addressed by using a const_cast:

    ar & const_cast<T &>(t);
Note that this violates the spirit and intention of the const keyword. const members are intialized when a class instance is constructed and not changed thereafter. However, this may be most appropriate in many cases. Ultimately, it comes down to the question about what const means in the context of serialization.

Non-Default Constructors

The general procedure used for serialization of objects through a pointer has been described in a previous section. This is implemented by code in the serialization library which is similar to the following:

// load data required for construction and invoke constructor in place
template<class Archive, class T>
inline void load_construct_data(
    Archive & ar, T * t, const unsigned int file_version
){
    // default just uses the default constructor to initialize
    // previously allocated memory. 
    ::new(t)T();
}
The default load_construct_data invokes the default constructor "in-place" to initialize the memory.

If there is no such default constructor, the function templates load_construct_data and perhaps save_construct_data will have to be overridden. Here is a simple example:


class my_class {
private:
    friend class boost::serialization::access;
    int member;
    template<class Archive>
    void serialize(Archive &ar, const unsigned int file_version){
        ar & member;
    }
public:
    my_class(int m) :
        member(m)
    {}
};
the overrides would be:

namespace boost { namespace serialization {
template<class Archive>
inline void save_construct_data(
    Archive & ar, const my_class * t, const unsigned long int file_version
){
    // save data required to construct instance
    ar << t->member;
}

template<class Archive>
inline void load_construct_data(
    Archive & ar, my_class * t, const unsigned long int file_version
){
    // retrieve data from archive required to construct new instance
    int m;
    ar >> m;
    // invoke inplace constructor to initialize instance of my_class
    ::new(t)my_class(m);
}
}} // namespace ...
In addition to the deserialization of pointers, these overrides are used in the deserialization of STL containers whose element type has no default constructor.

Reference Members

Classes that contain reference members will generally require non-default constructors as references can only be set when an instance is constructed. The example of the previous section is slightly more complex if the class has reference members. This raises the question of how and where the objects being referred to are stored and how are they created. Also there is the question about references to polymorphic base classes. Basically, these are the same questions that arise regarding pointers. This is no surprise as references are really a special kind of pointer. We address these questions by serializing references as though they were pointers.

class object;
class my_class {
private:
    friend class boost::serialization::access;
    int member1;
    object & member2;
    template<class Archive>
    void serialize(Archive &ar, const unsigned int file_version);
public:
    my_class(int m, object & o) :
        member1(m), 
        member2(o)
    {}
};
the overrides would be:

namespace boost { namespace serialization {
template<class Archive>
inline void save_construct_data(
    Archive & ar, const my_class * t, const unsigned int file_version
){
    // save data required to construct instance
    ar << t.member1;
    // serialize reference to object as a pointer
    ar << & t.member2;
}

template<class Archive>
inline void load_construct_data(
    Archive & ar, my_class * t, const unsigned int file_version
){
    // retrieve data from archive required to construct new instance
    int m;
    ar >> m;
    // create and load data through pointer to object
    // tracking handles issues of duplicates.
    object * optr;
    ar >> optr;
    // invoke inplace constructor to initialize instance of my_class
    ::new(t)my_class(m, *optr);
}
}} // namespace ...

Templates

Implementation serialization for templates is exactly the same process as for normal classes and requires no additional considerations. Among other things, this implies that serialization of compositions of templates are automatically generateded when required if serialization of the component templates is defined. For example, this library includes definition of serialization for boost::shared_ptr<T> and for std::list<T>. If I have defined serialization for my own class my_t, then serialization for std::list< boost::shared_ptr< my_t> > is already available for use.

See for an example that shows how this idea might be implemented for your own class templates, see demo_auto_ptr.cpp. This shows how non-intrusive serialization for the template auto_ptr from the standard library can be implemented.

A somewhat trickier addition of serialization to a standard template can be found in the example shared_ptr.hpp

In the specification of serialization for templates, its common to split serialize into a load/save pair. Note that the convenience macro described above isn't helpful in these cases as the number and kind of template class arguments won't match those used when splitting serialize for a simple class. Use the override syntax instead.

Class Serialization Traits

Serialization Wrappers

Serialization Implementations Included in the Library

This library includes code to serialize C style arrays of other serializable types. That is, if T is a serializable type, then the following is automatically available and will function as expected:

T t[4];
ar << t;
    ...
ar >> t;
The facilities described above are sufficient to implement serialization for all STL containers. In fact, this has been done and has been included in the library. For example, in order to use the included serialization code for std::list, use:

#include <boost/serialization/list.hpp>
rather than

#include <list>
Since the former includes the latter, this all that is necessary. The same holds true for all STL collections as well as templates required to support them (e.g. std::pair).

© Copyright Robert Ramey 2002-2004. Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)