Serialization

Archive Class Reference

Implementation
Usage
Testing
Polymorphic Archives

Implementation

The Archive concept specifies the functions that a class must implement to in order to be used to serialize Serializable types. The library implements a family of archives appropriate for different purposes. This section describes how they have been implemented and one way one can implement his own archive class. Our discussion will focus on archives used for loading as the hierarchy is exactly analogous for archives used for saving data.

Our archives have been factored in to a tree of classes in order to minimize repetition of code. This is shown in the accompanying class diagram. Any class which fullfills the following requirements will function as a loading archive.

Minimum Requirments

The archive class is derived from:


template<class Archive>
detail::common_iarchive;

An instance of the this template handles all the "bookkeeping" associated with serialization. In order to be a functional only the following additional functions must be declared:

void load(T &t);: This function must be implemented for all primitive data types. This can be accomplished through the use of a member template or explicit declarations for all primitive types.
void load_binary(void *address, std::size_t size);: This function should size bytes from the archive, and copy them in to memory starting at address address.
friend class boost::archive::load_access;: In addition, such a class must provide a the following friend declaration grant access to the core serialization library to the functions of this class.

So, the most trivial implementation of an input archive would look like this:



#include <boost/archive/detail/common_iarchive.hpp>

/////////////////////////////////////////////////////////////////////////
// class trivial_iarchive - read serialized objects from a input text stream
class trivial_iarchive : 
    public boost::archive::detail::common_iarchive<trivial_iarchive>
{
    // permit serialization system privileged access to permit
    // implementation of inline templates for maximum speed.
    friend class boost::archive::load_access;

    // member template for loading primitive types.
    // Override for any types/templates that special treatment
    template<class T>
    void load(T & t);

public:
    //////////////////////////////////////////////////////////
    // public interface used by programs that use the
    // serialization library

    // archives are expected to support this function
    void load_binary(void *address, std::size_t count);
};

The simplest possible output archive class is exactly analogous to the above. In the following discussion, only input archives will be addressed. Output archives are exactly symmetrical to input archives.

Given a suitable definitions of load and load_binary, any program using serialization with a conforming C++ compiler should compile and run with this archive class.

Optional Overrides

The detail::common_iarchive class contains a number of functions that are used by various parts of the serialization library to help render the archive in a particular form.

void load_start(): Default:Does nothing.
Purpose:To inject/retrieve an object name into the archive. Used by XML archive to inject "<name " before data.
void load_end(): Default:Does nothing.
Purpose:To inject/retrieve an object name into the archive. Used by XML archive to inject "</name>" after data.
void end_preamble(): Default:Does nothing.
Purpose:Called each time user data is saved. It's not called when archive book keeping data is saved. This is used by XML archives to determine when to inject a ">" character at end of XML header. XML output archives keep their own internal flag indicating that data being written is header data. This internal flag is reset when an object start tag is written. When void end_preamble() is invoked and this internal flag is set a ">" character is appended to the output and the internal flag is reset. The default implementation for void end_preamble() is a no-op there by permitting it to be optimised away for archive classes that don't use it.
template<class T> void load_override(T & t, int);: Default:Invokes archive::load(Archive & ar, t)
This is the main entry into the serialization library.
Purpose:This can be overridden in cases where the data is to be written to the archive in some special way. For example, XML archives implement special handling for name-value pairs by overriding this function template for name-value pairs. This replaces the default name-value pair handling, which is just to throw away the name, with one appropriate for XML which writes out the start of an XML tag with the correct object name.
The second argument must be part of the function signature even this it is not used. Its purpose is to be sure that code is portable to compilers which fail to correctly implement partial function template ordering. For more information see this.

Types used by the serialization library

The serialization library injects bookkeeping data into the serialization archive. This data includes things like object ids, version numbers, class names etc. Each of these objects is included in a wrapper so that the archive class can override the implementation of void load_override(T & t, int);. For example, in the XML archive, the override for this type renders an object_id equal to 23 as "object_id=_23". The following table lists the types defined in the boost::archive namespace used internally by the serialization library:

type	`default serialed as`
`version_type`	`unsigned int`
`object_id_type`	`unsigned int`
`object_id_reference_type`	`unsigned int`
`class_id_type`	`int`
`class_id_optional_type`	`nothing`
`class_id_reference_type`	`int`
`tracking_type`	`bool`
`classname_type`	`string`

All of these are associated with a default serialization defined in terms of primitive types so it isn't a requirement to define load_override for these types.

These are defined in basic_archive.hpp. All of these types have been assigned an implementation level of primitive and are convertible to types such as int, unsigned int, etc. so that they have default implementations. This is illustrated by basic_text_iarchive.hpp. which relies upon the default. However, in some cases, overrides will have to be explicitly provided for these types. For an example see basic_xml_iarchive.hpp.

In real practice, we probably won't be quite done. One or more of the following issues may need to be addressed:

Many compilers fail to implement correct partial ordering of function templates. The archives included with this library work around this using argument overloading. This technique is described in another section of this manual
Even if we are using a conforming compiler, we might want our new archive class to be portable to non-conforming compilers.
Our archive format might require extra information inserted into it. For example, XML archives need <name ... >...</name> surrounding all data objects.
Addressing any of the above may generate more issues to be addressed.
The archives included with library are all templates which use a stream or streambuf as a template parameter rather than simple classes. Combined with the above, even more issues arise with non-conforming compilers.

The attached class diagram shows the relationships between classes used to implement the serialization library.

A close examination of the archives included with the library illustrate what it takes to make a portable archive that covers all data types.

Usage

The newly created archive will usually be stored in its own header module. All that is necessary is to include the header and construct an instance of the new archive. EXCEPT for one special case.

Instances of a derived class are serialized through a base class pointer.
Such instances are not "registered" neither implicitly nor explicitly. That is, the macro BOOT_CLASS_EXPORT is used to instantiate the serialization code for the included archives.

To make this work, the following should be included after the archive class definition.


#define BOOST_SERIALIZATION_REGISTER_ARCHIVE(Archive)

Failure to do this will not inhibit the program from compiling, linking and executing properly - except in one case. If an instance of a derived class is serialized through a pointer to its base class, the program will throw an unregistered_class exception.

Testing

Exhaustive testing of the library requires testing the different aspects of object serialization with each archive. There are 46 different tests that can run with any archive. There are 5 "standard archives" included with the system. (3 in systems don't support wide charactor i/o).

In addition, there are 28 other tests which aren't related to any particular archive class.

The default bjam testing setup will run all the above described tests. This will result in as many as 46 archive tests * 5 standard archives + 28 general tests = 258 tests. Note that a complete test of the library would include DLL vs static library, release vs debug so the actual total would be closer to 1032 tests.

For each archive there is a header file in the test directory similar to the one below. The name of this archive is passed to the test program by setting the environmental variable BOOST_ARCHIVE_TEST to the name of the header. Here is the header file test_archive.hpp . Test header files for other archives are similar.


// text_archive test header
// include output archive header
#include <boost/archive/text_oarchive.hpp>
// set name of test output archive
typedef boost::archive::text_oarchive test_oarchive;
// set name of test output stream
typedef std::ofstream test_ostream;

// repeat the above for input archive
#include <boost/archive/text_iarchive.hpp>
typedef boost::archive::text_iarchive test_iarchive;
typedef std::ifstream test_istream;

// define open mode for streams
//   binary archives should use std::ios_base::binary
#define TEST_STREAM_FLAGS (std::ios_base::openmode)0

To test a new archive, for example, portable binary archives, with the gcc compiler, make a header file portable_binary_archive.hpp and invoke bjam with

 
-sBOOST_ARCHIVE_LIST=portable_binary_archive.hpp

This process in encapsulated in the shell or cmd script library_test whose command line is


library_test --toolset=gcc -sBOOST_ARCHIVE_LIST=portable_binary_archive.hpp

Polymorphic Archives

Motivation

All archives described so far are implemented as templates. Code to save and load data to archives is regenerated for each combination of archive class and data type. Under these cirumstances, a good optimizing compiler that can expand inline functions to enough depth will generate fast code. However:

Much inline code may be replicated.
If there are several archive classes, code will be regenerated for each archive class.
If serialization code is placed in a library, that library must be rebuilt each time a new archive class is created.
If serialization code is placed in a DLL,
- The DLL will contain versions of code for each known archive type. This would result in loading of DLLs which contain much code that is not used - basically defeating one of the main motivations for choosing to use a DLL in the first place.
- If a new archive is created and an application shipped, all DLLs have to be rebuilt, and reshipped along with the application which uses the new archive. Thus the other main motivation for using a DLL is defeated.

Implementation

The solution is the the pair polymorphic_oarchive and polymorphic_iarchive. They present a common interface of virtual functions - no templates - that is equivalent to the standard templated one. This is shown in the accompanying class diagram

The accompanying demo program in files demo_polymorphic.cpp, demo_polymorphic_A.hpp, and demo_polymorphic_A show how polymorphic archives are to be used. Note the following:

demo_polymorphic_A.hpp and demo_polymorphic_A.cpp contain no templates and no reference to any specific archive implementation. That is, they will only have to be compiled once for all archive implementations. The even applies to archives classes created in the future.
The main program demo_polymorphic.cpp specifies a specific archive implementation.

As can be seen in the class diagram and the header files, this implementation is just a composition of the polymorphic interface and the standard template driven implementation. This composition is accomplished by the templates polymorphic_iarchive_route.hpp and polymorphic_oarchive_route.hpp. As these contain no code specific to the particular implementation archive, they can be used to create a polymorphic archive implementation from any functioning templated archive implementation.

As a convenience, small header files have been included which contain typedef for polymorphic implementation for each corresponding templated one. For example, the headers polymorphic_text_iarchive.hpp and polymorphic_text_oarchive.hpp. contain the typedef for the polymorphic implementation of the standard text archive classes text_iarchive.hpp and text_oarchive.hpp respectively. All included polymorphic archives use the same naming scheme.

Usage

Polymorphic archives address the issues raised above regarding templated implementation. That is, there is no replicated code, and no recompilation for new archives. This will result in smaller executables for program which use more than one type of archive, and smaller DLLS. There is a penalty for calling archive functions through a virtual function dispatch table and there is no possibility for a compiler to inline archive functions. This will result in a detectable degradation in performance for saving and loading archives.

The main utility of polymorphic archives will be to permit the buiding of class DLLs that will include serialization code for all present and future archives with no redundant code.

Serialization

Archive Class Reference

Implementation

Minimum Requirments

`void load(T &t);`

`void load_binary(void *address, std::size_t size);`

`friend class boost::archive::load_access;`

Optional Overrides

`void load_start()`

`void load_end()`

`void end_preamble()`

`template<class T> void load_override(T & t, int);`

Types used by the serialization library

Usage

Testing

Polymorphic Archives

Motivation

Implementation

Usage