Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

This is the documentation for an old version of Boost. Click here to view this page for the latest version.
PrevUpHome

General Techniques

Creating Packages
Extending Wrapped Objects in Python
Reducing Compiling Time

Here are presented some useful techniques that you can use while wrapping code with Boost.Python.

A Python package is a collection of modules that provide to the user a certain functionality. If you're not familiar on how to create packages, a good introduction to them is provided in the Python Tutorial.

But we are wrapping C++ code, using Boost.Python. How can we provide a nice package interface to our users? To better explain some concepts, let's work with an example.

We have a C++ library that works with sounds: reading and writing various formats, applying filters to the sound data, etc. It is named (conveniently) sounds. Our library already has a neat C++ namespace hierarchy, like so:

sounds::core
sounds::io
sounds::filters

We would like to present this same hierarchy to the Python user, allowing him to write code like this:

import sounds.filters
sounds.filters.echo(...) # echo is a C++ function

The first step is to write the wrapping code. We have to export each module separately with Boost.Python, like this:

/* file core.cpp */
BOOST_PYTHON_MODULE(core)
{
    /* export everything in the sounds::core namespace */
    ...
}

/* file io.cpp */
BOOST_PYTHON_MODULE(io)
{
    /* export everything in the sounds::io namespace */
    ...
}

/* file filters.cpp */
BOOST_PYTHON_MODULE(filters)
{
    /* export everything in the sounds::filters namespace */
    ...
}

Compiling these files will generate the following Python extensions: core.pyd, io.pyd and filters.pyd.

[Note] Note

The extension .pyd is used for python extension modules, which are just shared libraries. Using the default for your system, like .so for Unix and .dll for Windows, works just as well.

Now, we create this directory structure for our Python package:

sounds/
    __init__.py
    core.pyd
    filters.pyd
    io.pyd

The file __init__.py is what tells Python that the directory sounds/ is actually a Python package. It can be a empty file, but can also perform some magic, that will be shown later.

Now our package is ready. All the user has to do is put sounds into his PYTHONPATH and fire up the interpreter:

>>> import sounds.io
>>> import sounds.filters
>>> sound = sounds.io.open('file.mp3')
>>> new_sound = sounds.filters.echo(sound, 1.0)

Nice heh?

This is the simplest way to create hierarchies of packages, but it is not very flexible. What if we want to add a pure Python function to the filters package, for instance, one that applies 3 filters in a sound object at once? Sure, you can do this in C++ and export it, but why not do so in Python? You don't have to recompile the extension modules, plus it will be easier to write it.

If we want this flexibility, we will have to complicate our package hierarchy a little. First, we will have to change the name of the extension modules:

/* file core.cpp */
BOOST_PYTHON_MODULE(_core)
{
    ...
    /* export everything in the sounds::core namespace */
}

Note that we added an underscore to the module name. The filename will have to be changed to _core.pyd as well, and we do the same to the other extension modules. Now, we change our package hierarchy like so:

sounds/
    __init__.py
    core/
        __init__.py
        _core.pyd
    filters/
        __init__.py
        _filters.pyd
    io/
        __init__.py
        _io.pyd

Note that we created a directory for each extension module, and added a __init__.py to each one. But if we leave it that way, the user will have to access the functions in the core module with this syntax:

>>> import sounds.core._core
>>> sounds.core._core.foo(...)

which is not what we want. But here enters the __init__.py magic: everything that is brought to the __init__.py namespace can be accessed directly by the user. So, all we have to do is bring the entire namespace from _core.pyd to core/__init__.py. So add this line of code to sounds/core/__init__.py:

from _core import *

We do the same for the other packages. Now the user accesses the functions and classes in the extension modules like before:

>>> import sounds.filters
>>> sounds.filters.echo(...)

with the additional benefit that we can easily add pure Python functions to any module, in a way that the user can't tell the difference between a C++ function and a Python function. Let's add a pure Python function, echo_noise, to the filters package. This function applies both the echo and noise filters in sequence in the given sound object. We create a file named sounds/filters/echo_noise.py and code our function:

import _filters
def echo_noise(sound):
    s = _filters.echo(sound)
    s = _filters.noise(sound)
    return s

Next, we add this line to sounds/filters/__init__.py:

from echo_noise import echo_noise

And that's it. The user now accesses this function like any other function from the filters package:

>>> import sounds.filters
>>> sounds.filters.echo_noise(...)

Thanks to Python's flexibility, you can easily add new methods to a class, even after it was already created:

>>> class C(object): pass
>>>
>>> # a regular function
>>> def C_str(self): return 'A C instance!'
>>>
>>> # now we turn it in a member function
>>> C.__str__ = C_str
>>>
>>> c = C()
>>> print c
A C instance!
>>> C_str(c)
A C instance!

Yes, Python rox.

We can do the same with classes that were wrapped with Boost.Python. Suppose we have a class point in C++:

class point {...};

BOOST_PYTHON_MODULE(_geom)
{
    class_<point>("point")...;
}

If we are using the technique from the previous session, Creating Packages, we can code directly into geom/__init__.py:

from _geom import *

# a regular function
def point_str(self):
    return str((self.x, self.y))

# now we turn it into a member function
point.__str__ = point_str

All point instances created from C++ will also have this member function! This technique has several advantages:

  • Cut down compile times to zero for these additional functions
  • Reduce the memory footprint to virtually zero
  • Minimize the need to recompile
  • Rapid prototyping (you can move the code to C++ if required without changing the interface)

You can even add a little syntactic sugar with the use of metaclasses. Let's create a special metaclass that "injects" methods in other classes.

# The one Boost.Python uses for all wrapped classes.
# You can use here any class exported by Boost instead of "point"
BoostPythonMetaclass = point.__class__

class injector(object):
    class __metaclass__(BoostPythonMetaclass):
        def __init__(self, name, bases, dict):
            for b in bases:
                if type(b) not in (self, type):
                    for k,v in dict.items():
                        setattr(b,k,v)
            return type.__init__(self, name, bases, dict)

# inject some methods in the point foo
class more_point(injector, point):
    def __repr__(self):
        return 'Point(x=%s, y=%s)' % (self.x, self.y)
    def foo(self):
        print 'foo!'

Now let's see how it got:

>>> print point()
Point(x=10, y=10)
>>> point().foo()
foo!

Another useful idea is to replace constructors with factory functions:

_point = point

def point(x=0, y=0):
    return _point(x, y)

In this simple case there is not much gained, but for constructurs with many overloads and/or arguments this is often a great simplification, again with virtually zero memory footprint and zero compile-time overhead for the keyword support.

If you have ever exported a lot of classes, you know that it takes quite a good time to compile the Boost.Python wrappers. Plus the memory consumption can easily become too high. If this is causing you problems, you can split the class_ definitions in multiple files:

/* file point.cpp */
#include <point.h>
#include <boost/python.hpp>

void export_point()
{
    class_<point>("point")...;
}

/* file triangle.cpp */
#include <triangle.h>
#include <boost/python.hpp>

void export_triangle()
{
    class_<triangle>("triangle")...;
}

Now you create a file main.cpp, which contains the BOOST_PYTHON_MODULE macro, and call the various export functions inside it.

void export_point();
void export_triangle();

BOOST_PYTHON_MODULE(_geom)
{
    export_point();
    export_triangle();
}

Compiling and linking together all this files produces the same result as the usual approach:

#include <boost/python.hpp>
#include <point.h>
#include <triangle.h>

BOOST_PYTHON_MODULE(_geom)
{
    class_<point>("point")...;
    class_<triangle>("triangle")...;
}

but the memory is kept under control.

This method is recommended too if you are developing the C++ library and exporting it to Python at the same time: changes in a class will only demand the compilation of a single cpp, instead of the entire wrapper code.

[Note] Note

This method is useful too if you are getting the error message "fatal error C1204:Compiler limit:internal structure overflow" when compiling a large source file, as explained in the FAQ.


PrevUpHome