...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
Here are presented some useful techniques that you can use while wrapping code with Boost.Python.
A Python package is a collection of modules that provide to the user a certain functionality. If you're not familiar on how to create packages, a good introduction to them is provided in the Python Tutorial.
But we are wrapping C++ code, using Boost.Python. How can we provide a nice package interface to our users? To better explain some concepts, let's work with an example.
We have a C++ library that works with sounds: reading and writing various
formats, applying filters to the sound data, etc. It is named (conveniently)
sounds
. Our library already has a neat C++ namespace hierarchy,
like so:
sounds::core sounds::io sounds::filters
We would like to present this same hierarchy to the Python user, allowing him to write code like this:
import sounds.filters sounds.filters.echo(...) # echo is a C++ function
The first step is to write the wrapping code. We have to export each module separately with Boost.Python, like this:
/* file core.cpp */ BOOST_PYTHON_MODULE(core) { /* export everything in the sounds::core namespace */ ... } /* file io.cpp */ BOOST_PYTHON_MODULE(io) { /* export everything in the sounds::io namespace */ ... } /* file filters.cpp */ BOOST_PYTHON_MODULE(filters) { /* export everything in the sounds::filters namespace */ ... }
Compiling these files will generate the following Python extensions: core.pyd
,
io.pyd
and filters.pyd
.
Note | |
---|---|
The extension |
Now, we create this directory structure for our Python package:
sounds/ __init__.py core.pyd filters.pyd io.pyd
The file __init__.py
is what tells Python that the directory
sounds/
is actually a Python package. It can be a empty
file, but can also perform some magic, that will be shown later.
Now our package is ready. All the user has to do is put sounds
into his PYTHONPATH
and fire up the interpreter:
>>> import sounds.io >>> import sounds.filters >>> sound = sounds.io.open('file.mp3') >>> new_sound = sounds.filters.echo(sound, 1.0)
Nice heh?
This is the simplest way to create hierarchies of packages, but it is not very flexible. What if we want to add a pure Python function to the filters package, for instance, one that applies 3 filters in a sound object at once? Sure, you can do this in C++ and export it, but why not do so in Python? You don't have to recompile the extension modules, plus it will be easier to write it.
If we want this flexibility, we will have to complicate our package hierarchy a little. First, we will have to change the name of the extension modules:
/* file core.cpp */ BOOST_PYTHON_MODULE(_core) { ... /* export everything in the sounds::core namespace */ }
Note that we added an underscore to the module name. The filename will have
to be changed to _core.pyd
as well, and we do the same
to the other extension modules. Now, we change our package hierarchy like
so:
sounds/ __init__.py core/ __init__.py _core.pyd filters/ __init__.py _filters.pyd io/ __init__.py _io.pyd
Note that we created a directory for each extension module, and added a __init__.py to each one. But if we leave it that way, the user will have to access the functions in the core module with this syntax:
>>> import sounds.core._core >>> sounds.core._core.foo(...)
which is not what we want. But here enters the __init__.py
magic: everything that is brought to the __init__.py
namespace
can be accessed directly by the user. So, all we have to do is bring the
entire namespace from _core.pyd
to core/__init__.py
.
So add this line of code to sounds/core/__init__.py
:
from _core import *
We do the same for the other packages. Now the user accesses the functions and classes in the extension modules like before:
>>> import sounds.filters >>> sounds.filters.echo(...)
with the additional benefit that we can easily add pure Python functions
to any module, in a way that the user can't tell the difference between a
C++ function and a Python function. Let's add a pure
Python function, echo_noise
, to the filters
package. This function applies both the echo
and noise
filters in sequence in the given sound
object. We create
a file named sounds/filters/echo_noise.py
and code our
function:
import _filters def echo_noise(sound): s = _filters.echo(sound) s = _filters.noise(sound) return s
Next, we add this line to sounds/filters/__init__.py
:
from echo_noise import echo_noise
And that's it. The user now accesses this function like any other function
from the filters
package:
>>> import sounds.filters >>> sounds.filters.echo_noise(...)
Thanks to Python's flexibility, you can easily add new methods to a class, even after it was already created:
>>> class C(object): pass >>> >>> # a regular function >>> def C_str(self): return 'A C instance!' >>> >>> # now we turn it in a member function >>> C.__str__ = C_str >>> >>> c = C() >>> print c A C instance! >>> C_str(c) A C instance!
Yes, Python rox.
We can do the same with classes that were wrapped with Boost.Python. Suppose
we have a class point
in C++:
class point {...}; BOOST_PYTHON_MODULE(_geom) { class_<point>("point")...; }
If we are using the technique from the previous session, Creating
Packages, we can code directly into geom/__init__.py
:
from _geom import * # a regular function def point_str(self): return str((self.x, self.y)) # now we turn it into a member function point.__str__ = point_str
All point instances created from C++ will also have this member function! This technique has several advantages:
Another useful idea is to replace constructors with factory functions:
_point = point def point(x=0, y=0): return _point(x, y)
In this simple case there is not much gained, but for constructurs with many overloads and/or arguments this is often a great simplification, again with virtually zero memory footprint and zero compile-time overhead for the keyword support.
If you have ever exported a lot of classes, you know that it takes quite a good time to compile the Boost.Python wrappers. Plus the memory consumption can easily become too high. If this is causing you problems, you can split the class_ definitions in multiple files:
/* file point.cpp */ #include <point.h> #include <boost/python.hpp> void export_point() { class_<point>("point")...; } /* file triangle.cpp */ #include <triangle.h> #include <boost/python.hpp> void export_triangle() { class_<triangle>("triangle")...; }
Now you create a file main.cpp
, which contains the BOOST_PYTHON_MODULE
macro, and call the various export functions inside it.
void export_point(); void export_triangle(); BOOST_PYTHON_MODULE(_geom) { export_point(); export_triangle(); }
Compiling and linking together all this files produces the same result as the usual approach:
#include <boost/python.hpp> #include <point.h> #include <triangle.h> BOOST_PYTHON_MODULE(_geom) { class_<point>("point")...; class_<triangle>("triangle")...; }
but the memory is kept under control.
This method is recommended too if you are developing the C++ library and exporting it to Python at the same time: changes in a class will only demand the compilation of a single cpp, instead of the entire wrapper code.
Note | |
---|---|
This method is useful too if you are getting the error message "fatal error C1204:Compiler limit:internal structure overflow" when compiling a large source file, as explained in the FAQ. |