...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
Actors may be combined in a multitude of ways to form composites. Composites are actors that are composed of zero or more actors. Composition is hierarchical. An element of the composite can be a primitive or again another composite. The flexibility to arbitrarily compose hierarchical structures allows us to form intricate constructions that model complex functions, statements and expressions.
A composite is-a tuple of 0..N actors. N is the predefined maximum actors a composite can take.
Note | |
---|---|
You can set |
As mentioned, each of the actors A0..AN can, in turn, be another composite, since a composite is itself an actor. This makes the composite a recursive structure. The actual evaluation is handled by a composite specific eval policy.
#include <boost/spirit/home/phoenix/function/function.hpp>
The function
class template
provides a mechanism for implementing lazily evaluated functions. Syntactically,
a lazy function looks like an ordinary C/C++ function. The function call
looks familiar and feels the same as ordinary C++ functions. However, unlike
ordinary functions, the actual function execution is deferred.
Unlike ordinary function pointers or functor objects that need to be explicitly bound through the bind function (see Bind), the argument types of these functions are automatically lazily bound.
In order to create a lazy function, we need to implement a model of the FunctionEval
concept. For a function that takes N
arguments, a model of FunctionEval must provide:
operator()
that implements that takes N
arguments, and implements the function logic.
result<A1, ... AN>
that takes the types of the N
arguments to the function and returns the result type of the function.
(There is a special case for function objects that accept no arguments.
Such nullary functors are only required to define a typedef result_type
that reflects the return
type of its operator()
).
For example, the following type implements the FunctionEval concept, in order to provide a lazy factorial function:
struct factorial_impl { template <typename Arg> struct result { typedef Arg type; }; template <typename Arg> Arg operator()(Arg n) const { return (n <= 0) ? 1 : n * this->operator()(n-1); } };
(See factorial.cpp)
Having implemented the factorial_impl
type, we can declare and instantiate a lazy factorial
function this way:
function<factorial_impl> factorial;
Invoking a lazy function such as factorial
does not immediately execute the function object factorial_impl
.
Instead, an actor object is created
and returned to the caller. Example:
factorial(arg1)
does nothing more than return an actor. A second function call will invoke the actual factorial function. Example:
int i = 4; cout << factorial(arg1)(i);
will print out "24".
Take note that in certain cases (e.g. for function objects with state), an instance of the model of FunctionEval may be passed on to the constructor. Example:
function<factorial_impl> factorial(ftor);
where ftor is an instance of factorial_impl (this is not necessary in this
case as factorial_impl
does
not require any state).
This facility provides a mechanism for lazily evaluating operators. Syntactically, a lazy operator looks and feels like an ordinary C/C++ infix, prefix or postfix operator. The operator application looks the same. However, unlike ordinary operators, the actual operator execution is deferred. Samples:
arg1 + arg2 1 + arg1 * arg2 1 / -arg1 arg1 < 150
We have seen the lazy operators in action (see Quick Start). Let's go back and examine them a little bit further:
find_if(c.begin(), c.end(), arg1 % 2 == 1)
Through operator overloading, the expression arg1
% 2 == 1
actually
generates an actor. This actor object is passed on to STL's find_if
function. From the viewpoint of
STL, the composite is simply a function object expecting a single argument
of the containers value_type. For each element in c
,
the element is passed on as an argument arg1
to the actor (function object). The actor checks if this is an odd value
based on the expression arg1 % 2 ==
1
where arg1 is replaced by the container's
element.
Like lazy functions (see function), lazy operators are not immediately executed when invoked. Instead, an actor (see actors) object is created and returned to the caller. Example:
(arg1 + arg2) * arg3
does nothing more than return an actor. A second function call will evaluate the actual operators. Example:
int i = 4, j = 5, k = 6; cout << ((arg1 + arg2) * arg3)(i, j, k);
will print out "54".
Operator expressions are lazily evaluated following four simple rules:
->*
will be lazily evaluated when at least one of its
operands is an actor object (see actors).
->*
is lazily
evaluated if the left hand argument is an actor object.
For example, to check the following expression is lazily evaluated:
-(arg1 + 3 + 6)
arg1 + 3
is
lazily evaluated since arg1
is an actor (see primitives).
arg1 + 3
expression
is an actor object, following rule 4.
arg1 +
3 + 6
is again lazily evaluated. Rule 2.
arg1
+ 3 + 6
is
an actor object.
arg1 +
3 + 6
is an actor, -(arg1 + 3 + 6)
is lazily evaluated. Rule 2.
Lazy-operator application is highly contagious. In most cases, a single
argN
actor infects all its
immediate neighbors within a group (first level or parenthesized expression).
Note that at least one operand of any operator must be a valid actor for
lazy evaluation to take effect. To force lazy evaluation of an ordinary expression,
we can use ref(x)
, val(x)
or cref(x)
to transform an operand into a valid actor object (see primitives.
For example:
1 << 3; // Immediately evaluated val(1) << 3; // Lazily evaluated
prefix: ~, !, -, +, ++, --, & (reference), * (dereference) postfix: ++, --
=, [], +=, -=, *=, /=, %=, &=, |=, ^=, <<=, >>= +, -, *, /, %, &, |, ^, <<, >> ==, !=, <, >, <=, >= &&, ||, ->*
if_else(c, a, b)
The ternary operator deserves special mention. Since C++ does not allow us
to overload the conditional expression: c
? a : b
, the
if_else pseudo function is provided for this purpose. The behavior is identical,
albeit in a lazy manner.
a->*member_object_pointer a->*member_function_pointer
The left hand side of the member pointer operator must be an actor returning a pointer type. The right hand side of the member pointer operator may be either a pointer to member object or pointer to member function.
If the right hand side is a member object pointer, the result is an actor which, when evaluated, returns a reference to that member. For example:
struct A { int member; }; A* a = new A; ... (arg1->*&A::member)(a); // returns member a->member
If the right hand side is a member function pointer, the result is an actor which, when invoked, calls the specified member function. For example:
struct A { int func(int); }; A* a = new A; int i = 0; (arg1->*&A::func)(arg2)(a, i); // returns a->func(i)
Table 1.4. Include Files
Operators |
File |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Lazy statements...
The primitives and composite building blocks presented so far are sufficiently powerful to construct quite elaborate structures. We have presented lazy- functions and lazy-operators. How about lazy-statements? First, an appetizer:
Print all odd-numbered contents of an STL container using std::for_each
(all_odds.cpp):
for_each(c.begin(), c.end(), if_(arg1 % 2 == 1) [ cout << arg1 << ' ' ] );
Huh? Is that valid C++? Read on...
Yes, it is valid C++. The sample code above is as close as you can get to
the syntax of C++. This stylized C++ syntax differs from actual C++ code.
First, the if
has a trailing
underscore. Second, the block uses square brackets instead of the familiar
curly braces {}.
Note | |
---|---|
C++ in C++? In as much as Spirit attempts to mimic EBNF in C++, Phoenix attempts to mimic C++ in C++!!! |
Here are more examples with annotations. The code almost speaks for itself.
#include <boost/spirit/home/phoenix/statement/sequence.hpp>
Syntax:
statement, statement, .... statement
Basically, these are comma separated statements. Take note that unlike the C/C++ semicolon, the comma is a separator put in-between statements. This is like Pascal's semicolon separator, rather than C/C++'s semicolon terminator. For example:
statement, statement, statement, // ERROR!
Is an error. The last statement should not have a comma. Block statements can be grouped using the parentheses. Again, the last statement in a group should not have a trailing comma.
statement, statement, ( statement, statement ), statement
Outside the square brackets, block statements should be grouped. For example:
for_each(c.begin(), c.end(), ( do_this(arg1), do_that(arg1) ) );
Wrapping a comma operator chain around a parentheses pair blocks the interpretation as an argument separator. The reason for the exception for the square bracket operator is that the operator always takes exactly one argument, so it "transforms" any attempt at multiple arguments with a comma operator chain (and spits out an error for zero arguments).
#include <boost/spirit/home/phoenix/statement/if.hpp>
We have seen the if_
statement.
The syntax is:
if_(conditional_expression) [ sequenced_statements ]
#include <boost/spirit/home/phoenix/statement/if.hpp>
The syntax is
if_(conditional_expression) [ sequenced_statements ] .else_ [ sequenced_statements ]
Take note that else
has a
leading dot and a trailing underscore: .else_
Example: This code prints out all the elements and appends " > 5"
, "
== 5"
or " < 5"
depending on the element's actual value:
for_each(c.begin(), c.end(), if_(arg1 > 5) [ cout << arg1 << " > 5\n" ] .else_ [ if_(arg1 == 5) [ cout << arg1 << " == 5\n" ] .else_ [ cout << arg1 << " < 5\n" ] ] );
Notice how the if_else_
statement is nested.
#include <boost/spirit/home/phoenix/statement/switch.hpp>
The syntax is:
switch_(integral_expression) [ case_<integral_value>(sequenced_statements), ... default_<integral_value>(sequenced_statements) ]
A comma separated list of cases, and an optional default can be provided. Note unlike a normal switch statement, cases do not fall through.
Example: This code prints out "one"
,
"two"
or "other value"
depending on the
element's actual value:
for_each(c.begin(), c.end(), switch_(arg1) [ case_<1>(cout << val("one") << '\n'), case_<2>(cout << val("two") << '\n'), default_(cout << val("other value") << '\n') ] );
#include <boost/spirit/home/phoenix/statement/while.hpp>
The syntax is:
while_(conditional_expression) [ sequenced_statements ]
Example: This code decrements each element until it reaches zero and prints out the number at each step. A newline terminates the printout of each value.
for_each(c.begin(), c.end(), ( while_(arg1--) [ cout << arg1 << ", " ], cout << val("\n") ) );
#include <boost/spirit/home/phoenix/statement/do_while.hpp>
The syntax is:
do_ [ sequenced_statements ] .while_(conditional_expression)
Again, take note that while
has a leading dot and a trailing underscore: .while_
Example: This code is almost the same as the previous example above with a slight twist in logic.
for_each(c.begin(), c.end(), ( do_ [ cout << arg1 << ", " ] .while_(arg1--), cout << val("\n") ) );
#include <boost/spirit/home/phoenix/statement/for.hpp>
The syntax is:
for_(init_statement, conditional_expression, step_statement) [ sequenced_statements ]
It is again very similar to the C++ for statement. Take note that the init_statement,
conditional_expression and step_statement are separated by the comma instead
of the semi-colon and each must be present (i.e. for_(,,)
is invalid). This is a case where the
nothing actor can be
useful.
Example: This code prints each element N times where N is the element's value. A newline terminates the printout of each value.
int iii; for_each(c.begin(), c.end(), ( for_(ref(iii) = 0, ref(iii) < arg1, ++ref(iii)) [ cout << arg1 << ", " ], cout << val("\n") ) );
As before, all these are lazily evaluated. The result of such statements
are in fact composites that are passed on to STL's for_each function. In
the viewpoint of for_each
,
what was passed is just a functor, no more, no less.
Note | |
---|---|
Unlike lazy functions and lazy operators, lazy statements always return void. |
#include <boost/spirit/home/phoenix/statement/try_catch.hpp>
The syntax is:
try_ [ sequenced_statements ] .catch_<exception_type>() [ sequenced_statements ] ... .catch_all [ sequenced_statement ]
Note the usual underscore after try and catch, and the extra parentheses required after the catch.
Example: The following code calls the (lazy) function f
for each element, and prints messages about different exception types it
catches.
try_ [ f(arg1) ] .catch_<runtime_error>() [ cout << val("caught runtime error or derived\n") ] .catch_<exception>() [ cout << val("caught exception or derived\n") ] .catch_all [ cout << val("caught some other type of exception\n") ]
#include <boost/spirit/home/phoenix/statement/throw.hpp>
As a natural companion to the try/catch support, the statement module provides lazy throwing and rethrowing of exceptions.
The syntax to throw an exception is:
throw_(exception_expression)
The syntax to rethrow an exception is:
throw_()
Example: This code extends the try/catch example, rethrowing exceptions derived from runtime_error or exception, and translating other exception types to runtime_errors.
try_ [ f(arg1) ] .catch_<runtime_error>() [ cout << val("caught runtime error or derived\n"), throw_() ] .catch_<exception>() [ cout << val("caught exception or derived\n"), throw_() ] .catch_all [ cout << val("caught some other type of exception\n"), throw_(runtime_error("translated exception")) ]
The Object module deals with object construction, destruction and conversion.
The module provides "lazy" versions of C++'s
object constructor, new
, delete
, static_cast
,
dynamic_cast
, const_cast
and reinterpret_cast
.
Lazy constructors...
#include <boost/spirit/home/phoenix/object/construct.hpp>
Lazily construct an object from an arbitrary set of arguments:
construct<T>(ctor_arg1, ctor_arg2, ..., ctor_argN);
where the given parameters are the parameters to the constructor of the object of type T (This implies, that type T is expected to have a constructor with a corresponding set of parameter types.).
Example:
construct<std::string>(arg1, arg2)
Constructs a std::string
from arg1
and arg2
.
Note | |
---|---|
The maximum number of actual parameters is limited by the preprocessor
constant PHOENIX_COMPOSITE_LIMIT. Note though, that this limit should not
be greater than PHOENIX_LIMIT. By default, |
Lazy new...
#include <boost/spirit/home/phoenix/object/new.hpp>
Lazily construct an object, on the heap, from an arbitrary set of arguments:
new_<T>(ctor_arg1, ctor_arg2, ..., ctor_argN);
where the given parameters are the parameters to the constructor of the object of type T (This implies, that type T is expected to have a constructor with a corresponding set of parameter types.).
Example:
new_<std::string>(arg1, arg2) // note the spelling of new_ (with trailing underscore)
Creates a std::string
from arg1
and arg2
on the heap.
Note | |
---|---|
Again, the maximum number of actual parameters is limited by the preprocessor constant PHOENIX_COMPOSITE_LIMIT. See the note above. |
Lazy delete...
#include <boost/spirit/home/phoenix/object/delete.hpp>
Lazily delete an object, from the heap:
delete_(arg);
where arg is assumed to be a pointer to an object.
Example:
delete_<std::string>(arg1) // note the spelling of delete_ (with trailing underscore)
Lazy casts...
#include <boost/spirit/home/phoenix/object/static_cast.hpp> #include <boost/spirit/home/phoenix/object/dynamic_cast.hpp> #include <boost/spirit/home/phoenix/object/const_cast.hpp> #include <boost/spirit/home/phoenix/object/reinterpret_cast.hpp>
The set of lazy C++ cast template functions provide a way of lazily casting an object of a certain type to another type. The syntax resembles the well known C++ casts. Take note however that the lazy versions have a trailing underscore.
static_cast_<T>(lambda_expression) dynamic_cast_<T>(lambda_expression) const_cast_<T>(lambda_expression) reinterpret_cast_<T>(lambda_expression)
Example:
static_cast_<Base*>(&arg1)
Static-casts the address of arg1
to a Base*
.
Up until now, the most basic ingredient is missing: creation of and access to local variables in the stack. When recursion comes into play, you will soon realize the need to have true local variables. It may seem that we do not need this at all since an unnamed lambda function cannot call itself anyway; at least not directly. With some sort of arrangement, situations will arise where a lambda function becomes recursive. A typical situation occurs when we store a lambda function in a Boost.Function, essentially naming the unnamed lambda.
There will also be situations where a lambda function gets passed as an argument to another function. This is a more common situation. In this case, the lambda function assumes a new scope; new arguments and possibly new local variables.
This section deals with local variables and nested lambda scopes.
#include <boost/spirit/home/phoenix/scope/local_variable.hpp>
We use an instance of:
actor<local_variable<Key> >
to represent a local variable. The local variable acts as an imaginary data-bin
where a local, stack based data will be placed. Key
is an arbitrary type that is used to identify the local variable. Example:
struct size_key; actor<local_variable<size_key> > size;
There are a few predefined instances of actor<local_variable<Key> >
named _a
.._z
that you can already use. To make use of them, simply use the namespace boost::phoenix::local_names
:
using namespace boost::phoenix::local_names;
#include <boost/spirit/home/phoenix/scope/let.hpp>
You declare local variables using the syntax:
let(local-declarations) [ let-body ]
let
allows 1..N local variable
declarations (where N == PHOENIX_LOCAL_LIMIT
).
Each declaration follows the form:
local-id = lambda-expression
Note | |
---|---|
You can set |
Example:
let(_a = 123, _b = 456) [ _a + _b ]
The type of the local variable assumes the type of the lambda- expression. Type deduction is reference preserving. For example:
let(_a = arg1, _b = 456)
_a
assumes the type of arg1
: a reference to an argument, while
_b
has type int
.
Consider this:
int i = 1; let(_a = arg1) [ cout << --_a << ' ' ] (i); cout << i << endl;
the output of above is : 0 0
While with this:
int i = 1; let(_a = val(arg1)) [ cout << --_a << ' ' ] (i); cout << i << endl;
the output is : 0 1
Reference preservation is necessary because we need to have L-value access
to outer lambda-scopes (especially the arguments). arg
s
and ref
s are L-values. val
s are R-values.
The scope and lifetimes of the local variables is limited within the let-body.
let
blocks can be nested.
A local variable may hide an outer local variable. For example:
let(_x = 1, _y = ", World") [ // _x here is an int: 1 let(_x = "Hello") // hides the outer _x [ cout << _x << _y // prints "Hello, World" ] ]
The RHS (right hand side lambda-expression) of each local-declaration cannot refer to any LHS local-id. At this point, the local-ids are not in scope yet; they will only be in scope in the let-body. The code below is in error:
let( _a = 1 , _b = _a // Error: _a is not in scope yet ) [ // _a and _b's scope starts here /*. body .*/ ]
However, if an outer let scope is available, this will be searched. Since the scope of the RHS of a local-declaration is the outer scope enclosing the let, the RHS of a local-declaration can refer to a local variable of an outer scope:
let(_a = 1) [ let( _a = 1 , _b = _a // Ok. _a refers to the outer _a ) [ /*. body .*/ ] ]
#include <boost/spirit/home/phoenix/scope/lambda.hpp>
A lot of times, you'd want to write a lazy function that accepts one or more
functions (higher order functions). STL algorithms come to mind, for example.
Consider a lazy version of stl::for_each
:
struct for_each_impl { template <typename C, typename F> struct result { typedef void type; }; template <typename C, typename F> void operator()(C& c, F f) const { std::for_each(c.begin(), c.end(), f); } }; function<for_each_impl> const for_each = for_each_impl();
Notice that the function accepts another function, f
as an argument. The scope of this function, f
,
is limited within the operator()
. When f
is called inside std::for_each
, it exists in a new scope, along
with new arguments and, possibly, local variables. This new scope is not
at all related to the outer scopes beyond the operator()
.
Simple syntax:
lambda [ lambda-body ]
Like let
, local variables
may be declared, allowing 1..N local variable declarations (where N == PHOENIX_LOCAL_LIMIT
):
lambda(local-declarations) [ lambda-body ]
The same restrictions apply with regard to scope and visibility. The RHS (right hand side lambda-expression) of each local-declaration cannot refer to any LHS local-id. The local-ids are not in scope yet; they will be in scope only in the lambda-body:
lambda( _a = 1 , _b = _a // Error: _a is not in scope yet )
See let
Visibility above for more information.
Example: Using our lazy for_each
let's print all the elements in a container:
for_each(arg1, lambda[cout << arg1])
As far as the arguments are concerned (arg1..argN), the scope in which the
lambda-body exists is totally new. The left arg1
refers to the argument passed to for_each
(a container). The right arg1
refers to the argument passed by std::for_each
when we finally get to call operator()
in our for_each_impl
above (a container element).
Yet, we may wish to get information from outer scopes. While we do not have
access to arguments in outer scopes, what we still have is access to local
variables from outer scopes. We may only be able to pass argument related
information from outer lambda
scopes through the local variables.
Note | |
---|---|
This is a crucial difference between |
Another example: Using our lazy for_each
,
and a lazy push_back
:
struct push_back_impl { template <typename C, typename T> struct result { typedef void type; }; template <typename C, typename T> void operator()(C& c, T& x) const { c.push_back(x); } }; function<push_back_impl> const push_back = push_back_impl();
write a lambda expression that accepts:
vector<vector<int> >
)
int
)
and pushes-back the element to each of the vector<int>
.
Solution:
for_each(arg1, lambda(_a = arg2) [ push_back(arg1, _a) ] )
Since we do not have access to the arguments of the outer scopes beyond the
lambda-body, we introduce a local variable _a
that captures the second outer argument: arg2
.
Hence: _a = arg2. This local variable is visible inside the lambda scope.
(See lambda.cpp)
Binding is the act of tying together a function to some
arguments for deferred (lazy) evaluation. Named Lazy
functions require a bit of typing. Unlike (unnamed) lambda expressions,
we need to write a functor somewhere off-line, detached from the call site.
If you wish to transform a plain function, member function or member variable
to a lambda expression, bind
is your friend.
Note | |
---|---|
Take note that binders are monomorphic. Rather than binding functions, the preferred way is to write true generic and polymorphic lazy-functions. However, since most of the time we are dealing with adaptation of existing code, binders get the job done faster. |
There is a set of overloaded bind
template functions. Each bind(x)
function generates a suitable binder object, a composite.
#include <boost/spirit/home/phoenix/bind/bind_function.hpp>
Example, given a function foo
:
void foo(int n) { std::cout << n << std::endl; }
Here's how the function foo
may be bound:
bind(&foo, arg1)
This is now a full-fledged composite
that can finally be evaluated by another function call invocation. A second
function call will invoke the actual foo
function. Example:
int i = 4; bind(&foo, arg1)(i);
will print out "4".
#include <boost/spirit/home/phoenix/bind/bind_member_function.hpp>
Binding member functions can be done similarly. A bound member function takes in a pointer or reference to an object as the first argument. For instance, given:
struct xyz { void foo(int) const; };
xyz
's foo
member function can be bound as:
bind(&xyz::foo, obj, arg1) // obj is an xyz object
Take note that a lazy-member functions expects the first argument to be a pointer or reference to an object. Both the object (reference or pointer) and the arguments can be lazily bound. Examples:
xyz obj; bind(&xyz::foo, arg1, arg2) // arg1.foo(arg2) bind(&xyz::foo, obj, arg1) // obj.foo(arg1) bind(&xyz::foo, obj, 100) // obj.foo(100)
#include <boost/spirit/home/phoenix/bind/bind_member_variable.hpp>
Member variables can also be bound much like member functions. Member variables
are not functions. Yet, like the ref(x)
that acts like a nullary function
returning a reference to the data, member variables, when bound, act like
a unary function, taking in a pointer or reference to an object as its argument
and returning a reference to the bound member variable. For instance, given:
struct xyz { int v; };
xyz::v
can be bound as:
bind(&xyz::v, obj) // obj is an xyz object
As noted, just like the bound member function, a bound member variable also expects the first (and only) argument to be a pointer or reference to an object. The object (reference or pointer) can be lazily bound. Examples:
xyz obj; bind(&xyz::v, arg1) // arg1.v bind(&xyz::v, obj) // obj.v bind(&xyz::v, arg1)(obj) = 4 // obj.v = 4