Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

This is the documentation for an old version of Boost. Click here to view this page for the latest version.
PrevUpHomeNext

Qi subrules

Description

The Spirit.Qi subrule is a component allowing to create a named parser, and to refer to it by name -- much like rules and grammars. It is in fact a fully static version of the rule.

The strength of subrules is performance. Replacing some rules with subrules can make a parser slightly faster (see Performance below for measurements). The reason is that subrules allow aggressive inlining by the C++ compiler, whereas the implementation of rules is based on a virtual function call which, depending on the compiler, can have some run-time overhead and stop inlining.

The weaknesses of subrules are:

entry = (
    expression =
        term
        >> *(   ('+' >> term)
            |   ('-' >> term)
            )

  , term =
        factor
        >> *(   ('*' >> factor)
            |   ('/' >> factor)
            )

  , factor =
        uint_
        |   '(' >> expression >> ')'
        |   ('-' >> factor)
        |   ('+' >> factor)
);

The example above can be found here: ../../example/qi/calc1_sr.cpp

As shown in this code snippet (an extract from the calc1_sr example), subrules can be freely mixed with rules and grammars. Here, a group of 3 subrules (expression, term, factor) is assigned to a rule (named entry). This means that parts of a parser can use subrules (typically the innermost, most performance-critical parts), whereas the rest can use rules and grammars.

Header
// forwards to <boost/spirit/repository/home/qi/nonterminal/subrule.hpp>
#include <boost/spirit/repository/include/qi_subrule.hpp>
Synopsis (declaration)
subrule<ID, A1, A2> sr(name);
Parameters (declaration)

Parameter

Description

ID

Required numeric argument. Gives the subrule a unique 'identification tag'.

A1, A2

Optional types, can be specified in any order. Can be one of 1. signature, 2. locals (see rules reference for more information on those parameters).

Note that the skipper type need not be specified in the parameters, unlike with grammars and rules. Subrules will automatically use the skipper type which is in effect when they are invoked.

name

Optional string. Gives the subrule a name, useful for debugging and error handling.

Synopsis (usage)

Subrules are defined and used within groups, typically (and by convention) enclosed inside parentheses.

// Group containing N subrules
(
    sr1 = expr1
  , sr2 = expr2
  , ... // Any number of subrules
}

The IDs of all subrules defined within the same group must be different. It is an error to define several subrules with the same ID (or to define the same subrule multiple times) in the same group.

// Auto-subrules and inherited attributes
(
    srA %= exprA >> srB >> srC(c1, c2, ...) // Arguments to subrule srC
  , srB %= exprB
  , srC  = exprC
  , ...
)(a1, a2, ...)         // Arguments to group, i.e. to start subrule srA
Parameters (usage)

Parameter

Description

sr1, sr2

Subrules with different IDs.

expr1, expr2

Parser expressions. Can include sr1 and sr2, as well as any other valid parser expressions.

srA

Subrule with a synthesized attribute and inherited attributes.

srB

Subrule with a synthesized attribute.

srC

Subrule with inherited attributes.

exprA, exprB, exprC

Parser expressions.

a1, a2

Arguments passed to the subrule group. They are passed as inherited attributes to the group's start subrule, srA.

c1, c2

Arguments passed as inherited attributes to subrule srC.

Groups

A subrule group (a set of subrule definitions) is a parser, which can be used anywhere in a parser expression (in assignments to rules, as well as directly in arguments to functions such as parse). In a group, parsing proceeds from the start subrule, which is the first (topmost) subrule defined in that group. In the two groups in the synopsis above, sr1 and srA are the start subrules respectively -- for example when the first subrule group is called forth, the sr1 subrule is called.

A subrule can only be used in a group which defines it. Groups can be viewed as scopes: a definition of a subrule is limited to its enclosing group.

rule<char const*> r1, r2, r3;
subrule<1> sr1;
subrule<2> sr2;

r1 =
        ( sr1 = 'a' >> int_ )       // First group in r1.
    >>  ( sr2 = +sr1 )              // Second group in r1.
    //           ^^^
    // DOES NOT COMPILE: sr1 is not defined in this
    // second group, it cannot be used here (its
    // previous definition is out of scope).
;

r2 =
        ( sr1 = 'a' >> int_ )       // Only group in r2.
    >>  sr1
    //  ^^^
    // DOES NOT COMPILE: not in a subrule group,
    // sr1 cannot be used here (here too, its
    // previous definition is out of scope).
;

r3 =
        ( sr1 = 'x' >> double_ )    // Another group. The same subrule `sr1`
                                    // can have another, independent
                                    // definition in this group.
;
Attributes

A subrule has the same behavior as a rule with respect to attributes. In particular:

Locals

A subrule has the same behavior as a rule with respect to locals. In particular, the Phoenix placeholders _a, _b, ... are available to refer to the subrule's locals, if present.

Example

Some includes:

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/repository/include/qi_subrule.hpp>
#include <boost/phoenix/core.hpp>
#include <boost/phoenix/operator.hpp>

Some using declarations:

namespace qi = boost::spirit::qi;
namespace repo = boost::spirit::repository;
namespace ascii = boost::spirit::ascii;

A grammar containing only one rule, defined with a group of 5 subrules:

template <typename Iterator>
struct mini_xml_grammar
  : qi::grammar<Iterator, mini_xml(), ascii::space_type>
{
    mini_xml_grammar()
      : mini_xml_grammar::base_type(entry)
    {
        using qi::lit;
        using qi::lexeme;
        using ascii::char_;
        using ascii::string;
        using namespace qi::labels;

        entry %= (
            xml %=
                    start_tag[_a = _1]
                >>  *node
                >>  end_tag(_a)

          , node %= xml | text

          , text %= lexeme[+(char_ - '<')]

          , start_tag %=
                    '<'
                >>  !lit('/')
                >>  lexeme[+(char_ - '>')]
                >>  '>'

          , end_tag %=
                    "</"
                >>  lit(_r1)
                >>  '>'
        );
    }

    qi::rule<Iterator, mini_xml(), ascii::space_type> entry;

    repo::qi::subrule<0, mini_xml(), qi::locals<std::string> > xml;
    repo::qi::subrule<1, mini_xml_node()> node;
    repo::qi::subrule<2, std::string()> text;
    repo::qi::subrule<3, std::string()> start_tag;
    repo::qi::subrule<4, void(std::string)> end_tag;
};

The definitions of the mini_xml and mini_xml_node data structures are not shown here. The full example above can be found here: ../../example/qi/mini_xml2_sr.cpp

Performance

This table compares run-time and compile-time performance when converting examples to subrules, with various compilers.

Table 2. Subrules performance

Example

Compiler

Speed (run-time)

Time (compile-time)

Memory (compile-time)

calc1_sr

gcc 4.4.1

+6%

n/a

n/a

calc1_sr

Visual C++ 2008 (VC9)

+5%

n/a

n/a

mini_xml2_sr

gcc 3.4.6

-1%

+54%

+32%

mini_xml2_sr

gcc 4.1.2

+5%

+58%

+25%

mini_xml2_sr

gcc 4.4.1

+8%

+20%

+14%

mini_xml2_sr

Visual C++ 2005 (VC8) SP1

+1%

+33%

+27%

mini_xml2_sr

Visual C++ 2008 (VC9)

+9%

+52%

+40%


The columns are:

Notes

Subrules push the C++ compiler hard. A group of subrules is a single C++ expression. Current C++ compilers cannot handle very complex expressions very well. One restricting factor is the typical compiler's limit on template recursion depth. Some, but not all, compilers allow this limit to be configured.

g++'s maximum can be set using a compiler flag: -ftemplate-depth. Set this appropriately if you use relatively complex subrules.


PrevUpHomeNext