Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

Click here to view the latest version of this page.
PrevUpHomeNext

Qi advance Parser

Description

The Spirit.Qi advance is a primitive parser component allowing the parser to skip (advance) through a specified number of iterations without performing unnecessary work:

advance(distance)

The most obvious existing alternative to this, the repeat directive, will cause the parser to advance one iterator at a time while usually performing operations at each step. In some cases that work is unnecessary, as in the case where large binary objects are being parsed. Take, for example, the following binary data:

00 00 00 01 77 fc b4 51 0a b3 b7 ... 1e 60 70 b6 00 00 01 00

If the first 4 bytes are a little-endian 32-bit integer describing the length of the subsequent data, but the data itself is not relevant to parsing, then the repeat directive would cause all of the subsequent 16 MB of data to be consumed one byte at a time while generating page faults or other superfluous I/O. If the value is large, as it is in this case, the parser becomes very slow.

little_dword[_a = _1] >> repeat(_a)[byte_] >> little_dword...

The advance parser component solves this problem by performing as little work as possible to advance the parser's iterator, and will optimize for the case of random-access iterators by advancing directly to the desired relative iterator position.

little_dword[_a = _1] >> advance(_a) >> little_dword...
Header
// forwards to <boost/spirit/repository/home/qi/primitive/advance.hpp>
#include <boost/spirit/repository/include/qi_advance.hpp>
Synopsis
advance(distance)
Parameters

Parameter

Description

'distance'

The distance that the iterator shall be advanced

Attribute

The advance component exposes no attribute (the exposed attribute type is unused_type):

advance --> unused
Example

The following example shows simple use cases of the advance component. We will illustrate its usage by generating parsers for some binary data (for the full example code see advance.cpp)

Prerequisites

In addition to the main header file needed to include the core components implemented in Spirit.Qi we add the header file needed for the new advance component.

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/spirit/repository/include/qi_advance.hpp>

In order to make the examples below more readable we introduce or use the following namespaces.

namespace qi = boost::spirit::qi;
using boost::spirit::repository::qi::advance;

Setting up the Grammar

This is a very simple grammar that recognizes several fields of a binary stream of data. There are two fields explicitly delimited by a field indicating the number of bytes that are spanned. They are separated by a literal string.

template <typename Iterator>
struct advance_grammar : qi::grammar<Iterator, qi::locals<int> >
{
    advance_grammar() : advance_grammar::base_type(start)
    {
        using qi::byte_;
        using qi::eoi;
        using namespace qi::labels;

        start
            =  byte_  [_a = _1]
            >> advance(_a)
            >> "boost"
            >> byte_  [_a = _1]
            >> (advance(_a) | "qi") // note alternative when advance fails
            >> eoi
            ;
    }

    qi::rule<Iterator, qi::locals<int> > start;
};

Note that the second binary field may either contain the number of specified bytes, or the word "qi". If the advance parser component fails to advance the specified number of bytes before reaching the end of input, it will fail and the parser will attempt to descend into alternatives.

Parsing a Correctly-delimited String of Data

The data below is correctly delimited and will thus result in a valid parse. Note that both random-access and bidirectional iterators are used here.

unsigned char const alt1[] =
{
    5,                         // number of bytes to advance
    1, 2, 3, 4, 5,             // data to advance through
    'b', 'o', 'o', 's', 't',   // word to parse
    2,                         // number of bytes to advance
    11, 12                     // more data to advance through
    // eoi
};

std::string const alt1_string(alt1, alt1 + sizeof alt1);
std::list<unsigned char> const alt1_list(alt1, alt1 + sizeof alt1);

result =
    qi::parse(alt1_string.begin(), alt1_string.end()
        , client::advance_grammar<std::string::const_iterator>())
    ? "succeeded" : "failed";
std::cout << "Parsing alt1 using random access iterator " << result << std::endl;

result =
    qi::parse(alt1_list.begin(), alt1_list.end()
        , client::advance_grammar<std::list<unsigned char>::const_iterator>())
    ? "succeeded" : "failed";
std::cout << "Parsing alt1 using bidirectional iterator " << result << std::endl;

Parsing the Alternative Representation

The data below is not correctly delimited, but will correctly parse because the alternative word "qi" is available.

unsigned char const alt2[] =
{
    2,                         // number of bytes to advance
    1, 2,                      // data to advance through
    'b', 'o', 'o', 's', 't',   // word to parse
    4,                         // number of bytes to advance
    'q', 'i'                   // alternative (advance won't work)
    // eoi
};

std::string const alt2_string(alt2, alt2 + sizeof alt2);
std::list<unsigned char> const alt2_list(alt2, alt2 + sizeof alt2);

result =
    qi::parse(alt2_string.begin(), alt2_string.end()
        , client::advance_grammar<std::string::const_iterator>())
    ? "succeeded" : "failed";
std::cout << "Parsing alt2 using random access iterator " << result << std::endl;

result =
    qi::parse(alt2_list.begin(), alt2_list.end()
        , client::advance_grammar<std::list<unsigned char>::const_iterator>())
    ? "succeeded" : "failed";
std::cout << "Parsing alt2 using bidirectional iterator " << result << std::endl;

Notes

The advance parser component will fail unconditionally on negative values. It will never attempt to advance the iterator in the reverse direction.


PrevUpHomeNext