...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
The Spirit.Qi advance
is a primitive parser component allowing the parser to skip (advance) through
a specified number of iterations without performing unnecessary work:
advance(distance)
The most obvious existing alternative to this, the repeat
directive, will cause the parser to advance one iterator at a time while
usually performing operations at each step. In some cases that work is
unnecessary, as in the case where large binary objects are being parsed.
Take, for example, the following binary data:
00 00 00 01 77 fc b4 51 0a b3 b7 ... 1e 60 70 b6 00 00 01 00
If the first 4 bytes are a little-endian 32-bit integer describing the length of the subsequent data, but the data itself is not relevant to parsing, then the repeat directive would cause all of the subsequent 16 MB of data to be consumed one byte at a time while generating page faults or other superfluous I/O. If the value is large, as it is in this case, the parser becomes very slow.
little_dword[_a = _1] >> repeat(_a)[byte_] >> little_dword...
The advance
parser component
solves this problem by performing as little work as possible to advance
the parser's iterator, and will optimize for the case of random-access
iterators by advancing directly to the desired relative iterator position.
little_dword[_a = _1] >> advance(_a) >> little_dword...
// forwards to <boost/spirit/repository/home/qi/primitive/advance.hpp> #include <boost/spirit/repository/include/qi_advance.hpp>
advance(distance)
Parameter |
Description |
---|---|
'distance' |
The distance that the iterator shall be advanced |
The advance
component exposes
no attribute (the exposed attribute type is unused_type
):
advance --> unused
The following example shows simple use cases of the advance
component. We will illustrate its usage by generating parsers for some
binary data (for the full example code see advance.cpp)
In addition to the main header file needed to include the core components
implemented in Spirit.Qi we add the header file needed
for the new advance
component.
#include <boost/spirit/include/qi.hpp> #include <boost/phoenix/operator.hpp> #include <boost/spirit/repository/include/qi_advance.hpp>
In order to make the examples below more readable we introduce or use the following namespaces.
namespace qi = boost::spirit::qi; using boost::spirit::repository::qi::advance;
This is a very simple grammar that recognizes several fields of a binary stream of data. There are two fields explicitly delimited by a field indicating the number of bytes that are spanned. They are separated by a literal string.
template <typename Iterator> struct advance_grammar : qi::grammar<Iterator, qi::locals<int> > { advance_grammar() : advance_grammar::base_type(start) { using qi::byte_; using qi::eoi; using namespace qi::labels; start = byte_ [_a = _1] >> advance(_a) >> "boost" >> byte_ [_a = _1] >> (advance(_a) | "qi") // note alternative when advance fails >> eoi ; } qi::rule<Iterator, qi::locals<int> > start; };
Note that the second binary field may either contain the number of specified
bytes, or the word "qi". If the advance
parser component fails to advance the specified number of bytes before
reaching the end of input, it will fail and the parser will attempt to
descend into alternatives.
The data below is correctly delimited and will thus result in a valid parse. Note that both random-access and bidirectional iterators are used here.
unsigned char const alt1[] = { 5, // number of bytes to advance 1, 2, 3, 4, 5, // data to advance through 'b', 'o', 'o', 's', 't', // word to parse 2, // number of bytes to advance 11, 12 // more data to advance through // eoi }; std::string const alt1_string(alt1, alt1 + sizeof alt1); std::list<unsigned char> const alt1_list(alt1, alt1 + sizeof alt1); result = qi::parse(alt1_string.begin(), alt1_string.end() , client::advance_grammar<std::string::const_iterator>()) ? "succeeded" : "failed"; std::cout << "Parsing alt1 using random access iterator " << result << std::endl; result = qi::parse(alt1_list.begin(), alt1_list.end() , client::advance_grammar<std::list<unsigned char>::const_iterator>()) ? "succeeded" : "failed"; std::cout << "Parsing alt1 using bidirectional iterator " << result << std::endl;
The data below is not correctly delimited, but will correctly parse because the alternative word "qi" is available.
unsigned char const alt2[] = { 2, // number of bytes to advance 1, 2, // data to advance through 'b', 'o', 'o', 's', 't', // word to parse 4, // number of bytes to advance 'q', 'i' // alternative (advance won't work) // eoi }; std::string const alt2_string(alt2, alt2 + sizeof alt2); std::list<unsigned char> const alt2_list(alt2, alt2 + sizeof alt2); result = qi::parse(alt2_string.begin(), alt2_string.end() , client::advance_grammar<std::string::const_iterator>()) ? "succeeded" : "failed"; std::cout << "Parsing alt2 using random access iterator " << result << std::endl; result = qi::parse(alt2_list.begin(), alt2_list.end() , client::advance_grammar<std::list<unsigned char>::const_iterator>()) ? "succeeded" : "failed"; std::cout << "Parsing alt2 using bidirectional iterator " << result << std::endl;
The advance
parser component
will fail unconditionally on negative values. It will never attempt to
advance the iterator in the reverse direction.