...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
The Spirit.Qi distinct
parser is a directive component allowing to avoid partial matches while
parsing using a skipper. A simple example is the common task of matching
a C keyword. Consider:
"description" >> -lit(":") >> *(char_ - eol)
intended to match a line in a configuration file. Let's assume further,
that this rule is used with a space
skipper and that we have the following strings in the input:
"description: ident\n" "description ident\n" "descriptionident\n"
It might seem unexpected, but the parser above matches all three inputs just fine, even if the third input should not match at all! In order to avoid the unwanted match we are forced to make our rule more complicated:
lexeme["description" >> !char_("a-zA-Z_0-9")] >> -lit(":") >> *(char_ - eol)
(the rule reads as: match "description"
as long as it's not directly followed by a valid identifier).
The distinct[]
directive is meant to simplify the rule above:
distinct(char_("a-zA-Z_0-9"))["description"] >> -lit(":") >> *(char_ - eol)
Using the distinct[]
component instead of the explicit sequence has the advantage of being able
to encapsulate the tail (i.e the char_("a-zA-Z_0-9")
) as a separate parser construct. The following
code snippet illustrates the idea (for the full code of this example please
see distinct.cpp):
namespace spirit = boost::spirit; namespace ascii = boost::spirit::ascii; namespace repo = boost::spirit::repository; // Define metafunctions allowing to compute the type of the distinct() // and ascii::char_() constructs namespace traits { // Metafunction allowing to get the type of any repository::distinct(...) // construct template <typename Tail> struct distinct_spec : spirit::result_of::terminal<repo::tag::distinct(Tail)> {}; // Metafunction allowing to get the type of any ascii::char_(...) construct template <typename String> struct char_spec : spirit::result_of::terminal<spirit::tag::ascii::char_(String)> {}; } // Define a helper function allowing to create a distinct() construct from // an arbitrary tail parser template <typename Tail> inline typename traits::distinct_spec<Tail>::type distinct_spec(Tail const& tail) { return repo::distinct(tail); } // Define a helper function allowing to create a ascii::char_() construct // from an arbitrary string representation template <typename String> inline typename traits::char_spec<String>::type char_spec(String const& str) { return ascii::char_(str); } // the following constructs the type of a distinct_spec holding a // charset("0-9a-zA-Z_") as its tail parser typedef traits::char_spec<std::string>::type charset_tag_type; typedef traits::distinct_spec<charset_tag_type>::type keyword_tag_type; // Define a new Qi 'keyword' directive usable as a shortcut for a // repository::distinct(char_(std::string("0-9a-zA-Z_"))) std::string const keyword_spec("0-9a-zA-Z_"); keyword_tag_type const keyword = distinct_spec(char_spec(keyword_spec));
These definitions define a new Qi parser recognizing keywords! This allows to rewrite our declaration parser expression as:
keyword["description"] >> -lit(":") >> *(char_ - eol)
which is much more readable and concise if compared to the original parser
expression. In addition the new keyword[]
directive has the advantage to be usable
for wrapping any parser expression, not only strings as in the example
above.
// forwards to <boost/spirit/repository/home/qi/directive/distinct.hpp> #include <boost/spirit/repository/include/qi_distinct.hpp>
distinct(tail)[subject]
Parameter |
Description |
---|---|
|
The parser construct specifying what whould not follow the subject in order to match the overall expression. |
|
The parser construct to use to match the current input. The distinct directive makes sure that no unexpected partial matches occur. |
All two parameters can be arbitrary complex parsers themselves.
The distinct
component
exposes the attribute type of its subject as its own attribute type. If
the subject
does not expose
any attribute (the type is unused_type
),
then the distinct
does
not expose any attribute either.
a: A, b: B --> distinct(b)[a]: A
The following example shows simple use cases of the distinct
parser. distinct.cpp)
In addition to the main header file needed to include the core components
implemented in Spirit.Qi we add the header file needed
for the new distinct
generator.
#include <boost/spirit/include/qi.hpp> #include <boost/spirit/repository/include/qi_distinct.hpp>
To make all the code below more readable we introduce the following namespaces.
using namespace boost::spirit; using namespace boost::spirit::ascii; using boost::spirit::repository::distinct;
We show several examples of how the distinct[]
directive can be used to force correct
behavior while matching keywords. The first two code snippets show the
correct matching of the description
keyword (in this hypothetical example we allow keywords to be directly
followed by an optional "--"
):
{ std::string str("description ident"); std::string::iterator first(str.begin()); bool r = qi::phrase_parse(first, str.end() , distinct(alnum | '_')["description"] >> -lit("--") >> +(alnum | '_') , space); BOOST_ASSERT(r && first == str.end()); }
{ std::string str("description--ident"); std::string::iterator first(str.begin()); bool r = qi::phrase_parse(first, str.end() , distinct(alnum | '_')["description"] >> -lit("--") >> +(alnum | '_') , space); BOOST_ASSERT(r && first == str.end()); }
The last example shows that the distinct[]
parser component correctly refuses to
match "description-ident":
{ std::string str("description-ident"); std::string::iterator first(str.begin()); bool r = qi::phrase_parse(first, str.end() , distinct(alnum | '_')["description"] >> -lit("--") >> +(alnum | '_') , space); BOOST_ASSERT(!r && first == str.begin()); }