Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

This is the documentation for an old version of Boost. Click here to view this page for the latest version.
PrevUpHomeNext

Mini XML - Error Handling

A parser will not be complete without error handling. Spirit2 provides some facilities to make it easy to adapt a grammar for error handling. We'll wrap up the Qi tutorial with another version of the mini xml parser, this time, with error handling.

The full cpp file for this example can be found here: ../../example/qi/mini_xml3.cpp

Here's the grammar:

template <typename Iterator>
struct mini_xml_grammar
  : qi::grammar<Iterator, mini_xml(), qi::locals<std::string>, ascii::space_type>
{
    mini_xml_grammar()
      : mini_xml_grammar::base_type(xml, "xml")
    {
        using qi::lit;
        using qi::lexeme;
        using qi::on_error;
        using qi::fail;
        using ascii::char_;
        using ascii::string;
        using namespace qi::labels;

        using phoenix::construct;
        using phoenix::val;

        text %= lexeme[+(char_ - '<')];
        node %= xml | text;

        start_tag %=
                '<'
            >>  !lit('/')
            >   lexeme[+(char_ - '>')]
            >   '>'
        ;

        end_tag =
                "</"
            >   string(_r1)
            >   '>'
        ;

        xml %=
                start_tag[_a = _1]
            >   *node
            >   end_tag(_a)
        ;

        xml.name("xml");
        node.name("node");
        text.name("text");
        start_tag.name("start_tag");
        end_tag.name("end_tag");

        on_error<fail>
        (
            xml
          , std::cout
                << val("Error! Expecting ")
                << _4                               // what failed?
                << val(" here: \"")
                << construct<std::string>(_3, _2)   // iterators to error-pos, end
                << val("\"")
                << std::endl
        );
    }

    qi::rule<Iterator, mini_xml(), qi::locals<std::string>, ascii::space_type> xml;
    qi::rule<Iterator, mini_xml_node(), ascii::space_type> node;
    qi::rule<Iterator, std::string(), ascii::space_type> text;
    qi::rule<Iterator, std::string(), ascii::space_type> start_tag;
    qi::rule<Iterator, void(std::string), ascii::space_type> end_tag;
};

What's new?

Readable Names

First, when we call the base class, we give the grammar a name:

: mini_xml_grammar::base_type(xml, "xml")

Then, we name all our rules:

xml.name("xml");
node.name("node");
text.name("text");
start_tag.name("start_tag");
end_tag.name("end_tag");
On Error

on_error declares our error handler:

on_error<Action>(rule, handler)

This will specify what we will do when we get an error. We will print out an error message using phoenix:

on_error<fail>
(
    xml
  , std::cout
        << val("Error! Expecting ")
        << _4                               // what failed?
        << val(" here: \"")
        << construct<std::string>(_3, _2)   // iterators to error-pos, end
        << val("\"")
        << std::endl
);

we choose to fail in our example for the Action: Quit and fail. Return a no_match (false). It can be one of:

Action

Description

fail

Quit and fail. Return a no_match.

retry

Attempt error recovery, possibly moving the iterator position.

accept

Force success, moving the iterator position appropriately.

rethrow

Rethrows the error.

rule is the rule to which the handler is attached. In our case, we are attaching to the xml rule.

handler is the actual error handling function. It expects 4 arguments:

Arg

Description

first

The position of the iterator when the rule with the handler was entered.

last

The end of input.

error-pos

The actual position of the iterator where the error occurred.

what

What failed: a string describing the failure.

Expectation Points

You might not have noticed it, but some of our expressions changed from using the >> to >. Look, for example:

end_tag =
        "</"
    >   lit(_r1)
    >   '>'
;

What is it? It's the expectation operator. You will have some "deterministic points" in the grammar. Those are the places where backtracking cannot occur. For our example above, when you get a "</", you definitely must see a valid end-tag label next. It should be the one you got from the start-tag. After that, you definitely must have a '>' next. Otherwise, there is no point in proceeding and trying other branches, regardless where they are. The input is definitely erroneous. When this happens, an expectation_failure exception is thrown. Somewhere outward, the error handler will catch the exception.

Try building the parser: ../../example/qi/mini_xml3.cpp. You can find some examples in: ../../example/qi/mini_xml_samples for testing purposes. "4.toyxml" has an error in it:

<foo><bar></foo></bar>

Running the example with this gives you:

Error! Expecting "bar" here: "foo></bar>"
Error! Expecting end_tag here: "<bar></foo></bar>"
-------------------------
Parsing failed
-------------------------

PrevUpHomeNext