Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

Click here to view the latest version of this page.
PrevUpHomeNext
Character Generators (char_, lit)
Description

The character generators described in this section are:

The char_ generator emits single characters. The char_ generator has an associated Character Encoding Namespace. This is needed when doing basic operations such as forcing lower or upper case and dealing with character ranges.

There are various forms of char_.

char_

The no argument form of char_ emits any character in the associated Character Encoding Namespace.

char_               // emits any character as supplied by the attribute
char_(ch)

The single argument form of char_ (with a character argument) emits the supplied character.

char_('x')          // emits 'x'
char_(L'x')         // emits L'x'
char_(x)            // emits x (a char)
char_(first, last)

char_ with two arguments, emits any character from a range of characters as supplied by the attribute.

char_('a','z')      // alphabetic characters
char_(L'0',L'9')    // digits

A range of characters is created from a low-high character pair. Such a generator emits a single character that is in the range, including both endpoints. Note, the first character must be before the second, according to the underlying Character Encoding Namespace.

Character mapping is inherently platform dependent. It is not guaranteed in the standard for example that 'A' < 'Z', that is why in Spirit2, we purposely attach a specific Character Encoding Namespace (such as ASCII, ISO-8859-1) to the char_ generator to eliminate such ambiguities.

[Note] Note

Sparse bit vectors

To accommodate 16/32 and 64 bit characters, the char-set statically switches from a std::bitset implementation when the character type is not greater than 8 bits, to a sparse bit/boolean set which uses a sorted vector of disjoint ranges (range_run). The set is constructed from ranges such that adjacent or overlapping ranges are coalesced.

range_runs are very space-economical in situations where there are lots of ranges and a few individual disjoint values. Searching is O(log n) where n is the number of ranges.

char_(def)

Lastly, when given a string (a plain C string, a std::basic_string, etc.), the string is regarded as a char-set definition string following a syntax that resembles posix style regular expression character sets (except that double quotes delimit the set elements instead of square brackets and there is no special negation ^ character). Examples:

char_("a-zA-Z")     // alphabetic characters
char_("0-9a-fA-F")  // hexadecimal characters
char_("actgACTG")   // DNA identifiers
char_("\x7f\x7e")   // Hexadecimal 0x7F and 0x7E

These generators emit any character from a range of characters as supplied by the attribute.

lit(ch)

lit, when passed a single character, behaves like the single argument char_ except that lit does not consume an attribute. A plain char or wchar_t is equivalent to a lit.

[Note] Note

lit is reused by the String Generators, the char generators, and the Numeric Generators (see signed integer, unsigned integer, and real number generators). In general, a char generator is created when you pass in a character, a string generator is created when you pass in a string, and a numeric generator is created when you use a numeric literal. The exception is when you pass a single element literal string, e.g. lit("x"). In this case, we optimize this to create a char generator instead of a string generator.

Examples:

'x'
lit('x')
lit(L'x')
lit(c)      // c is a char
Header
// forwards to <boost/spirit/home/karma/char/char.hpp>
#include <boost/spirit/include/karma_char_.hpp>

Also, see Include Structure.

Namespace

Name

boost::spirit::lit // alias: boost::spirit::karma::lit

ns::char_

In the table above, ns represents a Character Encoding Namespace.

Model of

PrimitiveGenerator

Notation

ch, ch1, ch2

Character-class specific character (See Character Class Types), or a Lazy Argument that evaluates to a character-class specific character value

cs

Character-set specifier string (See Character Class Types), or a Lazy Argument that evaluates to a character-set specifier string, or a pointer/reference to a null-terminated array of characters. This string specifies a char-set definition string following a syntax that resembles posix style regular expression character sets (except the square brackets and the negation ^ character).

ns

A Character Encoding Namespace.

cg

A char generator, a char range generator, or a char set generator.

Expression Semantics

Semantics of an expression is defined only where it differs from, or is not defined in PrimitiveGenerator.

Expression

Description

ch

Generate the character literal ch. This generator never fails (unless the underlying output stream reports an error).

lit(ch)

Generate the character literal ch. This generator never fails (unless the underlying output stream reports an error).

ns::char_

Generate the character provided by a mandatory attribute interpreted in the character set defined by ns. This generator never fails (unless the underlying output stream reports an error).

ns::char_(ch)

Generate the character ch as provided by the immediate literal value the generator is initialized from. If this generator has an associated attribute it succeeds only as long as the attribute is equal to the immediate literal (unless the underlying output stream reports an error). Otherwise this generator fails and does not generate any output.

ns::char_("c")

Generate the character c as provided by the immediate literal value the generator is initialized from. If this generator has an associated attribute it succeeds only as long as the attribute is equal to the immediate literal (unless the underlying output stream reports an error). Otherwise this generator fails and does not generate any output.

ns::char_(ch1, ch2)

Generate the character provided by a mandatory attribute interpreted in the character set defined by ns. The generator succeeds as long as the attribute belongs to the character range [ch1, ch2] (unless the underlying output stream reports an error). Otherwise this generator fails and does not generate any output.

ns::char_(cs)

Generate the character provided by a mandatory attribute interpreted in the character set defined by ns. The generator succeeds as long as the attribute belongs to the character set cs (unless the underlying output stream reports an error). Otherwise this generator fails and does not generate any output.

~cg

Negate cg. The result is a negated char generator that inverts the test condition of the character generator it is attached to.

A character ch is assumed to belong to the character range defined by ns::char_(ch1, ch2) if its character value (binary representation) interpreted in the character set defined by ns is not smaller than the character value of ch1 and not larger then the character value of ch2 (i.e. ch1 <= ch <= ch2).

The charset parameter passed to ns::char_(charset) must be a string containing more than one character. Every single character in this string is assumed to belong to the character set defined by this expression. An exception to this is the '-' character which has a special meaning if it is not specified as the first and not the last character in charset. If the '-' is used in between to characters it is interpreted as spanning a character range. A character ch is considered to belong to the defined character set charset if it matches one of the characters as specified by the string parameter described above. For example

Example

Description

char_("abc")

'a', 'b', and 'c'

char_("a-z")

all characters (and including) from 'a' to 'z'

char_("a-zA-Z")

all characters (and including) from 'a' to 'z' and 'A' and 'Z'

char_("-1-9")

'-' and all characters (and including) from '1' to '9'

Attributes

Expression

Attribute

ch

unused

lit(ch)

unused

ns::char_

Ch, attribute is mandatory (otherwise compilation will fail). Ch is the character type of the Character Encoding Namespace, ns.

ns::char_(ch)

Ch, attribute is optional, if it is supplied, the generator compares the attribute with ch and succeeds only if both are equal, failing otherwise. Ch is the character type of the Character Encoding Namespace, ns.

ns::char_("c")

Ch, attribute is optional, if it is supplied, the generator compares the attribute with c and succeeds only if both are equal, failing otherwise. Ch is the character type of the Character Encoding Namespace, ns.

ns::char_(ch1, ch2)

Ch, attribute is mandatory (otherwise compilation will fail), the generator succeeds if the attribute belongs to the character range [ch1, ch2] interpreted in the character set defined by ns. Ch is the character type of the Character Encoding Namespace, ns.

ns::char_(cs)

Ch, attribute is mandatory (otherwise compilation will fail), the generator succeeds if the attribute belongs to the character set cs, interpreted in the character set defined by ns. Ch is the character type of the Character Encoding Namespace, ns.

~cg

Attribute of cg

[Note] Note

In addition to their usual attribute of type Ch all listed generators accept an instance of a boost::optional<Ch> as well. If the boost::optional<> is initialized (holds a value) the generators behave as if their attribute was an instance of Ch and emit the value stored in the boost::optional<>. Otherwise the generators will fail.

Complexity

O(1)

The complexity of ch, lit(ch), ns::char_, ns::char_(ch), and ns::char_("c") is constant as all generators emit exactly one character per invocation.

The character range generator (ns::char_(ch1, ch2)) additionally requires constant lookup time for the verification whether the attribute belongs to the character range.

The character set generator (ns::char_(cs)) additionally requires O(log N) lookup time for the verification whether the attribute belongs to the character set, where N is the number of characters in the character set.

Example
[Note] Note

The test harness for the example(s) below is presented in the Basics Examples section.

Some includes:

#include <boost/spirit/include/karma.hpp>
#include <boost/spirit/include/support_utree.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/fusion/include/std_pair.hpp>
#include <iostream>
#include <string>

Some using declarations:

using boost::spirit::karma::lit;
using boost::spirit::ascii::char_;

Basic usage of char_ generators:

test_generator("A", 'A');
test_generator("A", lit('A'));

test_generator_attr("a", char_, 'a');
test_generator("A", char_('A'));
test_generator_attr("A", char_('A'), 'A');
test_generator_attr("", char_('A'), 'B');         // fails (as 'A' != 'B')

test_generator_attr("A", char_('A', 'Z'), 'A');
test_generator_attr("", char_('A', 'Z'), 'a');    // fails (as 'a' does not belong to 'A'...'Z')

test_generator_attr("k", char_("a-z0-9"), 'k');
test_generator_attr("", char_("a-z0-9"), 'A');    // fails (as 'A' does not belong to "a-z0-9")


PrevUpHomeNext