...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
|
Boost.Regexsyntax_option_type |
|
Type syntax_option type is an implementation defined bitmask type that controls how a regular expression string is to be interpreted. For convenience note that all the constants listed here, are also duplicated within the scope of class template basic_regex.
namespace std{ namespace regex_constants{ typedef bitmask_type syntax_option_type; // these flags are standardized: static const syntax_option_type normal; static const syntax_option_type icase; static const syntax_option_type nosubs; static const syntax_option_type optimize; static const syntax_option_type collate; static const syntax_option_type ECMAScript = normal; static const syntax_option_type JavaScript = normal; static const syntax_option_type JScript = normal; static const syntax_option_type basic; static const syntax_option_type extended; static const syntax_option_type awk; static const syntax_option_type grep; static const syntax_option_type egrep; static const syntax_option_type sed = basic; static const syntax_option_type perl;
// these are boost.regex specific:
static const syntax_option_type escape_in_lists;
static const syntax_option_type char_classes;
static const syntax_option_type intervals;
static const syntax_option_type limited_ops;
static const syntax_option_type newline_alt;
static const syntax_option_type bk_plus_qm;
static const syntax_option_type bk_braces;
static const syntax_option_type bk_parens;
static const syntax_option_type bk_refs;
static const syntax_option_type bk_vbar;
static const syntax_option_type use_except;
static const syntax_option_type failbit;
static const syntax_option_type literal;
static const syntax_option_type nocollate;
static const syntax_option_type perlex;
static const syntax_option_type emacs;
} // namespace regex_constants } // namespace std
The type syntax_option_type
is an implementation defined bitmask
type (17.3.2.1.2). Setting its elements has the effects listed in the table
below, a valid value of type syntax_option_type
will always have
exactly one of the elements normal, basic, extended, awk, grep, egrep, sed
or perl
set.
Note that for convenience all the constants listed here are duplicated within the scope of class template basic_regex, so you can use any of:
boost::regex_constants::constant_name
or
boost::regex::constant_name
or
boost::wregex::constant_name
in an interchangeable manner.
Element |
Effect if set |
normal |
Specifies that the grammar recognized by the regular expression engine uses its normal semantics: that is the same as that given in the ECMA-262, ECMAScript Language Specification, Chapter 15 part 10, RegExp (Regular Expression) Objects (FWD.1). boost.regex also recognizes most perl-compatible extensions in this mode. |
icase |
Specifies that matching of regular expressions against a character container sequence shall be performed without regard to case. |
nosubs |
Specifies that when a regular expression is matched against a character container sequence, then no sub-expression matches are to be stored in the supplied match_results structure. |
optimize |
Specifies that the regular expression engine should pay more attention to the speed with which regular expressions are matched, and less to the speed with which regular expression objects are constructed. Otherwise it has no detectable effect on the program output. This currently has no effect for boost.regex. |
collate |
Specifies that character ranges of the form "[a-b]" should be locale sensitive. |
ECMAScript |
The same as normal. |
JavaScript |
The same as normal. |
JScript |
The same as normal. |
basic |
Specifies that the grammar recognized by the regular expression engine is the same as that used by POSIX basic regular expressions in IEEE Std 1003.1-2001, Portable Operating System Interface (POSIX ), Base Definitions and Headers, Section 9, Regular Expressions (FWD.1). |
extended |
Specifies that the grammar recognized by the regular expression engine is the same as that used by POSIX extended regular expressions in IEEE Std 1003.1-2001, Portable Operating System Interface (POSIX ), Base Definitions and Headers, Section 9, Regular Expressions (FWD.1). |
awk |
Specifies that the grammar recognized by the regular expression engine is the same as that used by POSIX utility awk in IEEE Std 1003.1-2001, Portable Operating System Interface (POSIX ), Shells and Utilities, Section 4, awk (FWD.1). That is to say: the same as POSIX extended syntax, but with escape sequences in character classes permitted. |
grep |
Specifies that the grammar recognized by the regular expression engine is the same as that used by POSIX utility grep in IEEE Std 1003.1-2001, Portable Operating System Interface (POSIX ), Shells and Utilities, Section 4, Utilities, grep (FWD.1). That is to say, the same as POSIX basic syntax, but with the newline character acting as an alternation character in addition to "|". |
egrep |
Specifies that the grammar recognized by the regular expression engine is the same as that used by POSIX utility grep when given the -E option in IEEE Std 1003.1-2001, Portable Operating System Interface (POSIX ), Shells and Utilities, Section 4, Utilities, grep (FWD.1). That is to say, the same as POSIX extended syntax, but with the newline character acting as an alternation character in addition to "|". |
sed |
The same as basic. |
perl |
The same as normal. |
The following constants are specific to this particular regular expression implementation and do not appear in the regular expression standardization proposal:
regbase::escape_in_lists | Allows the use of the escape "\" character in sets of characters, for example [\]] represents the set of characters containing only "]". If this flag is not set then "\" is an ordinary character inside sets. |
regbase::char_classes | When this bit is set, character classes [:classname:] are allowed inside character set declarations, for example "[[:word:]]" represents the set of all characters that belong to the character class "word". |
regbase:: intervals | When this bit is set, repetition intervals are allowed, for example "a{2,4}" represents a repeat of between 2 and 4 letter a's. |
regbase:: limited_ops | When this bit is set all of "+", "?" and "|" are ordinary characters in all situations. |
regbase:: newline_alt | When this bit is set, then the newline character "\n" has the same effect as the alternation operator "|". |
regbase:: bk_plus_qm | When this bit is set then "\+" represents the one or more repetition operator and "\?" represents the zero or one repetition operator. When this bit is not set then "+" and "?" are used instead. |
regbase:: bk_braces | When this bit is set then "\{" and "\}" are used for bounded repetitions and "{" and "}" are normal characters. This is the opposite of default behavior. |
regbase:: bk_parens | When this bit is set then "\(" and "\)" are used to group sub-expressions and "(" and ")" are ordinary characters, this is the opposite of default behavior. |
regbase:: bk_refs | When this bit is set then back references are allowed. |
regbase:: bk_vbar | When this bit is set then "\|" represents the alternation operator and "|" is an ordinary character. This is the opposite of default behavior. |
regbase:: use_except | When this bit is set then a bad_expression exception will be thrown on error. Use of this flag is deprecated - basic_regex will always throw on error. |
regbase:: failbit | This bit is set on error, if regbase::use_except is not set, then this bit should be checked to see if a regular expression is valid before usage. |
regbase::literal | All characters in the string are treated as literals, there are no special characters or escape sequences. |
regbase::emacs | Provides compatability with the emacs editor, eqivalent to: bk_braces | bk_parens | bk_refs | bk_vbar. |
Revised 24 Oct 2003
© Copyright John Maddock 1998- 2003
Use, modification and distribution are subject to the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)