...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
Table 11. Regular expressions support
Expression |
Meaning |
---|---|
|
Match any character |
|
Match any except newline (or optionally any character) |
|
All characters taken as literals between double quotes, except escape sequences |
|
A character class; in this case matches |
|
A character class with a range in it; matches |
|
A negated character class i.e. any character but those in the class. In this case, any character except an uppercase letter |
|
Zero or more r's (greedy), where r is any regular expression |
|
Zero or more r's (abstemious), where r is any regular expression |
|
One or more r's (greedy) |
|
One or more r's (abstemious) |
|
Zero or one r's (greedy), i.e. optional |
|
Zero or one r's (abstemious), i.e. optional |
|
Anywhere between two and five r's (greedy) |
|
Anywhere between two and five r's (abstemious) |
|
Two or more r's (greedy) |
|
Two or more r's (abstemious) |
|
Exactly four r's |
|
The macro |
|
The literal string |
|
If X is |
|
A NUL character (ASCII code 0) |
|
The character with octal value 123 |
|
The character with hexadecimal value 2a |
|
A named control character |
|
A shortcut for Alert (bell). |
|
A shortcut for Backspace |
|
A shortcut for ESC (escape character |
|
A shortcut for newline |
|
A shortcut for carriage return |
|
A shortcut for form feed |
|
A shortcut for horizontal tab |
|
A shortcut for vertical tab |
|
A shortcut for |
|
A shortcut for |
|
A shortcut for |
|
A shortcut for |
|
A shortcut for |
|
A shortcut for |
|
Match an |
|
apply option 'r' and omit option 's' while interpreting pattern.
Options may be zero or more of the characters 'i' or 's'. 'i'
means case-insensitive. '-i' means case-sensitive. 's' alters
the meaning of the '.' syntax to match any single character whatsoever.
'-s' alters the meaning of '.' to match any character except
' |
|
The regular expression |
|
Either an |
|
An |
|
An |
Note | |
---|---|
POSIX character classes are not currently supported, due to performance issues when creating them in wide character mode. |
Tip | |
---|---|
If you want to build tokens for syntaxes that recognize items like quotes
( quote1 = "'"; // match single "'" quote2 = "\\\""; // match single '"' literal_quote1 = "\\'"; // match backslash followed by single "'" literal_quote2 = "\\\\\\\""; // match backslash followed by single '"' literal_backslash = "\\\\\\\\"; // match two backslashes
|
rs
has highest precedence
r*
has next highest (+
,
?
, {n,m}
have the same precedence as *
)
r|s
has the lowest precedence
Regular expressions can be given a name and referred to in rules using
the syntax {NAME}
where NAME
is the name you have given to the macro. A macro name can be at most 30
characters long and must start with a _
or a letter. Subsequent characters can be _
,
-
, a letter or a decimal digit.