...one of the most highly
regarded and expertly designed C++ library projects in the
world. — Herb Sutter and Andrei
There are two ways to use Boost.Regex with Unicode strings:
If your platform's
can hold Unicode strings, and your platform's C/C++ runtime correctly handles
wide character constants (when passed to
std::iswlower etc), then you can use
to process Unicode. However, there are several disadvantages to this approach:
wchar_t, or even whether the runtime treats wide characters as Unicode at all, most Windows compilers do so, but many Unix systems do not.
If you have the ICU library, then Boost.Regex provides a distinct regular expression type (boost::u32regex), that supports both Unicode specific character properties, and the searching of text that is encoded in either UTF-8, UTF-16, or UTF-32. See: ICU string class support.