Boost
C++ Libraries
...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
This is an older version of Boost and was released in 2013. The current version is 1.89.0.
There are two ways to use Boost.Regex with Unicode strings:
If your platform's wchar_t type
can hold Unicode strings, and your platform's C/C++ runtime correctly handles
wide character constants (when passed to std::iswspace
std::iswlower etc), then you can use boost::wregex
to process Unicode. However, there are several disadvantages to this approach:
wchar_t,
or even whether the runtime treats wide characters as Unicode at all, most
Windows compilers do so, but many Unix systems do not.
[[:Nd:]], [[:Po:]]
etc.
If you have the ICU library, then Boost.Regex can be configured to make use of it, and provide a distinct regular expression type (boost::u32regex), that supports both Unicode specific character properties, and the searching of text that is encoded in either UTF-8, UTF-16, or UTF-32. See: ICU string class support.