Boost.Locale
Localized Text Formatting

The iostream manipulators are very useful, but when we create a messages for the user, sometimes we need something like good old printf or boost::format.

Unfortunately boost::format has several limitations in context of localization:

  1. It renders all parameters using global locale rather than target ostream locale. For example:
        std::locale::global(std::locale("en_US.UTF-8"));
        output.imbue(std::locale("de_DE.UTF-8"))
        output << boost::format("%1%") % 1234.345;
    

    This would write "1,234.235" to output, instead of the "1.234,234" that is expected for "de_DE" locale
  2. It knows nothing about the new Boost.Locale manipulators.
  3. The printf-like syntax is very limited for formatting complex localized data, not allowing the formatting of dates, times, or currencies

Thus a new class, boost::locale::format, was introduced. For example:

    wcout << wformat(L"Today {1,date} I would meet {2} at home") % time(0) % name <<endl;

Each format specifier is enclosed within {} brackets, is separated with a comma "," and may have an additional option after an equals symbol '='. This option may be simple ASCII text or single-quoted localized text. If a single-quote should be inserted within the text, it may be represented with a pair of single-quote characters.

Here is an example of a format string:

    "Ms. {1} had arrived at {2,ftime='%I o''clock'} at home. The exact time is {2,time=full}"

The syntax is described by following grammar:

    format : '{' parameters '}'
    parameters: parameter | parameter ',' parameters;
    parameter : key ["=" value] ;
    key : [0-9a-zA-Z<>]+ ;
    value : ascii-string-excluding-"}"-and="," | local-string ; 
    local-string : quoted-text | quoted-text local-string;
    quoted-text : '[^']*' ;

You can include literal '{' and '}' by inserting double "{{" or "}}" to the text.

cout << format(translate("Unexpected `{{' in line {1} in file {2}")) % pos % file;

Would display something like

Unexpected `{' in line 5 in file source.cpp

The following format key-value pairs are supported:

  • [0-9]+ -- digits, the index of the formatted parameter -- required.
  • num or number -- format a number. Options are:
    • hex -- display in hexadecimal format
    • oct -- display in octal format
    • sci or scientific -- display in scientific format
    • fix or fixed -- display in fixed format
      For example, number=sci
  • cur or currency -- format currency. Options are:
    • iso -- display using ISO currency symbol.
    • nat or national -- display using national currency symbol.
  • per or percent -- format a percentage value.
  • date, time, datetime or dt -- format a date, a time, or a date and time. Options are:
    • s or short -- display in short format.
    • m or medium -- display in medium format.
    • l or long -- display in long format.
    • f or full -- display in full format.
  • ftime with string (quoted) parameter -- display as with strftime. See as::ftime manipulator.
  • spell or spellout -- spell the number.
  • ord or ordinal -- format an ordinal number (1st, 2nd... etc)
  • left or < -- align-left.
  • right or > -- align-right.
  • width or w -- set field width (requires parameter).
  • precision or p -- set precision (requires parameter).
  • locale -- with parameter -- switch locales for the current operation. This command generates a locale with formatting facets, giving more fine grained control of formatting. For example:
        cout << format("This article was published at {1,date=l} (Gregorian) {1,locale=he_IL@calendar=hebrew,date=l} (Hebrew)") % date;
    
  • timezone or tz -- the name of the timezone to display the time in. For example:
        cout << format("Time is: Local {1,time}, ({1,time,tz=EET} Eastern European Time)") % date;
    
  • local - display the time in local time
  • gmt - display the time in UTC time scale
        cout << format("Local time is: {1,time,local}, universal time is {1,time,gmt}") % time;
    

The constructor for the format class can take an object of type message, simplifying integration with message translation code.

For example:

    cout<< format(translate("Adding {1} to {2}, we get {3}")) % a % b % (a+b) << endl;

A formatted string can be fetched directly by using the str(std::locale const &loc=std::locale()) member function. For example:

    std::wstring de = (wformat(translate("Adding {1} to {2}, we get {3}")) % a % b % (a+b)).str(de_locale);
    std::wstring fr = (wformat(translate("Adding {1} to {2}, we get {3}")) % a % b % (a+b)).str(fr_locale);
Note:
There is one significant difference between boost::format and boost::locale::format: Boost.Locale's format converts its parameters only when written to an ostream or when the `str()` member function is called. It only saves references to the objects that can be written to a stream.

This is generally not a problem when all operations are done in one statement, such as:

    cout << format("Adding {1} to {2}, we get {3}") % a % b % (a+b);

Because the temporary value of (a+b) exists until the formatted data is actually written to the stream. But following code is wrong:

    format fmt("Adding {1} to {2}, we get {3}");
    fmt % a;
    fmt % b;
    fmt % (a+b);
    cout << fmt;

Because the temporary value of (a+b) no longer exists when fmt is written to the stream. A correct solution would be:

    format fmt("Adding {1} to {2}, we get {3}");
    fmt % a;
    fmt % b;
    int a_plus_b = a+b;
    fmt % a_plus_b;
    cout << fmt;