Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

PrevUpHomeNext

Character sets

Character set refresher

MySQL defines a character set as "a set of symbols and their respective encodings". ascii, latin1, utf8 and utf16 are character sets supported by MySQL.

A collation is a set of rules for comparing characters in a character set. For example, a case-insensitive collation will make strings that only differ in case compare equal. All collations are associated to a single character set. For example, utf8_spanish_ci is a case-insensitive collation associated to the utf8 character set. Every character set has a default collation, which will be used if a character set without a collation is specified. For example, latin1_swedish_ci is the default collation for the latin1 character set.

You can find more information about these concepts in the official MySQL docs on character sets.

The connection character set and collation

Every connection has an associated character set and collation. The connection's character set determines the encoding for character strings sent to and retrieved from the server. This includes SQL query strings, string fields and column names in metadata. The connection's collation is used for string literal comparison.

Every session you establish can have its own different character set and collation. You can specify this in two ways:

results result;
conn.execute("SET NAMES utf8mb4", result);
// Further operations can assume utf8mb4 as conn's charset

character_set_results and character_set_client

Both of the above methods are shortcuts to set several session-level variables. The ones that impact this library's behavior are:

The table below summarizes the encoding used by each piece of functionality in this library:

Functionality

Encoding given by...

SQL query strings passed to connection::execute and connection::prepare_statement

character_set_client

String values passed as parameters to statement::bind

character_set_client

String fields retrieved by connection::execute or connection::read_some_rows:

field_view::as_string
field_view::get_string

character_set_results

Metadata strings:

metadata::database
metadata::table
metadata::original_table
metadata::column_name
metadata::original_column_name

character_set_results

Server-generated error messages: diagnostics::server_message

character_set_results

Informational messages:

results::info
execution_state::info

ASCII. These can only contain ASCII characters and are always ASCII encoded. More info in this section.


PrevUpHomeNext