...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
This section briefly explains the library main classes and functions, and how to use them.
connection
is the most important class in the library. A connection is an I/O object,
templated on a Stream
type. A connection
contains
an instance of that Stream
type and additional state required by the protocol. connection
's
constructor takes the same arguments as the underlying Stream
constructor.
The library defines some typedefs to make things less verbose. The most common
one is tcp_ssl_connection
.
In this case, Stream
is
boost::asio::ssl::stream<boost::asio::ip::tcp::socket>
, which can be constructed from a boost::asio::any_io_executor
and a boost::asio::ssl::context
:
// The execution context, required to run I/O operations. boost::asio::io_context ctx; // The SSL context, required to establish TLS connections. // The default SSL options are good enough for us at this point. boost::asio::ssl::context ssl_ctx(boost::asio::ssl::context::tls_client); // Represents a connection to the MySQL server. boost::mysql::tcp_ssl_connection conn(ctx.get_executor(), ssl_ctx);
Typedefs for other transports are also available. See this section for more info.
The MySQL client/server protocol is session-oriented. Before anything else,
you must perform session establishment, usually by calling connection::connect
.
This function performs two actions:
connect()
on the underlying Stream
object. For a tcp_ssl_connection
,
this establishes the TCP connection.
connection::connect
takes two parameters, one for each of the above actions:
boost::asio::ip::tcp::endpoint
. For UNIX sockets, it's a
boost::asio::local::stream_protocol::endpoint
. For TCP, we can resolve a
string hostname and port into an endpoint using a resolver
object.
handshake_params
to use for the handshake operation. This parameter doesn't depend on
the Stream
type. See
this section for more info.
// Resolve the hostname to get a collection of endpoints boost::asio::ip::tcp::resolver resolver(ctx.get_executor()); auto endpoints = resolver.resolve(argv[3], boost::mysql::default_port_string); // The username and password to use boost::mysql::handshake_params params( argv[1], // username argv[2], // password "boost_mysql_examples" // database ); // Connect to the server using the first endpoint returned by the resolver conn.connect(*endpoints.begin(), params);
Note that connection::connect
can only be used with socket-like streams. If your stream is not a socket,
you must use the lower-level connection::handshake
function. Please read this
section for more info.
The two main ways to use a connection are text queries and prepared statements:
Feature |
Used for... |
Code |
---|---|---|
Text queries: |
Simple queries, without parameters:
|
results result; conn.query("START TRANSACTION", result); |
Prepared statements: |
Queries with parameters unknown at compile-time. |
statement stmt = conn.prepare_statement( "SELECT first_name FROM employee WHERE company_id = ? AND salary > ?" ); results result; conn.execute_statement(stmt, std::make_tuple("HGS", 30000), result); |
When you execute a text query or a prepared statement, you get a results
object, which will be the subject
of the next section. We will delve deeper into prepared statements later.
In MySQL, a resultset referes to the results generated
by a SQL query. The results
class is an in-memory reprentation of a MySQL resultset. The following diagram
shows an approximate representation of what a resultset looks like:
We can see that a resultset is composed of three pieces of information:
results::rows
.
We'll expand on this later.
results::meta
.
See this section for more info.
results::affected_rows
)
or the number of warnings generated by the query (results::warning_count
).
You can obtain a results
object by executing a text query (connection::query
)
or a prepared statement (connection::execute_statement
).
All SQL statements generate resultsets. Statements that generate no rows,
like INSERT
s, generate empty
resultsets (i.e. result.rows().empty() == true
).
The interface to execute SELECT
s
and INSERT
s is the same.
We saw that results::rows
returns a matrix-like data structure containing the rows retrieved by SQL
query. This library defines six data structures to represent MySQL data:
field
The smallest unit of data. A single "cell" in a MySQL table. This is an owning, variant-like type.
field_view
Like field
, but non-owning.
row
An owning, vector
-like
collection of fields.
row_view
Like row
, but non-owning.
rows
An owning, matrix-like collection of fields. Represents several rows of the same size in an optimized way.
rows_view
Like rows
, but non-owning.
results::rows
returns a rows_view
object. The memory for the rows is owned by the results
object. Indexing the returned view also returns view objects:
// Populate a results object results result; conn.query("SELECT 'Hello world'", result); // results::rows() returns a rows_view. The underlying memory is owned by the results object rows_view all_rows = result.rows(); // Indexing a rows_view yields a row_view. The underlying memory is owned by the results object row_view first_row = all_rows.at(0); // Indexing a row_view yields a field_view. The underlying memory is owned by the results object field_view first_field = first_row.at(0); // Contains the string "Hello world"
Views behave similarly to std::string_view
.
You must make sure that you don't use a view after the storage it points
to has gone out of scope. In this case, you must not use any of the views
after the results
object
has gone out of scope.
As it happens with std::string_view
, you can take ownership of
a view using its owning counterpart:
// You may use all_rows_owning after result has gone out of scope rows all_rows_owning{all_rows}; // You may use first_row_owning after result has gone out of scope row first_row_owning{first_row}; // You may use first_field_owning after result has gone out of scope field first_field_owning{first_field};
field
and field_view
are specialized variant-like types that can hold any type you may find in
a MySQL table. Once you obtain a field, you can access its contents using
the following functions:
kind
,
which returns a field_kind
enum.
field::is_xxx
.
field::as_xxx
and field::get_xxx
. The as_xxx
functions are checked (they will throw an exception if the actual type
doesn't match), while the get_xxx
are unchecked (they result in undefined behavior on type mismatch).
For example:
results result; conn.query("SELECT 'abc', 42", result); // Obtain a field's underlying value using the is_xxx and get_xxx accessors field_view f = result.rows().at(0).at(0); // f points to the string "abc" if (f.is_string()) { // we know it's a string, unchecked access string_view s = f.get_string(); std::cout << s << std::endl; // Use the string as required } else { // Oops, something went wrong - schema msimatch? } // Alternative: use the as_xxx accessor f = result.rows().at(0).at(1); std::int64_t value = f.as_int64(); // Checked access. Throws if f doesn't contain an int std::cout << value << std::endl; // Use the int as required
NULL
values are represented
as field objects having kind() == field_kind::null
.
You can check whether a value is NULL
or not using is_null
.
This is how NULL
s are typically
handled:
results result; // Create some test data conn.query( R"%( CREATE TEMPORARY TABLE products ( id VARCHAR(50) PRIMARY KEY, description VARCHAR(256) ) )%", result ); conn.query("INSERT INTO products VALUES ('PTT', 'Potatoes'), ('CAR', NULL)", result); // Retrieve the data. Note that some fields are NULL conn.query("SELECT id, description FROM products", result); for (row_view r : result.rows()) { field_view description_fv = r.at(1); if (description_fv.is_null()) { // Handle the NULL value // Note: description_fv.is_string() will return false here; NULL is represented as a separate // type std::cout << "No description for product_id " << r.at(0) << std::endl; } else { // Handle the non-NULL case. Get the underlying value and use it as you want // If there is any schema mismatch (and description was not defined as VARCHAR), this will // throw string_view description = description_fv.as_string(); // Use description as required std::cout << "product_id " << r.at(0) << ": " << description << std::endl; } }
Every MySQL type is mapped to a single C++ type. The following table shows these mappings:
|
C++ type |
MySQL types |
|
|
|
---|---|---|---|---|---|
|
|
||||
|
|
Unsigned |
|||
|
|
|
|||
|
|
||||
|
|
||||
|
|
||||
|
|||||
|
|||||
|
|||||
|
Any of the above, when they're |
This section contains more information about fields.
Until now, we've used simple text queries that did not contain any user-provided input. In the real world, most queries will contain some piece of user-provided input.
One approach could be to use string concatenation to construct a SQL query
from user input, and then execute it using query()
. Avoid this approach as much as possible,
as it can lead to SQL injection vulnerabilities.
Instead, use prepared statements.
Prepared statements are server-side objects that represent a parameterized
query. A statement is represented using the statement
class, which is a lightweight object holding a handle to the server-side
prepared statement.
Let's say you've got an inventory table, and you're writing a command-line program to get products by ID. You've got the following table definition:
results result; conn.query( R"%( CREATE TEMPORARY TABLE products ( id VARCHAR(50) PRIMARY KEY, description VARCHAR(256) ) )%", result ); conn.query("INSERT INTO products VALUES ('PTT', 'Potatoes'), ('CAR', 'Carrots')", result);
You can prepare a statement to retrieve products by ID using:
statement stmt = conn.prepare_statement("SELECT description FROM products WHERE id = ?");
You can execute the statement using connection::execute_statement
:
// Obtain the product_id from the user. product_id is untrusted input const char* product_id = argv[2]; // Execute the statement results result; conn.execute_statement(stmt, std::make_tuple(product_id), result); // Use result as required
The statement
object is passed
as first parameter, which tells the server which statement it should execute.
Actual parameters are passed as the second argument, as a std::tuple
.
You must pass as many parameters as ?
placeholders the statement has.
To learn more about prepared statements, please refer to this section.
Until now, we've been using connection::query
and connection::execute_statement
,
which execute some SQL and read all generated data into an in-memory results
object.
Some use cases may not fit in this simple pattern. For example:
TEXT
or BLOB
fields, it may
not be adequate to copy these values from the network buffer into the
results
object. A view-based
approach may be better.
For these cases, we can break the query()
or execute_statement()
operation into several steps, using a
multi-function operation (the term is coined by this
library). This example reads an entire table in batches, which can be the
case in an ETL process:
// Create the table and some sample data // In a real system, body may be megabaytes long. results result; conn.query( R"%( CREATE TEMPORARY TABLE posts ( id INT PRIMARY KEY AUTO_INCREMENT, title VARCHAR (256), body TEXT ) )%", result ); conn.query( R"%( INSERT INTO posts (title, body) VALUES ('Post 1', 'A very long post body'), ('Post 2', 'An even longer post body') )%", result ); // execution_state stores state about our operation, and must be passed to all functions execution_state st; // Writes the query request and reads the server response, but not the rows conn.start_query("SELECT title, body FROM posts", st); // Reads all the returned rows, in batches. // st.complete() returns true once there are no more rows to read while (!st.complete()) { // row_batch will be valid until conn performs the next network operation rows_view row_batch = conn.read_some_rows(st); for (row_view post : row_batch) { // Process post as required std::cout << "Title:" << post.at(0) << std::endl; } }
![]() |
Warning |
---|---|
Once you start a multi-function operation with |
Multi-function operations are powerful but complex. Only use them when there is a strong reason to do so. Please refer to this section for more information on these operations.
The functions we've been using communicate errors throwing exceptions. There are also non-throwing overloads that use error codes.
If the server fails to fulfill a request (for example, because the provided
SQL was invalid or a referenced table didn't exist), the operation is considered
failed and will return an error. The server provides an error message that
can be accessed using the diagnostics
class. For example:
error_code ec; diagnostics diag; results result; // The provided SQL is invalid. The server will return an error. // ec will be set to a non-zero value conn.query("this is not SQL!", result, ec, diag); if (ec) { // The error code will likely report a syntax error std::cout << "Operation failed with error code: " << ec << '\n'; // diag.server_message() will contain the classic phrase // "You have an error in your SQL syntax; check the manual..." // Bear in mind that server_message() may contain user input, so treat it with caution std::cout << "Server diagnostics: " << diag.server_message() << std::endl; }
With exceptions, this would be:
try { // The provided SQL is invalid. This function will throw an exception. results result; conn.query("this is not SQL!", result); } catch (const error_with_diagnostics& err) { // error_with_diagnostics contains an error_code and a diagnostics object. // It inherits from boost::system::system_error. std::cout << "Operation failed with error code: " << err.code() << '\n' << "Server diagnostics: " << err.get_diagnostics().server_message() << std::endl; }
As with Boost.Asio, every sync operation has an async counterpart. This library
follows Asio's async model, so you may use async operations with any valid
Asio CompletionToken
, including
callbacks and coroutines.
For example, if you can use C++20, you can write:
// Using this CompletionToken, you get C++20 coroutines that communicate // errors with error_codes. This way, you can access the diagnostics object. constexpr auto token = boost::asio::as_tuple(boost::asio::use_awaitable); // Run our query as a coroutine diagnostics diag; results result; auto [ec] = co_await conn.async_query("SELECT 'Hello world!'", result, diag, token); // This will throw an error_with_diagnostics in case of failure boost::mysql::throw_on_error(ec, diag);
The examples section contains material that can help you. This section also provides more info on this topic.
At any given point in time, a connection
may only have a single async operation outstanding. This is because the connection
uses the underlying Stream
object directly, without any locking or queueing. If you perform several
async operations concurrently on a single connection without any serialization,
the stream may interleave reads and writes from different operations, leading
to undefined behavior.
For example, connection::async_query
performs both reads and writes. Doing this is illegal and should be avoided:
// Coroutine body // DO NOT DO THIS!!!! results result1, result2; co_await ( conn.async_query("SELECT 1", result1, boost::asio::use_awaitable) && conn.async_query("SELECT 2", result2, boost::asio::use_awaitable) );
If you need to perform queries in parallel, open more connections to the server.