...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
This section briefly explains the library main classes and functions, and how to use them.
connection
is the most important class in the library. A connection is an I/O object,
templated on a Stream
type. A connection
contains
an instance of that Stream
type and additional state required by the protocol. connection
's
constructor takes the same arguments as the underlying Stream
constructor.
The library defines some typedefs to make things less verbose. The most common
one is tcp_ssl_connection
.
In this case, Stream
is
boost::asio::ssl::stream<boost::asio::ip::tcp::socket>
, which can be constructed from a boost::asio::any_io_executor
and a boost::asio::ssl::context
:
// The execution context, required to run I/O operations. boost::asio::io_context ctx; // The SSL context, required to establish TLS connections. // The default SSL options are good enough for us at this point. boost::asio::ssl::context ssl_ctx(boost::asio::ssl::context::tls_client); // Represents a connection to the MySQL server. boost::mysql::tcp_ssl_connection conn(ctx.get_executor(), ssl_ctx);
Typedefs for other transports are also available. See this section for more info.
The MySQL client/server protocol is session-oriented. Before anything else,
you must perform session establishment, usually by calling connection::connect
.
This function performs two actions:
connect()
on the underlying Stream
object. For a tcp_ssl_connection
,
this establishes the TCP connection.
connection::connect
takes two parameters, one for each of the above actions:
boost::asio::ip::tcp::endpoint
. For UNIX sockets, it's a
boost::asio::local::stream_protocol::endpoint
. For TCP, we can resolve a
string hostname and port into an endpoint using a resolver
object.
handshake_params
to use for the handshake operation. This parameter doesn't depend on
the Stream
type. See
this section for more info.
// Obtain the hostname to connect to - replace get_hostname by your code std::string server_hostname = get_hostname(); // Resolve the hostname to get a collection of endpoints boost::asio::ip::tcp::resolver resolver(ctx.get_executor()); auto endpoints = resolver.resolve(server_hostname, boost::mysql::default_port_string); // The username and password to use boost::mysql::handshake_params params( mysql_username, // username, as a string mysql_password, // password, as a string - don't hardcode this into your code! "boost_mysql_examples" // database to use ); // Connect to the server using the first endpoint returned by the resolver conn.connect(*endpoints.begin(), params);
Note that connection::connect
can only be used with socket-like streams. If your stream is not a socket,
you must use the lower-level connection::handshake
function. Please read this
section for more info.
The two main ways to use a connection are text queries and prepared statements.
You can access both using connection::execute
:
Feature |
Used for... |
Code |
---|---|---|
Text queries |
Simple queries, without parameters:
|
results result; conn.execute("START TRANSACTION", result); |
Prepared statements |
Queries with parameters unknown at compile-time. |
statement stmt = conn.prepare_statement( "SELECT first_name FROM employee WHERE company_id = ? AND salary > ?" ); results result; conn.execute(stmt.bind("HGS", 30000), result); |
There are two different interfaces to access the rows generated by a query
or statement. You can use the results
class to access rows using a dynamically-typed interface, using variant-like
objects to represent values retrieved from the server. On ther other hand,
static_results
is statically-typed. You specify the shape of your rows at compile-time,
and the library will parse the retrieved values for you into the types you
provide.
You can use almost every feature in this library (including text queries and prepared statements) with both interfaces.
For example, given the following table:
const char* table_definition = R"%( CREATE TEMPORARY TABLE posts ( id INT PRIMARY KEY AUTO_INCREMENT, title VARCHAR (256) NOT NULL, body TEXT NOT NULL ) )%";
This is how you would access its contents using either of the interfaces:
Interface |
Description |
Example |
---|---|---|
Dynamic interface: |
Variant based |
// Passing a results object to connection::execute selects the dynamic interface results result; conn.execute("SELECT id, title, body FROM posts", result); // Every row is a collection of fields, which are variant-like objects // that represent data. We use as_string() to cast them to the appropriate type for (row_view post : result.rows()) { std::cout << "Title: " << post.at(1).as_string() << "Body: " << post.at(2).as_string() << std::endl; } |
Static interface: |
Parses rows into your own types |
// We can use a plain struct with ints and strings to describe our rows. // This must be placed at the namespace level struct post { int id; std::string title; std::string body; }; // We use BOOST_DESCRIBE_STRUCT to add reflection capabilities to post. // We must list all the fields that should be populated by Boost.MySQL BOOST_DESCRIBE_STRUCT(post, (), (id, title, body))
// // This must be placed inside your function or method: // // Passing a static_results to execute() selects the static interface static_results<post> result; conn.execute("SELECT id, title, body FROM posts", result); // Query results are parsed directly into your own type for (const post& p : result.rows()) { std::cout << "Title: " << p.title << "Body: " << p.body << std::endl; } |
In MySQL, a resultset referes to the results generated
by a SQL query. When you execute a text query or a prepared statement, you
get back a resultset. We've already been using resultsets: the results
and static_results
classes are in-memory representations of a resultset.
A resultset is composed of three pieces of data:
The actual rows generated by the SQL query: results::rows
and static_results::rows
.
Information about the columns retrieved by the query: results::meta
and static_results::meta
.
There is one object per retrieved column. It provides information about
column names, types, uniqueness contraints...
Extra info on the execution of the operation, like the number of affected
rows (results::affected_rows
and static_results::affected_rows
)
or the last auto-generated ID for INSERT
statements (results::last_insert_id
and static_results::last_insert_id
).
All SQL statements generate resultsets. Statements that generate no rows,
like INSERT
s, generate empty
resultsets (i.e. result.rows().empty() == true
).
The interface to execute SELECT
s
and INSERT
s is the same.
Until now, we've used simple text queries that did not contain any user-provided input. In the real world, most queries will contain some piece of user-provided input.
One approach could be to use string concatenation to construct a SQL query
from user input, and then pass it to execute()
. Avoid this approach as much as possible,
as it can lead to SQL injection vulnerabilities.
Instead, use prepared statements.
Prepared statements are server-side objects that represent a parameterized
query. A statement is represented using the statement
class, which is a lightweight object holding a handle to the server-side
prepared statement.
Let's say you've got an inventory table, and you're writing a command-line program to get products by ID. You've got the following table definition:
results result; conn.execute( R"%( CREATE TEMPORARY TABLE products ( id VARCHAR(50) PRIMARY KEY, description VARCHAR(256) ) )%", result ); conn.execute("INSERT INTO products VALUES ('PTT', 'Potatoes'), ('CAR', 'Carrots')", result);
You can prepare a statement to retrieve products by ID using connection::prepare_statement
:
statement stmt = conn.prepare_statement("SELECT description FROM products WHERE id = ?");
You can execute the statement using connection::execute
:
// Obtain the product_id from the user. product_id is untrusted input const char* product_id = get_value_from_user(); // Execute the statement results result; conn.execute(stmt.bind(product_id), result); // Use result as required
We used statement::bind
to provide actual parameters to the statement. You must pass as many parameters
to bind
as ?
placeholders the statement has.
To learn more about prepared statements, please refer to this section.
The functions we've been using communicate errors throwing exceptions. There are also non-throwing overloads that use error codes.
If the server fails to fulfill a request (for example, because the provided
SQL was invalid or a referenced table didn't exist), the operation is considered
failed and will return an error. The server provides an error message that
can be accessed using the diagnostics
class. For example:
error_code ec; diagnostics diag; results result; // The provided SQL is invalid. The server will return an error. // ec will be set to a non-zero value conn.execute("this is not SQL!", result, ec, diag); if (ec) { // The error code will likely report a syntax error std::cout << "Operation failed with error code: " << ec << '\n'; // diag.server_message() will contain the classic phrase // "You have an error in your SQL syntax; check the manual..." // Bear in mind that server_message() may contain user input, so treat it with caution std::cout << "Server diagnostics: " << diag.server_message() << std::endl; }
With exceptions, this would be:
try { // The provided SQL is invalid. This function will throw an exception. results result; conn.execute("this is not SQL!", result); } catch (const error_with_diagnostics& err) { // error_with_diagnostics contains an error_code and a diagnostics object. // It inherits from boost::system::system_error. std::cout << "Operation failed with error code: " << err.code() << '\n' << "Server diagnostics: " << err.get_diagnostics().server_message() << std::endl; }
As with Boost.Asio, every sync operation has an async counterpart. This library
follows Asio's async model, so you may use async operations with any valid
Asio CompletionToken
, including
callbacks and coroutines.
For example, if you can use C++20, you can write:
// Using this CompletionToken, you get C++20 coroutines that communicate // errors with error_codes. This way, you can access the diagnostics object. constexpr auto token = boost::asio::as_tuple(boost::asio::use_awaitable); // Run our query as a coroutine diagnostics diag; results result; auto [ec] = co_await conn.async_execute("SELECT 'Hello world!'", result, diag, token); // This will throw an error_with_diagnostics in case of failure boost::mysql::throw_on_error(ec, diag);
The examples section contains material that can help you. This section also provides more info on this topic.
At any given point in time, a connection
may only have a single async operation outstanding. This is because the connection
uses the underlying Stream
object directly, without any locking or queueing. If you perform several
async operations concurrently on a single connection without any serialization,
the stream may interleave reads and writes from different operations, leading
to undefined behavior.
For example, doing the following is illegal and should be avoided:
// Coroutine body // DO NOT DO THIS!!!! results result1, result2; co_await ( conn.async_execute("SELECT 1", result1, boost::asio::use_awaitable) && conn.async_execute("SELECT 2", result2, boost::asio::use_awaitable) );
If you need to perform queries in parallel, open more connections to the server.
Until now, we've been using connection::execute
,
which executes some SQL and reads all generated data into an in-memory object.
Some use cases may not fit in this simple pattern. For example:
TEXT
or BLOB
fields, it may
not be adequate to copy these values from the network buffer into the
results
object. A view-based
approach may be better.
For these cases, we can break the execute()
operation into several steps, using a
multi-function operation (the term is coined by this
library). This example reads an entire table in batches, which can be the
case in an ETL process:
// Create the table and some sample data // In a real system, body may be megabytes long. results result; conn.execute( R"%( CREATE TEMPORARY TABLE posts ( id INT PRIMARY KEY AUTO_INCREMENT, title VARCHAR (256), body TEXT ) )%", result ); conn.execute( R"%( INSERT INTO posts (title, body) VALUES ('Post 1', 'A very long post body'), ('Post 2', 'An even longer post body') )%", result ); // execution_state stores state about our operation, and must be passed to all functions execution_state st; // Writes the query request and reads the server response, but not the rows conn.start_execution("SELECT title, body FROM posts", st); // Reads all the returned rows, in batches. // st.complete() returns true once there are no more rows to read while (!st.complete()) { // row_batch will be valid until conn performs the next network operation rows_view row_batch = conn.read_some_rows(st); for (row_view post : row_batch) { // Process post as required std::cout << "Title:" << post.at(0) << std::endl; } }
Warning | |
---|---|
Once you start a multi-function operation with |
Multi-function operations are powerful but complex. Only use them when there is a strong reason to do so. Multi-function operations also work with the static interface. Please refer to this section for more information on these operations.