Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

PrevUpHomeNext

Overview

This section briefly explains the library main classes and functions, and how to use them.

connection is the most important class in the library. A connection is an I/O object, templated on a Stream type. A connection contains an instance of that Stream type and additional state required by the protocol. connection's constructor takes the same arguments as the underlying Stream constructor.

The library defines some typedefs to make things less verbose. The most common one is tcp_ssl_connection. In this case, Stream is boost::asio::ssl::stream<boost::asio::ip::tcp::socket>, which can be constructed from a boost::asio::any_io_executor and a boost::asio::ssl::context:

// The execution context, required to run I/O operations.
boost::asio::io_context ctx;

// The SSL context, required to establish TLS connections.
// The default SSL options are good enough for us at this point.
boost::asio::ssl::context ssl_ctx(boost::asio::ssl::context::tls_client);

// Represents a connection to the MySQL server.
boost::mysql::tcp_ssl_connection conn(ctx.get_executor(), ssl_ctx);

Typedefs for other transports are also available. See this section for more info.

The MySQL client/server protocol is session-oriented. Before anything else, you must perform session establishment, usually by calling connection::connect. This function performs two actions:

  • It establishes the "physical" connection, by calling connect() on the underlying Stream object. For a tcp_ssl_connection, this establishes the TCP connection.
  • It performs the handshake with the MySQL server. This is part of the MySQL client/server protocol. It performs authentication, sets session parameters like the default database to use, and performs the TLS handshake, if required.

connection::connect takes two parameters, one for each of the above actions:

  • The physical endpoint where the server is listening. For TCP streams, this is a boost::asio::ip::tcp::endpoint. For UNIX sockets, it's a boost::asio::local::stream_protocol::endpoint. For TCP, we can resolve a string hostname and port into an endpoint using a resolver object.
  • handshake_params to use for the handshake operation. This parameter doesn't depend on the Stream type. See this section for more info.
// Obtain the hostname to connect to - replace get_hostname by your code
std::string server_hostname = get_hostname();

// Resolve the hostname to get a collection of endpoints
boost::asio::ip::tcp::resolver resolver(ctx.get_executor());
auto endpoints = resolver.resolve(server_hostname, boost::mysql::default_port_string);

// The username and password to use
boost::mysql::handshake_params params(
    mysql_username,         // username, as a string
    mysql_password,         // password, as a string - don't hardcode this into your code!
    "boost_mysql_examples"  // database to use
);

// Connect to the server using the first endpoint returned by the resolver
conn.connect(*endpoints.begin(), params);

Note that connection::connect can only be used with socket-like streams. If your stream is not a socket, you must use the lower-level connection::handshake function. Please read this section for more info.

The two main ways to use a connection are text queries and prepared statements. You can access both using connection::execute:

Feature

Used for...

Code

Text queries

Simple queries, without parameters:

  • "START TRANSACTION"
  • "SET NAMES utf8"
  • "SHOW TABLES"
results result;
conn.execute("START TRANSACTION", result);

Prepared statements

Queries with parameters unknown at compile-time.

statement stmt = conn.prepare_statement(
    "SELECT first_name FROM employee WHERE company_id = ? AND salary > ?"
);

results result;
conn.execute(stmt.bind("HGS", 30000), result);

There are two different interfaces to access the rows generated by a query or statement. You can use the results class to access rows using a dynamically-typed interface, using variant-like objects to represent values retrieved from the server. On ther other hand, static_results is statically-typed. You specify the shape of your rows at compile-time, and the library will parse the retrieved values for you into the types you provide.

You can use almost every feature in this library (including text queries and prepared statements) with both interfaces.

For example, given the following table:

const char* table_definition = R"%(
    CREATE TEMPORARY TABLE posts (
        id INT PRIMARY KEY AUTO_INCREMENT,
        title VARCHAR (256) NOT NULL,
        body TEXT NOT NULL
    )
)%";

This is how you would access its contents using either of the interfaces:

Interface

Description

Example

Dynamic interface: results

Variant based
Available in C++11
Learn more
Example code

// Passing a results object to connection::execute selects the dynamic interface
results result;
conn.execute("SELECT id, title, body FROM posts", result);

// Every row is a collection of fields, which are variant-like objects
// that represent data. We use as_string() to cast them to the appropriate type
for (row_view post : result.rows())
{
    std::cout << "Title: " << post.at(1).as_string() << "Body: " << post.at(2).as_string()
              << std::endl;
}

Static interface: static_results

Parses rows into your own types
Requires C++14
Learn more
Example code

// We can use a plain struct with ints and strings to describe our rows.
// This must be placed at the namespace level
struct post
{
    int id;
    std::string title;
    std::string body;
};

// We use BOOST_DESCRIBE_STRUCT to add reflection capabilities to post.
// We must list all the fields that should be populated by Boost.MySQL
BOOST_DESCRIBE_STRUCT(post, (), (id, title, body))


//
// This must be placed inside your function or method:
//

// Passing a static_results to execute() selects the static interface
static_results<post> result;
conn.execute("SELECT id, title, body FROM posts", result);

// Query results are parsed directly into your own type
for (const post& p : result.rows())
{
    std::cout << "Title: " << p.title << "Body: " << p.body << std::endl;
}

In MySQL, a resultset referes to the results generated by a SQL query. When you execute a text query or a prepared statement, you get back a resultset. We've already been using resultsets: the results and static_results classes are in-memory representations of a resultset.

A resultset is composed of three pieces of data:

Rows

The actual rows generated by the SQL query: results::rows and static_results::rows.

Metadata

Information about the columns retrieved by the query: results::meta and static_results::meta. There is one object per retrieved column. It provides information about column names, types, uniqueness contraints...

Additional execution information

Extra info on the execution of the operation, like the number of affected rows (results::affected_rows and static_results::affected_rows) or the last auto-generated ID for INSERT statements (results::last_insert_id and static_results::last_insert_id).

All SQL statements generate resultsets. Statements that generate no rows, like INSERTs, generate empty resultsets (i.e. result.rows().empty() == true). The interface to execute SELECTs and INSERTs is the same.

Until now, we've used simple text queries that did not contain any user-provided input. In the real world, most queries will contain some piece of user-provided input.

One approach could be to use string concatenation to construct a SQL query from user input, and then pass it to execute(). Avoid this approach as much as possible, as it can lead to SQL injection vulnerabilities. Instead, use prepared statements.

Prepared statements are server-side objects that represent a parameterized query. A statement is represented using the statement class, which is a lightweight object holding a handle to the server-side prepared statement.

Let's say you've got an inventory table, and you're writing a command-line program to get products by ID. You've got the following table definition:

results result;
conn.execute(
    R"%(
        CREATE TEMPORARY TABLE products (
            id VARCHAR(50) PRIMARY KEY,
            description VARCHAR(256)
        )
    )%",
    result
);
conn.execute("INSERT INTO products VALUES ('PTT', 'Potatoes'), ('CAR', 'Carrots')", result);

You can prepare a statement to retrieve products by ID using connection::prepare_statement:

statement stmt = conn.prepare_statement("SELECT description FROM products WHERE id = ?");

You can execute the statement using connection::execute:

// Obtain the product_id from the user. product_id is untrusted input
const char* product_id = get_value_from_user();

// Execute the statement
results result;
conn.execute(stmt.bind(product_id), result);

// Use result as required

We used statement::bind to provide actual parameters to the statement. You must pass as many parameters to bind as ? placeholders the statement has.

To learn more about prepared statements, please refer to this section.

The functions we've been using communicate errors throwing exceptions. There are also non-throwing overloads that use error codes.

If the server fails to fulfill a request (for example, because the provided SQL was invalid or a referenced table didn't exist), the operation is considered failed and will return an error. The server provides an error message that can be accessed using the diagnostics class. For example:

error_code ec;
diagnostics diag;
results result;

// The provided SQL is invalid. The server will return an error.
// ec will be set to a non-zero value
conn.execute("this is not SQL!", result, ec, diag);

if (ec)
{
    // The error code will likely report a syntax error
    std::cout << "Operation failed with error code: " << ec << '\n';

    // diag.server_message() will contain the classic phrase
    // "You have an error in your SQL syntax; check the manual..."
    // Bear in mind that server_message() may contain user input, so treat it with caution
    std::cout << "Server diagnostics: " << diag.server_message() << std::endl;
}

With exceptions, this would be:

try
{
    // The provided SQL is invalid. This function will throw an exception.
    results result;
    conn.execute("this is not SQL!", result);
}
catch (const error_with_diagnostics& err)
{
    // error_with_diagnostics contains an error_code and a diagnostics object.
    // It inherits from boost::system::system_error.
    std::cout << "Operation failed with error code: " << err.code() << '\n'
              << "Server diagnostics: " << err.get_diagnostics().server_message() << std::endl;
}

As with Boost.Asio, every sync operation has an async counterpart. This library follows Asio's async model, so you may use async operations with any valid Asio CompletionToken, including callbacks and coroutines.

For example, if you can use C++20, you can write:

// Using this CompletionToken, you get C++20 coroutines that communicate
// errors with error_codes. This way, you can access the diagnostics object.
constexpr auto token = boost::asio::as_tuple(boost::asio::use_awaitable);

// Run our query as a coroutine
diagnostics diag;
results result;
auto [ec] = co_await conn.async_execute("SELECT 'Hello world!'", result, diag, token);

// This will throw an error_with_diagnostics in case of failure
boost::mysql::throw_on_error(ec, diag);

The examples section contains material that can help you. This section also provides more info on this topic.

Single outstanding operation per connection

At any given point in time, a connection may only have a single async operation outstanding. This is because the connection uses the underlying Stream object directly, without any locking or queueing. If you perform several async operations concurrently on a single connection without any serialization, the stream may interleave reads and writes from different operations, leading to undefined behavior.

For example, doing the following is illegal and should be avoided:

// Coroutine body
// DO NOT DO THIS!!!!
results result1, result2;
co_await (
    conn.async_execute("SELECT 1", result1, boost::asio::use_awaitable) &&
    conn.async_execute("SELECT 2", result2, boost::asio::use_awaitable)
);

If you need to perform queries in parallel, open more connections to the server.

Until now, we've been using connection::execute, which executes some SQL and reads all generated data into an in-memory object.

Some use cases may not fit in this simple pattern. For example:

  • When reading a very big resultset, it may not be efficient (or even possible) to completely load it in memory. Reading rows in batches may be more adequate.
  • If rows contain very long TEXT or BLOB fields, it may not be adequate to copy these values from the network buffer into the results object. A view-based approach may be better.

For these cases, we can break the execute() operation into several steps, using a multi-function operation (the term is coined by this library). This example reads an entire table in batches, which can be the case in an ETL process:

// Create the table and some sample data
// In a real system, body may be megabytes long.
results result;
conn.execute(
    R"%(
        CREATE TEMPORARY TABLE posts (
            id INT PRIMARY KEY AUTO_INCREMENT,
            title VARCHAR (256),
            body TEXT
        )
    )%",
    result
);
conn.execute(
    R"%(
        INSERT INTO posts (title, body) VALUES
            ('Post 1', 'A very long post body'),
            ('Post 2', 'An even longer post body')
    )%",
    result
);

// execution_state stores state about our operation, and must be passed to all functions
execution_state st;

// Writes the query request and reads the server response, but not the rows
conn.start_execution("SELECT title, body FROM posts", st);

// Reads all the returned rows, in batches.
// st.complete() returns true once there are no more rows to read
while (!st.complete())
{
    // row_batch will be valid until conn performs the next network operation
    rows_view row_batch = conn.read_some_rows(st);

    for (row_view post : row_batch)
    {
        // Process post as required
        std::cout << "Title:" << post.at(0) << std::endl;
    }
}
[Warning] Warning

Once you start a multi-function operation with connection::start_execution, the server immediately sends all rows to the client. You must read all rows before engaging in further operations. Otherwise, you will encounter packet mismatches, which can lead to bugs and vulnerabilities!

Multi-function operations are powerful but complex. Only use them when there is a strong reason to do so. Multi-function operations also work with the static interface. Please refer to this section for more information on these operations.


PrevUpHomeNext