Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

This is the documentation for a snapshot of the develop branch, built from commit de00f503ab.
PrevUpHomeNext

Multi-function operations

Multi-function operations allow running operations as a set of separate steps, which gives you better control over execution. They work by splitting some of the reads and writes into several function calls.

You can use multi-function operations to execute text queries and prepared statements, and through the dynamic or the static interface.

Protocol primer

To make a good use of multi-function operations, you should have a basic understanding of the underlying protocol.

The protocol uses messages to communicate. These are delimited by headers containing the message length. All operations are initiated by the client, by sending a single request message, to which the server responds with a set of response messages.

The diagram below shows the message exchange between client and server for text queries and statement executions. Each arrow represents a message.

There are two separate cases:

connection::execute handles the full message exchange. In contrast, connection::start_execution will not read the rows, if any.

Some takeaways:

Using multi-function operations through the dynamic interface

execution_state is the main class for the dynamic interface in multi-function operations. An execution state holds information required to progress the execution operation, like metadata (required to parse the rows) and protocol state. Contrary to results, it doesn't contain the rows.

Given the following table definition:

const char* table_definition = R"%(
    CREATE TEMPORARY TABLE posts (
        id INT PRIMARY KEY AUTO_INCREMENT,
        title VARCHAR (256) NOT NULL,
        body TEXT NOT NULL
    )
)%";

You can start a multi-function operation using connection::start_execution:

// st will hold information about the operation being executed.
// It must be passed to any successive operations for this execution
execution_state st;

// Sends the query and reads response and meta, but not the rows
conn.start_execution("SELECT title, body FROM posts", st);

We now must read all the generated rows by calling connection::read_some_rows, which will return a batch of an unspecified size:

// st.complete() returns true once the OK packet is received
while (!st.complete())
{
    // row_batch will be valid until conn performs the next network operation
    rows_view row_batch = conn.read_some_rows(st);

    for (row_view post : row_batch)
    {
        // Process post as required
        std::cout << "Title:" << post.at(0) << std::endl;
    }
}

Some remarks:

read_some_rows returns a rows_view object pointing into the connection's internal buffer. This view is valid until the connection performs any other operation involving a network transfer.

Note that there is no need to distinguish between case 1 and case 2 in the diagram above in our code, as reading rows for a complete operation is well defined.

Using multi-function operations through the static interface

The mechanics are similar to what's been exposed above. The static interface uses static_execution_state to carry state. As with static_results, we must define and pass a type describing our rows:

// We can use a plain struct with ints and strings to describe our rows.
// This must be placed at the namespace level
struct post
{
    int id;
    std::string title;
    std::string body;
};

// We use BOOST_DESCRIBE_STRUCT to add reflection capabilities to post.
// We must list all the fields that should be populated by Boost.MySQL
BOOST_DESCRIBE_STRUCT(post, (), (id, title, body))

We can now start our operation using the same connection::start_execution:

// st will hold information about the operation being executed.
// It must be passed to any successive operations for this execution
static_execution_state<post> st;

// Sends the query and reads response and meta, but not the rows.
// If there is any schema mismatch between the declared row type and
// what the server returned, start_execution will detect it and fail
conn.start_execution("SELECT id, title, body FROM posts", st);

We now must read all the generated rows by calling connection::read_some_rows:

// storage will be filled with the read rows. You can use any other contiguous range.
std::array<post, 20> posts;

// st.complete() returns true once the OK packet is received
while (!st.complete())
{
    std::size_t read_rows = conn.read_some_rows(st, boost::span<post>(posts));
    for (const post& p : boost::span<post>(posts.data(), read_rows))
    {
        // Process post as required
        std::cout << "Title " << p.title << std::endl;
    }
}

Some remarks:

Accessing metadata and OK packet data

You can access metadata at any point, using execution_state::meta or static_execution_state::meta. This function returns a collection of metadata objects. For more information, plase refer to this section.

You can access OK packet data using functions like last_insert_id and affected_rows in both execution_state and static_execution_state. As this information is contained in the OK packet, it can only be accessed once the complete function returns true.

Using multi-function operations with stored procedures and multi-queries

When using operations that return more than one resultset (e.g. when calling stored procedures), the protocol is slightly more complex:

The message exchange is as follows:

For example, given the following stored procedure:

CREATE PROCEDURE get_company(IN pin_company_id CHAR(10))
BEGIN
    START TRANSACTION READ ONLY;
    SELECT id, name, tax_id FROM company WHERE id = pin_company_id;
    SELECT first_name, last_name, salary FROM employee WHERE company_id = pin_company_id;
    COMMIT;
END

We can write:

Dynamic interface

Static interface

// Get the company ID to retrieve, possibly from the user
std::string company_id = get_company_id();

// Call the procedure
execution_state st;
statement stmt = conn.prepare_statement("CALL get_employees(?)");
conn.start_execution(stmt.bind(company_id), st);

// The above code will generate 3 resultsets
// Read the 1st one, which contains the matched companies
while (st.should_read_rows())
{
    rows_view company_batch = conn.read_some_rows(st);

    // Use the retrieved companies as required
    for (row_view company : company_batch)
    {
        std::cout << "Company: " << company.at(1).as_string() << "\n";
    }
}

// Move on to the 2nd one, containing the employees for these companies
conn.read_resultset_head(st);
while (st.should_read_rows())
{
    rows_view employee_batch = conn.read_some_rows(st);

    // Use the retrieved employees as required
    for (row_view employee : employee_batch)
    {
        std::cout << "Employee " << employee.at(0).as_string() << " " << employee.at(1).as_string()
                  << "\n";
    }
}

// The last one is an empty resultset containing information about the
// CALL statement itself. We're not interested in this
conn.read_resultset_head(st);
assert(st.complete());
// Get the company ID to retrieve, possibly from the user
std::string company_id = get_company_id();

// Our procedure generates three resultsets. We must pass each row type
// to static_execution_state as template parameters
using empty = std::tuple<>;
static_execution_state<company, employee, empty> st;

// Call the procedure
statement stmt = conn.prepare_statement("CALL get_employees(?)");
conn.start_execution(stmt.bind(company_id), st);

// Read the 1st one, which contains the matched companies
std::array<company, 5> companies;
while (st.should_read_rows())
{
    std::size_t read_rows = conn.read_some_rows(st, boost::span<company>(companies));

    // Use the retrieved companies as required
    for (const company& c : boost::span<company>(companies.data(), read_rows))
    {
        std::cout << "Company: " << c.name << "\n";
    }
}

// Move on to the 2nd one, containing the employees for these companies
conn.read_resultset_head(st);
std::array<employee, 20> employees;
while (st.should_read_rows())
{
    std::size_t read_rows = conn.read_some_rows(st, boost::span<employee>(employees));

    // Use the retrieved companies as required
    for (const employee& emp : boost::span<employee>(employees.data(), read_rows))
    {
        std::cout << "Employee " << emp.first_name << " " << emp.last_name << "\n";
    }
}

// The last one is an empty resultset containing information about the
// CALL statement itself. We're not interested in this
conn.read_resultset_head(st);
assert(st.complete());

Note that we're using should_read_rows instead of complete as our loop termination condition. complete() returns true when all the resultsets have been read, while should_read_rows() will return false once an individual result has been fully read.

When using the static interface with multi-function operations, not all schema mismatches can be found by the start_execution function, since not all the information is available at this point. Errors may be reported by read_some_rows and read_resultset_head, too. Overall, the same checks are performed as when using connection::execute, but at different points in time.

Multi-resultset in the general case

execution_state and static_execution_state can be seen as state machines with four states. Each state describes which reading function should be invoked next:

For multi-function operations, you may also access OK packet data ever time a resultset has completely been read, i.e. when should_read_head returns true.

This example shows this general case. It uses multi-queries to run an arbitrary SQL script file.

More on read_some_rows

To properly understand read_some_rows, we need to know that every connection owns an internal buffer, where packets sent by the server are stored. It is a single, flat buffer, and you can configure its initial size when creating a connection, passing a buffer_params object as the first argument to connection's constructor. The read buffer may be grown under certain circumstances to accommodate large messages.

read_some_rows gets the maximum number of rows that fit in the internal buffer (without growing it) performing a single read_some operation on the stream (or using cached data). If there are rows to read, read_some_rows guarantees to read at least one. This means that, if doing what we described yields no rows (e.g. because of a large row that doesn't fit into the buffer), read_some_rows will grow the buffer or perform more reads until at least one row has been read. If you're using the static interface, the number of read rows is limited by the size of span you passed, too.

If you want to get the most of read_some_rows, customize the initial buffer size to maximize the number of rows that each batch retrieves.


PrevUpHomeNext