Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

This is the documentation for a snapshot of the develop branch, built from commit 0f79ae966a.
PrevUpHomeNext

Tutorial: iterator_interface

[Note] Note

All the member functions provided by iterator_interface are in your iterator's base class — iterator_interface — and can therefore be hidden if you define a member function with the same name in your derived iterator. If you don't like the semantics of any iterator_interface-provided member function, feel free to replace it.

The iterator_interface Template

Though a given iterator may have a large number of operations associated with it, there are only a few basis operations that the iterator needs to define; the full set of operations it supports can be defined in terms of that much smaller basis.

It is possible to define any iterator Iter in terms of a subset of user-defined operations. By deriving Iter from iterator_interface using CRTP, we can generate the full set of operations. Here is the declaration of iterator_interface:

template<
    typename Derived,
    typename IteratorConcept,
    typename ValueType,
    typename Reference = ValueType &,
    typename Pointer = ValueType *,
    typename DifferenceType = std::ptrdiff_t>
struct iterator_interface;

Let's break that down.

Derived is the type that you're deriving iterator_interface from.

IteratorConcept defines the iterator category/concept. This must be one of the C++ standard iterator tag types, like std::forward_iterator_tag. In C++20 and later, std::contiguous_iterator_tag is a valid tag to use.

ValueType is used to define the iterator's value_type typedef. Likewise, Reference and Pointer are used to define the iterator's reference and pointer typedefs, respectively.

[Tip] Tip

Reference does not need to be a reference type, and Pointer does not need to be a pointer type. This fact is very useful when making proxy iterators.

DifferenceType is used to define the iterator's difference_type. Don't be a weirdo; just leave this as the default type, std::ptrdiff_t.

Proxy Iterators

Sometimes you need to create an iterator I such that I::reference_type is not a (possibly const) reference to I::value_type. For instance, let's say you want to make a zip-iterator that produces pairs of values from two separate underlying sequences. For sequences A and B, with respective value_types T and U, one possible reference_type for a zip iterator would be std::pair<T &, U &> (this is distinct from a reference to a pair, such as std::pair<T, U> &). Each such pair would contain a reference to one element from A and a reference to the corresponding element from B.

As another example, if you wanted an iterator I that represents the infinite sequence 0, 1, 2, ..., you'd be unable to form a reference to most or all of those values; you'd instead produce a temporary for each value as it is needed. This implies that I::value_type does not involve references at all; it may instead by int or double.

When defining a proxy iterator, you can use a template alias that provides reasonable defaults for iterator_interface's parameters:

template<
    typename Derived,
    typename IteratorConcept,
    typename ValueType,
    typename Reference = ValueType,
    typename DifferenceType = std::ptrdiff_t>
using proxy_iterator_interface = iterator_interface<
    Derived,
    IteratorConcept,
    ValueType,
    Reference,
    proxy_arrow_result<Reference>,
    DifferenceType>;
[Note] Note

As shown above, proxy_iterator_interface uses a template called proxy_arrow_result as its pointer-type. This template makes a copy of whatever value is obtained by operator*, and then returns a pointer to the copy in its operator->. You may want to use something else if this is a performance concern.

User-Defined Iterator Operations

Now, let's get back to the user-defined basis operations.

In the table below, Iter is a user-defined type derived from iterator_interface; i and i2 are objects of type Iter; reference is the type passed as the Reference template parameter to iterator_interface; pointer is the type passed as the Pointer template parameter to iterator_interface; and n is a value of type difference_type.

Table 39.1. User-Defined Operations

Expression

Return Type

Semantics

Assertion/note/pre-/post-condition

*i

Convertible to reference.

Dereferences i and returns the result.

Expects: i is dereferenceable.

i == i2

Contextually convertible to bool.

Returns true if and only if i and i2 refer to the same value.

Expects: (i, i2) is in the domain of ==.

i2 - i

Convertible to difference_type.

Returns n.

Expects: there exists a value n of type difference_type such that i + n == i2. i2 == i + (i2 - i).

++i

Iter &

Increments i.

--i

Iter &

Decrements i.

i += n

Iter &

difference_type m = n;
if (m >= 0)
  while (m--) ++i;
else
  while (m++) --i;


[Tip] Tip

The table above leaves a lot of implementation freedom. In operator+=(), you could take n as a value or as a reference; operator-() can return a difference_type or just something convertible to one; etc. In particular, your operations can be constexpr or noexcept as you see fit.

Not all the iterator concepts require all the operations above. Here are the operations used with each iterator concept:

Table 39.2. Operations Required for Each Concept

Concept

Operations

input_iterator

*i
i == i2
++i

output_iterator

*i
++i

forward_iterator

*i
i == i2
++i

bidirectional_iterator

*i
i == i2
++i
--i

random_access_iterator/continguous_iterator

*i
i - i2
i += n


[Note] Note

For random_access_iterators, the operation i - i2 is used to provide all the relational operators, including operator==() and operator!=(). If you are defining an iterator over a discontiguous sequence (e.g. std::deque), this implementation of operator==() may not be optimal. In this case, provide your own operator==(). operator!=() will be provided if operator== is available.

An Important Note About operator++() and operator--()

There's a wrinkle in this way of doing things. When you define operator++() in your iterator type Derived, iterator_interface defines post-increment, operator++(int). But since Derived has an operator++ and so does its base class iterator_interface, the one in Derived hides the one in iterator_interface.

So, you need to add a using declaration that makes the operator++ from the base class visible in the derived class. For instance, in the node_iterator example there are these lines:

using base_type = boost::stl_interfaces::
    iterator_interface<node_iterator<T>, std::forward_iterator_tag, T>;
using base_type::operator++;

[Important] Important

All of the above applies to operator--. So, for bidirectional iterators, you need to add a line like using base_type::operator--; as well.

[Note] Note

These using declarations are not necessary for a random access iterator, because Derived does not have an operator++() in that case.

Putting it All Together

Ok, let's actually define a simple iterator. Let's say you need to interact with some legacy code that has a hand-written linked list:

template<typename T>
struct node
{
    T value_;
    node * next_; // == nullptr in the tail node
};

We can't change this code to use std::list, but it would be nice to be able to reuse all of the standard algorithms with this type. Defining an iterator will get us there.

template<typename T>
struct node_iterator
    : boost::stl_interfaces::
          iterator_interface<node_iterator<T>, std::forward_iterator_tag, T>

We are deriving node_iterator from iterator_interface, and because we're using CRTP, we first have to pass node_iterator for the Derived template parameter, so that iterator_interface knows what derived type to cast to in order to get at the user-defined operations. Then, we pass std::forward_iterator_tag for IteratorConcept, since that's appropriate for a singly-linked list. Finally, we pass T to let iterator_interface know what the value_type is for our iterator.

We leave the rest of the template parameters at their defaults: T & for Reference, T * for Pointer, and std::ptrdiff_t for DifferenceType. This is what you will do for almost all iterators. The most common exceptions to this are usually some kind of proxy iterator. Another exception is when for better code generation you want to return builtin values instead of references for constant iterators. To see an example of the latter, see the repeated_chars_iterator in the introduction; it's Reference template parameter is char for this reason.

constexpr node_iterator() noexcept : it_(nullptr) {}
constexpr node_iterator(node<T> * it) noexcept : it_(it) {}

Next, we define two constructors: a default constructor, and one that takes a node pointer. A default constructor is required by the forward_iterator concept, but iterator_interface cannot supply this, since constructors are not visible in derived types without user intervention.

[Important] Important

A default constructor is required for every iterator concept.

constexpr T & operator*() const noexcept { return it_->value_; }
constexpr node_iterator & operator++() noexcept
{
    it_ = it_->next_;
    return *this;
}
friend constexpr bool
operator==(node_iterator lhs, node_iterator rhs) noexcept
{
    return lhs.it_ == rhs.it_;
}

Next, we define the user-defined operations that iterator_interface requires to do its work. As you might expect, the three required operations are very straightforward.

[Note] Note

Here, I implement operator==() as a hidden friend function. it would have worked just as well if I had instead implemented it as a member function, like this:

constexpr bool operator==(node_iterator rhs) const noexcept
{
    return it_ == rhs.it_;
}

Either of these forms works, since iterator_interface is concept-based — the appropriate expressions need to be well-formed for the iterator_interface tempalte to do its work.

Finally, we need a using declaration to make iterator_interface::operator++(int) visible:

using base_type = boost::stl_interfaces::
    iterator_interface<node_iterator<T>, std::forward_iterator_tag, T>;
using base_type::operator++;

Here's how we might use the forward iterator we just defined:

node_iterator<int> const first(&nodes[0]);
node_iterator<int> const last;
for (auto it = first; it != last; it++) {
    std::cout << *it << " "; // Prints 0 1 2 3 4
}
std::cout << "\n";

What About Adapting an Existing Iterator?

So glad you asked. If you want to make something like a filtering iterator, or say a UTF-8 to UTF-32 transcoding iterator, you are starting with an existing iterator and adapting it. There's a way to avoid having to write all of the user-defined basis functions, as long as there's a base iterator that already has the right operations with the right semantics.

For example, consider an iterator that contains a pointer to an array of int, and predicate of type Pred. It filters out integers that do not meet the predicate. Since we are using an existing iterator (the pointer to int), we already have all the operations we need for a bidirectional iterator (and more), except that operator++ on an int * does not skip over elements as we'd like. Here's the code:

template<typename Pred>
struct filtered_int_iterator : boost::stl_interfaces::iterator_interface<
                                   filtered_int_iterator<Pred>,
                                   std::bidirectional_iterator_tag,
                                   int>
{
    filtered_int_iterator() : it_(nullptr) {}
    filtered_int_iterator(int * it, int * last, Pred pred) :
        it_(it),
        last_(last),
        pred_(std::move(pred))
    {
        // We need to do this in the constructor so that operator== works
        // properly on two filtered_int_iterators, when they bound a sequence
        // in which none of the ints meets the predicate.
        it_ = std::find_if(it_, last_, pred_);
    }

    // A bidirectional iterator based on iterator_interface usually required
    // four user-defined operations.  since we are adapting an existing
    // iterator (an int *), we only need to define this one.  The others are
    // implemented by iterator_interface, using the underlying int *.
    filtered_int_iterator & operator++()
    {
        it_ = std::find_if(std::next(it_), last_, pred_);
        return *this;
    }

    // It is really common for iterator adaptors to have a base() member
    // function that returns the adapted iterator.
    int * base() const { return it_; }

private:
    // Provide access to these private members.
    friend boost::stl_interfaces::access;

    // These functions are picked up by iterator_interface, and used to
    // implement any operations that you don't define above.  They're not
    // called base() so that they do not collide with the base() member above.
    //
    // Note that the const overload does not strictly speaking need to be a
    // reference, as demonstrated here.
    constexpr int *& base_reference() noexcept { return it_; }
    constexpr int * base_reference() const noexcept { return it_; }

    int * it_;
    int * last_;
    Pred pred_;
};

// A make-function makes it easier to deal with the Pred parameter.
template<typename Pred>
auto make_filtered_int_iterator(int * it, int * last, Pred pred)
{
    return filtered_int_iterator<Pred>(it, last, std::move(pred));
}

So, all we had to do was let iterator_interface know that there was an underlying iterator it could use — by implementing base_reference() — and the operations that we did not define got defined for us by iterator_interface.

Here is the iterator in action:

std::array<int, 8> ints = {{0, 1, 2, 3, 4, 5, 6, 7}};
int * const ints_first = ints.data();
int * const ints_last = ints.data() + ints.size();

auto even = [](int x) { return (x % 2) == 0; };
auto first = make_filtered_int_iterator(ints_first, ints_last, even);
auto last = make_filtered_int_iterator(ints_last, ints_last, even);

// This is an example only.  Obviously, we could have called
// std::copy_if() here.
std::vector<int> ints_copy;
std::copy(first, last, std::back_inserter(ints_copy));
assert(ints_copy == (std::vector<int>{0, 2, 4, 6}));

Checking Your Work

Boost.STLInterfaces is able to check that some of the code that you write is compatible with the concept for the iterator you're writing. It cannot check everything. For instance, Boost.STLInterfaces does not know if your derived type includes a default constructor, which is required by all the iterators. In particular, iterator_interface cannot static_assert on the wellformedness of Derived(), since Derived is an incomplete type within the body of iterator_interfaceiterator_interface is the base class for Derived, not the other way round.

Since you can easily static_assert that a type models a given concept, a good practice is to put such a static_assert after you define your iterator type.

For instance, after node_iterator you'll find this code:

// static_assert(std::forward_iterator<node_iterator>, ""), or nothing in
// C++17 and earlier.
BOOST_STL_INTERFACES_STATIC_ASSERT_CONCEPT(node_iterator, std::forward_iterator)

Consider this good code hygiene. Without this simple check, you'll probably eventually find yourself looking at an error message with a very long template instantiation stack.

There's also a macro that can help you check that std::iterator_traits is well-formed and provides the correct types. See BOOST_STL_INTERFACES_STATIC_ASSERT_ITERATOR_TRAITS.


PrevUpHomeNext