Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

This is the documentation for a snapshot of the master branch, built from commit 5002c2d6a2.
PrevUpHomeNext

Authority

The authority determines how a resource can be accessed. It contains two parts: the userinfo that holds identity credentials, and the host and port which identify a communication endpoint having dominion over the resource described in the remainder of the URL.

Some observations:

The function authority can be used to obtain the authority_view from a url_view:

url_view u( "https://www.boost.org/users/download/" );
assert(u.has_authority());
authority_view a = u.authority();

Notice that authority does not return a decode_view. The reason is any decoded character / could make it ambiguous with the path component. The authority is represented through an authority_view, a read-only container to a non-owning character buffer containing a valid authority.

An authority_view has functions for obtaining its subcomponents:

assert(a.host() == "www.boost.org");

These functions do not throw. If the URL has no authority, authority returns an empty authority_view. The function has_authority can be used to check whether this empty authority means there is no authority or an empty authority in the URL.

In contexts where an authority can appear by itself, an authority_view can be constructed directly from a string. For instance, the grammar for the request-target of an HTTP/1 CONNECT request uses authority-form. This is what such a request looks like:

CONNECT www.example.com:80 HTTP/1.1

In that case, we have an authority that cannot be parsed directly with parse_uri as a URL. Instead, we can use the analogous function parse_authority to obtain an authority_view.

authority_view a = parse_authority( "www.example.com:80" ).value();
assert(!a.has_userinfo());
assert(a.host() == "www.example.com");
assert(a.port() == "80");

The authority view provides the subset of observer member functions found in url_view which are relevant to the authority. However, when an authority is parsed on its own, the leading double slashes ("//") are not present.

Authority string with userinfo are also valid for parse_authority:

authority_view a = parse_authority( "user:pass@www.example.com:443" ).value();
assert(a.userinfo() == "user:pass");
assert(a.user() == "user");
assert(a.password() == "pass");
assert(a.host() == "www.example.com");
assert(a.port() == "443");
Host

The host subcomponent represents where resources are located. The functions host and port can be used to obtain the host from a url_view or authority_view:

The host might be a registered name

url_view u( "https://john.doe@www.example.com:123/forum/questions/" );
assert(u.host() == "www.example.com");
assert(u.port() == "123");

or an IP address

url_view u( "https://john.doe@192.168.2.1:123/forum/questions/" );
assert(u.host() == "192.168.2.1");
assert(u.port() == "123");

Although this is not mandatory, note that the encoded host is rarely different from its encoded counterpart.

url_view u( "https://www.example.com" );
assert(u.host() == "www.example.com");
assert(u.host() == u.encoded_host());

Registered names usually need to be handled differently from IP addresses. The function host_type can be used to identify which type of host is described in the URL.

url_view u( "https://www.boost.org/users/download/" );
switch (u.host_type())
{
case host_type::name:
    write_request(resolve(u.host()));
    break;
case host_type::ipv4:
    write_request(u.host_ipv4_address());
    break;
case host_type::ipv6:
    write_request(u.host_ipv6_address());
    break;
default:
    break;
}

When the host_type matches an IP address, the functions host_ipv4_address, host_ipv6_address can be used to obtain the decoded addresses as integers.

[Note] Note

Note that if an authority is present, the host is always defined even if it is the empty string (corresponding to a zero-length reg-name in the BNF).

url_view u( "https:///path/to_resource" );
assert( u.has_authority() );
assert( u.authority().buffer().empty() );
assert( u.path() == "/path/to_resource" );

The authority component also influences how we should interpret the URL path. If the authority is present, the path component must either be empty or begin with a slash. This is a common pattern where the path is empty:

url_view u( "https://www.boost.org" );
assert( u.host() == "www.boost.org" );
assert( u.path().empty() );

When both the authority and path exist, the path must begin with a slash:

url_view u( "https://www.boost.org/users/download/" );
assert( u.host() == "www.boost.org" );
assert( u.path() == "/users/download/" );

This rule also affects the path "/":

url_view u( "https://www.boost.org/" );
assert( u.host() == "www.boost.org" );
assert( u.path() == "/" );

When there is no authority component, the path cannot begin with an empty segment. This means the path cannot begin with two slashes // to avoid these characters being interpreted as the beginning of the authority component. For instance, consider the following valid URL:

url_view u( "https://www.boost.org/" );
assert( u.host() == "www.boost.org" );
assert( u.path() == "/" );

Note that including a double slash would make the path be interpreted as the authority:

url_view u( "mailto://John.Doe@example.com" );
assert( u.authority().buffer() == "John.Doe@example.com" );
assert( u.path().empty() );
Userinfo

In complete authority components, we can also extract the userinfo and port subcomponents.

url_view u( "https://john.doe@www.example.com:123/forum/questions/" );
assert(u.userinfo() == "john.doe");
assert(u.port() == "123");

When not treated as an opaque field, the optional userinfo subcomponent consists of a user name and an optional password. Analogous functions are provided for the userinfo subcomponents.

url_view u( "https://john.doe:123456@www.somehost.com/forum/questions/" );
assert(u.userinfo() == "john:doe");
assert(u.user() == "john");
assert(u.password() == "doe");

Analogous to other observers, the functions has_userinfo and has_password are provided to differentiate empty components from absent components. Note that there is no function has_user. The user component is available whenever userinfo exists.

[Note] Note

Although the specification allows the format username:password, the password component should be used with care.

It is not recommended to transfer password data through URLs unless this is an empty string indicating no password.


PrevUpHomeNext