Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

PrevUpHomeNext

Segments

Hierarchical schemes often interpret the path as a slash-delimited sequence of percent-encoded strings called segments. In this library the segments may be accessed using these separate, bidirectional view types which reference the underlying URL:

Table 1.26. Segments Types

Type

Accessor

Description

segments_view

segments

A read-only range of decoded segments.

segments_ref

segments

A modifiable range of decoded segments.

segments_encoded_view

encoded_segments

A read-only range of segments.

segments_encoded_ref

encoded_segments

A modifiable range of segments.


First we observe these invariants about paths and segments:

The following URL contains a path with three segments: "path", "to", and "file.txt":

http://www.example.com/path/to/file.txt

To understand the relationship between the path and segments, we define this function segs which returns a list of strings corresponding to the elements in a container of segments:

auto segs( core::string_view s ) -> std::list< std::string >
{
    url_view u( s );
    std::list< std::string > seq;
    for( auto seg : u.encoded_segments() )
        seq.push_back( seg.decode() );
    return seq;
}

In this table we show the result of invoking segs with different paths. This demonstrates how the library achieves the invariants described above for various interesting cases:

Table 1.27. Segments

s

segs( s )

absolute

""

{ }

"/"

{ }

yes

"./"

{ "" }

"usr"

{ "usr" }

"./usr"

{ "usr" }

"/index.htm"

{ "index.htm" }

yes

"/images/cat-pic.gif"

{ "images", "cat-pic.gif" }

yes

"images/cat-pic.gif"

{ "images", "cat-pic.gif" }

"/fast//query"

{ "fast", "", "query" }

yes

"fast//"

{ "fast", "", "" }

"/./"

{ "" }

yes

".//"

{ "", "" }


This implies that two paths may map to the same sequence of segments . In the paths "usr" and "./usr", the "./" is a prefix that might be necessary to maintain the invariant that instances of url_view_base always refer to valid URLs. Thus, both paths map to { "usr" }. On the other hand, each sequence determines a unique path for a given URL. For instance, setting the segments to {"a"} would always map to either "./a" or "a", depending on whether the "." prefix is necessary to keep the URL valid.

Sequences don't iterate the leading "." when it's necessary to keep the URL valid. Thus, when we assign { "x", "y", "z" } to segments, the sequence always contains { "x", "y", "z" } after that. It never contains { ".", "x", "y", "z" } because the "." needed to be included. In other words, the contents of the segment container are authoritative, and the path string is a function of them. Not vice-versa.

Library algorithms which modify individual segments of the path or set the entire path attempt to behave consistently with the behavior expected as if the operation was performed on the equivalent sequence. If a path maps, say, to the three element sequence { "a", "b", "c" } then erasing the middle segment should result in the sequence { "a", "c" }. The library always strives to do exactly what the caller requests; however, in some cases this would result in either an invalid URL, or a dramatic and unwanted change in the URL's semantics.

For example consider the following URL:

url u = url().set_path( "kyle:xy" );

The library produces the URL string "kyle%3Axy" and not "kyle:xy", because the latter would have an unintended scheme. The table below demonstrates the results achieved by performing various modifications to a URL containing a path:

Table 1.28. Path Operations

URL

Operation

Result

"info:kyle:xy"

remove_scheme()

"kyle%3Axy"

"kyle%3Axy"

set_scheme( "gopher" )

"gopher:kyle:xy"

"http://www.example.com//kyle:xy"

remove_authority()

"http:/.//kyle:xy"

"//www.example.com//kyle:xy"

remove_authority()

"/.//kyle:xy"

"http://www.example.com//kyle:xy"

remove_origin()

"/.//kyle:xy"

"info:kyle:xy"

remove_origin()

"kyle%3Axy"

"/kyle:xy"

set_path_absolute( false )

"kyle%3Axy"

"kyle%3Axy"

set_path_absolute( true )

"/kyle:xy"

""

set_path( "kyle:xy" )

"kyle%3Axy"

""

set_path( "//foo/fighters.txt" )

"/.//foo/fighters.txt"

"my%3Asharona/billa%3Abong"

normalize()

"my%3Asharona/billa:bong"

"./my:sharona"

normalize()

"my%3Asharona"


For the full set of containers and functions for operating on paths and segments, please consult the reference.


PrevUpHomeNext