...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
Hierarchical schemes often interpret the path as a slash-delimited sequence of percent-encoded strings called segments. In this library the segments may be accessed using these separate, bidirectional view types which reference the underlying URL:
Table 1.26. Segments Types
Type |
Accessor |
Description |
---|---|---|
A read-only range of decoded segments. |
||
A modifiable range of decoded segments. |
||
A read-only range of segments. |
||
A modifiable range of segments. |
First we observe these invariants about paths and segments:
bool
indicating if the path is absolute.
bool
indicating if the path
is absolute, maps to a unique path.
The following URL contains a path with three segments: "path", "to", and "file.txt":
http://www.example.com/path/to/file.txt
To understand the relationship between the path and segments, we define this
function segs
which returns a list of strings corresponding
to the elements in a container of segments:
auto segs( string_view s ) -> std::list< std::string > { url_view u( s ); std::list< std::string > seq; for( auto seg : u.encoded_segments() ) seq.push_back( seg.decode() ); return seq; }
In this table we show the result of invoking segs
with different
paths. This demonstrates how the library achieves the invariants described
above for various interesting cases:
Table 1.27. Segments
s |
|
absolute |
---|---|---|
|
|
|
|
|
yes |
|
|
|
|
|
|
|
|
|
|
|
yes |
|
|
yes |
|
|
|
|
|
yes |
|
|
|
|
|
yes |
|
|
This implies that two paths may map to the same sequence of segments . In
the paths "usr"
and "./usr"
,
the "./"
is a prefix that might be necessary to maintain
the invariant that instances of url_view_base
always refer to valid
URLs. Thus, both paths map to { "usr" }
. On the other
hand, each sequence determines a unique path for a given URL. For instance,
setting the segments to {"a"}
would always map to
either "./a"
or "a"
, depending
on whether the "." prefix is necessary to keep the URL valid.
Sequences don't iterate the leading "." when it's necessary to
keep the URL valid. Thus, when we assign { "x", "y",
"z" }
to segments, the sequence always contains {
"x", "y", "z" }
after that. It never
contains { ".", "x", "y", "z"
}
because the "."
needed to be included. In
other words, the contents of the segment container are authoritative, and
the path string is a function of them. Not vice-versa.
Library algorithms which modify individual segments of the path or set the
entire path attempt to behave consistently with the behavior expected as
if the operation was performed on the equivalent sequence. If a path maps,
say, to the three element sequence { "a", "b",
"c" }
then erasing the middle segment should result in the
sequence { "a", "c" }
. The library always
strives to do exactly what the caller requests; however, in some cases this
would result in either an invalid URL, or a dramatic and unwanted change
in the URL's semantics.
For example consider the following URL:
url u = url().set_path( "kyle:xy" );
The library produces the URL string "kyle%3Axy"
and
not "kyle:xy"
, because the latter would have an unintended
scheme. The table below demonstrates the results achieved by performing various
modifications to a URL containing a path:
Table 1.28. Path Operations
URL |
Operation |
Result |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
For the full set of containers and functions for operating on paths and segments, please consult the reference.