...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
Hierarchical schemes often interpret the path as a slash-delimited sequence of percent-encoded strings called segments. In this library the segments may be accessed using these separate, bidirectional view types which reference the underlying URL:
Table 1.26. Segments Types
Type |
Accessor |
Description |
---|---|---|
A read-only range of decoded segments. |
||
A modifiable range of decoded segments. |
||
A read-only range of segments. |
||
A modifiable range of segments. |
First we observe these invariants about paths and segments:
bool
indicating if the path is absolute.
bool
indicating if the path
is absolute, maps to a unique path.
The following URL contains a path with three segments: "path", "to", and "file.txt":
http://www.example.com/path/to/file.txt
To understand the relationship between the path and segments, we define this
function segs
which returns a list of strings corresponding
to the elements in a container of segments:
auto segs( core::string_view s ) -> std::list< std::string > { url_view u( s ); std::list< std::string > seq; for( auto seg : u.encoded_segments() ) seq.push_back( seg.decode() ); return seq; }
In this table we show the result of invoking segs
with different paths. This demonstrates how the library achieves the invariants
described above for various interesting cases:
Table 1.27. Segments
s |
|
absolute |
---|---|---|
|
|
|
|
|
yes |
|
|
|
|
|
|
|
|
|
|
|
yes |
|
|
yes |
|
|
|
|
|
yes |
|
|
|
|
|
yes |
|
|
This implies that two paths may map to the same sequence of segments . In
the paths "usr"
and
"./usr"
, the "./"
is a prefix that might be necessary
to maintain the invariant that instances of url_view_base
always refer to valid
URLs. Thus, both paths map to { "usr" }
.
On the other hand, each sequence determines a unique path for a given URL.
For instance, setting the segments to {"a"}
would always map to either "./a"
or "a"
, depending on
whether the "." prefix is necessary to keep the URL valid.
Sequences don't iterate the leading "." when it's necessary to
keep the URL valid. Thus, when we assign {
"x",
"y",
"z" }
to segments, the sequence always contains {
"x",
"y",
"z" }
after that. It never contains { ".", "x", "y", "z" }
because the "."
needed
to be included. In other words, the contents of the segment container are
authoritative, and the path string is a function of them. Not vice-versa.
Library algorithms which modify individual segments of the path or set the
entire path attempt to behave consistently with the behavior expected as
if the operation was performed on the equivalent sequence. If a path maps,
say, to the three element sequence {
"a",
"b",
"c" }
then erasing the middle segment should result in the sequence { "a", "c" }
. The library always strives to do exactly
what the caller requests; however, in some cases this would result in either
an invalid URL, or a dramatic and unwanted change in the URL's semantics.
For example consider the following URL:
url u = url().set_path( "kyle:xy" );
The library produces the URL string "kyle%3Axy"
and not "kyle:xy"
,
because the latter would have an unintended scheme. The table below demonstrates
the results achieved by performing various modifications to a URL containing
a path:
Table 1.28. Path Operations
URL |
Operation |
Result |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
For the full set of containers and functions for operating on paths and segments, please consult the reference.