Filesystem Library |
Boost Home Library Home Tutorial Reference FAQ |
The Boost.Filesystem library provides portable facilities to query and manipulate paths, files, and directories.
The motivation for the library is the need to perform portable script-like operations from within C++ programs. The intent is not to compete with Python, Perl, or shell languages, but rather to provide portable filesystem operations when C++ is already the language of choice. The design encourages, but does not require, safe and portable usage.
Programs using the library are portable, both in the sense that the syntax of program code is portable, and the sense that the semantics or behavior of code is portable. The generic path grammar is another important aid to portability.
Usage is safe in the sense that errors cannot be ignored since most functions throw C++ exceptions when errors are detected. This is also convenient for users because it alleviates the need to explicitly check error return codes.
A proposal, N1975, to include Boost.Filesystem in Technical Report 2 has been accepted by the C++ Standards Committee. The Boost.Filesystem library will stay in alignment with the TR2 Filesystem proposal as it works its way through the TR2 process. Note, however, that namespaces and header granularity differs between Boost.Filesystem and the TR2 proposal.
The Boost.Filesystem library provides several headers:
Boost.Filesystem is implemented as a separately compiled library, so before using it you must install it in a location that can be found by your linker. See Building the object-library.
The library's example directory contains very simple scripts for building the example programs on various platforms. You can use these scripts to see what's needed to compile and link your own programs.
(A more elaborate tutorial is also available from Tabrez Iqbal.)
First some preliminaries:
#include "boost/filesystem.hpp" // includes all needed Boost.Filesystem declarations #include <iostream> // for std::cout using boost::filesystem; // for ease of tutorial presentation; // a namespace alias is preferred practice in real code
A class path object can be created:
path my_path( "some_dir/file.txt" );
The string passed to the path constructor may be in a
portable generic path format or an
implementation-defined native operating system format. Access functions
make my_path contents available to the underlying operating system API in an operating system dependent format,
such as "some_dir:file.txt"
, "[some_dir]file.txt"
,
"some_dir/file.txt"
, or whatever is appropriate for the
operating system. If class wpath is used instead of class path,
translation between wide and narrow character paths is performed automatically
if necessary for the operating system.
Class path has conversion constructors from const char* and const std:: string&, so that even though the Filesystem Library functions used in the following code snippet have const path& formal parameters, the user can just code C-style strings as actual arguments:
remove_all( "foobar" ); create_directory( "foobar" ); ofstream file( "foobar/cheeze" ); file << "tastes good!\n"; file.close(); if ( !exists( "foobar/cheeze" ) ) std::cout << "Something is rotten in foobar\n";
To make class path objects easy to use in expressions, operator/ appends paths:
ifstream file1( arg_path / "foo/bar" ); ifstream file2( arg_path / "foo" / "bar" );
The expressions arg_path / "foo/bar" and arg_path / "foo" / "bar" yield identical results.
Paths can include references to the current directory, using ".
"
notation, and the parent directory, using "..
"
notation.
Class basic_directory_iterator is an important component of the library. It provides an input iterator over the contents of a directory, with the value type being class basic_path. Typedefs directory_iterator and wdirectory_iterator are provided to cover the most common use cases.
The following function, given a directory path and a file name, recursively searches the directory and its sub-directories for the file name, returning a bool, and if successful, the path to the file that was found. The code below is extracted from a real program, slightly modified for clarity:
bool find_file( const path & dir_path, // in this directory, const std::string & file_name, // search for this name, path & path_found ) // placing path here if found { if ( !exists( dir_path ) ) return false; directory_iterator end_itr; // default construction yields past-the-end for ( directory_iterator itr( dir_path ); itr != end_itr; ++itr ) { if ( is_directory(itr->status()) ) { if ( find_file( itr->path(), file_name, path_found ) ) return true; } else if ( itr->leaf() == file_name ) // see below { path_found = itr->path(); return true; } } return false; }
The expression itr->path().leaf() == file_name, in the line commented // see below, calls the leaf() function on the path returned by calling the path() function of the directory_entry object pointed to by the iterator. leaf() returns a string which is a copy of the last (closest to the leaf, farthest from the root) file or directory name in the path object.
In addition to leaf(), several other function names use the tree/root/branch/leaf metaphor.
Notice that find_file() does not do explicit error checking, such as verifying that the dir_path argument really represents a directory. Boost.Filesystem functions throw exceptions if they do not complete successfully, so there is enough implicit error checking that this application doesn't need to supply additional error checking code unless desired. Several Boost.Filesystem functions have non-throwing versions, to ease use cases where exceptions would not be appropriate.
Note: Recursive directory iteration was added as a convenience function after the above tutorial code was written, so nowadays you don't have to actually code the recursion yourself.
After reading the tutorial you can dive right into simple, script-like programs using the Filesystem Library! Before doing any serious work, however, there a few cautions to be aware of:
Filesystem function specifications follow the C++ Standard Library form, specifying behavior in terms of effects and postconditions. If a race-condition exists, a function's postconditions may no longer be true by the time the function returns to the caller.
Explanation: The state of files and directories is often globally shared, and thus may be changed unexpectedly by other threads, processes, or even other computers having network access to the filesystem. As an example of the difficulties this can cause, note that the following asserts may fail:
assert( exists( "foo" ) == exists( "foo" ) ); // (1)
remove_all( "foo" );
assert( !exists( "foo" ) ); // (2)
assert( is_directory( "foo" ) == is_directory( "foo" ) ); // (3)(1) will fail if a non-existent "foo" comes into existence, or an existent "foo" is removed, between the first and second call to exists(). This could happen if, during the execution of the example code, another thread, process, or computer is also performing operations in the same directory.
(2) will fail if between the call to remove_all() and the call to exists() a new file or directory named "foo" is created by another thread, process, or computer.
(3) will fail if another thread, process, or computer removes an existing file "foo" and then creates a directory named "foo", between the example code's two calls to is_directory().
Unless otherwise specified, Boost.Filesystem functions throw basic_filesystem_error exceptions if they cannot successfully complete their operational specifications. Also, implementations may use C++ Standard Library functions, which may throw std::bad_alloc. These exceptions may be thrown even though the error condition leading to the exception is not explicitly specified in the function's "Throws" paragraph.
All exceptions thrown by the Filesystem Library are implemented by calling boost::throw_exception(). Thus exact behavior may differ depending on BOOST_NO_EXCEPTIONS at the time the filesystem source files are compiled.
Non-throwing versions are provided of several functions that are often used in contexts where error codes may be the preferred way to report an error.
The example program simple_ls.cpp is given a path as a command line argument. Since the command line argument may be a relative path, the complete path is determined so that messages displayed can be more precise.
The program checks to see if the path exists; if not a message is printed.
If the path identifies a directory, the directory is iterated through, printing the name of the entries found, and an indication if they are directories. A count of directories and files is updated, and then printed after the iteration is complete.
If the path is for a file, a message indicating that is printed.
Try compiling and executing simple_ls.cpp to see how it works on your system. Try various path arguments to see what happens.
This example program prints the file's size if it is a regular file.
The programs used to generate the Boost regression test status tables use the Filesystem Library extensively. See:
Test programs are sometimes useful in understanding a library, as they illustrate what the developer expected to work and not work. See:
The current implementation supports operating systems which provide either the POSIX or Windows API.
The library is in regular use on Apple OS X, HP-UX, IBM AIX, Linux, Microsoft Windows, SGI IRIX, and Sun Solaris operating systems using a variety of compilers.
Compilers or standard libraries which do not support wide characters (wchar_t) or wide character strings (std::wstring) are detected automatically, and cause the library to compile code that is restricted to narrow character paths (boost::filesystem::path). Users can force this restriction by defining the macro BOOST_FILESYSTEM_NARROW_ONLY. That may be useful for dealing with legacy compilers or operating systems.
The object-library will be built automatically if you are using the Boost build system. See Getting Started. It can also be built manually using a Jamfile supplied in directory libs/filesystem/build, or the user can construct an IDE project or make file which includes the object-library source files.
The object-library source files are supplied in directory libs/filesystem/src. These source files implement the library for POSIX or Windows compatible operating systems; no implementation is supplied for other operating systems. Note that many operating systems not normally thought of as POSIX systems, such as mainframe legacy operating systems or embedded operating systems, support POSIX compatible file systems which will work with the Filesystem Library.
The object-library can be built for static or dynamic (shared/dll) linking. This is controlled by the BOOST_ALL_DYN_LINK or BOOST_FILESYSTEM_DYN_LINK macros. See the Separate Compilation page for a description of the techniques used.
The library's implementation code automatically detects the current platform, and compiles the POSIX or Windows implementation code accordingly. Automatic platform detection during object library compilation can be overridden by defining either BOOST_POSIX_API or BOOST_WINDOWS_API macros. With the exception of the Cygwin environment, there is usually no reason to define these macros, as the software development kits supplied with most compilers only support a single platform.
The Cygwin package of tools supports traditional Windows usage, but also provides an emulation layer and other tools which can be used to make Windows act as Linux (and thus POSIX), and provide the Linux look-and-feel. GCC is usually the compiler of choice in this environment, as it can be installed via the Cygwin install process. Other compilers can also use the Cygwin emulation of POSIX, at least in theory.
Those wishing to use the Cygwin POSIX emulation layer should define the BOOST_POSIX_API macro when compiling bother user programs and the Boost.Filesystem's object-library.
The Filesystem Library was designed and implemented by Beman Dawes. The original directory_iterator and filesystem_error classes were based on prior work from Dietmar Kuehl, as modified by Jan Langer. Thomas Witt was a particular help in later stages of initial development. Peter Dimov and Rob Stewart made many useful suggestions and comments over a long period of time. Howard Hinnant helped with internationalization issues.
Key design requirements and design realities were developed during extensive discussions on the Boost mailing list, followed by comments on the initial implementation. Numerous helpful comments were then received during the Formal Review.
Participants included Aaron Brashears, Alan Bellingham, Aleksey Gurtovoy, Alex Rosenberg, Alisdair Meredith, Andy Glew, Anthony Williams, Baptiste Lepilleur, Beman Dawes, Bill Kempf, Bill Seymour, Carl Daniel, Chris Little, Chuck Allison, Craig Henderson, Dan Nuffer, Dan'l Miller, Daniel Frey, Darin Adler, David Abrahams, David Held, Davlet Panech, Dietmar Kuehl, Douglas Gregor, Dylan Nicholson, Ed Brey, Eric Jensen, Eric Woodruff, Fedder Skovgaard, Gary Powell, Gennaro Prota, Geoff Leyland, George Heintzelman, Giovanni Bajo, Glen Knowles, Hillel Sims, Howard Hinnant, Jaap Suter, James Dennett, Jan Langer, Jani Kajala, Jason Stewart, Jeff Garland, Jens Maurer, Jesse Jones, Jim Hyslop, Joel de Guzman, Joel Young, John Levon, John Maddock, John Williston, Jonathan Caves, Jonathan Biggar, Jurko, Justus Schwartz, Keith Burton, Ken Hagen, Kostya Altukhov, Mark Rodgers, Martin Schuerch, Matt Austern, Matthias Troyer, Mattias Flodin, Michiel Salters, Mickael Pointier, Misha Bergal, Neal Becker, Noel Yap, Parksie, Patrick Hartling, Pavel Vozenilek, Pete Becker, Peter Dimov, Rainer Deyke, Rene Rivera, Rob Lievaart, Rob Stewart, Ron Garcia, Ross Smith, Sashan, Steve Robbins, Thomas Witt, Tom Harris, Toon Knapen, Victor Wagner, Vincent Finn, Vladimir Prus, and Yitzhak Sapir
A lengthy discussion on the C++ committee's library reflector illuminated the "illusion of portability" problem, particularly in postings by PJ Plauger and Pete Becker.
Walter Landry provided much help illuminating symbolic link use cases for version 1.31.0.
So many people have contributed comments and bug reports that it isn't any longer possible to acknowledge them individually. That said, Peter Dimov and Rob Stewart need to be specially thanked for their many constructive criticisms and suggestions. Terence Wilson and Chris Frey contributed timing programs which helped illuminate performance issues.
Revised 18 March, 2008
© Copyright Beman Dawes, 2002-2005
Use, modification, and distribution are subject to the Boost Software License, Version 1.0. See www.boost.org/LICENSE_1_0.txt