Samples

Samples

The Wave library contains several samples illustrating how to use the different features. This section describes these samples and their main characteristics.

The quick_start sample

The quick_start sample shows a minimal way to use the Wave preprocessor library. It simply opens the file given as the first command line argument, preprocesses it assuming that there aren't any additional include paths or macros defined and outputs the textual representation of the tokens generated from the given input file. This sample may be used to introduce yourself to Wave, because it does not contain all the potential additional complexity exposed by more complex samples.

The lexed_tokens sample

The lexed_tokens sample shows a minimal way to use the C++ lexing component of Wave without using the preprocessor. It opens the file specified as the first command line argument and prints out the contents of the tokens returned from the lexer.

The cpp_tokens sample

The cpp_tokens sample dumps out the information contained within the tokens returned from the iterator supplied by the Wave library. It shows, how to use the Wave library in conjunction with custom lexer and custom token types. The lexer used within this sample is SLex [5] based, i.e. it is fed during runtime (at startup) with the token definitions (regular expressions) and generates a resulting DFA table. This table is used for token identification and is saved to disc afterwards to avoid the table generation process at the next program startup. The name of the file to which the DFA table is saved is wave_slex_lexer.dfa.

The main advantage of this SLex based lexer if compared to the default Re2C [3] generated lexer is, that it provides not only the line information, where a particular token was recognized, but also the related column position. Otherwise the SLex based lexer is functionally fully compatible to the Re2C based one, i.e. you always may switch your application to use it, if you additionally need to get the column information back from the preprocessing.

Since no additional command line parameters are supported by this sample, it won't work well with include files, which aren't located in the same directory as the inspected input file. The command line syntax is straight forward:

    cpp_tokens input_file

The list_includes sample

The list_includes sample shows how the Wave library may be used to generate a include file dependency list for a particular input file. It completely depends on the default library configuration. The command line syntax for this sample is given below:

    Usage: list_includes [options] file ...:
        -h [ --help ]        : print out program usage (this message)
        -v [ --version ]     : print the version number
        -I [ --path ] dir    : specify additional include directory
        -S [ --syspath ] dir : specify additional system include directory

Please note though, that this sample will output only those include file names, which are visible to the preprocessor, i.e. given the following code snippet, only one of the two include file directives is triggered during preprocessing and for this reason only the corresponding file name is reported by the list_includes sample:

    #if defined(INCLUDE_FILE_A)
    #  include "file_a.h" 
    #else
    #  include "file_b.h"
    #endif

The advanced_hooks sample

The advanced_hooks sample is based on the quick_start sample mentioned above. It shows how you may want to use the advanced preprocessing hooks of the Wave library to get in the output not only the preprocessed tokens from the evaluated conditional blocks, but also the tokens recognized inside the non-evaluated conditional blocks. To make the generated token stream useful for further processing the tokens from the non-evaluated conditional blocks are commented out.

Here is a small sample what the advanced_hooks sample does. Consider the following input:

    #define TEST 1
    #if defined(TEST)
    "TEST was defined: " TEST
    #else
    "TEST was not defined!"
    #endif

which will produce as its output:

    //"#if defined(TEST)
    "TEST was defined: " 1
    //"#else
    //"TEST was not defined!"
    //"#endif

As you can see, the sample application prints out the conditional directives in a commented out manner as well.

The wave sample

Because of its general usefulness the wave sample is not located in the sample directory of the library, but inside the tools directory of Boost. The wave sample is usable as a full fledged preprocessor executable on top of any other C++ compiler. It outputs the textual representation of the preprocessed tokens generated from a given input file. It is described in more detail here.

The waveidl sample

The main point of the waveidl sample is to show, how a completely independent lexer type may be used in conjunction with the default token type of the Wave library. The lexer used in this sample is supposed to be used for an IDL language based preprocessor. It is based on the Re2C tool too, but recognizes a different set of tokens as the default C++ lexer contained within the Wave library. So this lexer does not recognize any keywords (except true and false, which are needed by the preprocessor itself). This is needed because there exist different IDL languages, where identifiers of one language may be keywords of others. Certainly this implies to postpone keyword identification after the preprocessing, but allows to use Wave for all of the IDL derivatives.

It is only possible to use the Wave library to write an IDL preprocessor, because the token sets for both languages are very similar. The tokens to be recognized by the waveidl IDL language preprocessor is nearly a complete subset of the full C++ token set.

The command line syntax usable for this sample is shown below:

  Usage: waveidl [options] [@config-file(s)] file:


    Options allowed on the command line only:
      -h [ --help ]           : print out program usage (this message)
      -v [ --version ]        : print the version number
      -c [ --copyright ]      : print out the copyright statement
      --config-file filepath  : specify a config file (alternatively: @filepath)

        
    Options allowed additionally in a config file:
      -o [ --output ] path    : specify a file to use for output instead of stdout
      -I [ --include ] path   : specify an additional include directory
      -S [ --sysinclude ] syspath : specify an additional system include directory
      -D [ --define ] macro[=[value]] : specify a macro to define
      -P [ --predefine ] macro[=[value]] : specify a macro to predefine
      -U [ --undefine ] macro : specify a macro to undefine

The hannibal sample

The hannibal sample shows how to base a spirit grammar on the Wave library. It was initially written and contributed to the Wave library by Danny Havenith (see his related web page here). The grammar of this example uses Wave as its preprocessor. It implements around 120 of the approximately 250 grammar rules as they can be found in The C++ Programming Language, Third Edition. The 120 rules allow a C++ source file to be parsed for all type information and declarations. In fact this grammar parses as good as anything, it parses C++ declarations, including class and template definitions, but skips function bodies. If so configured, the program will output an xml dump of the generated parse tree.

It may be a good starting point for a grammar that can be used for things like reverse engineering as some UML modelling tools do. Or whatever use you may find for a grammar that gives you a list of all templates and classes in a file and their members.

The check_macro_naming sample

The check_macro_naming sample demonstrates the use of context hooks to understand how macros are defined within a codebase. Some projects (such as Boost itself) have conventions on the names of macros. This sample will recursively search a directory, looking for header files and searching each for macro definitions. Any that do not match the supplied convention (which defaults to ^BOOST_.*) are reported, along with an annotation if they are used as an include guard. The user can also specify any number of directories to ignore in the process.

Command line syntax:

Usage: check_macro_naming [options] directory:
  -h [ --help ]            print out options
  --match arg (=^BOOST_.*) pattern defined macros must match
  --exclude arg            subdirectory to skip

Example usage:

$ check_macro_naming --exclude ./test/testwave/testfiles include
CPP_CONTEXT_HPP_907485E2_6649_4A87_911B_7F7225F3E5B8_INCLUDED include/boost/wave/cpp_context.hpp (guard)
WHITESPACE_HANDLING_HPP_INCLUDED include/boost/wave/whitespace_handling.hpp (guard)
...
TRACE_CPP_TIME_CONVERSION include/boost/wave/util/time_conversion_helper.hpp
spirit_append_actor include/boost/wave/util/time_conversion_helper.hpp
spirit_assign_actor include/boost/wave/util/time_conversion_helper.hpp

The custom_directives sample

The custom_directives sample shows how to define your own directives by overriding the found_unknown_directive hook to add #version and #extension directives (from GLSL).

The emit_custom_line_directives sample

The emit_custom_line_directives sample shows how to use the emit_line_directive preprocessing hook to customize the format of any generated #line directive. The sample will emit #line directives formatted compatible with those generated by gcc:

    # <lineno> <rel_file_name>

The preprocess_pragma_output sample

The preprocess_pragma_output sample demonstrates using the interpret_pragma hook to install custom pragma handling code. Specifically it defines a new #pragma wave pp() that interprets a string in the current preprocessor context. The sample input uses this mechanism to define new macros from within a quoted string passed to pp.

The real_positions sample

The real_positions sample demonstrates:

the use of the generated_token hook
how to use Wave with a new token type without changing the lexer

by showing how to correct the positions inside the returned tokens in a way that these appear to be consecutive (ignoring positions from macro definitions).

The token_statistics sample

The token_statistics sample collects counts of each type of token in the preprocessed input and reports it. It demonstrates:

Using the Xpressive lexer
Setting custom local and system include search paths

Last updated: Wednesday, June 21, 2006 22:12