Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

This is the documentation for a development version of boost.

Overview

Abstract

Boost.Endian provides facilities to manipulate the endianness of integers and user-defined types.

  • Three approaches to endianness are supported. Each has a long history of successful use, and each approach has use cases where it is preferred over the other approaches.

  • Primary uses:

    • Data portability. The Endian library supports binary data exchange, via either external media or network transmission, regardless of platform endianness.

    • Program portability. POSIX-based and Windows-based operating systems traditionally supply libraries with non-portable functions to perform endian conversion. There are at least four incompatible sets of functions in common use. The Endian library is portable across all C++ platforms.

  • Secondary use: Minimizing data size via sizes and/or alignments not supported by the standard C++ integer types.

Introduction to endianness

Consider the following code:

int16_t i = 0x0102;
FILE * file = fopen("test.bin", "wb"); // binary file!
fwrite(&i, sizeof(int16_t), 1, file);
fclose(file);

On OS X, Linux, or Windows systems with an Intel CPU, a hex dump of the "test.bin" output file produces:

0201

On OS X systems with a PowerPC CPU, or Solaris systems with a SPARC CPU, a hex dump of the "test.bin" output file produces:

0102

What’s happening here is that Intel CPUs order the bytes of an integer with the least-significant byte first, while SPARC CPUs place the most-significant byte first. Some CPUs, such as the PowerPC, allow the operating system to choose which ordering applies.

Most-significant-byte-first ordering is traditionally called "big endian" ordering and least-significant-byte-first is traditionally called "little-endian" ordering. The names are derived from Jonathan Swift's satirical novel Gulliver’s Travels, where rival kingdoms opened their soft-boiled eggs at different ends.

See Wikipedia’s Endianness article for an extensive discussion of endianness.

Programmers can usually ignore endianness, except when reading a core dump on little-endian systems. But programmers have to deal with endianness when exchanging binary integers and binary floating point values between computer systems with differing endianness, whether by physical file transfer or over a network. And programmers may also want to use the library when minimizing either internal or external data sizes is advantageous.

Introduction to the Boost.Endian library

Boost.Endian provides three different approaches to dealing with endianness. All three approaches support integers and user-define types (UDTs).

Each approach has a long history of successful use, and each approach has use cases where it is preferred to the other approaches.

Endian conversion functions

The application uses the built-in integer types to hold values, and calls the provided conversion functions to convert byte ordering as needed. Both mutating and non-mutating conversions are supplied, and each comes in unconditional and conditional variants.

Endian buffer types

The application uses the provided endian buffer types to hold values, and explicitly converts to and from the built-in integer types. Buffer sizes of 8, 16, 24, 32, 40, 48, 56, and 64 bits (i.e. 1, 2, 3, 4, 5, 6, 7, and 8 bytes) are provided. Unaligned integer buffer types are provided for all sizes, and aligned buffer types are provided for 16, 32, and 64-bit sizes. The provided specific types are typedefs for a generic class template that may be used directly for less common use cases.

Endian arithmetic types

The application uses the provided endian arithmetic types, which supply the same operations as the built-in C++ arithmetic types. All conversions are implicit. Arithmetic sizes of 8, 16, 24, 32, 40, 48, 56, and 64 bits (i.e. 1, 2, 3, 4, 5, 6, 7, and 8 bytes) are provided. Unaligned integer types are provided for all sizes and aligned arithmetic types are provided for 16, 32, and 64-bit sizes. The provided specific types are typedefs for a generic class template that may be used directly in generic code of for less common use cases.

Boost Endian is a header-only library. C++11 features affecting interfaces, such as noexcept, are used only if available. See C++03 support for C++11 features for details.

Choosing between conversion functions, buffer types, and arithmetic types

This section has been moved to its own Choosing the Approach page.

Built-in support for Intrinsics

Most compilers, including GCC, Clang, and Visual C++, supply built-in support for byte swapping intrinsics. The Endian library uses these intrinsics when available since they may result in smaller and faster generated code, particularly for optimized builds.

Defining the macro BOOST_ENDIAN_NO_INTRINSICS will suppress use of the intrinsics. This is useful when a compiler has no intrinsic support or fails to locate the appropriate header, perhaps because it is an older release or has very limited supporting libraries.

The macro BOOST_ENDIAN_INTRINSIC_MSG is defined as either "no byte swap intrinsics" or a string describing the particular set of intrinsics being used. This is useful for eliminating missing intrinsics as a source of performance issues.

Performance

Consider this problem:

Example 1

Add 100 to a big endian value in a file, then write the result to a file

Endian arithmetic type approach Endian conversion function approach
big_int32_at x;

... read into x from a file ...


x += 100;


... write x to a file ...
int32_t x;

... read into x from a file ...

big_to_native_inplace(x);
x += 100;
native_to_big_inplace(x);

... write x to a file ...

There will be no performance difference between the two approaches in optimized builds, regardless of the native endianness of the machine. That’s because optimizing compilers will generate exactly the same code for each. That conclusion was confirmed by studying the generated assembly code for GCC and Visual C++. Furthermore, time spent doing I/O will determine the speed of this application.

Now consider a slightly different problem:

Example 2

Add a million values to a big endian value in a file, then write the result to a file

Endian arithmetic type approach Endian conversion function approach
big_int32_at x;

... read into x from a file ...



for (int32_t i = 0; i < 1000000; ++i)
  x += i;



... write x to a file ...
int32_t x;

... read into x from a file ...

big_to_native_inplace(x);

for (int32_t i = 0; i < 1000000; ++i)
  x += i;

native_to_big_inplace(x);

... write x to a file ...

With the Endian arithmetic approach, on little endian platforms an implicit conversion from and then back to big endian is done inside the loop. With the Endian conversion function approach, the user has ensured the conversions are done outside the loop, so the code may run more quickly on little endian platforms.

Timings

These tests were run against release builds on a circa 2012 4-core little endian X64 Intel Core i5-3570K CPU @ 3.40GHz under Windows 7.

Caution
The Windows CPU timer has very high granularity. Repeated runs of the same tests often yield considerably different results.

See test/loop_time_test.cpp for the actual code and benchmark/Jamfile.v2 for the build setup.

GNU C++ version 4.8.2 on Linux virtual machine

Iterations: 10'000'000'000, Intrinsics: __builtin_bswap16, etc.

Test Case Endian arithmetic type Endian conversion function

16-bit aligned big endian

8.46 s

5.28 s

16-bit aligned little endian

5.28 s

5.22 s

32-bit aligned big endian

8.40 s

2.11 s

32-bit aligned little endian

2.11 s

2.10 s

64-bit aligned big endian

14.02 s

3.10 s

64-bit aligned little endian

3.00 s

3.03 s

Microsoft Visual C++ version 14.0

Iterations: 10'000'000'000, Intrinsics: <cstdlib> _byteswap_ushort, etc.

Test Case Endian arithmetic type Endian conversion function

16-bit aligned big endian

8.27 s

5.26 s

16-bit aligned little endian

5.29 s

5.32 s

32-bit aligned big endian

8.36 s

5.24 s

32-bit aligned little endian

5.24 s

5.24 s

64-bit aligned big endian

13.65 s

3.34 s

64-bit aligned little endian

3.35 s

2.73 s

Overall FAQ

Is the implementation header only?

Yes.

Are C++03 compilers supported?

Yes.

Does the implementation use compiler intrinsic built-in byte swapping?

Yes, if available. See Intrinsic built-in support.

Why bother with endianness?

Binary data portability is the primary use case.

Does endianness have any uses outside of portable binary file or network I/O formats?

Using the unaligned integer types with a size tailored to the application’s needs is a minor secondary use that saves internal or external memory space. For example, using big_int40_buf_t or big_int40_t in a large array saves a lot of space compared to one of the 64-bit types.

Why bother with binary I/O? Why not just use C++ Standard Library stream inserters and extractors?
  • Data interchange formats often specify binary integer data. Binary integer data is smaller and therefore I/O is faster and file sizes are smaller. Transfer between systems is less expensive.

  • Furthermore, binary integer data is of fixed size, and so fixed-size disk records are possible without padding, easing sorting and allowing random access.

  • Disadvantages, such as the inability to use text utilities on the resulting files, limit usefulness to applications where the binary I/O advantages are paramount.

Which is better, big-endian or little-endian?

Big-endian tends to be preferred in a networking environment and is a bit more of an industry standard, but little-endian may be preferred for applications that run primarily on x86, x86-64, and other little-endian CPU’s. The Wikipedia article gives more pros and cons.

Why are only big and little native endianness supported?

These are the only endian schemes that have any practical value today. PDP-11 and the other middle endian approaches are interesting curiosities but have no relevance for today’s C++ developers. The same is true for architectures that allow runtime endianness switching. The specification for native ordering has been carefully crafted to allow support for such orderings in the future, should the need arise. Thanks to Howard Hinnant for suggesting this.

Why do both the buffer and arithmetic types exist?

Conversions in the buffer types are explicit. Conversions in the arithmetic types are implicit. This fundamental difference is a deliberate design feature that would be lost if the inheritance hierarchy were collapsed. The original design provided only arithmetic types. Buffer types were requested during formal review by those wishing total control over when conversion occurs. They also felt that buffer types would be less likely to be misused by maintenance programmers not familiar with the implications of performing a lot of integer operations on the endian arithmetic integer types.

What is gained by using the buffer types rather than always just using the arithmetic types?

Assurance that hidden conversions are not performed. This is of overriding importance to users concerned about achieving the ultimate in terms of speed. "Always just using the arithmetic types" is fine for other users. When the ultimate in speed needs to be ensured, the arithmetic types can be used in the same design patterns or idioms that would be used for buffer types, resulting in the same code being generated for either types.

What are the limitations of integer support?

Tests have only been performed on machines that use two’s complement arithmetic. The Endian conversion functions only support 16, 32, and 64-bit aligned integers. The endian types only support 8, 16, 24, 32, 40, 48, 56, and 64-bit unaligned integers, and 8, 16, 32, and 64-bit aligned integers.

Why is there no floating point support?

An attempt was made to support four-byte floats and eight-byte doubles, limited to IEEE 754 (also known as ISO/IEC/IEEE 60559) floating point and further limited to systems where floating point endianness does not differ from integer endianness. Even with those limitations, support for floating point types was not reliable and was removed. For example, simply reversing the endianness of a floating point number can result in a signaling-NAN. For all practical purposes, binary serialization and endianness for integers are one and the same problem. That is not true for floating point numbers, so binary serialization interfaces and formats for floating point does not fit well in an endian-based library.

History

Changes requested by formal review

The library was reworked from top to bottom to accommodate changes requested during the formal review. See Mini-Review page for details.

Other changes since formal review

  • Header boost/endian/endian.hpp has been renamed to boost/endian/arithmetic.hpp. Headers boost/endian/conversion.hpp and boost/endian/buffers.hpp have been added. Infrastructure file names were changed accordingly.

  • The endian arithmetic type aliases have been renamed, using a naming pattern that is consistent for both integer and floating point, and a consistent set of aliases supplied for the endian buffer types.

  • The unaligned-type alias names still have the _t suffix, but the aligned-type alias names now have an _at suffix.

  • endian_reverse() overloads for int8_t and uint8_t have been added for improved generality. (Pierre Talbot)

  • Overloads of endian_reverse_inplace() have been replaced with a single endian_reverse_inplace() template. (Pierre Talbot)

  • For X86 and X64 architectures, which permit unaligned loads and stores, unaligned little endian buffer and arithmetic types use regular loads and stores when the size is exact. This makes unaligned little endian buffer and arithmetic types significantly more efficient on these architectures. (Jeremy Maitin-Shepard)

  • C++11 features affecting interfaces, such as noexcept, are now used. C++03 compilers are still supported.

  • Acknowledgements have been updated.

Compatibility with interim releases

Prior to the official Boost release, class template endian_arithmetic has been used for a decade or more with the same functionality but under the name endian. Other names also changed in the official release. If the macro BOOST_ENDIAN_DEPRECATED_NAMES is defined, those old now deprecated names are still supported. However, the class template endian name is only provided for compilers supporting C++11 template aliases. For C++03 compilers, the name will have to be changed to endian_arithmetic.

To support backward header compatibility, deprecated header boost/endian/endian.hpp forwards to boost/endian/arithmetic.hpp. It requires BOOST_ENDIAN_DEPRECATED_NAMES be defined. It should only be used while transitioning to the official Boost release of the library as it will be removed in some future release.

C++03 support for C++11 features

C++11 Feature Action with C++03 Compilers

Scoped enums

Uses header boost/core/scoped_enum.hpp to emulate C++11 scoped enums.

noexcept

Uses BOOST_NOEXCEPT macro, which is defined as null for compilers not supporting this C++11 feature.

C++11 PODs (N2342)

Takes advantage of C++03 compilers that relax C++03 POD rules, but see Limitations here and here. Also see macros for explicit POD control here and here

Future directions

Standardization.

The plan is to submit Boost.Endian to the C++ standards committee for possible inclusion in a Technical Specification or the C++ standard itself.

Specializations for numeric_limits.

Roger Leigh requested that all boost::endian types provide numeric_limits specializations. See GitHub issue 4.

Character buffer support.

Peter Dimov pointed out during the mini-review that getting and setting basic arithmetic types (or <cstdint> equivalents) from/to an offset into an array of unsigned char is a common need. See Boost.Endian mini-review posting.

Out-of-range detection.

Peter Dimov pointed suggested during the mini-review that throwing an exception on buffer values being out-of-range might be desirable. See the end of this posting and subsequent replies.

Acknowledgements

Comments and suggestions were received from Adder, Benaka Moorthi, Christopher Kohlhoff, Cliff Green, Daniel James, Dave Handley, Gennaro Proto, Giovanni Piero Deretta, Gordon Woodhull, dizzy, Hartmut Kaiser, Howard Hinnant, Jason Newton, Jeff Flinn, Jeremy Maitin-Shepard, John Filo, John Maddock, Kim Barrett, Marsh Ray, Martin Bonner, Mathias Gaunard, Matias Capeletto, Neil Mayhew, Nevin Liber, Olaf van der Spek, Paul Bristow, Peter Dimov, Pierre Talbot, Phil Endecott, Philip Bennefall, Pyry Jahkola, Rene Rivera, Robert Stewart, Roger Leigh, Roland Schwarz, Scott McMurray, Sebastian Redl, Tim Blechmann, Tim Moore, tymofey, Tomas Puverle, Vincente Botet, Yuval Ronen and Vitaly Budovsk. Apologies if anyone has been missed.

The documentation was converted into Asciidoc format by Glen Fernandes.

Revision History

Changes in 1.72.0

  • Made endian_reverse, conditional_reverse and *_to_* constexpr on GCC and Clang

  • Added convenience load and store functions

  • Added floating point convenience typedefs

  • Added a non-const overload of data(); changed its return type to unsigned char*

  • Added __int128 support to endian_reverse when available

  • Added a convenience header boost/endian.hpp

Changes in 1.71.0

  • Clarified requirements on the value type template parameter

  • Added support for float and double

  • Added endian_load, endian_store

  • Updated endian_reverse to correctly support all non-bool integral types

  • Moved deprecated names to the deprecated header endian.hpp

Endian Conversion Functions

Introduction

Header boost/endian/conversion.hpp provides byte order reversal and conversion functions that convert objects of the built-in integer types between native, big, or little endian byte ordering. User defined types are also supported.

Reference

Functions are implemented inline if appropriate. For C++03 compilers, noexcept is elided. Boost scoped enum emulation is used so that the library still works for compilers that do not support scoped enums.

Definitions

Endianness refers to the ordering of bytes within internal or external integers and other arithmetic data. Most-significant byte first is called big endian ordering. Least-significant byte first is called little endian ordering. Other orderings are possible and some CPU architectures support both big and little ordering.

Note
The names are derived from Jonathan Swift's satirical novel Gulliver’s Travels, where rival kingdoms opened their soft-boiled eggs at different ends. Wikipedia has an extensive description of Endianness.

The standard integral types (C++std 3.9.1) except bool are collectively called the endian types.

Header <boost/endian/conversion.hpp> Synopsis

#define BOOST_ENDIAN_INTRINSIC_MSG \
   “message describing presence or absence of intrinsics”

namespace boost
{
namespace endian
{
  enum class order
  {
    native = see below,
    big    = see below,
    little = see below,
  };

  // Byte reversal functions

  template <class Endian>
    Endian endian_reverse(Endian x) noexcept;

  template <class EndianReversible>
    EndianReversible big_to_native(EndianReversible x) noexcept;
  template <class EndianReversible>
    EndianReversible native_to_big(EndianReversible x) noexcept;
  template <class EndianReversible>
    EndianReversible little_to_native(EndianReversible x) noexcept;
  template <class EndianReversible>
    EndianReversible native_to_little(EndianReversible x) noexcept;

  template <order O1, order O2, class EndianReversible>
    EndianReversible conditional_reverse(EndianReversible x) noexcept;
  template <class EndianReversible>
    EndianReversible conditional_reverse(EndianReversible x,
      order order1, order order2) noexcept;

  // In-place byte reversal functions

  template <class EndianReversible>
    void endian_reverse_inplace(EndianReversible& x) noexcept;

  template <class EndianReversibleInplace>
    void big_to_native_inplace(EndianReversibleInplace& x) noexcept;
  template <class EndianReversibleInplace>
    void native_to_big_inplace(EndianReversibleInplace& x) noexcept;
  template <class EndianReversibleInplace>
    void little_to_native_inplace(EndianReversibleInplace& x) noexcept;
  template <class EndianReversibleInplace>
    void native_to_little_inplace(EndianReversibleInplace& x) noexcept;

  template <order O1, order O2, class EndianReversibleInplace>
    void conditional_reverse_inplace(EndianReversibleInplace& x) noexcept;
  template <class EndianReversibleInplace>
   void conditional_reverse_inplace(EndianReversibleInplace& x,
     order order1, order order2) noexcept;

  // Generic load and store functions

  template<class T, std::size_t N, order Order>
    T endian_load( unsigned char const * p ) noexcept;

  template<class T, std::size_t N, order Order>
    void endian_store( unsigned char * p, T const & v ) noexcept;

  // Convenience load functions

  boost::int16_t load_little_s16( unsigned char const * p ) noexcept;
  boost::uint16_t load_little_u16( unsigned char const * p ) noexcept;
  boost::int16_t load_big_s16( unsigned char const * p ) noexcept;
  boost::uint16_t load_big_u16( unsigned char const * p ) noexcept;

  boost::int32_t load_little_s24( unsigned char const * p ) noexcept;
  boost::uint32_t load_little_u24( unsigned char const * p ) noexcept;
  boost::int32_t load_big_s24( unsigned char const * p ) noexcept;
  boost::uint32_t load_big_u24( unsigned char const * p ) noexcept;

  boost::int32_t load_little_s32( unsigned char const * p ) noexcept;
  boost::uint32_t load_little_u32( unsigned char const * p ) noexcept;
  boost::int32_t load_big_s32( unsigned char const * p ) noexcept;
  boost::uint32_t load_big_u32( unsigned char const * p ) noexcept;

  boost::int64_t load_little_s40( unsigned char const * p ) noexcept;
  boost::uint64_t load_little_u40( unsigned char const * p ) noexcept;
  boost::int64_t load_big_s40( unsigned char const * p ) noexcept;
  boost::uint64_t load_big_u40( unsigned char const * p ) noexcept;

  boost::int64_t load_little_s48( unsigned char const * p ) noexcept;
  boost::uint64_t load_little_u48( unsigned char const * p ) noexcept;
  boost::int64_t load_big_s48( unsigned char const * p ) noexcept;
  boost::uint64_t load_big_u48( unsigned char const * p ) noexcept;

  boost::int64_t load_little_s56( unsigned char const * p ) noexcept;
  boost::uint64_t load_little_u56( unsigned char const * p ) noexcept;
  boost::int64_t load_big_s56( unsigned char const * p ) noexcept;
  boost::uint64_t load_big_u56( unsigned char const * p ) noexcept;

  boost::int64_t load_little_s64( unsigned char const * p ) noexcept;
  boost::uint64_t load_little_u64( unsigned char const * p ) noexcept;
  boost::int64_t load_big_s64( unsigned char const * p ) noexcept;
  boost::uint64_t load_big_u64( unsigned char const * p ) noexcept;

  // Convenience store functions

  void store_little_s16( unsigned char * p, boost::int16_t v ) noexcept;
  void store_little_u16( unsigned char * p, boost::uint16_t v ) noexcept;
  void store_big_s16( unsigned char * p, boost::int16_t v ) noexcept;
  void store_big_u16( unsigned char * p, boost::uint16_t v ) noexcept;

  void store_little_s24( unsigned char * p, boost::int32_t v ) noexcept;
  void store_little_u24( unsigned char * p, boost::uint32_t v ) noexcept;
  void store_big_s24( unsigned char * p, boost::int32_t v ) noexcept;
  void store_big_u24( unsigned char * p, boost::uint32_t v ) noexcept;

  void store_little_s32( unsigned char * p, boost::int32_t v ) noexcept;
  void store_little_u32( unsigned char * p, boost::uint32_t v ) noexcept;
  void store_big_s32( unsigned char * p, boost::int32_t v ) noexcept;
  void store_big_u32( unsigned char * p, boost::uint32_t v ) noexcept;

  void store_little_s40( unsigned char * p, boost::int64_t v ) noexcept;
  void store_little_u40( unsigned char * p, boost::uint64_t v ) noexcept;
  void store_big_s40( unsigned char * p, boost::int64_t v ) noexcept;
  void store_big_u40( unsigned char * p, boost::uint64_t v ) noexcept;

  void store_little_s48( unsigned char * p, boost::int64_t v ) noexcept;
  void store_little_u48( unsigned char * p, boost::uint64_t v ) noexcept;
  void store_big_s48( unsigned char * p, boost::int64_t v ) noexcept;
  void store_big_u48( unsigned char * p, boost::uint64_t v ) noexcept;

  void store_little_s56( unsigned char * p, boost::int64_t v ) noexcept;
  void store_little_u56( unsigned char * p, boost::uint64_t v ) noexcept;
  void store_big_s56( unsigned char * p, boost::int64_t v ) noexcept;
  void store_big_u56( unsigned char * p, boost::uint64_t v ) noexcept;

  void store_little_s64( unsigned char * p, boost::int64_t v ) noexcept;
  void store_little_u64( unsigned char * p, boost::uint64_t v ) noexcept;
  void store_big_s64( unsigned char * p, boost::int64_t v ) noexcept;
  void store_big_u64( unsigned char * p, boost::uint64_t v ) noexcept;

} // namespace endian
} // namespace boost

The values of order::little and order::big shall not be equal to one another.

The value of order::native shall be:

  • equal to order::big if the execution environment is big endian, otherwise

  • equal to order::little if the execution environment is little endian, otherwise

  • unequal to both order::little and order::big.

Requirements

Template argument requirements

The template definitions in the boost/endian/conversion.hpp header refer to various named requirements whose details are set out in the tables in this subsection. In these tables, T is an object or reference type to be supplied by a C++ program instantiating a template; x is a value of type (possibly const) T; mlx is a modifiable lvalue of type T.

EndianReversible requirements (in addition to CopyConstructible)
Expression Return Requirements

endian_reverse(x)

T

T is an endian type or a class type.

If T is an endian type, returns the value of x with the order of bytes reversed.

If T is a class type, the function:

  • Returns the value of x with the order of bytes reversed for all data members of types or arrays of types that meet the EndianReversible requirements, and;

  • Is a non-member function in the same namespace as T that can be found by argument dependent lookup (ADL).

EndianReversibleInplace requirements (in addition to CopyConstructible)
Expression Requirements

endian_reverse_inplace(mlx)

T is an endian type or a class type.

If T is an endian type, reverses the order of bytes in mlx.

If T is a class type, the function:

  • Reverses the order of bytes of all data members of mlx that have types or arrays of types that meet the EndianReversible or EndianReversibleInplace requirements, and;

  • Is a non-member function in the same namespace as T that can be found by argument dependent lookup (ADL).

Note
Because there is a function template for endian_reverse_inplace that calls endian_reverse, only endian_reverse is required for a user-defined type to meet the EndianReversibleInplace requirements. Although User-defined types are not required to supply an endian_reverse_inplace function, doing so may improve efficiency.
Customization points for user-defined types (UDTs)

This subsection describes requirements on the Endian library’s implementation.

The library’s function templates requiring EndianReversible are required to perform reversal of endianness if needed by making an unqualified call to endian_reverse().

The library’s function templates requiring EndianReversibleInplace are required to perform reversal of endianness if needed by making an unqualified call to endian_reverse_inplace().

See example/udt_conversion_example.cpp for an example user-defined type.

Byte Reversal Functions

template <class Endian>
Endian endian_reverse(Endian x) noexcept;
  • Requires

    Endian must be a standard integral type that is not bool.

    Returns

    x, with the order of its constituent bytes reversed.

template <class EndianReversible>
EndianReversible big_to_native(EndianReversible x) noexcept;
  • Returns

    conditional_reverse<order::big, order::native>(x).

template <class EndianReversible>
EndianReversible native_to_big(EndianReversible x) noexcept;
  • Returns

    conditional_reverse<order::native, order::big>(x).

template <class EndianReversible>
EndianReversible little_to_native(EndianReversible x) noexcept;
  • Returns

    conditional_reverse<order::little, order::native>(x).

template <class EndianReversible>
EndianReversible native_to_little(EndianReversible x) noexcept;
  • Returns

    conditional_reverse<order::native, order::little>(x).

template <order O1, order O2, class EndianReversible>
EndianReversible conditional_reverse(EndianReversible x) noexcept;
  • Returns

    x if O1 == O2, otherwise endian_reverse(x).

    Remarks

    Whether x or endian_reverse(x) is to be returned shall be determined at compile time.

template <class EndianReversible>
EndianReversible conditional_reverse(EndianReversible x,
     order order1, order order2) noexcept;
  • Returns

    order1 == order2? x: endian_reverse(x).

In-place Byte Reversal Functions

template <class EndianReversible>
void endian_reverse_inplace(EndianReversible& x) noexcept;
  • Effects

    x = endian_reverse(x).

template <class EndianReversibleInplace>
void big_to_native_inplace(EndianReversibleInplace& x) noexcept;
  • Effects

    conditional_reverse_inplace<order::big, order::native>(x).

template <class EndianReversibleInplace>
void native_to_big_inplace(EndianReversibleInplace& x) noexcept;
  • Effects

    conditional_reverse_inplace<order::native, order::big>(x).

template <class EndianReversibleInplace>
void little_to_native_inplace(EndianReversibleInplace& x) noexcept;
  • Effects

    conditional_reverse_inplace<order::little, order::native>(x).

template <class EndianReversibleInplace>
void native_to_little_inplace(EndianReversibleInplace& x) noexcept;
  • Effects

    conditional_reverse_inplace<order::native, order::little>(x).

template <order O1, order O2, class EndianReversibleInplace>
void conditional_reverse_inplace(EndianReversibleInplace& x) noexcept;
  • Effects

    None if O1 == O2, otherwise endian_reverse_inplace(x).

    Remarks

    Which effect applies shall be determined at compile time.

template <class EndianReversibleInplace>
void conditional_reverse_inplace(EndianReversibleInplace& x,
     order order1, order order2) noexcept;
  • Effects

    If order1 == order2 then endian_reverse_inplace(x).

Generic Load and Store Functions

template<class T, std::size_t N, order Order>
T endian_load( unsigned char const * p ) noexcept;
  • Requires

    sizeof(T) must be 1, 2, 4, or 8. N must be between 1 and sizeof(T), inclusive. T must be trivially copyable. If N is not equal to sizeof(T), T must be integral or enum.

    Effects

    Reads N bytes starting from p, in forward or reverse order depending on whether Order matches the native endianness or not, interprets the resulting bit pattern as a value of type T, and returns it. If sizeof(T) is bigger than N, zero-extends when T is unsigned, sign-extends otherwise.

template<class T, std::size_t N, order Order>
void endian_store( unsigned char * p, T const & v ) noexcept;
  • Requires

    sizeof(T) must be 1, 2, 4, or 8. N must be between 1 and sizeof(T), inclusive. T must be trivially copyable. If N is not equal to sizeof(T), T must be integral or enum.

    Effects

    Writes to p the N least significant bytes from the object representation of v, in forward or reverse order depending on whether Order matches the native endianness or not.

Convenience Load Functions

inline boost::intM_t load_little_sN( unsigned char const * p ) noexcept;
  • Reads an N-bit signed little-endian integer from p.

    Returns

    endian_load<boost::intM_t, N/8, order::little>( p ).

inline boost::uintM_t load_little_uN( unsigned char const * p ) noexcept;
  • Reads an N-bit unsigned little-endian integer from p.

    Returns

    endian_load<boost::uintM_t, N/8, order::little>( p ).

inline boost::intM_t load_big_sN( unsigned char const * p ) noexcept;
  • Reads an N-bit signed big-endian integer from p.

    Returns

    endian_load<boost::intM_t, N/8, order::big>( p ).

inline boost::uintM_t load_big_uN( unsigned char const * p ) noexcept;
  • Reads an N-bit unsigned big-endian integer from p.

    Returns

    endian_load<boost::uintM_t, N/8, order::big>( p ).

Convenience Store Functions

inline void store_little_sN( unsigned char * p, boost::intM_t v ) noexcept;
  • Writes an N-bit signed little-endian integer to p.

    Effects

    endian_store<boost::intM_t, N/8, order::little>( p, v ).

inline void store_little_uN( unsigned char * p, boost::uintM_t v ) noexcept;
  • Writes an N-bit unsigned little-endian integer to p.

    Effects

    endian_store<boost::uintM_t, N/8, order::little>( p, v ).

inline void store_big_sN( unsigned char * p, boost::intM_t v ) noexcept;
  • Writes an N-bit signed big-endian integer to p.

    Effects

    endian_store<boost::intM_t, N/8, order::big>( p, v ).

inline void store_big_uN( unsigned char * p, boost::uintM_t v ) noexcept;
  • Writes an N-bit unsigned big-endian integer to p.

    Effects

    endian_store<boost::uintM_t, N/8, order::big>( p, v ).

FAQ

See the Overview FAQ for a library-wide FAQ.

Why are both value returning and modify-in-place functions provided?

  • Returning the result by value is the standard C and C++ idiom for functions that compute a value from an argument. Modify-in-place functions allow cleaner code in many real-world endian use cases and are more efficient for user-defined types that have members such as string data that do not need to be reversed. Thus both forms are provided.

Why not use the Linux names (htobe16, htole16, be16toh, le16toh, etc.) ?

  • Those names are non-standard and vary even between POSIX-like operating systems. A C++ library TS was going to use those names, but found they were sometimes implemented as macros. Since macros do not respect scoping and namespace rules, to use them would be very error prone.

Acknowledgements

Tomas Puverle was instrumental in identifying and articulating the need to support endian conversion as separate from endian integer types. Phil Endecott suggested the form of the value returning signatures. Vicente Botet and other reviewers suggested supporting user defined types. General reverse template implementation approach using std::reverse suggested by Mathias Gaunard. Portable implementation approach for 16, 32, and 64-bit integers suggested by tymofey, with avoidance of undefined behavior as suggested by Giovanni Piero Deretta, and a further refinement suggested by Pyry Jahkola. Intrinsic builtins implementation approach for 16, 32, and 64-bit integers suggested by several reviewers, and by David Stone, who provided his Boost licensed macro implementation that became the starting point for boost/endian/detail/intrinsic.hpp. Pierre Talbot provided the int8_t endian_reverse() and templated endian_reverse_inplace() implementations.

Endian Buffer Types

Introduction

The internal byte order of arithmetic types is traditionally called endianness. See the Wikipedia for a full exploration of endianness, including definitions of big endian and little endian.

Header boost/endian/buffers.hpp provides endian_buffer, a portable endian integer binary buffer class template with control over byte order, value type, size, and alignment independent of the platform’s native endianness. Typedefs provide easy-to-use names for common configurations.

Use cases primarily involve data portability, either via files or network connections, but these byte-holders may also be used to reduce memory use, file size, or network activity since they provide binary numeric sizes not otherwise available.

Class endian_buffer is aimed at users who wish explicit control over when endianness conversions occur. It also serves as the base class for the endian_arithmetic class template, which is aimed at users who wish fully automatic endianness conversion and direct support for all normal arithmetic operations.

Example

The example/endian_example.cpp program writes a binary file containing four-byte, big-endian and little-endian integers:

#include <iostream>
#include <cstdio>
#include <boost/endian/buffers.hpp>  // see Synopsis below
#include <boost/static_assert.hpp>

using namespace boost::endian;

namespace
{
  //  This is an extract from a very widely used GIS file format.
  //  Why the designer decided to mix big and little endians in
  //  the same file is not known. But this is a real-world format
  //  and users wishing to write low level code manipulating these
  //  files have to deal with the mixed endianness.

  struct header
  {
    big_int32_buf_t     file_code;
    big_int32_buf_t     file_length;
    little_int32_buf_t  version;
    little_int32_buf_t  shape_type;
  };

  const char* filename = "test.dat";
}

int main(int, char* [])
{
  header h;

  BOOST_STATIC_ASSERT(sizeof(h) == 16U);  // reality check

  h.file_code   = 0x01020304;
  h.file_length = sizeof(header);
  h.version     = 1;
  h.shape_type  = 0x01020304;

  //  Low-level I/O such as POSIX read/write or <cstdio>
  //  fread/fwrite is sometimes used for binary file operations
  //  when ultimate efficiency is important. Such I/O is often
  //  performed in some C++ wrapper class, but to drive home the
  //  point that endian integers are often used in fairly
  //  low-level code that does bulk I/O operations, <cstdio>
  //  fopen/fwrite is used for I/O in this example.

  std::FILE* fi = std::fopen(filename, "wb");  // MUST BE BINARY

  if (!fi)
  {
    std::cout << "could not open " << filename << '\n';
    return 1;
  }

  if (std::fwrite(&h, sizeof(header), 1, fi) != 1)
  {
    std::cout << "write failure for " << filename << '\n';
    return 1;
  }

  std::fclose(fi);

  std::cout << "created file " << filename << '\n';

  return 0;
}

After compiling and executing example/endian_example.cpp, a hex dump of test.dat shows:

01020304 00000010 01000000 04030201

Notice that the first two 32-bit integers are big endian while the second two are little endian, even though the machine this was compiled and run on was little endian.

Limitations

Requires <climits>, CHAR_BIT == 8. If CHAR_BIT is some other value, compilation will result in an #error. This restriction is in place because the design, implementation, testing, and documentation has only considered issues related to 8-bit bytes, and there have been no real-world use cases presented for other sizes.

In C++03, endian_buffer does not meet the requirements for POD types because it has constructors and a private data member. This means that common use cases are relying on unspecified behavior in that the C++ Standard does not guarantee memory layout for non-POD types. This has not been a problem in practice since all known C++ compilers lay out memory as if endian were a POD type. In C++11, it is possible to specify the default constructor as trivial, and private data members and base classes no longer disqualify a type from being a POD type. Thus under C++11, endian_buffer will no longer be relying on unspecified behavior.

Feature set

  • Big endian| little endian | native endian byte ordering.

  • Signed | unsigned

  • Unaligned | aligned

  • 1-8 byte (unaligned) | 1, 2, 4, 8 byte (aligned)

  • Choice of value type

Enums and typedefs

Two scoped enums are provided:

enum class order { big, little, native };

enum class align { no, yes };

One class template is provided:

template <order Order, typename T, std::size_t Nbits,
  align Align = align::no>
class endian_buffer;

Typedefs, such as big_int32_buf_t, provide convenient naming conventions for common use cases:

Name Alignment Endianness Sign Sizes in bits (n)

big_intN_buf_t

no

big

signed

8,16,24,32,40,48,56,64

big_uintN_buf_t

no

big

unsigned

8,16,24,32,40,48,56,64

little_intN_buf_t

no

little

signed

8,16,24,32,40,48,56,64

little_uintN_buf_t

no

little

unsigned

8,16,24,32,40,48,56,64

native_intN_buf_t

no

native

signed

8,16,24,32,40,48,56,64

native_uintN_buf_t

no

native

unsigned

8,16,24,32,40,48,56,64

big_intN_buf_at

yes

big

signed

8,16,32,64

big_uintN_buf_at

yes

big

unsigned

8,16,32,64

little_intN_buf_at

yes

little

signed

8,16,32,64

little_uintN_buf_at

yes

little

unsigned

8,16,32,64

The unaligned types do not cause compilers to insert padding bytes in classes and structs. This is an important characteristic that can be exploited to minimize wasted space in memory, files, and network transmissions.

Caution
Code that uses aligned types is possibly non-portable because alignment requirements vary between hardware architectures and because alignment may be affected by compiler switches or pragmas. For example, alignment of an 64-bit integer may be to a 32-bit boundary on a 32-bit machine and to a 64-bit boundary on a 64-bit machine. Furthermore, aligned types are only available on architectures with 8, 16, 32, and 64-bit integer types.
Tip
Prefer unaligned buffer types.
Tip
Protect yourself against alignment ills. For example:
static_assert(sizeof(containing_struct) == 12, "sizeof(containing_struct) is wrong");

Note: One-byte big and little buffer types have identical layout on all platforms, so they never actually reverse endianness. They are provided to enable generic code, and to improve code readability and searchability.

Class template endian_buffer

An endian_buffer is a byte-holder for arithmetic types with user-specified endianness, value type, size, and alignment.

Synopsis

namespace boost
{
  namespace endian
  {
    //  C++11 features emulated if not available

    enum class align { no, yes };

    template <order Order, class T, std::size_t Nbits,
      align Align = align::no>
    class endian_buffer
    {
    public:

      typedef T value_type;

      endian_buffer() noexcept = default;
      explicit endian_buffer(T v) noexcept;

      endian_buffer& operator=(T v) noexcept;
      value_type value() const noexcept;
      unsigned char* data() noexcept;
      unsigned char const* data() const noexcept;

    private:

      unsigned char value_[Nbits / CHAR_BIT]; // exposition only
    };

    //  stream inserter
    template <class charT, class traits, order Order, class T,
      std::size_t n_bits, align Align>
    std::basic_ostream<charT, traits>&
      operator<<(std::basic_ostream<charT, traits>& os,
        const endian_buffer<Order, T, n_bits, Align>& x);

    //  stream extractor
    template <class charT, class traits, order Order, class T,
      std::size_t n_bits, align A>
    std::basic_istream<charT, traits>&
      operator>>(std::basic_istream<charT, traits>& is,
        endian_buffer<Order, T, n_bits, Align>& x);

    // typedefs

    // unaligned big endian signed integer buffers
    typedef endian_buffer<order::big, int_least8_t, 8>        big_int8_buf_t;
    typedef endian_buffer<order::big, int_least16_t, 16>      big_int16_buf_t;
    typedef endian_buffer<order::big, int_least32_t, 24>      big_int24_buf_t;
    typedef endian_buffer<order::big, int_least32_t, 32>      big_int32_buf_t;
    typedef endian_buffer<order::big, int_least64_t, 40>      big_int40_buf_t;
    typedef endian_buffer<order::big, int_least64_t, 48>      big_int48_buf_t;
    typedef endian_buffer<order::big, int_least64_t, 56>      big_int56_buf_t;
    typedef endian_buffer<order::big, int_least64_t, 64>      big_int64_buf_t;

    // unaligned big endian unsigned integer buffers
    typedef endian_buffer<order::big, uint_least8_t, 8>       big_uint8_buf_t;
    typedef endian_buffer<order::big, uint_least16_t, 16>     big_uint16_buf_t;
    typedef endian_buffer<order::big, uint_least32_t, 24>     big_uint24_buf_t;
    typedef endian_buffer<order::big, uint_least32_t, 32>     big_uint32_buf_t;
    typedef endian_buffer<order::big, uint_least64_t, 40>     big_uint40_buf_t;
    typedef endian_buffer<order::big, uint_least64_t, 48>     big_uint48_buf_t;
    typedef endian_buffer<order::big, uint_least64_t, 56>     big_uint56_buf_t;
    typedef endian_buffer<order::big, uint_least64_t, 64>     big_uint64_buf_t;

    // unaligned big endian floating point buffers
    typedef endian_buffer<order::big, float, 32>              big_float32_buf_t;
    typedef endian_buffer<order::big, double, 64>             big_float64_buf_t;

    // unaligned little endian signed integer buffers
    typedef endian_buffer<order::little, int_least8_t, 8>     little_int8_buf_t;
    typedef endian_buffer<order::little, int_least16_t, 16>   little_int16_buf_t;
    typedef endian_buffer<order::little, int_least32_t, 24>   little_int24_buf_t;
    typedef endian_buffer<order::little, int_least32_t, 32>   little_int32_buf_t;
    typedef endian_buffer<order::little, int_least64_t, 40>   little_int40_buf_t;
    typedef endian_buffer<order::little, int_least64_t, 48>   little_int48_buf_t;
    typedef endian_buffer<order::little, int_least64_t, 56>   little_int56_buf_t;
    typedef endian_buffer<order::little, int_least64_t, 64>   little_int64_buf_t;

    // unaligned little endian unsigned integer buffers
    typedef endian_buffer<order::little, uint_least8_t, 8>    little_uint8_buf_t;
    typedef endian_buffer<order::little, uint_least16_t, 16>  little_uint16_buf_t;
    typedef endian_buffer<order::little, uint_least32_t, 24>  little_uint24_buf_t;
    typedef endian_buffer<order::little, uint_least32_t, 32>  little_uint32_buf_t;
    typedef endian_buffer<order::little, uint_least64_t, 40>  little_uint40_buf_t;
    typedef endian_buffer<order::little, uint_least64_t, 48>  little_uint48_buf_t;
    typedef endian_buffer<order::little, uint_least64_t, 56>  little_uint56_buf_t;
    typedef endian_buffer<order::little, uint_least64_t, 64>  little_uint64_buf_t;

    // unaligned little endian floating point buffers
    typedef endian_buffer<order::little, float, 32>           little_float32_buf_t;
    typedef endian_buffer<order::little, double, 64>          little_float64_buf_t;

    // unaligned native endian signed integer types
    typedef implementation-defined_int8_buf_t   native_int8_buf_t;
    typedef implementation-defined_int16_buf_t  native_int16_buf_t;
    typedef implementation-defined_int24_buf_t  native_int24_buf_t;
    typedef implementation-defined_int32_buf_t  native_int32_buf_t;
    typedef implementation-defined_int40_buf_t  native_int40_buf_t;
    typedef implementation-defined_int48_buf_t  native_int48_buf_t;
    typedef implementation-defined_int56_buf_t  native_int56_buf_t;
    typedef implementation-defined_int64_buf_t  native_int64_buf_t;

    // unaligned native endian unsigned integer types
    typedef implementation-defined_uint8_buf_t   native_uint8_buf_t;
    typedef implementation-defined_uint16_buf_t  native_uint16_buf_t;
    typedef implementation-defined_uint24_buf_t  native_uint24_buf_t;
    typedef implementation-defined_uint32_buf_t  native_uint32_buf_t;
    typedef implementation-defined_uint40_buf_t  native_uint40_buf_t;
    typedef implementation-defined_uint48_buf_t  native_uint48_buf_t;
    typedef implementation-defined_uint56_buf_t  native_uint56_buf_t;
    typedef implementation-defined_uint64_buf_t  native_uint64_buf_t;

    // unaligned native endian floating point types
    typedef implementation-defined_float32_buf_t  native_float32_buf_t;
    typedef implementation-defined_float64_buf_t  native_float64_buf_t;

    // aligned big endian signed integer buffers
    typedef endian_buffer<order::big, int8_t, 8, align::yes>       big_int8_buf_at;
    typedef endian_buffer<order::big, int16_t, 16, align::yes>     big_int16_buf_at;
    typedef endian_buffer<order::big, int32_t, 32, align::yes>     big_int32_buf_at;
    typedef endian_buffer<order::big, int64_t, 64, align::yes>     big_int64_buf_at;

    // aligned big endian unsigned integer buffers
    typedef endian_buffer<order::big, uint8_t, 8, align::yes>      big_uint8_buf_at;
    typedef endian_buffer<order::big, uint16_t, 16, align::yes>    big_uint16_buf_at;
    typedef endian_buffer<order::big, uint32_t, 32, align::yes>    big_uint32_buf_at;
    typedef endian_buffer<order::big, uint64_t, 64, align::yes>    big_uint64_buf_at;

    // aligned big endian floating point buffers
    typedef endian_buffer<order::big, float, 32, align::yes>       big_float32_buf_at;
    typedef endian_buffer<order::big, double, 64, align::yes>      big_float64_buf_at;

    // aligned little endian signed integer buffers
    typedef endian_buffer<order::little, int8_t, 8, align::yes>    little_int8_buf_at;
    typedef endian_buffer<order::little, int16_t, 16, align::yes>  little_int16_buf_at;
    typedef endian_buffer<order::little, int32_t, 32, align::yes>  little_int32_buf_at;
    typedef endian_buffer<order::little, int64_t, 64, align::yes>  little_int64_buf_at;

    // aligned little endian unsigned integer buffers
    typedef endian_buffer<order::little, uint8_t, 8, align::yes>   little_uint8_buf_at;
    typedef endian_buffer<order::little, uint16_t, 16, align::yes> little_uint16_buf_at;
    typedef endian_buffer<order::little, uint32_t, 32, align::yes> little_uint32_buf_at;
    typedef endian_buffer<order::little, uint64_t, 64, align::yes> little_uint64_buf_at;

    // aligned little endian floating point buffers
    typedef endian_buffer<order::little, float, 32, align::yes>    little_float32_buf_at;
    typedef endian_buffer<order::little, double, 64, align::yes>   little_float64_buf_at;

    // aligned native endian typedefs are not provided because
    // <cstdint> types are superior for this use case

  } // namespace endian
} // namespace boost

The implementation-defined text in typedefs above is either big or little according to the native endianness of the platform.

The expository data member value_ stores the current value of the endian_buffer object as a sequence of bytes ordered as specified by the Order template parameter. The CHAR_BIT macro is defined in <climits>. The only supported value of CHAR_BIT is 8.

The valid values of Nbits are as follows:

  • When sizeof(T) is 1, Nbits shall be 8;

  • When sizeof(T) is 2, Nbits shall be 16;

  • When sizeof(T) is 4, Nbits shall be 24 or 32;

  • When sizeof(T) is 8, Nbits shall be 40, 48, 56, or 64.

Other values of sizeof(T) are not supported.

When Nbits is equal to sizeof(T)*8, T must be a trivially copyable type (such as float) that is assumed to have the same endianness as uintNbits_t.

When Nbits is less than sizeof(T)*8, T must be either a standard integral type (C++std, [basic.fundamental]) or an enum.

Members

endian_buffer() noexcept = default;
  • Effects

    Constructs an uninitialized object.

explicit endian_buffer(T v) noexcept;
  • Effects

    endian_store<T, Nbits/8, Order>( value_, v ).

endian_buffer& operator=(T v) noexcept;
  • Effects

    endian_store<T, Nbits/8, Order>( value_, v ).

    Returns

    *this.

value_type value() const noexcept;
  • Returns

    endian_load<T, Nbits/8, Order>( value_ ).

unsigned char* data() noexcept;
unsigned char const* data() const noexcept;
  • Returns

    A pointer to the first byte of value_.

Non-member functions

template <class charT, class traits, order Order, class T,
  std::size_t n_bits, align Align>
std::basic_ostream<charT, traits>& operator<<(std::basic_ostream<charT, traits>& os,
  const endian_buffer<Order, T, n_bits, Align>& x);
  • Returns

    os << x.value().

template <class charT, class traits, order Order, class T,
  std::size_t n_bits, align A>
std::basic_istream<charT, traits>& operator>>(std::basic_istream<charT, traits>& is,
  endian_buffer<Order, T, n_bits, Align>& x);
  • Effects

    As if:

    T i;
    if (is >> i)
      x = i;
    Returns

    is.

FAQ

See the Overview FAQ for a library-wide FAQ.

Why not just use Boost.Serialization?

Serialization involves a conversion for every object involved in I/O. Endian integers require no conversion or copying. They are already in the desired format for binary I/O. Thus they can be read or written in bulk.

Are endian types PODs?

Yes for C++11. No for C++03, although several macros are available to force PODness in all cases.

What are the implications of endian integer types not being PODs with C++03 compilers?

They can’t be used in unions. Also, compilers aren’t required to align or lay out storage in portable ways, although this potential problem hasn’t prevented use of Boost.Endian with real compilers.

What good is native endianness?

It provides alignment and size guarantees not available from the built-in types. It eases generic programming.

Why bother with the aligned endian types?

Aligned integer operations may be faster (as much as 10 to 20 times faster) if the endianness and alignment of the type matches the endianness and alignment requirements of the machine. The code, however, is likely to be somewhat less portable than with the unaligned types.

Design considerations for Boost.Endian buffers

  • Must be suitable for I/O - in other words, must be memcpyable.

  • Must provide exactly the size and internal byte ordering specified.

  • Must work correctly when the internal integer representation has more bits that the sum of the bits in the external byte representation. Sign extension must work correctly when the internal integer representation type has more bits than the sum of the bits in the external bytes. For example, using a 64-bit integer internally to represent 40-bit (5 byte) numbers must work for both positive and negative values.

  • Must work correctly (including using the same defined external representation) regardless of whether a compiler treats char as signed or unsigned.

  • Unaligned types must not cause compilers to insert padding bytes.

  • The implementation should supply optimizations with great care. Experience has shown that optimizations of endian integers often become pessimizations when changing machines or compilers. Pessimizations can also happen when changing compiler switches, compiler versions, or CPU models of the same architecture.

C++11

The availability of the C++11 Defaulted Functions feature is detected automatically, and will be used if present to ensure that objects of class endian_buffer are trivial, and thus PODs.

Compilation

Boost.Endian is implemented entirely within headers, with no need to link to any Boost object libraries.

Several macros allow user control over features:

  • BOOST_ENDIAN_NO_CTORS causes class endian_buffer to have no constructors. The intended use is for compiling user code that must be portable between compilers regardless of C++11 Defaulted Functions support. Use of constructors will always fail,

  • BOOST_ENDIAN_FORCE_PODNESS causes BOOST_ENDIAN_NO_CTORS to be defined if the compiler does not support C++11 Defaulted Functions. This is ensures that objects of class endian_buffer are PODs, and so can be used in C++03 unions. In C++11, class endian_buffer objects are PODs, even though they have constructors, so can always be used in unions.

Endian Arithmetic Types

Introduction

Header boost/endian/arithmetic.hpp provides integer binary types with control over byte order, value type, size, and alignment. Typedefs provide easy-to-use names for common configurations.

These types provide portable byte-holders for integer data, independent of particular computer architectures. Use cases almost always involve I/O, either via files or network connections. Although data portability is the primary motivation, these integer byte-holders may also be used to reduce memory use, file size, or network activity since they provide binary integer sizes not otherwise available.

Such integer byte-holder types are traditionally called endian types. See the Wikipedia for a full exploration of endianness, including definitions of big endian and little endian.

Boost endian integers provide the same full set of C++ assignment, arithmetic, and relational operators as C++ standard integral types, with the standard semantics.

Unary arithmetic operators are +, -, ~, !, plus both prefix and postfix -- and ++. Binary arithmetic operators are +, +=, -, -=, *, *=, /, /=, &, &=, |, |=, ^, ^=, <<, <<=, >>, and >>=. Binary relational operators are ==, !=, <, <=, >, and >=.

Implicit conversion to the underlying value type is provided. An implicit constructor converting from the underlying value type is provided.

Example

The endian_example.cpp program writes a binary file containing four-byte, big-endian and little-endian integers:

#include <iostream>
#include <cstdio>
#include <boost/endian/arithmetic.hpp>
#include <boost/static_assert.hpp>

using namespace boost::endian;

namespace
{
  //  This is an extract from a very widely used GIS file format.
  //  Why the designer decided to mix big and little endians in
  //  the same file is not known. But this is a real-world format
  //  and users wishing to write low level code manipulating these
  //  files have to deal with the mixed endianness.

  struct header
  {
    big_int32_t     file_code;
    big_int32_t     file_length;
    little_int32_t  version;
    little_int32_t  shape_type;
  };

  const char* filename = "test.dat";
}

int main(int, char* [])
{
  header h;

  BOOST_STATIC_ASSERT(sizeof(h) == 16U);  // reality check

  h.file_code   = 0x01020304;
  h.file_length = sizeof(header);
  h.version     = 1;
  h.shape_type  = 0x01020304;

  //  Low-level I/O such as POSIX read/write or <cstdio>
  //  fread/fwrite is sometimes used for binary file operations
  //  when ultimate efficiency is important. Such I/O is often
  //  performed in some C++ wrapper class, but to drive home the
  //  point that endian integers are often used in fairly
  //  low-level code that does bulk I/O operations, <cstdio>
  //  fopen/fwrite is used for I/O in this example.

  std::FILE* fi = std::fopen(filename, "wb");  // MUST BE BINARY

  if (!fi)
  {
    std::cout << "could not open " << filename << '\n';
    return 1;
  }

  if (std::fwrite(&h, sizeof(header), 1, fi) != 1)
  {
    std::cout << "write failure for " << filename << '\n';
    return 1;
  }

  std::fclose(fi);

  std::cout << "created file " << filename << '\n';

  return 0;
}

After compiling and executing endian_example.cpp, a hex dump of test.dat shows:

01020304 00000010 01000000 04030201

Notice that the first two 32-bit integers are big endian while the second two are little endian, even though the machine this was compiled and run on was little endian.

Limitations

Requires <climits>, CHAR_BIT == 8. If CHAR_BIT is some other value, compilation will result in an #error. This restriction is in place because the design, implementation, testing, and documentation has only considered issues related to 8-bit bytes, and there have been no real-world use cases presented for other sizes.

In C++03, endian_arithmetic does not meet the requirements for POD types because it has constructors, private data members, and a base class. This means that common use cases are relying on unspecified behavior in that the C++ Standard does not guarantee memory layout for non-POD types. This has not been a problem in practice since all known C++ compilers lay out memory as if endian were a POD type. In C++11, it is possible to specify the default constructor as trivial, and private data members and base classes no longer disqualify a type from being a POD type. Thus under C++11, endian_arithmetic will no longer be relying on unspecified behavior.

Feature set

  • Big endian| little endian | native endian byte ordering.

  • Signed | unsigned

  • Unaligned | aligned

  • 1-8 byte (unaligned) | 1, 2, 4, 8 byte (aligned)

  • Choice of value type

Enums and typedefs

Two scoped enums are provided:

enum class order { big, little, native };

enum class align { no, yes };

One class template is provided:

template <order Order, typename T, std::size_t n_bits,
  align Align = align::no>
class endian_arithmetic;

Typedefs, such as big_int32_t, provide convenient naming conventions for common use cases:

Name Alignment Endianness Sign Sizes in bits (n)

big_intN_t

no

big

signed

8,16,24,32,40,48,56,64

big_uintN_t

no

big

unsigned

8,16,24,32,40,48,56,64

little_intN_t

no

little

signed

8,16,24,32,40,48,56,64

little_uintN_t

no

little

unsigned

8,16,24,32,40,48,56,64

native_intN_t

no

native

signed

8,16,24,32,40,48,56,64

native_uintN_t

no

native

unsigned

8,16,24,32,40,48,56,64

big_intN_at

yes

big

signed

8,16,32,64

big_uintN_at

yes

big

unsigned

8,16,32,64

little_intN_at

yes

little

signed

8,16,32,64

little_uintN_at

yes

little

unsigned

8,16,32,64

The unaligned types do not cause compilers to insert padding bytes in classes and structs. This is an important characteristic that can be exploited to minimize wasted space in memory, files, and network transmissions.

Caution
Code that uses aligned types is possibly non-portable because alignment requirements vary between hardware architectures and because alignment may be affected by compiler switches or pragmas. For example, alignment of an 64-bit integer may be to a 32-bit boundary on a 32-bit machine. Furthermore, aligned types are only available on architectures with 8, 16, 32, and 64-bit integer types.
Tip
Prefer unaligned arithmetic types.
Tip
Protect yourself against alignment ills. For example:
static_assert(sizeof(containing_struct) == 12, "sizeof(containing_struct) is wrong");
Note
One-byte arithmetic types have identical layout on all platforms, so they never actually reverse endianness. They are provided to enable generic code, and to improve code readability and searchability.

Class template endian_arithmetic

An endian_integer is an integer byte-holder with user-specified endianness, value type, size, and alignment. The usual operations on arithmetic types are supplied.

Synopsis

#include <boost/endian/buffers.hpp>

namespace boost
{
  namespace endian
  {
    //  C++11 features emulated if not available

    enum class align { no, yes };

    template <order Order, class T, std::size_t n_bits,
      align Align = align::no>
    class endian_arithmetic
      : public endian_buffer<Order, T, n_bits, Align>
    {
    public:

      typedef T value_type;

      // if BOOST_ENDIAN_FORCE_PODNESS is defined && C++11 PODs are not
      // available then these two constructors will not be present
      endian_arithmetic() noexcept = default;
      endian_arithmetic(T v) noexcept;

      endian_arithmetic& operator=(T v) noexcept;
      operator value_type() const noexcept;
      value_type value() const noexcept; // for exposition; see endian_buffer
      unsigned char* data() noexcept; // for exposition; see endian_buffer
      unsigned char const* data() const noexcept; // for exposition; see endian_buffer

      // arithmetic operations
      //   note that additional operations are provided by the value_type
      value_type operator+() const noexcept;
      endian_arithmetic& operator+=(value_type y) noexcept;
      endian_arithmetic& operator-=(value_type y) noexcept;
      endian_arithmetic& operator*=(value_type y) noexcept;
      endian_arithmetic& operator/=(value_type y) noexcept;
      endian_arithmetic& operator%=(value_type y) noexcept;
      endian_arithmetic& operator&=(value_type y) noexcept;
      endian_arithmetic& operator|=(value_type y) noexcept;
      endian_arithmetic& operator^=(value_type y) noexcept;
      endian_arithmetic& operator<<=(value_type y) noexcept;
      endian_arithmetic& operator>>=(value_type y) noexcept;
      endian_arithmetic& operator++() noexcept;
      endian_arithmetic& operator--() noexcept;
      endian_arithmetic operator++(int) noexcept;
      endian_arithmetic operator--(int) noexcept;

      // Stream inserter
      template <class charT, class traits>
      friend std::basic_ostream<charT, traits>&
        operator<<(std::basic_ostream<charT, traits>& os, const endian_arithmetic& x);

      // Stream extractor
      template <class charT, class traits>
      friend std::basic_istream<charT, traits>&
        operator>>(std::basic_istream<charT, traits>& is, endian_arithmetic& x);
    };

    // typedefs

    // unaligned big endian signed integer types
    typedef endian_arithmetic<order::big, int_least8_t, 8>        big_int8_t;
    typedef endian_arithmetic<order::big, int_least16_t, 16>      big_int16_t;
    typedef endian_arithmetic<order::big, int_least32_t, 24>      big_int24_t;
    typedef endian_arithmetic<order::big, int_least32_t, 32>      big_int32_t;
    typedef endian_arithmetic<order::big, int_least64_t, 40>      big_int40_t;
    typedef endian_arithmetic<order::big, int_least64_t, 48>      big_int48_t;
    typedef endian_arithmetic<order::big, int_least64_t, 56>      big_int56_t;
    typedef endian_arithmetic<order::big, int_least64_t, 64>      big_int64_t;

    // unaligned big endian unsigned integer types
    typedef endian_arithmetic<order::big, uint_least8_t, 8>       big_uint8_t;
    typedef endian_arithmetic<order::big, uint_least16_t, 16>     big_uint16_t;
    typedef endian_arithmetic<order::big, uint_least32_t, 24>     big_uint24_t;
    typedef endian_arithmetic<order::big, uint_least32_t, 32>     big_uint32_t;
    typedef endian_arithmetic<order::big, uint_least64_t, 40>     big_uint40_t;
    typedef endian_arithmetic<order::big, uint_least64_t, 48>     big_uint48_t;
    typedef endian_arithmetic<order::big, uint_least64_t, 56>     big_uint56_t;
    typedef endian_arithmetic<order::big, uint_least64_t, 64>     big_uint64_t;

    // unaligned big endian floating point types
    typedef endian_arithmetic<order::big, float, 32>              big_float32_t;
    typedef endian_arithmetic<order::big, double, 64>             big_float64_t;

    // unaligned little endian signed integer types
    typedef endian_arithmetic<order::little, int_least8_t, 8>     little_int8_t;
    typedef endian_arithmetic<order::little, int_least16_t, 16>   little_int16_t;
    typedef endian_arithmetic<order::little, int_least32_t, 24>   little_int24_t;
    typedef endian_arithmetic<order::little, int_least32_t, 32>   little_int32_t;
    typedef endian_arithmetic<order::little, int_least64_t, 40>   little_int40_t;
    typedef endian_arithmetic<order::little, int_least64_t, 48>   little_int48_t;
    typedef endian_arithmetic<order::little, int_least64_t, 56>   little_int56_t;
    typedef endian_arithmetic<order::little, int_least64_t, 64>   little_int64_t;

    // unaligned little endian unsigned integer types
    typedef endian_arithmetic<order::little, uint_least8_t, 8>    little_uint8_t;
    typedef endian_arithmetic<order::little, uint_least16_t, 16>  little_uint16_t;
    typedef endian_arithmetic<order::little, uint_least32_t, 24>  little_uint24_t;
    typedef endian_arithmetic<order::little, uint_least32_t, 32>  little_uint32_t;
    typedef endian_arithmetic<order::little, uint_least64_t, 40>  little_uint40_t;
    typedef endian_arithmetic<order::little, uint_least64_t, 48>  little_uint48_t;
    typedef endian_arithmetic<order::little, uint_least64_t, 56>  little_uint56_t;
    typedef endian_arithmetic<order::little, uint_least64_t, 64>  little_uint64_t;

    // unaligned little endian floating point types
    typedef endian_arithmetic<order::little, float, 32>           little_float32_t;
    typedef endian_arithmetic<order::little, double, 64>          little_float64_t;

    // unaligned native endian signed integer types
    typedef implementation-defined_int8_t   native_int8_t;
    typedef implementation-defined_int16_t  native_int16_t;
    typedef implementation-defined_int24_t  native_int24_t;
    typedef implementation-defined_int32_t  native_int32_t;
    typedef implementation-defined_int40_t  native_int40_t;
    typedef implementation-defined_int48_t  native_int48_t;
    typedef implementation-defined_int56_t  native_int56_t;
    typedef implementation-defined_int64_t  native_int64_t;

    // unaligned native endian unsigned integer types
    typedef implementation-defined_uint8_t   native_uint8_t;
    typedef implementation-defined_uint16_t  native_uint16_t;
    typedef implementation-defined_uint24_t  native_uint24_t;
    typedef implementation-defined_uint32_t  native_uint32_t;
    typedef implementation-defined_uint40_t  native_uint40_t;
    typedef implementation-defined_uint48_t  native_uint48_t;
    typedef implementation-defined_uint56_t  native_uint56_t;
    typedef implementation-defined_uint64_t  native_uint64_t;

    // unaligned native endian floating point types
    typedef implementation-defined_float32_t  native_float32_t;
    typedef implementation-defined_float64_t  native_float64_t;

    // aligned big endian signed integer types
    typedef endian_arithmetic<order::big, int8_t, 8, align::yes>       big_int8_at;
    typedef endian_arithmetic<order::big, int16_t, 16, align::yes>     big_int16_at;
    typedef endian_arithmetic<order::big, int32_t, 32, align::yes>     big_int32_at;
    typedef endian_arithmetic<order::big, int64_t, 64, align::yes>     big_int64_at;

    // aligned big endian unsigned integer types
    typedef endian_arithmetic<order::big, uint8_t, 8, align::yes>      big_uint8_at;
    typedef endian_arithmetic<order::big, uint16_t, 16, align::yes>    big_uint16_at;
    typedef endian_arithmetic<order::big, uint32_t, 32, align::yes>    big_uint32_at;
    typedef endian_arithmetic<order::big, uint64_t, 64, align::yes>    big_uint64_at;

    // aligned big endian floating point types
    typedef endian_arithmetic<order::big, float, 32, align::yes>       big_float32_at;
    typedef endian_arithmetic<order::big, double, 64, align::yes>      big_float64_at;

    // aligned little endian signed integer types
    typedef endian_arithmetic<order::little, int8_t, 8, align::yes>    little_int8_at;
    typedef endian_arithmetic<order::little, int16_t, 16, align::yes>  little_int16_at;
    typedef endian_arithmetic<order::little, int32_t, 32, align::yes>  little_int32_at;
    typedef endian_arithmetic<order::little, int64_t, 64, align::yes>  little_int64_at;

    // aligned little endian unsigned integer types
    typedef endian_arithmetic<order::little, uint8_t, 8, align::yes>   little_uint8_at;
    typedef endian_arithmetic<order::little, uint16_t, 16, align::yes> little_uint16_at;
    typedef endian_arithmetic<order::little, uint32_t, 32, align::yes> little_uint32_at;
    typedef endian_arithmetic<order::little, uint64_t, 64, align::yes> little_uint64_at;

    // aligned little endian floating point types
    typedef endian_arithmetic<order::little, float, 32, align::yes>    little_float32_at;
    typedef endian_arithmetic<order::little, double, 64, align::yes>   little_float64_at;

    // aligned native endian typedefs are not provided because
    // <cstdint> types are superior for that use case

  } // namespace endian
} // namespace boost

The implementation-defined text above is either big or little according to the endianness of the platform.

The only supported value of CHAR_BIT is 8.

The valid values of Nbits are as follows:

  • When sizeof(T) is 1, Nbits shall be 8;

  • When sizeof(T) is 2, Nbits shall be 16;

  • When sizeof(T) is 4, Nbits shall be 24 or 32;

  • When sizeof(T) is 8, Nbits shall be 40, 48, 56, or 64.

Other values of sizeof(T) are not supported.

When Nbits is equal to sizeof(T)*8, T must be a standard arithmetic type.

When Nbits is less than sizeof(T)*8, T must be a standard integral type (C++std, [basic.fundamental]) that is not bool.

Members

endian_arithmetic() noexcept = default;  // C++03: endian(){}
  • Effects

    Constructs an uninitialized object.

endian_arithmetic(T v) noexcept;
  • Effects

    See endian_buffer::endian_buffer(T).

endian_arithmetic& operator=(T v) noexcept;
  • Effects

    See endian_buffer::operator=(T).

    Returns

    *this.

operator T() const noexcept;
  • Returns

    value().

Other operators

Other operators on endian objects are forwarded to the equivalent operator on value_type.

Stream inserter

template <class charT, class traits>
friend std::basic_ostream<charT, traits>&
  operator<<(std::basic_ostream<charT, traits>& os, const endian_arithmetic& x);
  • Returns

    os << +x.

Stream extractor

template <class charT, class traits>
friend std::basic_istream<charT, traits>&
  operator>>(std::basic_istream<charT, traits>& is, endian_arithmetic& x);
  • Effects

    As if:

    T i;
    if (is >> i)
      x = i;
    Returns

    is.

FAQ

See the Overview FAQ for a library-wide FAQ.

Why not just use Boost.Serialization?

Serialization involves a conversion for every object involved in I/O. Endian integers require no conversion or copying. They are already in the desired format for binary I/O. Thus they can be read or written in bulk.

Are endian types PODs?

Yes for C++11. No for C++03, although several macros are available to force PODness in all cases.

What are the implications of endian integer types not being PODs with C++03 compilers?

They can’t be used in unions. Also, compilers aren’t required to align or lay out storage in portable ways, although this potential problem hasn’t prevented use of Boost.Endian with real compilers.

What good is native endianness?

It provides alignment and size guarantees not available from the built-in types. It eases generic programming.

Why bother with the aligned endian types?

Aligned integer operations may be faster (as much as 10 to 20 times faster) if the endianness and alignment of the type matches the endianness and alignment requirements of the machine. The code, however, will be somewhat less portable than with the unaligned types.

Why provide the arithmetic operations?

Providing a full set of operations reduces program clutter and makes code both easier to write and to read. Consider incrementing a variable in a record. It is very convenient to write:

++record.foo;

Rather than:

int temp(record.foo);
++temp;
record.foo = temp;

Design considerations for Boost.Endian types

  • Must be suitable for I/O - in other words, must be memcpyable.

  • Must provide exactly the size and internal byte ordering specified.

  • Must work correctly when the internal integer representation has more bits that the sum of the bits in the external byte representation. Sign extension must work correctly when the internal integer representation type has more bits than the sum of the bits in the external bytes. For example, using a 64-bit integer internally to represent 40-bit (5 byte) numbers must work for both positive and negative values.

  • Must work correctly (including using the same defined external representation) regardless of whether a compiler treats char as signed or unsigned.

  • Unaligned types must not cause compilers to insert padding bytes.

  • The implementation should supply optimizations with great care. Experience has shown that optimizations of endian integers often become pessimizations when changing machines or compilers. Pessimizations can also happen when changing compiler switches, compiler versions, or CPU models of the same architecture.

Experience

Classes with similar functionality have been independently developed by several Boost programmers and used very successful in high-value, high-use applications for many years. These independently developed endian libraries often evolved from C libraries that were also widely used. Endian types have proven widely useful across a wide range of computer architectures and applications.

Motivating use cases

Neil Mayhew writes: "I can also provide a meaningful use-case for this library: reading TrueType font files from disk and processing the contents. The data format has fixed endianness (big) and has unaligned values in various places. Using Boost.Endian simplifies and cleans the code wonderfully."

C++11

The availability of the C++11 Defaulted Functions feature is detected automatically, and will be used if present to ensure that objects of class endian_arithmetic are trivial, and thus PODs.

Compilation

Boost.Endian is implemented entirely within headers, with no need to link to any Boost object libraries.

Several macros allow user control over features:

  • BOOST_ENDIAN_NO_CTORS causes class endian_arithmetic to have no constructors. The intended use is for compiling user code that must be portable between compilers regardless of C++11 Defaulted Functions support. Use of constructors will always fail,

  • BOOST_ENDIAN_FORCE_PODNESS causes BOOST_ENDIAN_NO_CTORS to be defined if the compiler does not support C++11 Defaulted Functions. This is ensures that objects of class endian_arithmetic are PODs, and so can be used in C++03 unions. In C++11, class endian_arithmetic objects are PODs, even though they have constructors, so can always be used in unions.

Acknowledgements

Original design developed by Darin Adler based on classes developed by Mark Borgerding. Four original class templates combined into a single endian_arithmetic class template by Beman Dawes, who put the library together, provided documentation, added the typedefs, and also added the unrolled_byte_loops sign partial specialization to correctly extend the sign when cover integer size differs from endian representation size.

Choosing Approach

Introduction

Deciding which is the best endianness approach (conversion functions, buffer types, or arithmetic types) for a particular application involves complex engineering trade-offs. It is hard to assess those trade-offs without some understanding of the different interfaces, so you might want to read the conversion functions, buffer types, and arithmetic types pages before diving into this page.

Choosing between conversion functions, buffer types, and arithmetic types

The best approach to endianness for a particular application depends on the interaction between the application’s needs and the characteristics of each of the three approaches.

Recommendation: If you are new to endianness, uncertain, or don’t want to invest the time to study engineering trade-offs, use endian arithmetic types. They are safe, easy to use, and easy to maintain. Use the anticipating need design pattern locally around performance hot spots like lengthy loops, if needed.

Background

A dealing with endianness usually implies a program portability or a data portability requirement, and often both. That means real programs dealing with endianness are usually complex, so the examples shown here would really be written as multiple functions spread across multiple translation units. They would involve interfaces that can not be altered as they are supplied by third-parties or the standard library.

Characteristics

The characteristics that differentiate the three approaches to endianness are the endianness invariants, conversion explicitness, arithmetic operations, sizes available, and alignment requirements.

Endianness invariants

Endian conversion functions use objects of the ordinary C++ arithmetic types like int or unsigned short to hold values. That breaks the implicit invariant that the C++ language rules apply. The usual language rules only apply if the endianness of the object is currently set to the native endianness for the platform. That can make it very hard to reason about logic flow, and result in difficult to find bugs.

For example:

struct data_t  // big endian
{
  int32_t   v1;  // description ...
  int32_t   v2;  // description ...
  ... additional character data members (i.e. non-endian)
  int32_t   v3;  // description ...
};

data_t data;

read(data);
big_to_native_inplace(data.v1);
big_to_native_inplace(data.v2);

...

++v1;
third_party::func(data.v2);

...

native_to_big_inplace(data.v1);
native_to_big_inplace(data.v2);
write(data);

The programmer didn’t bother to convert data.v3 to native endianness because that member isn’t used. A later maintainer needs to pass data.v3 to the third-party function, so adds third_party::func(data.v3); somewhere deep in the code. This causes a silent failure because the usual invariant that an object of type int32_t holds a value as described by the C++ core language does not apply.

Endian buffer and arithmetic types hold values internally as arrays of characters with an invariant that the endianness of the array never changes. That makes these types easier to use and programs easier to maintain.

Here is the same example, using an endian arithmetic type:

struct data_t
{
  big_int32_t   v1;  // description ...
  big_int32_t   v2;  // description ...
  ... additional character data members (i.e. non-endian)
  big_int32_t   v3;  // description ...
};

data_t data;

read(data);

...

++v1;
third_party::func(data.v2);

...

write(data);

A later maintainer can add third_party::func(data.v3) and it will just-work.

Conversion explicitness

Endian conversion functions and buffer types never perform implicit conversions. This gives users explicit control of when conversion occurs, and may help avoid unnecessary conversions.

Endian arithmetic types perform conversion implicitly. That makes these types very easy to use, but can result in unnecessary conversions. Failure to hoist conversions out of inner loops can bring a performance penalty.

Arithmetic operations

Endian conversion functions do not supply arithmetic operations, but this is not a concern since this approach uses ordinary C++ arithmetic types to hold values.

Endian buffer types do not supply arithmetic operations. Although this approach avoids unnecessary conversions, it can result in the introduction of additional variables and confuse maintenance programmers.

Endian arithmetic types do supply arithmetic operations. They are very easy to use if lots of arithmetic is involved.

Sizes

Endianness conversion functions only support 1, 2, 4, and 8 byte integers. That’s sufficient for many applications.

Endian buffer and arithmetic types support 1, 2, 3, 4, 5, 6, 7, and 8 byte integers. For an application where memory use or I/O speed is the limiting factor, using sizes tailored to application needs can be useful.

Alignments

Endianness conversion functions only support aligned integer and floating-point types. That’s sufficient for most applications.

Endian buffer and arithmetic types support both aligned and unaligned integer and floating-point types. Unaligned types are rarely needed, but when needed they are often very useful and workarounds are painful. For example:

Non-portable code like this:

struct S {
  uint16_t a; // big endian
  uint32_t b; // big endian
} __attribute__ ((packed));

Can be replaced with portable code like this:

struct S {
  big_uint16_ut a;
  big_uint32_ut b;
};

Design patterns

Applications often traffic in endian data as records or packets containing multiple endian data elements. For simplicity, we will just call them records.

If desired endianness differs from native endianness, a conversion has to be performed. When should that conversion occur? Three design patterns have evolved.

Convert only as needed (i.e. lazy)

This pattern defers conversion to the point in the code where the data element is actually used.

This pattern is appropriate when which endian element is actually used varies greatly according to record content or other circumstances

Convert in anticipation of need

This pattern performs conversion to native endianness in anticipation of use, such as immediately after reading records. If needed, conversion to the output endianness is performed after all possible needs have passed, such as just before writing records.

One implementation of this pattern is to create a proxy record with endianness converted to native in a read function, and expose only that proxy to the rest of the implementation. If a write function, if needed, handles the conversion from native to the desired output endianness.

This pattern is appropriate when all endian elements in a record are typically used regardless of record content or other circumstances.

Convert only as needed, except locally in anticipation of need

This pattern in general defers conversion but for specific local needs does anticipatory conversion. Although particularly appropriate when coupled with the endian buffer or arithmetic types, it also works well with the conversion functions.

Example:

struct data_t
{
  big_int32_t   v1;
  big_int32_t   v2;
  big_int32_t   v3;
};

data_t data;

read(data);

...
++v1;
...

int32_t v3_temp = data.v3;  // hoist conversion out of loop

for (int32_t i = 0; i < large-number; ++i)
{
  ... lengthy computation that accesses v3_temp ...
}
data.v3 = v3_temp;

write(data);

In general the above pseudo-code leaves conversion up to the endian arithmetic type big_int32_t. But to avoid conversion inside the loop, a temporary is created before the loop is entered, and then used to set the new value of data.v3 after the loop is complete.

Question: Won’t the compiler’s optimizer hoist the conversion out of the loop anyhow?

Answer: VC++ 2015 Preview, and probably others, does not, even for a toy test program. Although the savings is small (two register bswap instructions), the cost might be significant if the loop is repeated enough times. On the other hand, the program may be so dominated by I/O time that even a lengthy loop will be immaterial.

Use case examples

Porting endian unaware codebase

An existing codebase runs on big endian systems. It does not currently deal with endianness. The codebase needs to be modified so it can run on little endian systems under various operating systems. To ease transition and protect value of existing files, external data will continue to be maintained as big endian.

The endian arithmetic approach is recommended to meet these needs. A relatively small number of header files dealing with binary I/O layouts need to change types. For example, short or int16_t would change to big_int16_t. No changes are required for .cpp files.

Porting endian aware codebase

An existing codebase runs on little-endian Linux systems. It already deals with endianness via Linux provided functions. Because of a business merger, the codebase has to be quickly modified for Windows and possibly other operating systems, while still supporting Linux. The codebase is reliable and the programmers are all well-aware of endian issues.

These factors all argue for an endian conversion approach that just mechanically changes the calls to htobe32, etc. to boost::endian::native_to_big, etc. and replaces <endian.h> with <boost/endian/conversion.hpp>.

Reliability and arithmetic-speed

A new, complex, multi-threaded application is to be developed that must run on little endian machines, but do big endian network I/O. The developers believe computational speed for endian variable is critical but have seen numerous bugs result from inability to reason about endian conversion state. They are also worried that future maintenance changes could inadvertently introduce a lot of slow conversions if full-blown endian arithmetic types are used.

The endian buffers approach is made-to-order for this use case.

Reliability and ease-of-use

A new, complex, multi-threaded application is to be developed that must run on little endian machines, but do big endian network I/O. The developers believe computational speed for endian variables is not critical but have seen numerous bugs result from inability to reason about endian conversion state. They are also concerned about ease-of-use both during development and long-term maintenance.

Removing concern about conversion speed and adding concern about ease-of-use tips the balance strongly in favor the endian arithmetic approach.

Appendix A: Endian Mini-Review

The results of the Boost.Endian formal review included a list of issues to be resolved before a mini-review.

The issues are shown in bold below, with the resolution indicated.

Common use case scenarios should be developed.

Done. The documentation have been refactored. A page is now devoted to Choosing the Approach to endianness. See Use cases for use case scenarios.

Example programs should be developed for the common use case scenarios.

Done. See Choosing the Approach. Example code has been added throughout.

Documentation should illuminate the differences between endian integer/float type and endian conversion approaches to the common use case scenarios, and provide guidelines for choosing the most appropriate approach in user’s applications.

Done. See Choosing the Approach.

Conversion functions supplying results via return should be provided.

Done. See Conversion Functions.

Platform specific performance enhancements such as use of compiler intrinsics or relaxed alignment requirements should be supported.

Done. Compiler (Clang, GCC, VisualC++, etc.) intrinsics and built-in functions are used in the implementation where appropriate, as requested. See Built-in support for Intrinsics. See Timings for Example 2 to gauge the impact of intrinsics.

Endian integer (and floating) types should be implemented via the conversion functions. If that can’t be done efficiently, consideration should be given to expanding the conversion function signatures to resolve the inefficiencies.

Done. For the endian types, the implementation uses the endian conversion functions, and thus the intrinsics, as requested.

Benchmarks that measure performance should be provided. It should be possible to compare platform specific performance enhancements against portable base implementations, and to compare endian integer approaches against endian conversion approaches for the common use case scenarios.

Done. See Timings for Example 2. The endian/test directory also contains several additional benchmark and speed test programs.

Float (32-bits) and double (64-bits) should be supported. IEEE 754 is the primary use case.

Done. The endian buffer types, endian arithmetic types and endian conversion functions now support 32-bit (float) and 64-bit (double) floating point, as requested.

Support for user defined types (UDTs) is desirable, and should be provided where there would be no conflict with the other concerns.

Done. See Customization points for user-defined types (UDTs).

There is some concern that endian integer/float arithmetic operations might used inadvertently or inappropriately. The impact of adding an endian_buffer class without arithmetic operations should be investigated.

Done. The endian types have been decomposed into class template endian_buffer and class template endian_arithmetic. Class endian_buffer is a public base class for endian_arithmetic, and can also be used by users as a stand-alone class.

Stream insertion and extraction of the endian integer/float types should be documented and included in the test coverage.

Done. See Stream inserter and Stream extractor.

Binary I/O support that was investigated during development of the Endian library should be put up for mini-review for inclusion in the Boost I/O library.

Not done yet. Will be handled as a separate min-review soon after the Endian mini-review.

Other requested changes.

In addition to the named-endianness conversion functions, functions that perform compile-time (via template) and run-time (via function argument) dispatch are now provided. order*native is now a synonym for order*big or order*little according to the endianness of the platform. This reduces the number of template specializations required. Headers have been reorganized to make them easier to read, with a synopsis at the front and implementation following.

This documentation is

  • Copyright 2011-2016 Beman Dawes

  • Copyright 2019 Peter Dimov

and is distributed under the Boost Software License, Version 1.0.