The implementations are implementations of well-known data structures. The queue is based on Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms by Michael Scott and Maged Michael, the stack is based on Systems programming: coping with parallelism by R. K. Treiber and the spsc_queue is considered as 'folklore' and is implemented in several open-source projects including the linux kernel. All data structures are discussed in detail in "The Art of Multiprocessor Programming" by Herlihy & Shavit.
classes are node-based data structures, based on a linked list. Memory management
of lock-free data structures is a non-trivial problem, because we need to
avoid that one thread frees an internal node, while another thread still
boost.lockfree uses a simple approach not returning
any memory to the operating system. Instead they maintain a free-list
in order to reuse them later. This is done for two reasons: first, depending
on the implementation of the memory allocator freeing the memory may block
(so the implementation would not be lock-free anymore), and second, most
memory reclamation algorithms are patented.
The ABA problem is a common problem when implementing lock-free data structures.
The problem occurs when updating an atomic variable using a
operation: if the value A was read, thread 1 changes it to say C and tries
to update the variable, it uses
compare_exchange to write
C, only if the current value is A. This might be a problem if in the meanwhile
thread 2 changes the value from A to B and back to A, because thread 1 does
not observe the change of the state. The common way to avoid the ABA problem
is to associate a version counter with the value and change both atomically.
boost.lockfree uses a
class which associates a pointer with an integer tag. This usually requires
compare_exchange, which is not available
on all platforms. IA32 did not provide the
before the pentium processor and it is also lacking on many RISC architectures
like PPC. Early X86-64 processors also did not provide a
instruction. On 64bit platforms one can work around this issue, because often
not the full 64bit address space is used. On X86_64 for example, only 48bit
are used for the address, so we can use the remaining 16bit for the ABA prevention
tag. For details please consult the implementation of the
For lock-free operations on 32bit platforms without double-width
we support a third approach: by using a fixed-sized array to store the internal
nodes we can avoid the use of 32bit pointers, but instead 16bit indices into
the array are sufficient. However this is only possible for fixed-sized data
structures, that have an upper bound of internal nodes.