PrevUpHomeNext

Design

In recent years, C++ developers have become very interested in heterogenous containers -- containers that can contain an value which might be of several different types.

One way to approach the problem is to use a union type. union is a language-level feature inherited from C. [1]

Prior to C++11, a union could only contain POD types. Since C++11, this restriction was relaxed, but the compiler still does not generate useful special member functions, nor can it, really. A union is ultimately a very low level construct -- usually in C++, it is there for compatibility with C, or it is used as a building block in creating a more user-friendly type, rather than as a solution itself.

For purposes of C++, if we want to have a union type which is extended to act more like a standard container, a natural approach is to use a discriminated union. This means, a union, together with an int, or an enum, which keeps track of which type is currently engaged within the union. When we have this extra information, we can access the union in a type-safe way, and give it a copy constructor and so on. This idea was originally described by Andrei Alexandrescu [2].

boost::variant is a well-known implementation of these ideas, and it became a very influential library. Besides providing a generic, type-safe, discriminated union, it also provided a concept of static visitor. A static visitor is function object which the programmer creates in order to "visit" the variant type in a systematic fashion, rather than using a sequence of if ... else or similar .

The function object's implementation may use templates, and so take advantage of C++'s pattern matching abilities when visiting the variant type, which is very powerful and convenient. This is perhaps the closest thing in current versions of C++ to the pattern matching features found in languages like ML, Rust, and Haskell.

However, one of boost::variant's constraints is that it doesn't require C++11 -- its interface and design work in a way that accomodates C++ sans move semantics and noexcept annotations, which gives up certain optimization opportunities. Some of these gains can be realized by using the most recent versions of boost::variant when it enables C++11 features, but some cannot, as we will see.

Never-Empty Guarantee

The core issue in implementing a variant is how to handle a throwing, type-changing assignment [3].

Fundamentally, there is a chicken-and-egg problem: When the variant is assigned to a value different from the currently contained type, we must first destroy the current value in order to vacate the storage, then we can construct the new value in that storage. Clearly, we can't construct the new value without destroying the old value first. But on the other hand, if constructing the new value fails by throwing an exception, we would like to retain the old value, so that we can have roll-back semantics and strong exception safety. Once it's destroyed, we can't exactly do that anymore. So somehow, neither of these steps should happen first.

There are a number of possible ways to resolve this issue.

Double-storage approach

One simple solution is to allocate double the storage for the variant. Then, the new object need not be allocated in the same memory as the old object, so it can always be constructed first and the issue is resolved. [4]

But in some applications, the overhead of double memory usage isn't tolerable.

boost::variant approach

boost::variant resolves this tension as follows. First, it assumes that the contained object is copyable, and then, before any type-changing assignment, makes a backup copy of your object on the heap. Then the object in storage is destroyed and the new one is created. If construction fails, the variant holds onto the heap pointer, and subsequent accesses go via that pointer, until a new assignment takes place. [5]

In many scenarios, this is a great solution -- the storage overhead is at most the size of a pointer. However, if your object is expensive or impossible to copy, then this won't work so well for you.

std::variant approach

After much discussion, the std::variant of C++17 took the following approach: relax the "never-empty" guarantee to a "rarely-empty" guarantee, by introducing an empty state which occurs when construction of the new object fails. This empty state can be queried, and if a variant is visited which was made empty by an earlier exception, then a bad_visit is thrown. This is the so-called "Kona Compromise". [6]

This solution has several obvious advantages -- we don't have to assume that the user type is copyable, and we don't have to make a backup copy at all, which is significantly more efficient in some cases. In terms of performance, we basically get ideal behavior, for assignments and for the storage space used by the variant.

However, it also complicates the interface to the variant, as now the programmer needs to think about and sometimes query the empty state.

In many projects, the empty state isn't a meaningful concern and this solution works wonderfully.

Indeed, many other modern C++ variant types targetting C++11 and C++14 have followed suit and adopted the std::variant interface and behavior [7] [8].

However, in some projects, when the exception caused by object is thrown, we want to recover and proceed with what we were doing. Depending on the lifetimes of the containers in the project, a variant with only a basic exception safety guarantee may be very inconvenient, as now we may have to go find that variant and put it in a good state before attempting to proceed, or risk it throwing a new exception when we try to use it. In these projects, it would be much nicer if the exception only invalidates the object that threw it, and not also the surrounding container.

strict_variant approach

Since most standard and boost containers provide strong-exception safety, it seems desirable to have at our disposal an efficient, modern C++ variant type which also provides this guarantee. And indeed, there have been some standardization proposals like this [9] [10].

strict_variant achieves this by making a different implementation decision with a different trade-off. Instead of compromising on strong-exception safety, we compromise somewhat on the stack-based goal -- we simply say that any type that is not no-throw move-constructible needs to be allocated on the heap.

Specifically, objects with throwing moves get put in a wrapper which allocates them on the heap. The wrapper only holds a pointer, and moving the pointer can't fail, even if the object has a throwing move.

For example, this solves the copy-assignment problem as follows:

When using the easy_variant template alias, the throwing move is detected and the wrapper is handled transparently for the programmer, giving a general, easy-to-use, and performant variant.

From the user's point of view, this design results in a very familiar feel:

Of course, this implementation pays a cost when we try to access one of the types with a throwing move, since it is on the heap. However, this also is somewhat consistent with the rest of the standard library -- since C++11, most programmers have learned that they should try hard to make their objects no-throw moveable if possible, since many standard library containers such as vector will be able to use a "fast path" in that case and be much more efficient.

While it's been well-established that throwing moves are sometimes hard to avoid and can't reasonably be banned from general-purpose containers, [11], it seems that supporting them by putting them on the heap is a reasonable solution to the problem. If such objects are large and complex enough, it may be favorable to allocate them on the heap anyways.

Easy control of abstraction penalty

Another nice aspect of our design is that we make it easy for the programmer to control the overhead that they pay to support the throwing move situation, to the degree that they want to.

strict_variant

The "real" implementation of the variant type, strict_variant, only fully supports types which are no-throw move constructible.

If it is constructed with a type that is not no-throw move constructible, then the variant cannot generate assignment operators -- a compiler error results.

In some projects, this is more desirable than a container which silently begins to make extra heap allocations. When a strict_variant cannot be assigned and needs to be, the programmer resolves the situation either by factoring out the throwing move, or putting the type in recursive_wrapper within the variant declaration. (Or using the emplace member function, rather than assignment.)

[Note] Note

Recall that boost::recursive_wrapper is a component from boost::variant.

A recursive_wrapper<T> represents a heap-allocated instance of T.

In boost::variant, this is used to allow declaring a variant which contains an incomplete type in a simple and ergonomic way -- there is special support code within variant's interface so that accessing an object in a recursive_wrapper can be done using the same syntax.

Since our recursive_wrapper is no-throw move constructible, even if T is not, this effectively works around the issue, and the rest of your code using the variant can compile without changes after substituting recursive_wrapper<T> for T.

When using strict_variant, you thus control directly exactly what extra heap allocations are made in support of strong exception-safety, and it happens in a transparent way -- you don't pay for anything you don't use.

If you can manage to make all your types no-throw move constructible, then you get ideal variant performance -- no storage overhead, no extra copies or dynamic allocations, and no empty state to worry about.

easy_variant

We provide a template alias easy_variant which takes care of these details if you don't care to be bothered by the compiler about a throwing move / dynamic allocation.

(Some programmers would prefer that the compiler not start making dynamic allocations without a warning, just because some noexcept annotation was not deduced the way they expected. But programmer convenience is a good thing too.)

Specifically, any type that you put in the easy_variant which has a throwing move will be wrapped in recursive_wrapper implicitly.

namespace strict_variant {
  template <typename ... Ts>
  using easy_variant = variant<wrap_if_thowing_move_t<Ts>...>;
}

where wrap_if_throwing_move_t is

namespace strict_variant {
  struct <typename T, typename = mpl::enable_if_t<std::is_nothrow_destructible<T>::value && !std::is_reference<T>::value>>
  struct wrap_if_throwing_move {
    using type = typename std::conditional<std::is_nothrow_move_constructible<T>::value,
                                           T,
                                           recursive_wrapper<T>>::type;
  };

  template <typename T>
  using wrap_if_throwing_move_t = typename wrap_if_throwing_move<T>::type;
}

So, easy_variant means that if something starts throwing in the future, you won't have to go back and change your variant declaration. But, if you can manage to make all the types involved no-throw move constructible, then you still don't pay anything for what you don't use.

Implicit Conversions

A secondary motivation of strict_variant is to reduce the errors caused by implicit conversions when using a variant, and help achieve consistent cross-platform behavior.

For instance, some programmers consider it very undesirable that the following code compiles without a warning or error.

boost::variant<std::string, int> v;

v = true;

When an assignment like this takes place, boost::variant determines what type will be used via overload resolution. That is, a function object f is created, which has overloads for std::string and int arguments. If overload resolution succeeds, then it determines which type will result inside of the variant.

A drawback of this is that it means that any C++ implicit conversions are permitted to take place when assigning a variant type.

Overload resolution is a core C++ language feature, and in 95% of cases it works very well and does the right thing.

However, in the 5% of cases where overload resolution does the wrong thing, it can be quite difficult to work around it. This includes the scenarios in which overload resolution is ambiguous, and those in an overload is selected which the user did not intend due to a subtle conversion.

Because integral types have so many permitted conversions, these problems are particularly obvious when you have a variant with several integral types.

This happens commonly when using variant types to interface with some scripting language for instance. The typical dynamically-typed scripting language will permit a variety of primitive values, so when binding to it, you may naturally end up with something like

boost::variant<bool, int, float, std::string, ...>

To make it easier to get good results when manipulating a variant like this, strict_variant prevents many implicit conversions from occurring when two arithmetic or pointer types types are involved.

Safe Conversions

When assigning or constructing a variant, some conversions which may be "surprising" or which are not totally portable are not permitted.

So for instance,

strict_variant<std::string, int> v;

v = true;

is an error because bool cannot be used to construct int within the variant, and

strict_variant<long, float> v;

v = 100;

is not ambiguous -- int to float is not allowed by our rules, so it is eliminated from the overload resolution using SFINAE, and long is unambiguously selected.

With boost::variant, code like this is okay:

boost::variant<const void *, bool> v;

const char * str = "foo";
v = str;

With strict_variant, it isn't -- neither conversion const char * to const void * or to bool is permitted to occur implicitly here.

[Note] Note

strict_variant conversion rules can only affect the result when the type contained in the variant is an arithmetic type or a pointer type.

When the type contained is a user-defined type, all the normal conversions take effect -- if your type is constructible from unsigned long long, we can't prevent the compiler from selecting it when assigning from an int using SFINAE so far as I know. (And it's not clear that that would be desirable anyways.)

The actual test is performed by a type trait, strict_variant::safely_constructible, which refines std::is_constructible. You can make tests against that if you want to e.g. make static assertions that certain conversions are or aren't permitted.

Rank-based Priority

A second aspect of strict_variant is that we adopt a rank-based priority policy to resolve some overload ambiguities involving arithmetic types.

Consider the following overloaded function:

void f(long);
void f(long long);

f(100);

Under standard C++ rules, this function call is ambiguous: The argument f is an int, and both long and long long can be obtained from int by conversion. Since neither has a worse conversion, there is no best conversion.

When assigning to a variant however, by far the most common situation is that the user wants to use the "smallest" type that will fit, since "widening" represents a loss of information in some sense. (After the widening conversion, our knowledge of the bounds of the value are worse than they were before.) If later when they are visiting that value, they want to convert it to a larger type, that is easy to arrange.

Therefore, when an integral value is assigned to strict_variant, if there are multiple integral value types which permit a safe conversion, it prioritizes them by rank and sign. If for one of the candidates, there is another candidate which has strictly lower rank / sign, it is "dominated" by the lower rank candidate, and does not participate in overload resolution.

For instance, int is safely convertible to both long and long long. But long dominates long long when making arithmetic conversions, so long long is eliminated, and long is selected unambigously when assigning an int to strict_variant<long, long long>.

Emplace

There is a second way to achieve a type-changing assignment: the emplace member function.

v.emplace<long long>(100);
v.emplace<unsigned long long>(100);

emplace takes a single template parameter, the type of the desired object to emplace. All other parameters are forwarded to its constructor. This can be used to put objects in the variant that are neither copyable nor moveable, so long as they are somehow no-throw constructible.

This can be used to specify explicitly what value type shall result, regardless of overload resolution or the strict_variant rules.



[2] Andrei Alexandrescu. "Generic<Programming>: Discriminated Unions" series: Part 1, Part 2, Part 3. C/C++ Users Journal. 2002.

[3] C.f. Axel Naumann "The Variant Saga: A happy ending?" https://isocpp.org/blog/2015/11/the-variant-saga-a-happy-ending

[4] C.f. Anthony Williams. Double-Storage Proposal. 2002.

[5] C.f. Eric Friedman, Itay Maman. Online boost::variant documentation boost:doc/html/variant/design.html#variant.design.never-empty

[6] C.f. Axel Naumann. "Variant: a type-safe union (v6)" Proposal P0088r1 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0088r1.html

[7] C.f. Agustín Bergé. eggs::variant online documentation. https://github.com/eggs-cpp/variant

[8] C.f. Artem Pavlenko. mapbox::variant online documentation. https://github.com/mapbox/variant

[9] C.f. David Sanke "Simply a Strong Variant". P0093R0 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0093r0.html

[10] C.f. Anthony Williams "P0110R0: Implementing the strong guarantee for variant<> assignment" P0110R0 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0110r0.html


PrevUpHomeNext