For which T is optional regular?

As we has already seen, optional_unsigned_int has been an regular type. optional<T> is also regular, at least if T is regular, too. This is, because in this case the compiler can generate all required constructors and operators, which are needed, for us. This is possible, because a regular T has the defined as well. A so called special member functions can be generated by the compiler for a given class or struct if all members have this special member function defined. The implies, that we can compose regular types out of other regular types, because in this case the members will have all required special member functions defined.

This is nice, but a bit restricting. Let’s have a look at our optional’s default constructor:

template<typename T>
class optional {
 public:
  optional()
      : mHasValue(false) {}
  //...
};

A default constructed optional<T> has no value. But in our current implementation, T’s default constructor (if it exists )is called during the executions of optional<T>’s default constructor. This is unfortunate, because our optional’s class invariant does not allow a user to access the value of an optional which has no value. Only if an optional with a value is assigned to the former optional, accessing the value is allowed. This means, that the default constructed T exists only to be either overwritten or destructed.

This is unfortunate. We need a mechanism to avoid this useless construction of a T.

Implementing a test for this issue

Who to test such behavior? As we want to make sure that optional does never us its value type’s default constructor. So it is sufficient to use a value type, which is not default constructable. This can be archived by simply defining another constructor.

struct NonDefaultConstructable {
  NonDefaultConstructable(int x){};
};

TEST_CASE(
    "An optional of a non default constructable type can be default "
    "constructed.") {
  const optional<NonDefaultConstructable> x;
  REQUIRE(x.has_value() == false);
}

NonDefaultConstrubctable’s default constructor is removed, because there is another constructor defined. The test defined above will not compile, because optional’s default constructor needs to call NonDefaultConstructable’s default constructor, which has been removed. This is exactly what we wanted to archive.

Using a sledgehammer

What we need a lazy or delayed initialization of the optionals value (note that term “lazy initialization” means something else). A pretty common way of implementing such a lazy/delayed initialization of a class member is using the heap.

template<typename T>
class optional {
 public:
  optional()
      : mHasValue(false)
      , mValue(NULL) {}

  optional(const T& value)
      : mHasValue(true)
      , mValue(new T(value)) {}
  //...
 private:
  bool mHasValue;
  T* mValue;
};

Please note, that this implementation is not complete; it is just meant to illustrate the idea:

  • Store a pointer to T instead of T in the optional.
  • Initialize mValue with a null pointer if the optional is empty.
  • Allocate a copy of the value on the heap, if the optional is empty.

This implementation works and solves our problem: T’s constructors are only called if needed. But this implementation has a major drawback: it requies us to use heap memory. The allocation of heap memory has many problems:

  • Allocating memory on the heap is rather slow (in the old implementation, the memory has been allocated already with the allocation of the inclosing optional).
  • Allocationg memory on the heap may fail: we would have to deal with this special case.
  • Accessing objects on the heap may be slow due to cache misses.

These issues should convince us to drop this implementation and choose another one.

Using a union

It turns out, that we can use a language feature, which C++ inherited from C: unions.

template<typename T>
class optional {
 public:
  optional()
      : mHasValue(false)
      , mNoValue() {}

  optional(const T& value)
      : mHasValue(true)
      , mValue(value) {}
  //...
 private:
  bool mHasValue;

  struct NoValue {};
  union {
    NoValue mNoValue;
    T mValue;
  };
};

A union – at first glance – looks like a strcut. But other than a struct, which contains both of its members at the same time, a union always contains only one of its members. In our case this means, that, if mNoValue is initialized, the union containts an instance of NoValue, or, if mValue is initialized, the same union contains an instance of T.

NoValue is a dummy type. Its only use case is to be the second member of this union here. Its most important properties are

  • it has a minimal size (in most cases it is one byte),
  • it is default constructable and
  • it is clearly intended to only be used in this context.

Its minimal size is important because a union is always as big as its biggest member. The size of an optional should under no circumstances be determined by the size of this dummy type NoValue, but only by the size of T (and maybe its alignment requirements). The default constructablility of NoValue is needed, so that mNoValue can be default initilialized in optional’s default constructor. This implies, that T default constructor is not needed anymore, which was our goal, we wanted to archive in this post.

Note, that the union, which we are using here, is a so called anonymous union. cppference.com states that

Members of an anonymous union are injected in the enclosing scope.

This is quite nice, because mValue can be used like a usual member of optional, which is quite convenient as we don’t need to change the rest of our implementation. C.182 of the C++ Core Guidelines recommends the use of an anonymous union for tagged unions – we can consider optional<T> a tagged union with two “types”: T and “has no value”, so this rules applies quite well to our case here.

This approach (using a union) is nice but comes with a cost: if does not work with C++03. If we compile this version of our optional with our current test suite, the compiler (GCC) will raise this error:

error: member ‘CopyCounting optional<CopyCounting>::<unnamed union>::mValue’ with copy assignment operator not allowed in union
note: unrestricted unions only available with ‘-std=c++11’ or ‘-std=gnu++11’

Thankfully, GCC tells us right away the issue, we are facing here: “unrestricted unions only available with ‘-std=c++11’”. It turns out, that C++98 and C++03 are rather picky about the types of union members. Only plain old data (POD) types are allowed to be members of unions in these older C++ standards, meaning that basically only type may be used, which could be defined in C as well.

In C++11, there restrictions have been lifted, which gave us unrestricted unions. In order to be able to use them, we are forced to upgrade our C++ Version to C++11.

set(CMAKE_CXX_STANDARD 11)

After applying the change above to our CMakeLists.txt, our test suite compiles and runs without any errors.

Please note, that it actually would be possible to be possible to archive something similar with the facilities, which are provided by C++98. But these are rather tricky. Maybe we will explore them at a different time. If you want to get an idea about an C++98 implementation of such an optional, you may have a look at optional-like.

Conclusion

Until this post, our optional needed it’s value type T to be default constructable to be usable. This is unfortunate, but we have been able to overcome this limitation by using a union.

But in order to be able to use a union for this purpose, we were required to upgrade to C++11. Nevertheless “unrestricted unions” well be the only C++11 which will use for the next future.

However, with this adaption of unions in our implementation, we introduced other limitation, which have not been there before this change. These limitation are currently not standing out, because we don’t have the correct tests yet. But we will introduce them in the upcoming post.