The copy constructor strikes back

The next step towards obeying the rule of three

In the last post, we already suspected, that, as we already had to define a custom destructor, we should also define custom copy constructor, like the rule of three demands it. In order do so, we need to define a test, which requires this custom copy constructor. It makes sense to extend our already existing tests, which is concerned about copying our optional, witch a subsection dedicated to value types with a non-trivial copy constructor, like e.g. std::string.

TEST_CASE("An optional and its copy are equal.") {
  // ...
  SECTION("copy constructor with a value (non trivial copy constructor)") {
    std::string anyValueX = "value";
    optional<std::string> x(anyValueX);
    optional<std::string> y(x);
    REQUIRE(x == y);
  }
  // ...
}

This test won’t even compile, as the compiler can not generated a copy constructor for a union with members, which are not trivially copyable.

The naive approach

So we need to define a copy constructor for out optional. A first quite naive implementation could look like this.

template <typename T>
class optional {
 public:
  // ...
  optional(const optional& other)
      : mHasValue(other.mHasValue)
      , mValue(other.mValue) {}
  // ...
};

This implementation will compile and will even make our new test compile. But this implementation could have been generated by the compiler with ease: usually a compiler generated copy constructor just looks (at least conceptually) like that. This should make us suspicious. It seems that we need another test: we should test the empty case for non-trivial value types, too!

TEST_CASE("An optional and its copy are equal.") {
  // ...
  SECTION("copy constructor without a value (non trivial copy constructor)") {
    optional<std::string> x;
    optional<std::string> y(x);
    REQUIRE(x == y);
  }
  // ...
}

This new test case will compile as well, but will raise a segmentation fault. This is because our naive copy constructor tries to copy a std::string which isn’t there. Actually, std::strings copy constructor is invoked on some bytes which happen to be there and interpret them as a std::string. As a std::string contains at least one pointer, this will probably lead to an out of bounds memory access, which causes the segmentation fault.

Making a conditional copy - attempt I

In order to fix this issue, we must make sure, that std::strings copy constructor is only invoked if other actually holds a std::string. So we need to invoke it only if other.mHasValue is true. This can actually not be done as part of the constructor’s initializer list, because C++ offers no mechanism to invoke an constructor conditionally in the initializer list, so can not initialize mValue in the initializer list. Avoiding that, a next attempt could be this one.

template <typename T>
class optional {
 public:
  // ...
  optional(const optional& other)
      : mHasValue(other.mHasValue) {
    if (other.mHasValue) {
      mValue = other.mValue;
    }
  }
  // ...
};

This will actually make our test pass: instead of relying on the value type’s copy constructor, we are using its assignment operator. But this approach has two major issues:

It will only work for value types, which are assignable, but not for value types, which are copyable but not assignable. For example, a struct or a class containing at least one const member may be copyable (because even the const member can be copied), but may not be assignable (because we can not assign to a constant). Additionally, an instantiation like optional<const int> would also not be copyable, although const int can be easily copied. This would be quite unfortunate.
The assignment operator needs to be called on a valid object, but – in this situation – mValue is no valid object: it has never been initialized. So we are invoking the assignment operator on uninitialized memory here. This might work in some cases, but only by coincidence.

Given this two issues, must revise our implementation and came up with another solution.

Interlude: How to call a constructor?

So, how can we call a constructor? C++ has basically two ways for calling a class’s constructor, hence to construct a new object (object meaning something which takes up memory like built-in data types or classes, structs, etc): with or without the new operator.

The group of mechanisms for constructing an object without the new operator is the predominant why of creating objects in modern C++. We have used is exclusively yet. It is used, to create object everywhere else than the heap: on the stack, as a global/static variable/constant or withing an class’s initializer list.

Constructing an object using the new operator is different. We use this syntax if we want to create an object on the heap; like in this example, in which we create an integer with the value of 5 on the heap.

int * x = new int(5);

The new operator is actually doing two different things here:

it allocates the memory needed to store the int on the heap. This can be roughly compared with calling malloc in C.
it construct the int at the new allocated piece of memory.

As the new operator is obviously already known for constructing objects (hence calling constructors), and as there are situations in which one needs to create an object which resides at a given address, it made sense to add this capability to the new operator. It is in fact possible to pass a pointer to the new operator, which points to the target location in memory, at which the new object shall be created. We could rewrite the example from above like this:

int * x = (int*)malloc(sizeof(int));
new(x) int(5);

The new operator will not allocate memory on it’s own now, but use the memory location pointed to by x. This variant of the new operator is called placement new.

Making a conditional copy - attempt II

Now we know how to construct an object at a given memory location or – putting it differently – how to call a (copy) constructor. Using this new knowledge we can now fix our implementation.

template <typename T>
class optional {
 public:
  // ...
  optional(const optional& other)
      : mHasValue(other.mHasValue) {
    if (other.mHasValue) {
      new (&mValue) T(other.mValue);
    }
  }
  // ...
};

With this implementation, all the issues an limitation of the other approaches are circumvented:

A copy is performed if and only if there is something to copy.
The new object is initialized properly, no member functions (like the assignment operator) are called on unitialized memory.
Only the existence of the copy constructor is required; the assignment operator is not needed.

Conclusion

This one was quite though: in order to properly copy our optional, we needed to get to known a somewhat new mechanism to invoke constructors/construct object called placement new. But as soon as we knew that, the actual implementation of the copy constructor was not that hard after all.

But our optional still does not comply with the rule of three: the copy assigment operator has not been defined yet. This will be our next step.