Understanding Move Semantics and Perfect Forwarding: Part 2

Programming

coding-computer-hacker-97077

Rvalue References and Move Semantics

In the previous article I discussed lvalues and rvalues, what they mean and how they can affect a developer’s ability to write C++ applications. In this article I will take it a step further and finally introduce rvalue references, and how knowledge of rvalues are required for implementing move semantics.

We looked at how passing arguments by reference was important for performance, yet this imposed limitations. Only lvalues could be passed by non-const reference, and rvalues could be passed if the reference was const, meaning that the value couldn’t be modified.

As C++ 11 introduces the concept of rvalue references, the references we looked at previously are commonly referred to as lvalue references. An lvalue reference can be assigned a modifiable (const) lvalue or an unmodifiable value, and if the lvalue reference is defined as const (unmodifiable) they can be assigned an rvalue.

An rvalue reference is a reference that is only assigned an rvalue, meaning you cannot pass an lvalue to a function that has an rvalue reference argument. Below is an example of an rvalue reference, denoted with a double ampersand ‘&&’.

int &&rvalueRef = (x+1); // rvalueRef is rvalue reference

Having the ability to use rvalue references in C++ 11 opens up the ability to use rvalues in ways that were not previously possible as shown in the following code:

string Hello()
{
    Return “Hello World”;
}
string &&rvalRef = Hello();
rvalRef = “Hello”;
std::cout << rvalRef << std::endl; // Outputs “Hello”

The above code highlights how rvalue references change the way we interact with rvalues. Firstly, rvalues are no longer restricted to live only within the expression they are defined.  An rvalue’s lifetime becomes linked to the rvalue reference, and will therefore exist until that reference expires.

Prior to C++ 11 we could extend an rvalues lifetime by passing it to an un-modifable lvalue reference, however in this case the rvalue was un-modifable. With rvalue references, the rvalue can now be modified.

Now we are finally here, we understand rvalues, lvalues, lvalue references, and rvalue references. We are almost at the point of learning about move semantics, but first we need to make sure we have an understanding of temporary objects otherwise move semantics are not going to make much sense.

Temporary Objects

A temporary object is an object created during the creation of an expression and its purpose is to provide a temporary location for the result of that expression. Without temporary variables we would have to define variables to store these results. This was covered in more detail in the previous article.

It is worth remembering that an expression in C++ is any piece of code that returns a value, so there are many times within a C++ application that temporary objects are created, some examples include:

  • Returning a value from a function
  • Literals
  • Implicit conversions
  • Passing by value to a function

The reason why we are talking about temporary objects is because they are rvalue expressions, and whenever a temporary object is created, memory is allocated for that object, and the result from an expression is copied into it. The act of copying data into temporary objects is what can cause seemingly hidden performance issues within a C++ application, and are what move semantics aim to address.

Move Semantics

Move semantics aim to avoid the copying of data from temporary objects by instead stealing the memory location of where the object resides. This behaviour is implemented through the use of a move constructor and move assignment operator that act only on rvalue references.

MyClass(MyClass&& other)
{
_buffer = other._buffer;
_length = other._length;

other._buffer = nullptr;
other._length = 0;
}

In the above move constructor the following things occur. An rvalue reference argument is defined, so that the function only accepts rvalues like temporary objects. Given that _buffer is a raw pointer to an array, the _buffer pointer is simply assigned to point at the location of our temporary object and _length is updated to store the _length of other._buffer.

Finally, the rvalue reference other is reset to its default value. Assigning other._buffer to nullptr ensures that our temporay object no longer points to this data. This is important as we previously mentioned the lifetime of the temporary object is linked to the life time of the rvalue reference, which expires at the end of the move constructor. When the temporary object is being destroyed, like any other object its destructor will be called which would free up all the memory it was allocated i.e. our _buffer array.

The result of this move constructor is that we have now copied our object without having to copy any data, we have essentially just stolen the address of the temporary object instead.  For reference, below shows how a move assignment operator would look.

MyClass& MyClass::operator=(MyClass&& other)
{
    if(&other != *this)
{
delete[] _buffer;
_buffer = other._buffer;
_length = other._length;

other._buffer = nullptr;
other._length = 0;
}

A move constructor and move assignment operator can only be passed rvalues, whenever an lvalue is used, the copy constructor, or copy assignment operator will be called which means data will simply be copied to the new instance of the object. If we want to treat an lvalue like an rvalue so we can make use of move semantics then we need to cast an lvalue to rvalue.

std::move

In addition to the things already discussed, C++ 11 introduces a method std::move, that allows casting an lvalue to an rvalue. The method does not move the data, that is done by the move constructor, but by converting the lvalue to the rvalue it allows move constructor to be called.

std::string hello = “Hello World”; // lvalue

std::string(hello); // hello an lvalue so copy constructor called

std::string(std::move(hello)); // cast hello to an rvalue allowing move constructor to be called

Caution must be taken when using std::move. When working with a temporary object if we pass it to a move constructor we know that the temporary object (the rvalue reference) is going to cease to exist once method has finished being executed. With an lvalue, its lifetime might not be so short and by using std::move and essentially treating it like a temporary object we could end up deleting data that is used somewhere else in the program.

Therefore, it is wise to only use std::move on lvalues that we know aren’t going to get used anywhere else in the code, such as if they are local to the function in which a move constructor is being called.

Rvalue References are Lvalues

We now know that if we want our move constructor or move assignment operator to be called we must ensure we are passing an rvalue or first casting lvalue to an rvalue with std::move. It might not be apparently obvious but we can still end up calling the copy constructor if we are not careful.

MyClass(MyClass&& other)
{
_buffer = other._buffer;
_length = other._length;
_name   = other._name;

other._buffer = nullptr;
other._length = 0;
other._name = "";
}

In the above code we have modified the class MyClass to store a string object and updated our move constructor to accommodate this change. At first it might seem that when we execute _name = other._name we are calling the move constructor of string when in fact we are actually calling its copy constructor.

The copy constructor is being called because when an rvalue is passed to an rvalue reference this value is in fact an lvalue. This makes sense given the definition of an lvalue. A value that has an identifiable location in memory which you can get the address of. If the rvalue reference is in fact an lvalue then the objects copy constructor will be called because we are passing it an lvalue. To ensure the move constructor is called we must use std::move, this is safe because the original object being passed in should be a temporary object or an lvalue we know has limited lifetime.

MyClass(MyClass&& other)
{
_buffer = other._buffer;
_length = other._length;

// None primitive types will have their copy constructors called unless we first cast them to an rvalue.
_name = std::move(other._name);

other._buffer = nullptr;
other._length = 0;
other._name = "";
}

Summary

Hopefully you now have a good understanding of rvalue references, move semantics, and how move semantics are useful for improving the performance of C++ applications by avoiding the need to copy data when we temporary objects are involved.

Fortunately for us, move semantics are supported in modern C++, meaning the standard library has been updated to include move constructors and move assignment operators for all objects and can support any user defined objects that contain their own.

In addition, move constructors and move assignment operators will be implicitly created unless the user defines: a copy constructor, copy assignment operator, or destructor. So as long as we are not handling resources with raw pointers, or requiring deep copies of objects that rely on a custom copy constructor, we can happily reap the benefits of move semantics without making changes to our code.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s