COMS W4995 C++ Deep Dive for C Programmers

Move Semantics

In this chapter, we’re going to keep building out our IntArray implementation from the previous chapter. Recall that the we deleted the copy constructor and assignment in the class definition:

class IntArray {
public:
    ...
    IntArray(const IntArray&) = delete;
    IntArray& operator=(const IntArray&) = delete;
    ...
};

As a result, any code that attempts to copy an IntArray will fail to compile. That’s reasonable because we don’t want to wastefully copy an IntArray, but we’d still like to be able to pass it around. We’re now going to implement move operations for IntArray to facilitate that.

Motivation

Consider intarray2.cpp, which redefines IntArray with all of the improvements we discussed at the end of the last chapter. We also changed the main() function to call createIntArray() as follows:

IntArray createIntArray() {
    IntArray tmp;
    for (int i = 0; i < 20; i++) {
        tmp.push_back(i);
        std::cout << tmp << std::endl;
    }
    return tmp;
}

int main() {
    using namespace std;

    IntArray ia { createIntArray() };

    cout << "ia: " << ia << endl;
}

The createIntArray() function creates an IntArray object tmp on its stack and invokes push_back() 20 times. It then returns tmp by value. At this point, we know that this will cause a copy construction of the stack variable to create a temporary return value (in the absence of copy elision). Back in main(), we copy construct ia out of the return value of createIntArray() (again, in the absense of copy elision). Since we deleted copy for IntArray, we’d see the following error message if we tried to compile it:

g++ -g -Wall -std=c++14 -fno-elide-constructors   -c -o intarray2.o intarray2.cpp
intarray2.cpp: In function ‘IntArray createIntArray()’:
intarray2.cpp: error: use of deleted function ‘IntArray::IntArray(const IntArray&)’
      |     return tmp;
      |            ^~~
intarray2.cpp: note: declared here
      |     IntArray(const IntArray&) = delete;
      |     ^~~~~~~~
intarray2.cpp: In function ‘int main()’:
intarray2.cpp: error: use of deleted function ‘IntArray::IntArray(const IntArray&)’
      |     IntArray ia { createIntArray() };
      |                                    ^
intarray2.cpp: note: declared here
      |     IntArray(const IntArray&) = delete;
      |     ^~~~~~~~

Note that we’re compiling with -std=c++14 -fno-elide-constructors so that the compiler doesn’t perform copy elision. As expected, we see it complain about the two copies that we explained above. By the way, the code would still fail to compile even if we allow copy elision. Even though its copy construction is elided, ia is still semantically being copied from the return value of createIntArray().

Last chapter, we explained that avoiding copy here is reasonable because IntArray could grow to be large and copying it would be expensive. A common pattern in C and old C++ to avoid expensive copies is to define createIntArray() with an out-parameter as follows:

void createIntArray(IntArray* ia_ptr) {
    for (int i = 0; i < 20; i++) {
        ia_ptr->push_back(i);
        std::cout << *ia_ptr << std::endl;
    }
}

int main() {
    using namespace std;

    IntArray ia;
    createIntArray(&ia);

    cout << "ia: " << ia << endl;
}

While this coding style does the job, it’d be nice to be able to return a large object without incurring a copy, like we can in other high-level languages like Java and Python.

Move Constructor

C++11 introduced move semantics to avoid expensive copies. We know that tmp is destroyed once createIntArray() returns; instead of making a new copy of the underlying array and deleting the old one, why don’t we just move the pointer to the underlying array from the old object to the new object?

The IntArray move constructor, shown below, simply steals the underlying array from tmp to construct a new IntArray object:

IntArray(IntArray&& tmp) : sz{tmp.sz}, cap{tmp.cap}, a{tmp.a} {
    tmp.sz = tmp.cap = 0;
    tmp.a = nullptr;
    std::cout << "move ctor" << std::endl;
}

We’ll explain what the syntax IntArray&& means in the next section; let’s just focus on the high-level picture here.

First, we initialize the new IntArray by copying all the fields from the old IntArray that we’re stealing from, tmp. We use the member initializer list syntax we introduced last chapter to do that. Second, we transfer ownership of the underlying array to the new IntArray. Note that we aren’t destructing tmp here. Its destructor will be called at some point after the move constructor returns. We need to make sure we leave tmp in a destructable state. Here, we set tmp.a = nullptr since it’s safe to invoke delete on nullptr.

06-int-array-move-ctor

With our brand new move constructor installed, intarray2 compiles and runs!

$ ./intarray2
0 (cap=1)
0 1 (cap=2)
0 1 2 (cap=4)
0 1 2 3 (cap=4)
0 1 2 3 4 (cap=8)
0 1 2 3 4 5 (cap=8)
0 1 2 3 4 5 6 (cap=8)
0 1 2 3 4 5 6 7 (cap=8)
0 1 2 3 4 5 6 7 8 (cap=16)
0 1 2 3 4 5 6 7 8 9 (cap=16)
0 1 2 3 4 5 6 7 8 9 10 (cap=16)
0 1 2 3 4 5 6 7 8 9 10 11 (cap=16)
0 1 2 3 4 5 6 7 8 9 10 11 12 (cap=16)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 (cap=16)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 (cap=16)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 (cap=16)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 (cap=32)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 (cap=32)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 (cap=32)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 (cap=32)
move ctor
move ctor
ia: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 (cap=32)

We see that the move constructor is invoked twice. First, the stack variable tmp inside of createIntArray() is moved into an returned temporary object. Second, that temporary object is moved into ia. Note that we’re still compiling with -std=c++14 -fno-elide-constructors because the compiler would elide both of the move constructors otherwise.

L-value vs. R-value

The signature of the IntArray move constructor takes tmp as an r-value reference:

IntArray(IntArray&& tmp);

To better understand what that means and how it’s different from an l-value reference (IntArray&), let’s take a look at 06/rvalue-test.cpp:

struct X {
    X() : d{100} {}

    double d;
};

void f1(X& t)       { t.d *= 2;  cout << t.d << endl; }

void f2(const X& t) { cout << "can't change t" << endl; }

void f3(X&& t)      { t.d *= 3;  cout << t.d << endl; }

int main() {
    X x;           // x is an lvalue
                  
    f1(x);         // passing an lvalue to X&       --> ok
    f2(x);         // passing an lvalue to const X& --> ok
    // f3(x);      // passing an lvalue to X&&      --> not ok

    // f1( X{} );  // passing an rvalue to X&       --> not ok
    f2( X{} );     // passing an rvalue to const X& --> ok
    f3( X{} );     // passing an rvalue to X&&      --> ok
}

The invocations f1(x) and f2(x) are both valid. The variable x can be bound to both X& and const X&. That’s nothing new.

If we try to compile f3(x), however, we get a compilation error:

rvalue-test.cpp: error: cannot bind rvalue reference of type ‘X&&’ to lvalue of type ‘X’
      |     f3(x);      // passing an lvalue to X&&      --> not ok
      |        ^
rvalue-test.cpp: note:   initializing argument 1 of ‘void f3(X&&)’
      | void f3(X&& t)      { t.d *= 3;  cout << t.d << endl; }
      |         ~~~~^

It says that we can’t bind an r-value reference to x, which is an l-value.

Generally speaking, we can think of r-values as values than can only appear on the right-hand side of an assignment expression. In something like a = 5, the constant integer 5 is an r-value. l-values, on the other hand, can appear on the left-hand side of an assignment exression, i.e., something that you can assign to. In that example, the variable a is an l-value. While l-values can appear on the right-hand side of an assignment expression, like in b = a, r-values cannot appear on the left-hand side of an assignment expression, like in 5 = a. We’ll study l-values and r-values in greater detail in several chapters from now.

Aside from constants, another good example of an r-value is a temporary unnamed object. Consider the following code snippet:

MyString s {"hi"};
s = MyString{"hey"};

The MyString{"hey"} is an explicit call to the MyString constructor that yields a temporary unnamed object, which in turn gets passed into the copy assignment operator for s.

An r-value reference is a reference type that can only bind to an r-value. In rvalue-test, f3(x) fails to compile because it takes an r-value reference X&& but we passed it the variable x, which is an l-value.

The last three lines of main() call the three functions again, but with a temporary unnamed object X{} instead of the variable x. Let’s see the behavior of functions when we pass an r-value instead of an l-value.

Invoking f1( X{} ) will not compile:

rvalue-test.cpp: In function ‘int main()’:
rvalue-test.cpp: error: cannot bind non-const lvalue reference of type ‘X&’ to an rvalue of type ‘X’
      |     f1( X{} );  // passing an rvalue to X&       --> not ok
      |         ^~~
rvalue-test.cpp: note:   initializing argument 1 of ‘void f1(X&)’
      | void f1(X& t)       { t.d *= 2;  cout << t.d << endl; }
      |     

This makes sense because f1() takes an l-value reference and we’re trying to pass an r-value. As we’ve seen, an r-value is a constant or a temporary object. Binding an l-value reference to a constant clearly makes no sense because l-value references are meant to mutate the objects they refer to. Binding an l-value reference to a temporary object with the intention to modify it is likely futile because the object will soon go away. Even worse, this may indicate a flaw in the programming logic. For these reasons, C++ does not allow an r-value to be bound to an l-value reference.

C++ does allow, however, binding a const l-value reference to an r-value, as shown by the invocation f2( X{} ). The const reference makes it explicit that there is no intention to modify the object. This is also why the assignment operator works for the MyString example above; MyString{"hey"} is an r-value, and MyString::operator=() takes it as a const MyString&.

The invocation f3( X{} ) binds an r-value reference to an r-value. Note that f3() is able to modify the object using an r-value reference. In fact, r-value reference was introduced so that you could modify r-values, which is necessary to implement move semantics.

R-value reference completes our discussion of the IntArray move constructor:

IntArray(IntArray&& tmp);

The move constructor takes an r-value reference to the old IntArray so that it can steal the underlying array and leave the old IntArray in a destructable state.

Move Assignment

Going back to our intarray2 program, let’s extend the main() function to declare an empty IntArray ia2 and assign to it the return value of createIntArray():

int main() {
    using namespace std;

    IntArray ia { createIntArray() };
    cout << "ia: " << ia << endl;

    IntArray ia2;
    ia2 = createIntArray();
    cout << "ia2: " << ia2 << endl;
}

The code fails to compile because it looks like we are trying to invoke the deleted copy assignment operator:

intarray2.cpp: In function ‘int main()’:
intarray2.cpp: error: use of deleted function ‘IntArray& IntArray::operator=(const IntArray&)’
      |     ia2 = createIntArray();
      |                          ^
intarray2.cpp: note: declared here
      |     IntArray& operator=(const IntArray&) = delete;
      |               ^~~~~~~~

Recall that the invocation createIntArray() returns an r-value IntArray object. We can enable this kind of assignment by implementing the move assignment operator, shown below:

IntArray& operator=(IntArray&& tmp) {
    if (this != &tmp) {
        delete[] a;

        sz = tmp.sz;
        cap = tmp.cap;
        a = tmp.a;

        tmp.sz = tmp.cap = 0;
        tmp.a = nullptr;
    }
    std::cout << "move assignment" << std::endl;
    return *this;
}

We first perform a self-assignment check like we’ve seen in MyString::operator=(). If that passes, we proceed to delete the underlying array of the left-hand side object, this->a, and steal the underlying array of the right-hand side object, tmp.a. We then leave tmp in a destructible state by zeroing out its members and setting tmp.a to nullptr, like we did for the move constructor.

The move assignment performed in the code above looks like this in memory:

06-int-array-move-asn

We see the following output when we run intarray2:

...
0 (cap=1)
0 1 (cap=2)
0 1 2 (cap=4)
0 1 2 3 (cap=4)
0 1 2 3 4 (cap=8)
0 1 2 3 4 5 (cap=8)
0 1 2 3 4 5 6 (cap=8)
0 1 2 3 4 5 6 7 (cap=8)
0 1 2 3 4 5 6 7 8 (cap=16)
0 1 2 3 4 5 6 7 8 9 (cap=16)
0 1 2 3 4 5 6 7 8 9 10 (cap=16)
0 1 2 3 4 5 6 7 8 9 10 11 (cap=16)
0 1 2 3 4 5 6 7 8 9 10 11 12 (cap=16)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 (cap=16)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 (cap=16)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 (cap=16)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 (cap=32)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 (cap=32)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 (cap=32)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 (cap=32)
move ctor
move assignment
ia2: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 (cap=32)

The “move ctor” is from constructing the tmp return object. The “move assignment” is from the assignment of that tmp object to ia2.

Let’s say we now wanted to move assign ia into ia2, like this:

int main() {
    using namespace std;

    IntArray ia { createIntArray() };
    cout << "ia: " << ia << endl;

    IntArray ia2;

    ia2 = ia;

    cout << "ia2: " << ia2 << endl;
    cout << "ia: " << ia << endl;
}

Even with the move assignment operator installed, the code fails to compile. It looks like we’re trying to invoke the deleted copy assignment again. The difference between this assignment and the previous assignment is the right-hand side of the assignment operator. In ia2 = createIntArray(), the right-hand side is an r-value (the tmp object returned by createIntArray()). In ia2 = ia, the right-hand side is an l-value (the variable ia). It makes sense that the statement ia2 = ia doesn’t invoke the move assignment operator; we can’t bind an r-value reference to an l-value.

So how do we indicate to the compiler that we want to move the contents of an l-value? There are valid cases where you’d want to transfer the ownership of some underlying resources from one object to another.

We can just cast ia to an r-value reference, as shown below:

int main() {
    using namespace std;

    IntArray ia { createIntArray() };
    cout << "ia: " << ia << endl;

    IntArray ia2;

    // ia2 = ia;
    ia2 = (IntArray&&) ia;

    cout << "ia2: " << ia2 << endl;
    cout << "ia: " << ia << endl;
}

We now see the move assignment operator being invoked with ia, stealing its contents and leaving it in an empty state:

...
ia: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 (cap=32)
move assignment
ia2: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 (cap=32)
ia: (cap=0)

Generally speaking, C++ discourages C-style casts. To emphasize the intention of moving the underlying resources out of an object, C++ introduced std::move() to perform the cast for you:

ia2 = std::move(ia);

The new syntax is less error-prone and easier to read.

The Essential 6, Rule of 5, and Rule of 0

We now extend the Basic 4 into the Essential 6: constructor, destructor, copy constructor, copy assignment, move constructor, and move assignment! For any class we write, we must make sure that its essential 6 are correct.

In addition to the rules for compiler-generated basic 4 that we discussed a few chapters ago, here are a few more rules for move operations:

Recall that the Rule of 3 states that if you have to implement a destructor, copy constructor, or copy assignment, you most likely need to implement all three.

We now extend this rule to include move constructor and assignment since all five of these special member functions are closely related. If you intend to declare any of them (including = default or = delete), you most likely need to declare all of them. This is known as the Rule of 5.

There’s also the Rule of 0. If all of your class members already have their five special member functions defined correctly, and your class has no other special resource management, then there’s no need for your class to explicitly define any of the five special member functions. The compiler generated special member functions for your class would just call the special member functions for your class members. There’s no additional work you have to do! Some C++ experts prefer declaring the five special member functions with = default instead of omitting the declarations in this case. This makes your intention to rely on the compiler generated versions clear.


Last updated: 2025-09-14