COMS W4995 C++ for C Programmers

Basic-4 in C++

Member Functions

Member Function

C++ allows you to add member functions inside struct like this:

#include <iostream>

struct Pt {
    double x;
    double y;

    void print() {
        std::cout << "(" << this->x << "," << y << ")" << std::endl;
    }
};

int main()
{
    struct Pt p1 = { 1.0, 2.0 };

    p1.print();
}

A member function can be invoked on a struct instance with the following syntax:

object.mem_func();

pointer_to_obj->mem_func();

When we say instance or object, it refers to something that takes up some bytes of memory, like the struct variable p1 above. A member function can refer to the other members of the object directly (like y above), or using the implicitly defined this pointer (like this->x above).

Now that structs can have member functions, struct in C++ look a lot like class from Java. In fact, in C++ struct and class are pretty much the same thing. The only difference is that all members in a class have private access permission by default whereas in a struct they have public access by default, as they were in C. C++ tries to preserve the existing C semantics if possible.

We replace struct with class. We’ll make all members public for now for convenience.

class Pt {
public:

    double x;
    double y;

    void print() {
        std::cout << "(" << this->x << "," << y << ")" << std::endl;
    }
};

int main()
{
    Pt p1 = { 1.0, 2.0 };
    ...
}

Also note that we dropped the keyword struct from the declaration of p1. Writing the type as struct Pt or class Pt is still valid, but no longer necessary in C++.

Member Function vs. Global Function

When Bjarne Stroustrup created C++, he called it “C with Classes”, and the original compiler called Cfront translated C++ code to equivalent C code.

There are no member functions in C. C only has global functions. How would such a translator translate a member function to a normal global function in C?

It can change the function name to avoid conflicts with other classes’ member functions, and add a parameter for the object on which the function should be invoked:

void Pt_print(struct Pt *this) {
    // change x to this->x
    ...
}

The member function invocation, p1.print(), would be translated to a regular function call like this:

Pt_print(&p1);

In fact, that’s essentially what a member function is in C++: a normal function that you are familiar with from C, with an implicitly defined this pointer to the object on which the function is being invoked.

Note that a member function like this will not behave polymorphically like Java methods. C++ offers virtual keyword to make member functions polymorphic. We will explore this when we study inheritance.

Stack Allocation of Class Object

Process Memory Address Space (background material)

Under 64-bit Linux operating system, every single process (i.e., a running instance of a program) gets a vast memory space. Each byte of that memory space is linearly addressed from 0 to a large number like 512G. An over-simplified illustration of a process’s memory space looks like this:

         +--------------------------------------------------+
         | operating system code & data                     |
    512G +--------------------------------------------------+
         | stack        (for automatic variables)           |
         +--------------------------------------------------+
         |        |                                         |
         |        |                                         |
         |        v     (stack grows down)                  |
         |                                                  |
         |                                                  |
         |                                                  |
         |        ^     (heap grows up)                     |
         |        |                                         |
         |        |                                         |
         +--------------------------------------------------+
         | heap         (for memory allocated by malloc())  |
         +--------------------------------------------------+
         | date section (for static variables)              |
         +--------------------------------------------------+
         | code section (for the program code)              |
       0 +--------------------------------------------------+

Most computers do not have 512G of physical memory (RAM), let alone enough memory to give 512G to each of the dozens or even hundreds of processes running at any given time. The operating system provides an illusion that each process owns the entire memory space, and one that is much bigger than what’s actually there. This is called virtual memory.

Constructor

We can add a constructor to initialize the data members:

class Pt {
public:

    double x;
    double y;

    Pt() {        // constructor is a member function
        x = 4.0;  // with the same name as the class
        y = 5.0;
    }

    ...
};

int main()
{
    Pt p1;  // object allocation on the stack
            // will automatically invoke its constructor
    ...
}

In C, allocating an object on the stack simply meant reserving the size of the object on the stack. In C, struct Pt p1; would have simply reserved 16 bytes on the stack. (Each double data member takes up 8 bytes.)

C++ added one more step to object construction. C++ will automatically invoke the class’s constructor. That is, the Cfront compiler would translate the following object declaration in C++,

Pt p1;

to the following sequence of C code:

struct Pt p1;
Pt_Pt(&p1);

C++ allows function overloading, which means you can define multiple functions with the same name as long as they take different sets of parameters. So we can have multiple versions of constructor that take different parameters:

    Pt() {
        x = 4.0;
        y = 5.0;
    }

    Pt(double _x) {
        x = _x;
        y = 5.0;
    }

    Pt(double _x, double _y) {
        x = _x;
        y = _y;
    }

Or combine them into a single constructor using default arguments:

    Pt(double _x = 4.0, double _y = 5.0) {
        x = _x;
        y = _y;
    }

C++ allows several variations for object construction syntax:

   Pt p1(6,7);     // old C++ style

   Pt p1 = {6,7};  // C struct initialization syntax preserved

   Pt p1 {6,7};    // modern C++ style

   Pt p1 {6};      // x will be 6, y will be defaulted to 5

   Pt p1 {};       // exlicitly pass no argument

   Pt p1;          // or just omit them all together

   // Pt p1();     // this is no good because it is taken as a prototype

Destructor

C++ also lets you define destructors. Destructor is the opposite of constructor. Destructor gets automatically invoked when the object goes away – when a stack object goes out of scope, for example. We add a destructor to our Pt class:

class Pt {
public:

    double x;
    double y;

    Pt(double _x = 4.0, double _y = 5.0) {
        x = _x;
        y = _y;
        std::cout << "hi" << std::endl;
    }

    // ~class_name is the name of the destructor

    ~Pt() {
        std::cout << "bye" << std::endl;
    }

    ...
};

int main()
{
    using namespace std;

    cout << "*** p1: stack allocation" << endl;

    {
        Pt p1;

        p1.print(); 
        
    }

    cout << "That's all folks!" << endl;
}

We limited the scope of p1 using a pair of curly braces to show exactly where the constructor and the destructor get invoked. Here is the output:

*** p1: stack allocation
hi
(4,5)
bye
That's all folks!

Heap Allocation of Class Objects

Heap Allocation in C vs. C++

Recall that we allocated struct instances on the heap using malloc() in C:

struct Pt *p2 = malloc(sizeof(struct Pt));

...

free(p2);

That is, we called malloc() to reserve heap memory big enough for us to lay out our struct on it, and simply set a pointer of the right type to the first byte of that memory. malloc() was a generic function that simply allocates heap memory of the requested number of bytes, and it didn’t care how we were going to use it.

malloc() returns a void *, which was freely assignable to any typed pointer in C, but C++ changed the rule so that void * can no longer be assigned to a typed pointer without explicit casting. (The other way still works fine in C++ – i.e., any pointer can be assigned to a void *.) Other than that minor change, malloc() and free() still works in C++. We can try to allocate a Pt object on the heap like this:

class Pt {
public:

    double x;
    double y;

    Pt(double _x = 4.0, double _y = 5.0) {
        x = _x;
        y = _y;
        std::cout << "hi" << std::endl;
    }
    ~Pt() {
        std::cout << "bye" << std::endl;
    }
    ...
};

int main() {
    cout << "*** p2: heap allocation" << endl;

    Pt *p2 = (Pt *)malloc(sizeof(Pt));  // malloc() does not quite work
    p2->print();                        // for object construction in C++
    free(p2);

    cout << "That's all folks!" << endl;
}

It produces the following output:

*** p2: heap allocation
(0,0)
That's all folks!

As you can see, Pt’s constructor and destructor were not invoked. Recall that, for stack allocation of an object, C++ changed the rule to add a constructor call at allocation and a destructor call at deallocation. That was okay because when we write Pt p1;, we are explicitly constructing an object of type Pt. That’s not the case when we call malloc() and free() to allocate and deallocate an object on the heap. C++ cannot just change what malloc() and free() do. Even if it could, what would they do? malloc() has no idea you are going to use the memory for a Pt object!

That’s why C++ had to add a couple of new keywords, new and delete, for heap allocation and deallocation. If we replace the malloc and free calls above with the following,

    Pt *p2 = new Pt {6,7};
    p2->print();
    delete p2;

we will see the following output showing that the constructor and destructor got called:

*** p2: heap allocation
hi
(6,7)
bye
That's all folks!

Heap Allocation in Java

Compare C++ heap allocation syntax to that of Java:

    // C++

    Pt *p2 = new Pt(6, 7);  // p2 is a pointer
    p2->print();
    delete p2;

    // Java

    Pt p2 = new Pt(6, 7);  // p2 is an object reference, which is
    p2.print();            // just a pointer under the hood
    
    // There is no delete in Java; Java does garbage collection,
    // which means that the object will be automatically deallocated 
    // at some point after it becomes unreachable.

The similarity is not surprising since Java took C++ and made a simpler and safer language by removing many features that were deemed unnecessary for Java’s design goals.

Note that in C++, Pt p1; will allocate an actual object on the stack. But in Java, Pt p1; simply declares an object reference on the stack, which is just a pointer under the hood. In other words, Java’s Pt p1; is the same as Pt *p1; in C++.

If that’s the case, how does one create a class object on the stack in Java? The answer is that you cannot, at least not explicitly. There is no Java syntax to allocate a class object on the stack. All objects are created on the heap using new. (The Java Virtual Machine (JVM) may create short-lived objects on the stack as optimization though.) All class variables in Java are object references, i.e., pointers. This is how Java was able to get rid of * from the language. If everything is a pointer, you don’t have to call it a pointer!

Heap Allocation of Arrays in C++

Here is how you can allocate an array of objects on the heap instead of one object:

    cout << "*** p3: heap allocation of array of objects" << endl;

    Pt *p3 = new Pt[5] { {6,7}, {8,9}, {10} };

    p3[0].print(); p3[1].print(); p3[2].print(); p3[3].print(); p3[4].print();
    
    delete[] p3;

    cout << "That's all folks!" << endl;

Here is the output:

*** p3: heap allocation of array of objects
hi
hi
hi
hi
hi
(6,7)
(8,9)
(10,5)
(4,5)
(4,5)
bye
bye
bye
bye
bye
That's all folks!

We see that the constructor and the destructor are being called on each element of the array as they are allocated and deallocated. We passed initial values for the first three elements and let the last two elements be initialized with default values.

delete vs. delete[]

Note that we deleted a single object with delete p2; but an array of objects with delete[] p3;. The brackets are indeed required for deleting an array of objects in C++. delete is for a single heap-allocated object and delete[] is for an array of heap-allocated objects. You cannot mix them. If you do, your program will likely to crash.

Let’s explore this a little further. Comment out delete[] p3 and run the program under Valgrind to see how many bytes get leaked. We may expect to see 5 * 16 = 80 bytes leaked, but we will actually see 88 bytes leaked. The current versions of both g++ and clang++ allocates 8 more bytes on the heap for an array of heap-allocated objects, and stores the length of the array in there. The pointer returned by new is 8 bytes into the allocated region. When delete[] is called, it will go back 8 bytes, read the number to determine how many elements have been allocated, and call the destructor on each element. We can see this number by inserting the following code:

    Pt *p3 = new Pt[5] { {6,7}, {8,9}, {10} };

    ...
    
    // g++ stores the # of objects in the 8 bytes before p3
    int64_t *i = ((int64_t *)p3) - 1;
    cout << *i << endl;

    delete[] p3;

This number is not there when you new a single object using delete. It is the responsibility of the programmer to make sure delete and delete[] don’t get mixed up.

Why is C++ designed this way? Why not store the number 1 in the single object case as well? This is certainly possible, and would have made C++ programmer’s life a tiny bit easier because there would be no reason for separate delete[].

The reason is performance. C++ chooses performance over convenience in general, and definitely for a common case like single-object heap allocation. Single-object heap allocation should be made as fast as possible; it should not checking the number in front of the pointer every time.

Passing by Value vs. Passing by Reference

Passing by Value

Let’s add a function transpose() and modify main() to test it. (We keep the existing definition of class Pt that has a constructor, a destructor, and the print() member function.)

void transpose(Pt p)
{
    double t = p.x;
    p.x = p.y;
    p.y = t;
    p.print();
}

int main() {
    cout << "*** p4: passing by value vs. passing by reference" << endl;

    Pt p4;
    p4.print();
    transpose(p4);
    p4.print();

    cout << "That's all folks!" << endl;
}

It produces the following output, with each line annotated with explanation:

*** p4: passing by value vs. passing by reference
hi                   // p4's constructor
(4,5)                // p4.print()
(5,4)                // p.print() showing that p.x and p.y have been swapped
bye                  // p's destructor
(4,5)                // p4.print() showing that p4 has NOT been transposed
That's all folks!
bye                  // p4's destructor

When you pass a class object by value, the function parameter is not the object from the calling function. It is a new object which is a copy of the original object. In the example above, p is a copy of p4, which gets transposed inside the function transpose(). But p is a local variable in transpose() – all function parameters are local to the function – so it will go out of scope at the end of the function, triggering the invocation of its destructor. When the execution returns to main() we see that transposing p had no effect on p4.

Note that we do not see p’s constructor being invoked. We see “bye” from p’s destructor, but there is no “hi”. Why is this? If p was a new local object that was put on the stack frame of transpose(), shouldn’t its constructor have been called? A constructor was indeed got invoked, but it was not the constructor we wrote before. When an object is passed by value, C++ will call a special constructor called “copy constructor”. Since we did not provide our own copy constructor, the compiler generated one that simply copied each members. We will write our own copy constructor shortly.

Passing by Reference in C

In C, all function arguments are passed by value. There is no such thing as passing by reference in C. In C, we had to simulate passing by reference using pointers like this:

void transpose(Pt *p)
{
    double t = p->x;
    p->x = p->y;
    p->y = t;
}

int main() {
    ...
    transpose(&p4);  // pass the address of p4 instead of p4 itself
    ...
}

Note that this is still passing by value – we are passing something else by value, the address of the object, instead of the object itself.

Passing by Reference in C++

In C++, we can actually pass an object by reference. All we have to do is to change the function parameter type from Pt to Pt&:

void transpose(Pt& p)
{
    double t = p.x;
    p.x = p.y;
    p.y = t;
    p.print();
}

int main() {
    cout << "*** p4: passing by value vs. passing by reference" << endl;

    Pt p4;
    p4.print();
    transpose(p4);
    p4.print();

    cout << "That's all folks!" << endl;
}

which will now produce the following output:

*** p4: passing by value vs. passing by reference
hi                 // p4's constructor
(4,5)              // p4.print()
(5,4)              // p.print() showing that p.x and p.y have been swapped
(5,4)              // p4.print() showing that p was actually referring to p4
That's all folks!
bye                // p4's destructor

In this case, p is not a separate object, but simply another name for the existing p4 object. We can think of p as an alias or synonym for p4. Such a variable is called a “reference” in C++. p is a variable of type Pt&, a reference to a Pt object. It is often read as “Pt reference” or “Pt ref”. A reference is similar to a pointer in that it refers another object, but different in that you use a reference exactly as you would use the underlying object, whereas you have to dereference the pointer before you use it as the object that it points to.

Here are some examples that illustrate the difference:

    Pt p1 {100}, p2 {200};  // we create 2 Pt objects, p1 and p2

    Pt *p;  // we declare a pointer variable, uninitialized

    p = &p1;     // p points to p1
    p->print();  // this will print "(100,5)"

    p = &p2;     // p now points to p2
    p->print();  // this will print "(200,5)"

    Pt& r = p1;  // we create a reference to p1 named r

    // Pt& r;  // error: references must be initialized when declared

    r = p2;  // this does NOT change r to refer to p2
             // instead, it is equivalent to p1 = p2;
             // a reference is stuck with the object forever

    p1.print();  // will print "(200,5)"
    p2.print();  // will print "(200,5)"

Copy Constructor

We are now ready to write a copy constructor. We will change transpose() back to pass-by-value version, and add a copy constructor to class Pt:

class Pt {
public:
    ...

    Pt(Pt& orig) {   // we will improve the parameter type with "const" later
        x = orig.x;
        y = orig.y;
        std::cout << "copy" << std::endl;
    }

    ...
};

void transpose(Pt p)
{
    double t = p.x;
    p.x = p.y;
    p.y = t;
    p.print();
}

int main() {
    cout << "*** p4: passing by value vs. passing by reference" << endl;

    Pt p4;
    p4.print();
    transpose(p4);
    p4.print();

    cout << "That's all folks!" << endl;
}

The following output shows that our copy constructor is indeed getting invoked when p4 is passed by value to transpose() and p is copy-constructed on the stack of the transform() function.

*** p4: passing by value vs. passing by reference
hi
(4,5)
copy
(5,4)
bye
(4,5)
That's all folks!
bye

Note that the copy constructor takes a reference to the original Pt as the parameter. This is indeed one of the reasons why the reference type had to be added to C++. Can you see why you cannot write a copy constructor if you didn’t have the reference type?

There is a small improvement we make to our copy constructor: we change Pt& to const Pt&:

    Pt(const Pt& orig) {  // orig is a const reference, which indicates that 
        x = orig.x;       // the object being passed in will not be modified
        y = orig.y;
        std::cout << "copy" << std::endl;
    }

The difference is analogous to char *p and const char *q. You can change the underlying characters pointed to by p. For example:

    *p   = 'A';
    p[3] = 'B';

But attempting to do that through q will result in a compiler error.

Recall the venerable strcpy() function in C:

strcpy(char *target, const char * source);

The source string is declared as const char * because it is simply being read, but the target array is declared as char * because it is being written to.

By the same token, a copy constructor will simply read the object that it is copying from, so it should be passed in as a const reference.

const member function

Let’s change the declaration of p4 to a const Pt object:

class Pt {
public:
    ...
    void print() {
        std::cout << "(" << this->x << "," << y << ")" << std::endl;
    }
    ...
};

...

int main() {
    cout << "*** p4: passing by value vs. passing by reference" << endl;

    const Pt p4;  // we can declare a variable as "const" type
    p4.print();   // to indicate that it will not be modified
    ...
}

clang++ emits the following error message. Other compilers will produce similar messages as well.

basic4.cpp:48:5: error: 'this' argument to member function 'print' has type 'const Pt', but function is not marked const

Apparently, in order for a member function to be invoked on a const object, the member function must be marked const. How do we do that? Like this:

    void print() const {
        std::cout << "(" << this->x << "," << y << ")" << std::endl;
    }

Recall that a member function of class Pt will implicitly declare the following pointer that points to the object on which the member function in being invoked:

    Pt *this;

When you mark the member function const, the type of the this pointer changes to this:

    const Pt *this;

The compiler will refuse to let a regular pointer point to a const object. This was the case in C and the same holds in C++. This is why you can only invoke a const member function on a const object (or a const reference to an object). The same applies to functions that take regular references or pointers as parameters. You cannot pass a const object to a function that takes a regular reference/pointer as the parameter.

In general, once you have tacked on const-ness, there is no going back. You have to stay in the land of const-ness. This is why we must mark member functions as const whenever possible because the member function can then be invoked on any object. (Note that there is no problem binding a non-const object to a const reference or pointer – i.e., tacking on const-ness is always ok; you just can’t go back, at least not without explicit casting.)

Copy Constructor Revisited

Direct Invocation of Copy Constructor

Our previous discussion of copy constructor was motivated by passing objects by value. We can also invoke copy constructor directly like this:

int main() {
    const Pt p4;

    cout << "*** p5: direct invocation of copy ctor" << endl;

    {
        Pt p5 {p4};
        p5.print();
    }

    cout << "That's all folks!" << endl;
}

Here is the output (with annotation):

hi                                       // p4's constructor
*** p5: direct invocation of copy ctor
copy                                     // p5 being copy-constructed
(4,5)
bye                                      // p5's destructor
That's all folks!
bye                                      // p4's destructor

Also shown below are some alternate syntax for direct copy construction:

    // All 3 lines below are equivalent to Pt p5 {p4};
    // i.e., they are all direct copy construction of p5 using p4
    
    Pt p5 = {p4};
    Pt p5(p4);
    Pt p5 = p4;

    // But the following is different:
    // 1) p5 is default-constructed (i.e., no-argument constructor is called)
    // 2) and then we assign the existing p4 onto the existing p5

    Pt p5;
    p5 = p4;

Returning Object by Value

Another case of copy construction is when an object is returned by value. Consider the following code:

class Pt {
public:
    // same as before: constructor, destructor, copy constructor, print()
    ...
};

Pt expand(Pt p)
{
    p.x *= 2;
    p.y *= 2;
    return p;
}

int main() {
    const Pt p4;

    cout << "*** p6: returning object by value" << endl;

    {
        Pt p6 { expand(p4) };
    }

    cout << "That's all folks!" << endl;
}

Here is the output (with annotations):

// we compiled the code with:
// g++ -fno-elide-constructors -std=c++14 

hi                                 // p4's constructor
*** p6: returning object by value
copy
copy
copy
bye
bye
bye
That's all folks!
bye                                // p4's destructor

We see three copy constructor calls:

  1. Passing p4 by value to expand(), copy-constructing p
  2. expand() function returning p by value, thereby creating a temporary unnamed object which we will call tmp
  3. Copy construction of p6 where tmp is being passed in as the parameter

Note that the unnamed temporary object tmp returned by expand() is a distinct object that is copy-constructed from p because p is a local variable in expand() and therefore not available when expand() has returned. tmp gets created when expand() returns and it exists during the construction of p6 to serve as the object from which p6 is copy-constructed.

The three destructor calls following the copy calls are:

  1. p going out of scope in expand()
  2. tmp going away when p6’s copy constructor returns
  3. p6 going out of scope

Copy Elision

If we recompile the same code above without the -fno-elide-constructors compiler flag, the output changes to this:

hi
*** p6: returning object by value
copy
copy
bye
bye
That's all folks!
bye

Now there are only two Pt objects being copy-constructed. p and p6 are named objects, so the two copy constructions must be for those objects. Removing the -fno-elide-constructors flag seems to have enabled the compiler to do away with creating the temporary object we referred to as tmp. The g++ manual page gives us a clear explanation on this:

-fno-elide-constructors

    The C++ standard allows an implementation to omit creating a
    temporary that is only used to initialize another object of
    the same type.  Specifying this option disables that
    optimization, and forces G++ to call the copy constructor in
    all cases. [ ... ]

    In C++17, the compiler is required to omit these temporaries [ ... ]

Omitting such temporaries, aka copy elision, became a requirement of the C++ language staring from C++17, so -fno-elide-constructors does not bring back tmp for us in C++17, which is why we had to also specify std=c++14.

Summary: Three Cases of Copy Construction

  1. Passing class objects by value
  2. Direct invocation of copy constructor
  3. Returning class objects by value

Copy Assignment

The last piece of what we call the “basic 4” of C++ classes is the copy assignment operator. Before we write a copy assignment operator, let’s see how assignment works without it:

class Pt {
public:
    // same as before: constructor, destructor, copy constructor, print()
    ...
};

int main() {
    const Pt p4;

    cout << "*** p7: copy assignment" << endl;

    {
        Pt p7 {8,9};
        p7.print();
        p7 = p4;
        p7.print();
    }

    cout << "That's all folks!" << endl;
}

Here is the output (with annotations):

hi                       // p4's constructor
*** p7: copy assignment
hi                       // p7's constructor                 
(8,9)                    // p7.print() before the assignment
(4,5)                    // p7.print() after the assignment
bye                      // p7's destructor
That's all folks!
bye                      // p4's destructor

This is just plain old assignment. It’s not that different from the following:

    int x = 100; 
    int y = 200;
    x = y;  // x is now 200

In C, assigning one struct instance to another simply assigns all the members. The same holds true in C++, unless we redefine what assignment means for our class by providing a copy assignment operator, aka operator=(), as follows:

class Pt {
public:
    // same constructor, destructor, copy constructor, print()
    ...

    void operator=(const Pt& rhs) {  // we will improve the return type later
        x = rhs.x;
        y = rhs.y;
        std::cout << "op=()" << std::endl;
    }
    ...
};

int main() {
    // same main() as before
    ...
}

The output changes as shown below, indicating that our copy assignment operator was invoked:

hi
*** p7: copy assignment
hi
(8,9)
op=()  // printed by our copy assignment operator
(4,5)
bye
That's all folks!
bye

C++ lets you redefine most operators for your own classes. For example, if you were writing a matrix class, you can define operator+() so that you can write m1 + m2 instead of add_matrices(m1, m2).

Note that operator=() is a member function of class Pt. The implicit this pointer points to the object on the left-hand side of the assignment (p7 in our example). The right-hand side object (p4 in our example) gets passed in as the sole const Pt& parameter to the member function. This makes sense because the copy assignment operator is meant to change the state of the object on the left-hand side, whereas it only reads from the object on the right-hand side. In other words, there is a clear distinction between the left and the right. If we were to write an operator+() for class Pt, for example, making it a member function would not be a good choice because there should not be any distinction between the left and the right of +. We would make it a global function that takes two parameters – we will revisit this point again later.

In C, an assignment expression like x = y has a value, which is the value of the left-hand side after the assignment has taken place. This makes it possible to write x = y = z or equivalently, x = (y = z).

Such a statement would not be possible with the copy assignment operator we wrote above because we set void as the return value. We fix it below. We also changed main() to test the return value of the copy assignment. The output remains the same as before.

class Pt {
public:
    // same constructor, destructor, copy constructor, print()
    ...

    Pt& operator=(const Pt& rhs) {
        x = rhs.x;
        y = rhs.y;
        std::cout << "op=()" << std::endl;
        return *this;
    }
    ...
};

int main() {
    const Pt p4;

    cout << "*** p7: copy assignment" << endl;

    {
        Pt p7 {8,9};
        p7.print();
        (p7 = p4).print();  // same as (p7.operator=(p4)).print();
    }

    cout << "That's all folks!" << endl;
}

We changed the second print() call in main() to (p7 = p4).print(); in order to showcase that the expression p7 = p4 indeed evaluates to p7. Note that we return *this from the copy assignment operator. We are supposed to return a reference to the left-hand side object. The type of this pointer is Pt *, so we have to dereference the pointer to get a Pt&.

Could we have made the return type of the copy assignment Pt rather than Pt& like this?

    Pt operator=(const Pt& rhs) {  // wrong return type
        x = rhs.x;
        y = rhs.y;
        std::cout << "op=()" << std::endl;
        return *this;
    }

The code will still work, but that’s not the right thing to do. Returning a Pt object by value will trigger a copy construction of the returned object. That is, the value of p7 = p4 will not be the p7 object itself, but a new object which is a copy of p7. The assignment expression must evaluate to the actual object on the left-hand side, not a copy.

Compiler-generated Basic 4

We studied how to write the following special member functions, which we call the Basic 4 of C++ classes:

For any class we write, we must make sure that its basic 4 are correct. (Actually we will learn two more essential member functions that we must consider when writing classes – move constructor and move assignment – but we will postpone that discussion until later.)

Now, making sure that the basic 4 are correct does not mean we have to write them ourselves. If we do not write them, the compiler will generate them for us. If the compiler-generated versions are all that we want, there is no reason to write them ourselves.

Here are the rules for compiler-generated basic 4:

  1. Constructor

    • If we write any constructor (that is, if we write a copy constructor, a constructor that takes one integer, a constructor that takes two char *, and so on), then the compiler will not generate any constructor.

    • If we write no constructor, the compiler will generate a default constructor (i.e., a constructor that takes no argument).

    • A generated default constructor will not initialize any members of built-in types like int or char *, but will invoke the default constructors of any class members.

  2. Destructor

    • If we write no destructor, the compiler will generate one.

    • A generated destructor will do nothing for members of built-in types, but will invoke the destructors of any class members.

  3. Copy constructor

    • If we write no copy constructor, the compiler will generate one.

    • A generated copy constructor will perform member-wise copy of each data member of the class. For a built-in type member, it simply means copying the value. For a class member, it means copy-constructing the member object by invoking the member’s copy constructor.

  4. Copy assignment

    • If we write no copy assignment, the compiler will generate one.

    • A generated copy assignment will perform member-wise assignment of each data members. For built-in types, it simply means assigning the value. For a class member, it means assigning to the member by invoking the member’s copy assignment operator.


Last updated: 2025-02-02