COMS W4995 C++ Deep Dive for C Programmers

Inheritance

Fundamentals

Let’s start our study of inheritance in C++ by covering the fundamentals. The 08/inherit1 program defines two simple classes B and D. D will inherit, or derive, from B. In Java, we’d say that B is a superclass and D is a subclass. In C++, we say that B is a base class and D is a derived class. The class definitions are shown below:

class B {
public:
    uint64_t sum() const { return b; }

private:
    uint64_t b = 100;
};

class D : public B {
public:
    uint64_t sum() const { return B::sum() + d; }

private:
    uint64_t d = 200;
};

B and D are simple wrappers over their respective uint64_t members b and d. uint64_t is an unsigned 8-byte integer. For convenience, we’ve defined b and d with default member initializers. The default member initializers are used for members that are not in the constructor’s member initializer list.

In the main() function, we instantiate a D object and invoke its sum() function:

int main() {
    using namespace std;

    D x;
    w( x.sum() );
}

w() is a convenience macro that prints the given expression verbatim followed by its value. It is defined as follows:

#define  w(expr)  std::cout << #expr << ": " << expr << std::endl

The preprocessor # operator is known as the stringify operator; it converts the macro argument into a string literal, allowing us to print it out.

The output of the 08/inherit1 program so far is as follows:

x.sum(): 300

D’s implementation of sum() adds the return value of the base class’s sum() function and its own member d. B::sum() uses the scope resolution operator to invoke the base class’s sum() function. Since D derives from B, you may think that D could access the b member from the base class directly, like this:

uint64_t sum() const { return b + d; } // Compiler error

Doing so results in a compilation error. The b member is declared as private in the B class. That is, it can only be accessed by the B class. If we wanted to allow derived classes of B to access b, we should have specified protected instead of private. Derived classes of B can access protected members of B, but protected members are still inaccessible to code outside the derived class.

Single Inheritance Memory Layout

Let’s expand the main() function to see how the D object is laid out in memory:

int main() {
    using namespace std;

    D x;
    w( x.sum() );

    cout << "*** Object layout" << endl;
    w( sizeof(x) );
    uint64_t* p = (uint64_t*)&x;
    w( p[0] );
    w( p[1] );
}

We reinterpet the memory occupied by D as an array of uint64_ts by casting a pointer to D to a uint64_t*. C++ formalizes this C-style cast as reinterpret_cast, which is a cast that allows you to convert between types by reinterpreting the underlying bit pattern. Unlike the const_cast that we saw earlier, reinterpret_cast doesn’t offer any additional typechecking, but it makes your intent to treat the memory as another type more explicit. The C-style cast could’ve been written as follows:

uint64_t* p = reinterpret_cast<uint64_t*>(&x);

When we run the program, we see the following output:

x.sum(): 300
*** Object layout
sizeof(x): 16
p[0]: 100
p[1]: 200

The size of the D object is 16 bytes as it contains two uint64_ts: 100 from B and 200 from D. Its object layout looks like this:

08-obj-1

Static Binding vs. Dynamic Binding

Static Binding

The address of a D object can be assigned to a base class pointer B* as well as a derived class pointer D*. Let’s extend the main() function to compare the pointers and to invoke the sum() function through them:

int main() {
    using namespace std;

    D x;
    w( x.sum() );

    cout << "*** Object layout" << endl;
    w( sizeof(x) );
    uint64_t* p = reinterpret_cast<uint64_t*>(&x);
    w( p[0] );
    w( p[1] );

    cout << "*** Pointer/reference to B can bind to a D object" << endl;
    B *pb = &x;
    D *pd = &x;
    w( pb );
    w( pd );
    w( pb->sum() );
    w( pd->sum() );
}

When we run the program again, we see the following new output:

x.sum(): 300
*** Object layout
sizeof(x): 16
p[0]: 100
p[1]: 200
*** Pointer/reference to B can bind to a D object
pb: 0x7ffde8c5cdf0
pd: 0x7ffde8c5cdf0
pb->sum(): 100
pd->sum(): 300

pb and pd have the same addresses. This should come at no surprise because we previously established that the base class portion is laid out at the beginning of D.

What may come as a surprise is that the invocation of sum() results in different values depending on whether you invoke it from the B pointer or the D pointer. You may have thought that they both should’ve invoked the same D::sum() function given that they refer to the same object, but that’s not the case here. The compiler chose which function to call based on the type of the pointer. Calling sum() through B* invokes B::sum() whereas calling it though D* invokes D::sum(). This is known as static binding and it is resolved at compile time.

Dynamic Binding

Had we written the inherit1 program in Java, we’d observe different behavior. If a Java subclass overrides a superclass’s method, that overridden method will always be invoked, regardless if you’re using a superclass or subclass reference to the subclass object. This is called dynamic binding, and it is resolved at run time. Dynamic binding enables polymorphism, where a class object accessed through a base pointer exhibits different behaviors depending on what derived type the object is.

C++ defaults to static binding as we saw before, but we can enable dynamic binding by marking a base class member function as virtual. We change the B class definition to mark sum() as a virtual function:

class B {
public:
    virtual uint64_t sum() const { return b; }

private:
    uint64_t b = 100;
};

class D : public B {
public:
    uint64_t sum() const override { return B::sum() + d; }

private:
    uint64_t d = 200;
};

We also marked D::sum() with override, which was optional. The compiler can already tell that you’re overriding the base class B::sum(). It’s good practice to specify override to catch errors like mispelling the name of the virtual function or trying to override a function that isn’t virtual.

When we rerun the inherit1 program with these changes, we see that both invocations of sum() now dynamically bind to D::sum():

x.sum(): 300
*** Object layout
sizeof(x): 24
p[0]: 97478962171232
p[1]: 100
*** Pointer/reference to B can bind to a D object
pb: 0x7ffca0aba030
pd: 0x7ffca0aba030
pb->sum(): 300
pd->sum(): 300

It looks like the D object layout also changed, however. It’s 8 bytes larger!

Virtual Table

To enable dynamic binding, the compiler has inserted additional information into the object so that the correct version of the function can be invoked at run time.

Let’s first fix up our object layout printing logic to account for the new 8-byte value at the beginning of D. We’re going to print it as a memory address by reinterpreting it as a void*:

cout << "*** Object layout" << endl;
w( sizeof(x) );
uint64_t* p = reinterpret_cast<uint64_t*>(&x);
w( reinterpret_cast<void*>(p[0]) );
w( p[1] );
w( p[2] );

Here’s the new output:

...
*** Object layout
sizeof(x): 24
reinterpret_cast<void*>(p[0]): 0x559d7b1ddd60
p[1]: 100
p[2]: 200
...

p[0] is known as the virtual table pointer, or vtable pointer, or just vptr. It points to the virtual table for the class, which is basically an array of function pointers for each virtual function in the class. In the case of D, its vtable will have an entry for D::sum(), as shown below:

08-vtable-1

Each class with virtual functions will have its own vtable. Every object instance of such classes will have a vptr at the beginning of the object memory layout pointing to the per-class vtable. Here’s what the memory layout looks like for a B object called y. Its vtable will have an entry for B::sum():

08-vtable-2

Using the vptr and vtable, the compiler will translate the member function invocation pb->sum() to something like pb->vptr[0](pb). The virtual member function sum() is invoked using the function pointer stored in the vtable. Recall our discussion about how a C++ to C translator would translate a member function to a global function that takes an extra this parameter. In order to invoke that global function, we have to pass pb.

Note that the indirection through vtable ensures that B::sum() will be called if pb points to a B object and D::sum() will be called if pb points to a D object. This is how C++ implements dynamic binding.

Let’s illustrate the vtable structure by following the vptr and invoking D::sum() manually. We’ve updated the memory layout diagram for D to highlight three pointers that we’re going to follow. We’ve marked each of them with the type that we’re going to cast it to:

08-vtable-cast

Starting from the right, the pointer marked with (1) refers to the vtable entry for the single virtual function sum(). Its type is uint64_t (*)(const D*). The vptr at the beginning of D is marked as (2). Since we’ll have to dereference vptr to get to (1), we’re going to cast vptr’s type to uint64_t (**)(const D*). By the same token, pb, which we marked as (3), will have to be casted to uint64_t (***)(const D*).

Putting it all together, we can manually call D::sum() like this:

cout << "*** Calling a virtual function manually" << endl;
cout << ((uint64_t (***)(const D*))pb)[0][0](&x) << endl;

Here is the output, as expected:

...
*** Calling a virtual function manually
300

Multiple Inheritance

Unlike in Java, a C++ class can derive from multiple base classes. Use of multiple inheritance is controversial, with many C++ experts discouraging it. Java, in fact, made the conscious decision to disallow multiple inheritance (but it allows use of multiple interfaces instead). Nonetheless, we should understand the implications that multiple inheritance has on the object memory layout and how it interacts with virtual functions.

Mulitple inheritance with no virtual functions

Let’s start with a simple case of multiple inheritance where no classes have virtual functions (i.e., there are no vtables). The 08/inherit2 program defines two base classes B and C, and one derived class D that inherits from both. The main() function displays the object layout for D:

class B {
public:
    uint64_t sum() const { return b; }
private:
    uint64_t b = 100;
};

class C {
public:
    uint64_t sum() const { return c; }
private:
    uint64_t c = 150;
};

class D : public B, public C {
public:
    uint64_t sum() const { return B::sum() + C::sum() + d; }
private:
    uint64_t d = 200;
};

int main() {
    using namespace std;

    // 1.  Multiple inheritance object layout

    D x;
    w( x.sum() );

    cout << "*** Object layout" << endl;
    w( sizeof(x) );
    uint64_t* p = reinterpret_cast<uint64_t*>(&x);
    w( p[0] );
    w( p[1] );
    w( p[2] );
}

The program output is as follows:

x.sum(): 450
*** Object layout
sizeof(x): 24
p[0]: 100
p[1]: 150
p[2]: 200

The program output reveals that the D class object includes the B and C portions at the beginning in the order in which D inherits from them.

Let’s extend the main() function to study the behavior of accessing the D object through B*, C* and D*:

int main() {
    using namespace std;

    // 1.  Multiple inheritance object layout
    ...
    // 2.  Base pointers have different addresses

    cout << "*** Base pointers have different addresses" << endl;
    B* pb = &x;
    C* pc = &x;
    D* pd = &x;
    w( pb );
    w( pc );
    w( pd );
    w( pb->sum() );
    w( pc->sum() );
    w( pd->sum() );
}

The new program output is as follows:

x.sum(): 450
*** Object layout
sizeof(x): 24
p[0]: 100
p[1]: 150
p[2]: 200
*** Base pointers have different addresses
pb: 0x7ffed4c4a520
pc: 0x7ffed4c4a528
pd: 0x7ffed4c4a520
pb->sum(): 100
pc->sum(): 150
pd->sum(): 450

As in the single inheritance case, we observe static binding when invoking the non-virtual sum() member function; calling it through B*, C*, and D* invokes B::sum(), C::sum(), and D::sum() respectively.

What you may find surprising is the fact that the pointer pc has a different address than pb and pd, even though they were all assigned to the same object address, &x!

The following diagram shows the D object layout and illustrates the pointer adjustment observed above:

08-multi-1

It makes sense that C* pc was adjusted by the compiler to point to the C portion of D when it was initialized to &x. pc must point to a valid C object. pb and pd have the same address because the B portion is laid out at the beginning of D.

`static_cast`

C++ offers static_cast, which performs some additional type-checking compared to C-style casts.

We can use static_cast to downcast from a C* to a D*:

D* pd2 = static_cast<D*>(pc);

The cast is valid because the compiler verifies that D inherits from C. We previously saw that the compiler adjusts the memory address of a pointer when we upcast from D* to C*. Likewise, the compiler will adjust the memory address from pc to the beginning of the containing D object.

In the following example, we attempt a bogus cast from std::string* to D* using both static_cast and a C-style cast. The static_cast fails to compile because std::string and D are unrelated classes. The C-style cast compiles because it is treated like a reinterpret_cast.

std::string s("hi");
// pd2 = static_cast<D*>(&s); // Compiler error
pd2 = (D*)&s; // No compiler error

static_cast isn’t perfect, however, because it can only use information known at compile time. In the example below, we create a standalone C object and attempt to downcast a pointer to it to a D*:

C y;
pd2 = static_cast<D*>(&y);
// w( pd2->sum() ); // Unpredictable result

The code compiles because static_cast knows that a C* could be a D*, but in this case, y is just a C object, not a D object.

Multiple inheritance with virtual functions

Let’s now mark sum() as virtual in B and C and override it in D:

class B {
public:
    virtual uint64_t sum() const { return b; }
private:
    uint64_t b = 100;
};

class C {
public:
    virtual uint64_t sum() const { return c; }
private:
    uint64_t c = 150;
};

class D : public B, public C {
public:
    uint64_t sum() const override { return B::sum() + C::sum() + d; }
private:
    uint64_t d = 200;
};

We’ll also update the main() function’s memory layout printing logic to account for two vtable pointers inside of D:

int main() {
    using namespace std;

    // 1.  Multiple inheritance object layout

    D x;
    w( x.sum() );

    cout << "*** Object layout" << endl;
    w( sizeof(x) );
    uint64_t* p = reinterpret_cast<uint64_t*>(&x);
    w( reinterpret_cast<void*>(p[0]) );
    w( p[1] );
    w( reinterpret_cast<void*>(p[2]) );
    w( p[3] );
    w( p[4] );
    
    // 2.  Base pointers have different addresses

    cout << "*** Base pointers have different addresses" << endl;
    B* pb = &x;
    C* pc = &x;
    D* pd = &x;
    w( pb );
    w( pc );
    w( pd );
    w( pb->sum() );
    w( pc->sum() );
    w( pd->sum() );
}

The new program output is shown below:

x.sum(): 450
*** Object layout
sizeof(x): 40
reinterpret_cast<void*>(p[0]): 0x621bcabfdca0
p[1]: 100
reinterpret_cast<void*>(p[2]): 0x621bcabfdcb8
p[3]: 150
p[4]: 200
*** Base pointers have different addresses
pb: 0x7ffca1502c90
pc: 0x7ffca1502ca0
pd: 0x7ffca1502c90
pb->sum(): 450
pc->sum(): 450
pd->sum(): 450

As expected, invoking the sum() virtual member function through any pointer B*, C*, or D* will invoke D::sum() polymorphically.

Now let’s turn our attention to the fact that there are two vptrs in D. We established that any class object with virtual functions will have a vptr laid out at the beginning of the object. The vptr at the beginning of the D object is used when the D object is accessed through a B* or D*. However, we saw that when the D object is accessed through a C*, the pointer points to the C portion in the middle of the D object. Thus, another vptr is laid out at the beginning of the C portion.

Why then, are the addresses printed for the two vptrs different? Shouldn’t they both refer to the same vtable entry for the D::sum() function? See the memory layout for D below:

08-multi-d

The second vptr points to the vtable entry labled as “Thunk of D::sum()”. It refers to a wrapper function around D::sum(). Consider the following code:

D x;
C* pc = &x;
pc->sum();

If the second vptr pointed to the vtable entry for the real D::sum(), the this pointer passed to the D::sum() function would be incorrect because, as we’ve seen, pc points to the middle of the D object. The thunk wrapper first adjusts the pointer back to the beginning of the object before it calls the real D::sum() function. In other words, the thunk wrapper would do something like this:

uint64_t thunk_of_D_sum(const C* this_ptr) {
    char* adjusted_this_ptr = (char*)this_ptr - sizeof(B);
    return D::sum( (const D*)adjusted_this_ptr );
}

`dynamic_cast`

In addition to function pointers, the vtable also holds type information that can be accessed at run time. This is used by dynamic_cast to perform crosscasts, and validate downcasts that static_cast cannot validate at compile time.

For example, the following two dynamic_casts are valid because pc actually refers to a D object:

D x;
B* pb = &x;
C* pc = &x;
D* pd = &x;
assert(dynamic_cast<B*>(pc) == pb);
assert(dynamic_cast<D*>(pc) == pd);

On the other hand, the following two casts fail because pc2 refers to a standalone C object that can’t be crosscasted to B or downcasted D:

C y;
C* pc2 = &y;
assert(dynamic_cast<B*>(pc2) == nullptr);
assert(dynamic_cast<D*>(pc2) == nullptr);

A failed pointer dynamic_cast returns nullptr and a failed reference dynamic_cast throws std::bad_cast.

(Optional) vtable layout

The following code shows the complete vtable layout for D:

D x;
uint64_t* p = reinterpret_cast<uint64_t*>(&x);
void** vtbl = reinterpret_cast<void**>(p[0]);

w( reinterpret_cast<int64_t>(vtbl[-2]) );
w( vtbl[-1] );
w( vtbl[0] );
w( reinterpret_cast<int64_t>(vtbl[1]) );
w( vtbl[2] );
w( vtbl[3] );

// First vptr points to vtbl[0]
assert(reinterpret_cast<void**>(p[0]) == &vtbl[0]);
// Second vptr points to vtbl[3]
assert(reinterpret_cast<void**>(p[2]) == &vtbl[3]);

Running this code reveals more details about D’s vtable:

reinterpret_cast<int64_t>(vtbl[-2]): 0
vtbl[-1]: 0x617072817cc0
vtbl[0]: 0x6170728150e2
reinterpret_cast<int64_t>(vtbl[1]): -16
vtbl[2]: 0x617072817cc0
vtbl[3]: 0x617072815127

In addition to the two function pointers we explained before (D::sum() in vtbl[0] and the thunk wrapper in vtbl[3]), there are four more pieces of information: two offsets (vtbl[-2] and vtbl[1]) and two pointers (vtbl[-1] and vtbl[2]). We’ve updated the memory layout for D and its vtable to show the full detail:

08-multi-d-2

The two offsets are labeled offset_to_top. They are the byte offsets to the top of the D object from the B portion and C portion, respectively. Each offset is followed by a pointer to the singleton std::type_info object for the class that the vtable belongs to, which is the D class in this case. These pieces of information are used by dynamic_cast at run time.

The Diamond Problem

What if our base classes B and C themselves derive from a common base class A? Consider 08/inherit3 shown below:

class A {
public:
    virtual uint64_t sum() const { return a; }
private:
    uint64_t a = 5;
};

class B : public A {
public:
    virtual uint64_t sum() const { return b; }
private:
    uint64_t b = 100;
};

class C : public A {
public:
    virtual uint64_t sum() const { return c; }
private:
    uint64_t c = 150;
};

class D : public B, public C {
public:
    uint64_t sum() const override { return B::sum() + C::sum() + d; }

    // A::sum() is ambiguous and does not compile:
    // uint64_t sum() const override { return A::sum() + B::sum() + C::sum() + d; }

private:
    uint64_t d = 200;
};

int main() {
    using namespace std;

    // 1.  The diamond problem: D object contains two A portions

    D x;

    cout << "*** Object layout" << endl;
    w( sizeof(x) );
    uint64_t *p = (uint64_t *)&x;
    for (size_t i = 0; i < sizeof(x) / 8; ++i) {
        cout << "p[" << i << "]: " << p[i] << endl;
    }

    // 2.  Base classes B & C still behave the same

    cout << "*** Base pointers have different addresses" << endl;
    B* pb = &x;
    C* pc = &x;
    D* pd = &x;
    w( pb );
    w( pc );
    w( pd );
    w( pb->sum() );
    w( pc->sum() );
    w( pd->sum() );

    // 3.  But referring to the base class A is ambiguous
    //
    // A* pa = &x;
}

The program produces the following output:

*** Object layout
reinterpret_cast<void*>(p[0]): 0x5dfb9ddedcf8
p[1]: 5
p[2]: 100
reinterpret_cast<void*>(p[3]): 0x5dfb9ddedd10
p[4]: 5
p[5]: 150
p[6]: 200
*** Base pointers have different addresses
pb: 0x7ffe57c6a530
pc: 0x7ffe57c6a548
pd: 0x7ffe57c6a530
pb->sum(): 450
pc->sum(): 450
pd->sum(): 450

The output reveals that the B portion and the C portion each contain its own A object! This means that the D object, which derives from B and C, will have two A objects in it. The memory layout diagram for D is shown below:

08-diamond-problem

The main() function shows that a D object still behaves polymorphically when referred to through B* and C*. Any reference to the A shared base class, however, results in a compiler error because there are two A portions and the compiler cannot determine which one to use. This is known as the “Diamond Problem”.

Virtual Inheritance

Virtual inheritance solves the Diamond Problem we described earlier. In 08/inherit4 shown below, we have B and C virtually inherit from the base class A using the virtual keyword:

class A {
public:
    virtual uint64_t sum() const { return a; }
private:
    uint64_t a = 5;
};

class B : virtual public A {
public:
    virtual uint64_t sum() const { return b; }
private:
    uint64_t b = 100;
};

class C : virtual public A {
public:
    virtual uint64_t sum() const { return c; }
private:
    uint64_t c = 150;
};

class D : public B, public C {
public:
    // There is now only one copy of A, so A::sum() is no longer ambiguous
    uint64_t sum() const override { return A::sum() + B::sum() + C::sum() + d; }
private:
    uint64_t d = 200;
};

int main() {
    using namespace std;

    // 1.  Virtual inheritance fixes the diamond problem

    D x;

    cout << "*** Object layout" << endl;
    static_assert(sizeof(D) == 7 * sizeof(uint64_t));
    uint64_t* p = reinterpret_cast<uint64_t*>(&x);
    w( reinterpret_cast<void*>(p[0]) );
    w( p[1] );
    w( reinterpret_cast<void*>(p[2]) );
    w( p[3] );
    w( p[4] );
    w( reinterpret_cast<void*>(p[5]) );
    w( p[6] );

    // 2.  sum() behaves polymorphically via any base pointer

    cout << "*** Base pointers have different addresses" << endl;
    A* pa = &x;
    B* pb = &x;
    C* pc = &x;
    D* pd = &x;
    w( pa );
    w( pb );
    w( pc );
    w( pd );
    w( pa->sum() );
    w( pb->sum() );
    w( pc->sum() );
    w( pd->sum() );
}

The program output is shown below:

*** Object layout
reinterpret_cast<void*>(p[0]): 0x635a2bae9be0
p[1]: 100
reinterpret_cast<void*>(p[2]): 0x635a2bae9c00
p[3]: 150
p[4]: 200
reinterpret_cast<void*>(p[5]): 0x635a2bae9c20
p[6]: 5
*** Base pointers have different addresses
pa: 0x7fff65600de8
pb: 0x7fff65600dc0
pc: 0x7fff65600dd0
pd: 0x7fff65600dc0
pa->sum(): 455
pb->sum(): 455
pc->sum(): 455
pd->sum(): 455

We can now polymorphically invoke the sum() virtual member function through any base pointer, including A*. By having B and C virtually inherit from A, the most derived class in the inheritance hierarchy, D, only contains one A portion, even though A appeared multiple times in the inheritance hierarchy. A is now laid out at the end of the D object and has its own vptr. The memory layout for D is shown below:

08-virtual-d

(Optional) vtable layout

The following code shows the complete vtable layout for D:

D x;
uint64_t* p = reinterpret_cast<uint64_t*>(&x);
void** vtbl = reinterpret_cast<void**>(p[0]);

w( reinterpret_cast<int64_t>(vtbl[-3]) );
w( reinterpret_cast<int64_t>(vtbl[-2]) );
w( vtbl[-1] );
w( vtbl[0] );
w( reinterpret_cast<int64_t>(vtbl[1]) );
w( reinterpret_cast<int64_t>(vtbl[2]) );
w( vtbl[3] );
w( vtbl[4] );
w( reinterpret_cast<int64_t>(vtbl[5]) );
w( reinterpret_cast<int64_t>(vtbl[6]) );
w( vtbl[7] );
w( vtbl[8] );

// First vptr points to vtbl[0]
assert(reinterpret_cast<void**>(p[0]) == &vtbl[0]);
// Second vptr points to vtbl[4]
assert(reinterpret_cast<void**>(p[2]) == &vtbl[4]);
// Third vptr points to vtbl[8]
assert(reinterpret_cast<void**>(p[5]) == &vtbl[8]);

Running this code reveals more details about D’s vtable:

reinterpret_cast<int64_t>(vtbl[-3]): 40
reinterpret_cast<int64_t>(vtbl[-2]): 0
vtbl[-1]: 0x5767fce2cce8
vtbl[0]: 0x5767fce2ad5a
reinterpret_cast<int64_t>(vtbl[1]): 24
reinterpret_cast<int64_t>(vtbl[2]): -16
vtbl[3]: 0x5767fce2cce8
vtbl[4]: 0x5767fce2adcf
reinterpret_cast<int64_t>(vtbl[5]): -40
reinterpret_cast<int64_t>(vtbl[6]): -40
vtbl[7]: 0x5767fce2cce8
vtbl[8]: 0x5767fce2adc2

We’ve updated the memory layout for D and its vtable to show the full detail:

08-virtual-d-2

In addition to the offset_to_top values, std::type_info pointers, and function/thunk pointers, there are two new kinds of offsets in the vtable. vbl[-3] is the vbase_offset that is used to adjust a B* or D* to the A portion of the D object. vtbl[1] is the vbase_offset that is used to adjust a C* to the A portion of the D object. vtbl[5] is the vcall_offset that is used to adjust the A* pointer to point to the entire D object when polymorphically invoking sum() through an A*.

Inheritance Odds & Ends

In this chapter, we focused on core topics of inheritance that require a deeper understanding of C++ internals. There are many more rules and features in the space of inheritance that we won’t cover here, but there are three more things that we’d like to cover before concluding this chapter: order of construction/destruction, pure virtual functions & abstract classes, and virtual destructors.

Construction & destruction order

We present 08/inherit5 below to illustrate the order in which the components of a derived class are constructed and destructed:

struct M {
    M(uint64_t x) : m{x} { cout << "M::M(uint64_t)" << endl; }
    ~M() { cout << "M::~M()" << endl; }
    uint64_t m;
};

struct B {
    B(uint64_t x) : b{x} { cout << "B::B(uint64_t)" << endl; }
    ~B() { cout << "B::~B()" << endl; }
    uint64_t b;
};

struct D : public B {
    D() : B{100}, m_obj{200}, d{300} { cout << "D::D()" << endl; }
    ~D() { cout << "D::~D()" << endl; }
    M m_obj;
    uint64_t d;
};

int main() {
    cout << "*** Order of construction and destruction" << endl;
    {
        D x;
    }
}

The program produces the following output:

*** Order of construction and destruction
B::B(uint64_t)
M::M(uint64_t)
D::D()
D::~D()
M::~M()
B::~B()

We see that the derived class object D x first constructs its base class B, then its members, and finally executes the body of its constructor. Note that we initialized the B portion by including B{100} in the member initializer list in the D constructor. The destruction happens in the opposite order: the body of the destructor of D, then its members, and finally its base class B.

Pure virtual functions & abstract classes

We’ve now changed 08/inherit5 to add a virtual function, sum(), to the base class B and override it in the derived class D:

...
struct B {
    ...
    virtual uint64_t sum() const = 0;
    ...
};

struct D : public B {
    ...
    uint64_t sum() const override { return b + m_obj.m + d; }
    ...
};

int main() {
    cout << "*** Order of construction and destruction" << endl;
    {
        D x;
        // B y;  // Compilation error
        B* bp = &x;
        w( bp->sum() );
    }
}

Declaring a virtual function with = 0 makes it a pure virtual function. Any class that contains a pure virtual function is called an abstract class. An abstract class is meant to specify an interface that should be implemented in a derived class. You cannot instantiate an abstract class, but you can refer to a derived concrete class using an abstract pointer/reference, as shown in the example above.

Virtual destructor

Consider the following snippet of code added at the end of the main() function:

int main() {
    cout << "*** Order of construction and destruction" << endl;
    ...

    cout << "*** Virtual destructor" << endl;
    B* pb = new D;
    delete pb;
}

The code produces the following output:

*** Order of construction and destruction
...
*** Virtual destructor
B::B(uint64_t)
M::M(uint64_t)
D::D()
B::~B()

Recall our discussion of static binding vs. dynamic binding. Unless a member function is marked virtual, C++ will resolve invocations using static binding. Here, when we delete a D object through a B*, B::~B() is chosen statically because the destructor was not marked virtual.

Let’s change B::~B() to be virtual and mark D::~D() with override as follows:

...
struct B {
    ...
    virtual ~B() { cout << "B::~B()" << endl; }
    ...
};

struct D : public B {
    ...
    ~D() override { cout << "D::~D()" << endl; }
    ...
};
...

The output now shows that D::~D() gets called polymorphically, and it in turn properly destructs all components of D.

*** Order of construction and destruction
...
*** Virtual destructor
B::B(uint64_t)
M::M(uint64_t)
D::D()
D::~D()
M::~M()
B::~B()

In general, when you write a class that is meant to be derived from, you should mark its destructor as virtual so that the derived class can be polymorphically deleted. There are, however, some situations where non-virtual destructor makes sense for a base class. In those cases, the base class destructor should be made protected instead of public to prevent deletion through a base pointer.

Last updated: 2025-09-25