Let’s start our study of inheritance in C++ by covering the fundamentals. The
08/inherit1
program defines two simple classes B
and D
. D
will inherit,
or derive, from B
. In Java, we’d say that B
is a superclass and D
is a
subclass. In C++, we say that B
is a base class and D
is a derived class.
The class definitions are shown below:
class B {
public:
uint64_t sum() const { return b; }
private:
uint64_t b = 100;
};
class D : public B {
public:
uint64_t sum() const { return B::sum() + d; }
private:
uint64_t d = 200;
};
B
and D
are simple wrappers over their respective uint64_t
members b
and
d
. uint64_t
is an unsigned 8-byte integer. For convenience, we’ve defined
b
and d
with default member initializers. The default member
initializers are used for members that are not in the constructor’s member
initializer list.
In the main()
function, we instantiate a D
object and invoke its sum()
function:
int main() {
using namespace std;
D x;
w( x.sum() );
}
w()
is a convenience macro that prints the given expression verbatim followed
by its value. It is defined as follows:
#define w(expr) std::cout << #expr << ": " << expr << std::endl
The preprocessor #
operator is known as the
stringify operator; it converts the macro argument into a string literal,
allowing us to print it out.
The output of the 08/inherit1
program so far is as follows:
x.sum(): 300
D
’s implementation of sum()
adds the return value of the base class’s
sum()
function and its own member d
. B::sum()
uses the
scope resolution operator to invoke the base class’s sum()
function.
Since D
derives from B
, you may think that D
could access the b
member from the base class directly, like this:
uint64_t sum() const { return b + d; } // Compiler error
Doing so results in a compilation error. The b
member is declared as private
in the B
class. That is, it can only be accessed by the B
class. If we
wanted to allow derived classes of B
to access b
, we should have
specified protected
instead of private
. Derived classes of B
can access
protected members of B
, but protected members are still inaccessible to code
outside the derived class.
Let’s expand the main()
function to see how the D
object is laid out in
memory:
int main() {
using namespace std;
D x;
w( x.sum() );
cout << "*** Object layout" << endl;
w( sizeof(x) );
uint64_t* p = (uint64_t*)&x;
w( p[0] );
w( p[1] );
}
We reinterpet the memory occupied by D
as an array of uint64_t
s by casting a
pointer to D
to a uint64_t*
. C++ formalizes this C-style cast as
reinterpret_cast
, which is a cast that allows you to convert between types by
reinterpreting the underlying bit pattern. Unlike the const_cast
that we saw
earlier, reinterpret_cast
doesn’t offer any additional typechecking, but it
makes your intent to treat the memory as another type more explicit. The C-style
cast could’ve been written as follows:
uint64_t* p = reinterpret_cast<uint64_t*>(&x);
When we run the program, we see the following output:
x.sum(): 300
*** Object layout
sizeof(x): 16
p[0]: 100
p[1]: 200
The size of the D
object is 16 bytes as it contains two uint64_t
s: 100 from
B
and 200 from D
. Its object layout looks like this:
The address of a D
object can be assigned to a base class pointer B*
as well
as a derived class pointer D*
. Let’s extend the main()
function to compare
the pointers and to invoke the sum()
function through them:
int main() {
using namespace std;
D x;
w( x.sum() );
cout << "*** Object layout" << endl;
w( sizeof(x) );
uint64_t* p = reinterpret_cast<uint64_t*>(&x);
w( p[0] );
w( p[1] );
cout << "*** Pointer/reference to B can bind to a D object" << endl;
B *pb = &x;
D *pd = &x;
w( pb );
w( pd );
w( pb->sum() );
w( pd->sum() );
}
When we run the program again, we see the following new output:
x.sum(): 300
*** Object layout
sizeof(x): 16
p[0]: 100
p[1]: 200
*** Pointer/reference to B can bind to a D object
pb: 0x7ffde8c5cdf0
pd: 0x7ffde8c5cdf0
pb->sum(): 100
pd->sum(): 300
pb
and pd
have the same addresses. This should come at no surprise because
we previously established that the base class portion is laid out at the
beginning of D
.
What may come as a surprise is that the invocation of sum()
results in
different values depending on whether you invoke it from the B
pointer or the
D
pointer. You may have thought that they both should’ve invoked the same
D::sum()
function given that they refer to the same object, but that’s not the
case here. The compiler chose which function to call based on the type of the
pointer. Calling sum()
through B*
invokes B::sum()
whereas calling it
though D*
invokes D::sum()
. This is known as static binding and it is
resolved at compile time.
Had we written the inherit1
program in Java, we’d observe different behavior.
If a Java subclass overrides a superclass’s method, that overridden method will
always be invoked, regardless if you’re using a superclass or subclass reference
to the subclass object. This is called dynamic binding, and it is resolved
at run time. Dynamic binding enables polymorphism, where a class object
accessed through a base pointer exhibits different behaviors depending on what
derived type the object is.
C++ defaults to static binding as we saw before, but we can enable
dynamic binding by marking a base class member function as virtual.
We change the B
class definition to mark sum()
as a virtual function:
class B {
public:
virtual uint64_t sum() const { return b; }
private:
uint64_t b = 100;
};
class D : public B {
public:
uint64_t sum() const override { return B::sum() + d; }
private:
uint64_t d = 200;
};
We also marked D::sum()
with override
, which was optional. The compiler can
already tell that you’re overriding the base class B::sum()
. It’s good
practice to specify override
to catch errors like mispelling the name of the
virtual function or trying to override a function that isn’t virtual.
When we rerun the inherit1
program with these changes, we see that both
invocations of sum()
now dynamically bind to D::sum()
:
x.sum(): 300
*** Object layout
sizeof(x): 24
p[0]: 97478962171232
p[1]: 100
*** Pointer/reference to B can bind to a D object
pb: 0x7ffca0aba030
pd: 0x7ffca0aba030
pb->sum(): 300
pd->sum(): 300
It looks like the D
object layout also changed, however. It’s 8 bytes larger!
To enable dynamic binding, the compiler has inserted additional information into the object so that the correct version of the function can be invoked at run time.
Let’s first fix up our object layout printing logic to account for the new
8-byte value at the beginning of D
. We’re going to print it as a memory
address by reinterpreting it as a void*
:
cout << "*** Object layout" << endl;
w( sizeof(x) );
uint64_t* p = reinterpret_cast<uint64_t*>(&x);
w( reinterpret_cast<void*>(p[0]) );
w( p[1] );
w( p[2] );
Here’s the new output:
...
*** Object layout
sizeof(x): 24
reinterpret_cast<void*>(p[0]): 0x559d7b1ddd60
p[1]: 100
p[2]: 200
...
p[0]
is known as the virtual table pointer, or vtable pointer, or just
vptr. It points to the virtual table for the class, which is basically an
array of function pointers for each virtual function in the class. In the case
of D
, its vtable will have an entry for D::sum()
, as shown below:
Each class with virtual functions will have its own vtable. Every object
instance of such classes will have a vptr at the beginning of the object memory
layout pointing to the per-class vtable. Here’s what the memory layout looks
like for a B
object called y
. Its vtable will have an entry for B::sum()
:
Using the vptr and vtable, the compiler will translate the member function
invocation pb->sum()
to something like pb->vptr[0](pb)
. The virtual member
function sum()
is invoked using the function pointer stored in the vtable.
Recall our discussion about how a C++ to C translator would translate a member
function to a global function that takes an extra this
parameter. In order to
invoke that global function, we have to pass pb
.
Note that the indirection through vtable ensures that B::sum()
will be called
if pb
points to a B
object and D::sum()
will be called if pb
points to a
D
object. This is how C++ implements dynamic binding.
Let’s illustrate the vtable structure by following the vptr and invoking
D::sum()
manually. We’ve updated the memory layout diagram for D
to
highlight three pointers that we’re going to follow. We’ve marked each of them
with the type that we’re going to cast it to:
Starting from the right, the pointer marked with (1) refers to the vtable entry
for the single virtual function sum()
. Its type is uint64_t (*)(const D*)
.
The vptr at the beginning of D
is marked as (2). Since we’ll have to
dereference vptr to get to (1), we’re going to cast vptr’s type to
uint64_t (**)(const D*)
. By the same token, pb
, which we marked as (3), will
have to be casted to uint64_t (***)(const D*)
.
Putting it all together, we can manually call D::sum()
like this:
cout << "*** Calling a virtual function manually" << endl;
cout << ((uint64_t (***)(const D*))pb)[0][0](&x) << endl;
Here is the output, as expected:
...
*** Calling a virtual function manually
300
Unlike in Java, a C++ class can derive from multiple base classes. Use of multiple inheritance is controversial, with many C++ experts discouraging it. Java, in fact, made the conscious decision to disallow multiple inheritance (but it allows use of multiple interfaces instead). Nonetheless, we should understand the implications that multiple inheritance has on the object memory layout and how it interacts with virtual functions.
Let’s start with a simple case of multiple inheritance where no classes have
virtual functions (i.e., there are no vtables). The 08/inherit2
program
defines two base classes B
and C
, and one derived class D
that inherits
from both. The main()
function displays the object layout for D
:
class B {
public:
uint64_t sum() const { return b; }
private:
uint64_t b = 100;
};
class C {
public:
uint64_t sum() const { return c; }
private:
uint64_t c = 150;
};
class D : public B, public C {
public:
uint64_t sum() const { return B::sum() + C::sum() + d; }
private:
uint64_t d = 200;
};
int main() {
using namespace std;
// 1. Multiple inheritance object layout
D x;
w( x.sum() );
cout << "*** Object layout" << endl;
w( sizeof(x) );
uint64_t* p = reinterpret_cast<uint64_t*>(&x);
w( p[0] );
w( p[1] );
w( p[2] );
}
The program output is as follows:
x.sum(): 450
*** Object layout
sizeof(x): 24
p[0]: 100
p[1]: 150
p[2]: 200
The program output reveals that the D
class object includes the B
and C
portions at the beginning in the order in which D
inherits from them.
Let’s extend the main()
function to study the behavior of accessing the D
object through B*
, C*
and D*
:
int main() {
using namespace std;
// 1. Multiple inheritance object layout
...
// 2. Base pointers have different addresses
cout << "*** Base pointers have different addresses" << endl;
B* pb = &x;
C* pc = &x;
D* pd = &x;
w( pb );
w( pc );
w( pd );
w( pb->sum() );
w( pc->sum() );
w( pd->sum() );
}
The new program output is as follows:
x.sum(): 450
*** Object layout
sizeof(x): 24
p[0]: 100
p[1]: 150
p[2]: 200
*** Base pointers have different addresses
pb: 0x7ffed4c4a520
pc: 0x7ffed4c4a528
pd: 0x7ffed4c4a520
pb->sum(): 100
pc->sum(): 150
pd->sum(): 450
As in the single inheritance case, we observe static binding when invoking the
non-virtual sum()
member function; calling it through B*
, C*
, and D*
invokes B::sum()
, C::sum()
, and D::sum()
respectively.
What you may find surprising is the fact that the pointer pc
has a different
address than pb
and pd
, even though they were all assigned to the same
object address, &x
!
The following diagram shows the D
object layout and illustrates the pointer
adjustment observed above:
It makes sense that C* pc
was adjusted by the compiler to point to the C
portion of D
when it was initialized to &x
. pc
must point to a valid C
object. pb
and pd
have the same address because the B
portion is laid out
at the beginning of D
.
static_cast
C++ offers static_cast
, which performs some additional type-checking compared
to C-style casts.
We can use static_cast
to downcast from a C*
to a D*
:
D* pd2 = static_cast<D*>(pc);
The cast is valid because the compiler verifies that D
inherits from C
.
We previously saw that the compiler adjusts the memory address of a pointer when
we upcast from D*
to C*
. Likewise, the compiler will adjust the memory
address from pc
to the beginning of the containing D
object.
In the following example, we attempt a bogus cast from std::string*
to D*
using both static_cast
and a C-style cast. The static_cast
fails to compile
because std::string
and D
are unrelated classes. The C-style cast compiles
because it is treated like a reinterpret_cast
.
std::string s("hi");
// pd2 = static_cast<D*>(&s); // Compiler error
pd2 = (D*)&s; // No compiler error
static_cast
isn’t perfect, however, because it can only use information known
at compile time. In the example below, we create a standalone C
object and
attempt to downcast a pointer to it to a D*
:
C y;
pd2 = static_cast<D*>(&y);
// w( pd2->sum() ); // Unpredictable result
The code compiles because static_cast
knows that a C*
could be a D*
, but
in this case, y
is just a C
object, not a D
object.
Let’s now mark sum()
as virtual
in B
and C
and override
it in D
:
class B {
public:
virtual uint64_t sum() const { return b; }
private:
uint64_t b = 100;
};
class C {
public:
virtual uint64_t sum() const { return c; }
private:
uint64_t c = 150;
};
class D : public B, public C {
public:
uint64_t sum() const override { return B::sum() + C::sum() + d; }
private:
uint64_t d = 200;
};
We’ll also update the main()
function’s memory layout printing logic to account
for two vtable pointers inside of D
:
int main() {
using namespace std;
// 1. Multiple inheritance object layout
D x;
w( x.sum() );
cout << "*** Object layout" << endl;
w( sizeof(x) );
uint64_t* p = reinterpret_cast<uint64_t*>(&x);
w( reinterpret_cast<void*>(p[0]) );
w( p[1] );
w( reinterpret_cast<void*>(p[2]) );
w( p[3] );
w( p[4] );
// 2. Base pointers have different addresses
cout << "*** Base pointers have different addresses" << endl;
B* pb = &x;
C* pc = &x;
D* pd = &x;
w( pb );
w( pc );
w( pd );
w( pb->sum() );
w( pc->sum() );
w( pd->sum() );
}
The new program output is shown below:
x.sum(): 450
*** Object layout
sizeof(x): 40
reinterpret_cast<void*>(p[0]): 0x621bcabfdca0
p[1]: 100
reinterpret_cast<void*>(p[2]): 0x621bcabfdcb8
p[3]: 150
p[4]: 200
*** Base pointers have different addresses
pb: 0x7ffca1502c90
pc: 0x7ffca1502ca0
pd: 0x7ffca1502c90
pb->sum(): 450
pc->sum(): 450
pd->sum(): 450
As expected, invoking the sum()
virtual member function through any pointer
B*
, C*
, or D*
will invoke D::sum()
polymorphically.
Now let’s turn our attention to the fact that there are two vptrs in D
. We
established that any class object with virtual functions will have a vptr laid
out at the beginning of the object. The vptr at the beginning of the D
object
is used when the D
object is accessed through a B*
or D*
. However, we saw
that when the D
object is accessed through a C*
, the pointer points to the
C
portion in the middle of the D
object. Thus, another vptr is laid out
at the beginning of the C
portion.
Why then, are the addresses printed for the two vptrs different? Shouldn’t they
both refer to the same vtable entry for the D::sum()
function? See the memory
layout for D
below:
The second vptr points to the vtable entry labled as “Thunk of D::sum()”. It
refers to a wrapper function around D::sum()
. Consider the following code:
D x;
C* pc = &x;
pc->sum();
If the second vptr pointed to the vtable entry for the real D::sum()
, the
this
pointer passed to the D::sum()
function would be incorrect because, as
we’ve seen, pc
points to the middle of the D
object. The thunk wrapper first
adjusts the pointer back to the beginning of the object before it calls the real
D::sum()
function. In other words, the thunk wrapper would do something like this:
uint64_t thunk_of_D_sum(const C* this_ptr) {
char* adjusted_this_ptr = (char*)this_ptr - sizeof(B);
return D::sum( (const D*)adjusted_this_ptr );
}
dynamic_cast
In addition to function pointers, the vtable also holds type information that
can be accessed at run time. This is used by dynamic_cast
to perform
crosscasts, and validate downcasts that static_cast
cannot validate at
compile time.
For example, the following two dynamic_casts
are valid because pc
actually
refers to a D
object:
D x;
B* pb = &x;
C* pc = &x;
D* pd = &x;
assert(dynamic_cast<B*>(pc) == pb);
assert(dynamic_cast<D*>(pc) == pd);
On the other hand, the following two casts fail because pc2
refers to a
standalone C
object that can’t be crosscasted to B
or downcasted D
:
C y;
C* pc2 = &y;
assert(dynamic_cast<B*>(pc2) == nullptr);
assert(dynamic_cast<D*>(pc2) == nullptr);
A failed pointer dynamic_cast
returns nullptr
and a failed reference
dynamic_cast
throws std::bad_cast
.
The following code shows the complete vtable layout for D
:
D x;
uint64_t* p = reinterpret_cast<uint64_t*>(&x);
void** vtbl = reinterpret_cast<void**>(p[0]);
w( reinterpret_cast<int64_t>(vtbl[-2]) );
w( vtbl[-1] );
w( vtbl[0] );
w( reinterpret_cast<int64_t>(vtbl[1]) );
w( vtbl[2] );
w( vtbl[3] );
// First vptr points to vtbl[0]
assert(reinterpret_cast<void**>(p[0]) == &vtbl[0]);
// Second vptr points to vtbl[3]
assert(reinterpret_cast<void**>(p[2]) == &vtbl[3]);
Running this code reveals more details about D
’s vtable:
reinterpret_cast<int64_t>(vtbl[-2]): 0
vtbl[-1]: 0x617072817cc0
vtbl[0]: 0x6170728150e2
reinterpret_cast<int64_t>(vtbl[1]): -16
vtbl[2]: 0x617072817cc0
vtbl[3]: 0x617072815127
In addition to the two function pointers we explained before (D::sum()
in
vtbl[0]
and the thunk wrapper in vtbl[3]
), there are four more pieces of
information: two offsets (vtbl[-2]
and vtbl[1]
) and two pointers (vtbl[-1]
and vtbl[2]
). We’ve updated the memory layout for D
and its vtable to show
the full detail:
The two offsets are labeled offset_to_top
. They are the byte offsets to the
top of the D
object from the B
portion and C
portion, respectively. Each
offset is followed by a pointer to the singleton std::type_info
object for the
class that the vtable belongs to, which is the D
class in this case. These
pieces of information are used by dynamic_cast
at run time.
What if our base classes B
and C
themselves derive from a common base class
A
? Consider 08/inherit3
shown below:
class A {
public:
virtual uint64_t sum() const { return a; }
private:
uint64_t a = 5;
};
class B : public A {
public:
virtual uint64_t sum() const { return b; }
private:
uint64_t b = 100;
};
class C : public A {
public:
virtual uint64_t sum() const { return c; }
private:
uint64_t c = 150;
};
class D : public B, public C {
public:
uint64_t sum() const override { return B::sum() + C::sum() + d; }
// A::sum() is ambiguous and does not compile:
// uint64_t sum() const override { return A::sum() + B::sum() + C::sum() + d; }
private:
uint64_t d = 200;
};
int main() {
using namespace std;
// 1. The diamond problem: D object contains two A portions
D x;
cout << "*** Object layout" << endl;
w( sizeof(x) );
uint64_t *p = (uint64_t *)&x;
for (size_t i = 0; i < sizeof(x) / 8; ++i) {
cout << "p[" << i << "]: " << p[i] << endl;
}
// 2. Base classes B & C still behave the same
cout << "*** Base pointers have different addresses" << endl;
B* pb = &x;
C* pc = &x;
D* pd = &x;
w( pb );
w( pc );
w( pd );
w( pb->sum() );
w( pc->sum() );
w( pd->sum() );
// 3. But referring to the base class A is ambiguous
//
// A* pa = &x;
}
The program produces the following output:
*** Object layout
reinterpret_cast<void*>(p[0]): 0x5dfb9ddedcf8
p[1]: 5
p[2]: 100
reinterpret_cast<void*>(p[3]): 0x5dfb9ddedd10
p[4]: 5
p[5]: 150
p[6]: 200
*** Base pointers have different addresses
pb: 0x7ffe57c6a530
pc: 0x7ffe57c6a548
pd: 0x7ffe57c6a530
pb->sum(): 450
pc->sum(): 450
pd->sum(): 450
The output reveals that the B
portion and the C
portion each contain its own
A
object! This means that the D
object, which derives from B
and C
, will
have two A
objects in it. The memory layout diagram for D
is shown below:
The main()
function shows that a D
object still behaves polymorphically when
referred to through B*
and C*
. Any reference to the A
shared base class,
however, results in a compiler error because there are two A
portions and the
compiler cannot determine which one to use. This is known as the “Diamond
Problem”.
Virtual inheritance solves the Diamond Problem we described earlier. In
08/inherit4
shown below, we have B
and C
virtually inherit from the base
class A
using the virtual
keyword:
class A {
public:
virtual uint64_t sum() const { return a; }
private:
uint64_t a = 5;
};
class B : virtual public A {
public:
virtual uint64_t sum() const { return b; }
private:
uint64_t b = 100;
};
class C : virtual public A {
public:
virtual uint64_t sum() const { return c; }
private:
uint64_t c = 150;
};
class D : public B, public C {
public:
// There is now only one copy of A, so A::sum() is no longer ambiguous
uint64_t sum() const override { return A::sum() + B::sum() + C::sum() + d; }
private:
uint64_t d = 200;
};
int main() {
using namespace std;
// 1. Virtual inheritance fixes the diamond problem
D x;
cout << "*** Object layout" << endl;
static_assert(sizeof(D) == 7 * sizeof(uint64_t));
uint64_t* p = reinterpret_cast<uint64_t*>(&x);
w( reinterpret_cast<void*>(p[0]) );
w( p[1] );
w( reinterpret_cast<void*>(p[2]) );
w( p[3] );
w( p[4] );
w( reinterpret_cast<void*>(p[5]) );
w( p[6] );
// 2. sum() behaves polymorphically via any base pointer
cout << "*** Base pointers have different addresses" << endl;
A* pa = &x;
B* pb = &x;
C* pc = &x;
D* pd = &x;
w( pa );
w( pb );
w( pc );
w( pd );
w( pa->sum() );
w( pb->sum() );
w( pc->sum() );
w( pd->sum() );
}
The program output is shown below:
*** Object layout
reinterpret_cast<void*>(p[0]): 0x635a2bae9be0
p[1]: 100
reinterpret_cast<void*>(p[2]): 0x635a2bae9c00
p[3]: 150
p[4]: 200
reinterpret_cast<void*>(p[5]): 0x635a2bae9c20
p[6]: 5
*** Base pointers have different addresses
pa: 0x7fff65600de8
pb: 0x7fff65600dc0
pc: 0x7fff65600dd0
pd: 0x7fff65600dc0
pa->sum(): 455
pb->sum(): 455
pc->sum(): 455
pd->sum(): 455
We can now polymorphically invoke the sum()
virtual member function through
any base pointer, including A*
. By having B
and C
virtually inherit from
A
, the most derived class in the inheritance hierarchy, D
, only contains one
A
portion, even though A
appeared multiple times in the inheritance
hierarchy. A
is now laid out at the end of the D
object and has its own
vptr. The memory layout for D
is shown below:
The following code shows the complete vtable layout for D
:
D x;
uint64_t* p = reinterpret_cast<uint64_t*>(&x);
void** vtbl = reinterpret_cast<void**>(p[0]);
w( reinterpret_cast<int64_t>(vtbl[-3]) );
w( reinterpret_cast<int64_t>(vtbl[-2]) );
w( vtbl[-1] );
w( vtbl[0] );
w( reinterpret_cast<int64_t>(vtbl[1]) );
w( reinterpret_cast<int64_t>(vtbl[2]) );
w( vtbl[3] );
w( vtbl[4] );
w( reinterpret_cast<int64_t>(vtbl[5]) );
w( reinterpret_cast<int64_t>(vtbl[6]) );
w( vtbl[7] );
w( vtbl[8] );
// First vptr points to vtbl[0]
assert(reinterpret_cast<void**>(p[0]) == &vtbl[0]);
// Second vptr points to vtbl[4]
assert(reinterpret_cast<void**>(p[2]) == &vtbl[4]);
// Third vptr points to vtbl[8]
assert(reinterpret_cast<void**>(p[5]) == &vtbl[8]);
Running this code reveals more details about D
’s vtable:
reinterpret_cast<int64_t>(vtbl[-3]): 40
reinterpret_cast<int64_t>(vtbl[-2]): 0
vtbl[-1]: 0x5767fce2cce8
vtbl[0]: 0x5767fce2ad5a
reinterpret_cast<int64_t>(vtbl[1]): 24
reinterpret_cast<int64_t>(vtbl[2]): -16
vtbl[3]: 0x5767fce2cce8
vtbl[4]: 0x5767fce2adcf
reinterpret_cast<int64_t>(vtbl[5]): -40
reinterpret_cast<int64_t>(vtbl[6]): -40
vtbl[7]: 0x5767fce2cce8
vtbl[8]: 0x5767fce2adc2
We’ve updated the memory layout for D
and its vtable to show the full detail:
In addition to the offset_to_top
values, std::type_info
pointers, and
function/thunk pointers, there are two new kinds of offsets in the vtable.
vbl[-3]
is the vbase_offset
that is used to adjust a B*
or D*
to the A
portion of the D
object. vtbl[1]
is the vbase_offset
that is used to
adjust a C*
to the A
portion of the D
object. vtbl[5]
is the
vcall_offset
that is used to adjust the A*
pointer to point to the entire
D
object when polymorphically invoking sum()
through an A*
.
In this chapter, we focused on core topics of inheritance that require a deeper understanding of C++ internals. There are many more rules and features in the space of inheritance that we won’t cover here, but there are three more things that we’d like to cover before concluding this chapter: order of construction/destruction, pure virtual functions & abstract classes, and virtual destructors.
We present 08/inherit5
below to illustrate the order in which the components
of a derived class are constructed and destructed:
struct M {
M(uint64_t x) : m{x} { cout << "M::M(uint64_t)" << endl; }
~M() { cout << "M::~M()" << endl; }
uint64_t m;
};
struct B {
B(uint64_t x) : b{x} { cout << "B::B(uint64_t)" << endl; }
~B() { cout << "B::~B()" << endl; }
uint64_t b;
};
struct D : public B {
D() : B{100}, m_obj{200}, d{300} { cout << "D::D()" << endl; }
~D() { cout << "D::~D()" << endl; }
M m_obj;
uint64_t d;
};
int main() {
cout << "*** Order of construction and destruction" << endl;
{
D x;
}
}
The program produces the following output:
*** Order of construction and destruction
B::B(uint64_t)
M::M(uint64_t)
D::D()
D::~D()
M::~M()
B::~B()
We see that the derived class object D x
first constructs its base class B
,
then its members, and finally executes the body of its constructor. Note that we
initialized the B
portion by including B{100}
in the member initializer list
in the D
constructor. The destruction happens in the opposite order: the body
of the destructor of D
, then its members, and finally its base class B
.
We’ve now changed 08/inherit5
to add a virtual function, sum()
, to the base
class B
and override it in the derived class D
:
...
struct B {
...
virtual uint64_t sum() const = 0;
...
};
struct D : public B {
...
uint64_t sum() const override { return b + m_obj.m + d; }
...
};
int main() {
cout << "*** Order of construction and destruction" << endl;
{
D x;
// B y; // Compilation error
B* bp = &x;
w( bp->sum() );
}
}
Declaring a virtual function with = 0
makes it a pure virtual function.
Any class that contains a pure virtual function is called an
abstract class. An abstract class is meant to specify an interface that
should be implemented in a derived class. You cannot instantiate an abstract
class, but you can refer to a derived concrete class using an abstract
pointer/reference, as shown in the example above.
Consider the following snippet of code added at the end of the main()
function:
int main() {
cout << "*** Order of construction and destruction" << endl;
...
cout << "*** Virtual destructor" << endl;
B* pb = new D;
delete pb;
}
The code produces the following output:
*** Order of construction and destruction
...
*** Virtual destructor
B::B(uint64_t)
M::M(uint64_t)
D::D()
B::~B()
Recall our discussion of static binding vs. dynamic binding. Unless a member
function is marked virtual
, C++ will resolve invocations using static binding.
Here, when we delete a D
object through a B*
, B::~B()
is chosen statically
because the destructor was not marked virtual
.
Let’s change B::~B()
to be virtual
and mark D::~D()
with override
as follows:
...
struct B {
...
virtual ~B() { cout << "B::~B()" << endl; }
...
};
struct D : public B {
...
~D() override { cout << "D::~D()" << endl; }
...
};
...
The output now shows that D::~D()
gets called polymorphically, and it in turn
properly destructs all components of D
.
*** Order of construction and destruction
...
*** Virtual destructor
B::B(uint64_t)
M::M(uint64_t)
D::D()
D::~D()
M::~M()
B::~B()
In general, when you write a class that is meant to be derived from, you should
mark its destructor as virtual
so that the derived class can be
polymorphically deleted. There are, however, some situations where non-virtual
destructor makes sense for a base class. In those cases, the base class
destructor should be made protected
instead of public
to prevent deletion
through a base pointer.
Last updated: 2025-09-25