COMS W4995 C++ Deep Dive for C Programmers

Smart Pointers

In this chapter, we will study smart pointers. They are class objects that wrap raw pointers to heap-allocated objects and manage their lifetime on your behalf. Smart pointers implement operator*() and operator->() so that they can be drop-in replacements for raw pointers in most situations. The benefit is that you don’t have to worry about the deleting the heap-allocated object. This is in fact a prime example of the RAII paradigm that we studied earlier – smart pointers allow you to dynamically allocate any object and pass it around between functions without having to worry about the object’s lifetime.

Many smart pointer implementations existed before they were formally introduced in C++11. The two primary smart pointer class templates that STL provides are shared_ptr<T> and unique_ptr<T>. The former allows many pointers to refer to the same heap-allocated object without copying the underlying object. The latter represents unique ownership of a heap-allocated object.

SharedPtr<T>

Let’s begin by studying SharedPtr<T>, our simplifed implementation of std::shared_ptr<T>. The program 17/sharedptr1 demonstrates its usage:

auto create_vec1() {

    // SharedPtr<T> replaces T* for heap-allocated object.
    // That is, instead of doing
    //
    //     string* p { new string{"hello"} };
    //
    // We do the following:

    SharedPtr<string> p1 { new string{"hello"} };
    SharedPtr<string> p2 { new string{"c"} };
    SharedPtr<string> p3 { new string{"students"} };

    vector v { p1, p2, p3 };
    return v;
}

int main() {
    vector v1 = create_vec1();

    for (auto p : v1) { cout << *p + " "; } cout << '\n';
    // for (auto& p : v1) { cout << *p + " "; } cout << '\n';
    // for (const auto& p : v1) { cout << *p + " "; } cout << '\n';
    // for (auto&& p : v1) { cout << *p + " "; } cout << '\n';

    vector v2 = v1;
    *v2[1] = "c2cpp";
    *v2[2] = "hackers"; 

    // print v1 again
    for (auto&& p : v1) { cout << *p + " "; } cout << '\n';
}

The create_vec1() function creates three stack-allocated shared pointers, p1, p2, and p3, that wrap raw pointers to heap-allocated string objects. It then creates a vector that holds copies of the three shared pointers and returns the vector by value. The stack-allocated shared pointers go out of scope when the function returns, but their copies in the vector still point to the same heap-allocated string objects. We’ll study the copying semantics in more detail shortly.

To be succinct, we omit the type parameter for the vector declarations in the code – i.e., we simply write vector instead of vector<SharedPtr<string>>. The compiler can deduce the type parameter from the context. By the same token, we write auto for the return type of create_vec1() because the compiler can deduce the type from the return statement.

The main() function initializes v1 with the vector that create_vec1() returns and prints out the three strings that the shared pointers in the vector point to. We dereference the shared pointer using operator*(), as if it were the underlying raw pointer.

The main() function makes a copy of the vector v1 into the vector v2, changes two of the strings, and then prints out the original vector v1. The output shown below reveals that shared pointers in v1 and v2 point to the same set of strings:

hello c students 
hello c2cpp hackers 

We also took this opportunity to demonstrate four ways of using auto in range-for loops. Since writing the full type SharedPtr<string> is cumbersome, we opted to simply write auto x instead. As we’ve seen before, we can also write auto& and const auto& to avoid making copies of the elements. The fourth alternative is new to us; it uses auto&&, which is another way to declare a forwarding reference, which we studied in the previous chapter. This means x can either be an l-value reference or an r-value reference, depending the value category of *v1.begin(). auto&& is useful when writing a range-for loop in a template where it’s unknown what the iterators return.

vector<bool> is one such case where dereferencing an iterator returns an r-value. It is a specialization of the vector class template that is likely to efficiently store the booleans as bits packed into integers. Since the vector doesn’t actually store an array of bool types, dereferencing vector<bool>::iterator returns by value an intermediary object that represents the particular boolean value.

Implementation

We present the implementation of SharedPtr<T> along with a simple test driver below:

template <typename T>
class SharedPtr {
    T*   ptr;    // the underlying pointer
    int* count;  // the reference count
public:
    explicit SharedPtr(T* p = nullptr) : ptr{p}, count{new int{1}} {}
    ~SharedPtr() {
        if (--*count == 0) {
            delete count;
            delete ptr;
        }
    }

    SharedPtr(const SharedPtr<T>& sp) : ptr(sp.ptr), count(sp.count) {
        ++*count; 
    }
    SharedPtr<T>& operator=(const SharedPtr<T>& sp) {
        if (this != &sp) {
            // first, detach.
            if (--*count == 0) {
                delete count;
                delete ptr;
            }
            // attach to the new object.
            ptr = sp.ptr;
            count = sp.count;
            ++*count;
        }
        return *this;
    }

    T& operator*() const { return *ptr; }
    T* operator->() const { return ptr; }

    operator void*() const { return ptr; } // enables "if (sp) ..."
    T* get_ptr() const { return ptr; } // access to underlying ptr
    int get_count() const { return *count; } // access to reference count
};

int main() {
    SharedPtr<string> p1 { new string{"hello"} };
    cout << "before foo(), reference count: " << p1.get_count() << '\n';

    auto foo = [](SharedPtr<string> p) {
        cout << "inside foo(), reference count: " << p.get_count() << '\n';
    };
    foo(p1);

    cout << "after foo(), reference count: " << p1.get_count() << '\n';
}

The test driver declares a SharedPtr<string> p1 that wraps a pointer to the heap-allocated string{"hello"}. The SharedPtr constructor initializes its two data members: ptr is a copy of the given pointer and count is a pointer to a heap-allocated integer initialized to 1. The heap-allocated integer is a reference count that tracks how many SharedPtr objects refer to the same underlying heap-allocated object. The following diagram depicts the memory layout after p1 is declared:

17-sharedptr1

The main() function then passes p1 to the lambda foo by value, causing a copy construction of p1. The SharedPtr copy constructor is essentially a shallow copy – it simply copies the ptr and count pointers. But it does perform one extra step: it increments the shared reference count. The following diagram depicts the memory layout when the lambda’s parameter p gets copy constructed:

17-sharedptr2

The diagram shows that both SharedPtr objects point to the same underlying string, which is reflected by the shared reference count of 2.

When the lambda foo returns, its parameter p is destroyed. The SharedPtr destructor will decrement the shared reference count back to 1, as shown below:

17-sharedptr3

The SharedPtr destructor will only delete the underlying string and the reference count once the count hits zero. This will be the case once p1 is destructed at the end of the main() function.

The SharedPtr copy assignment operator is similar to the copy constructor except that it must first detach itself from the underlying object it was previously pointing to by decrementing its reference count. If the reference count hits zero, meaning that the SharedPtr held the last reference, the previous object and the reference count are deleted. After successfully detaching itself, the code is the same as the copy constructor.

SharedPtr’s operator*() and operator->() allow it to be a drop-in replacement for the underlying pointer in most cases. operator->() overloads are treated specially by the compiler in that it keeps applying operator->() to the return value of the operator until a raw pointer is returned, at which point the built-in -> operator is finally applied.

operator void*() allow a SharedPtr to be converted to a void* like other pointers, enabling the SharedPtr to be used in boolean expressions. However, we do not provide a conversion operator to T* to prevent unintended exposure to the underlying pointer.

Instead, we provide the get_ptr() accessor for cases when we need the underlying pointer. We also provide the get_count() accessor, which returns the current reference count.

Revisiting vector<SharedPtr<string>>

With the SharedPtr implementation in mind, let’s now revisit the 17/sharedptr1 program from earlier. The diagram below depicts the memory layout right before create_vec1() returns, where the three SharedPtrs – p1, p2, and p3 – and their copies in the vector v share the three heap-allocated strings:

17-vec-1

When create_vec1() returns, p1, p2, and p3 are destroyed, dropping the reference counts to 1. The vector v is returned by value and is used to move-construct the vector v1. The result is shown below:

17-vec-2

The vector v2 is copy-constructed from v1, and we change two of the underlying strings through the SharedPtrs in v2. The result is shown below:

17-vec-3

MakeSharedPtr<T>()

Using std::forward<T>() and variadic templates we studied in the previous chapter, we can combine the construction of SharedPtr object and the heap-allocation of the underlying object as follows:

template <typename T, typename... Args>
SharedPtr<T> MakeSharedPtr(Args&&... args) {
    return SharedPtr<T>{new T{forward<Args>(args)...}};
}

auto create_vec2() {
    auto p1 { MakeSharedPtr<string>("hello") };
    auto p2 { MakeSharedPtr<string>("c") };
    auto p3 { MakeSharedPtr<string>("students") };

    vector v { p1, p2, p3 };
    return v;
}

The MakeSharedPtr<T>() function template takes any number of arguments of any types to construct a T object. It perfect-forwards the arguments to the new T{} expression, preserving the value categories of the arguments. It returns a SharedPtr object initialized with the pointer to the newly heap-allocated T object.

Note that the constructions of p1, p2, and p3 in create_vec2() don’t have explicit new expressions anymore because the heap allocation is now embedded inside MakeSharedPtr<T>(). This is an improvement as the user code no longer has to contain explicit new expressions. Once we study the semantics of the real std::shared_ptr<T> and std::make_shared<T>(), we’ll see that this also has performance implications.

Before we delve into the semantics of the real STL implementations, we show create_vec3() which replaces our MakeSharedPtr<T>() with std::make_shared<T>():

auto create_vec3() {
    auto p1 { make_shared<string>("hello") };
    auto p2 { make_shared<string>("c") };
    auto p3 { make_shared<string>("students") };
    static_assert(is_same_v<decltype(p3), shared_ptr<string>>);

    vector v { p1, p2, p3 };
    return v;
}

std::shared_ptr<T>

Combined heap allocations

Consider the following code snippet in 17/sharedptr2:

string*            p0 { new string(50, 'A') };
shared_ptr<string> p1 { new string(50, 'B') };
shared_ptr<string> p2 { make_shared<string>(50, 'C') };

We used Valgrind to examime how many heap allocations resulted from each declaration. The declaration of p0 results in two heap allocations, as expected: one for the string object and another for the 51-byte char array, "AAA...A". We chose a long string to ensure the underlying char array is separately heap-allocated – most implementations of std::string will perform short short optimization (SSO) and store a short string inside the object itself.

The declaration of p1 results in three heap allocations: one for the string object, one for "BBB...B", and one for the control block of the std::shared_ptr. The memory layout for p1 is shown below:

17-std-sharedptr-1

In our simple implementation of SharedPtr, we only stored the shared reference count on the heap. The std::shared_ptr implementation heap-allocates a control block that stores the shared reference count as well as other metadata.

Note that both the shared_ptr and its control block contain pointers to the underlying string object in the diagram above. shared_ptr::get() returns its “stored ptr” member. The control block’s “managed ptr” is used to delete the underlying object later. We’ll see why the implementation makes this distinction shortly.

The declaration of p2 using make_shared() results in the memory layout shown below:

17-std-sharedptr-2

The diagram reveals that the underlying string object is embedded inside the control block. We can imagine that the make_shared() function template calculates the total size of the control block and the underlying object, allocates the raw memory on the heap, and uses placement new to construct the objects. This combined allocation obviates the need to store a “managed ptr” member in the control block, and more importantly, saves us one heap allocation. Since make_shared() not only hides the raw new expression in user code but also has a performance benefit, it is the perferred method of constructing shared_ptr.

Custom deleters

Consider the code snippet below that attempts to wrap a heap-allocated string array using our SharedPtr and std::shared_ptr:

SharedPtr<string> p0 { new string[2] {"hello", "c2cpp"} };
cout << *p0 << ' ' << p0.get_ptr()[1] << '\n';

shared_ptr<string> p1 { new string[2] {"hello", "c2cpp"} };
cout << *p1 << ' ' << p1.get()[1] << '\n';

The code compiles fine, and we’re able to print the second element of each array by using the raw pointers returned by get_ptr() and get(). However, the destructions of p0 and p1 both cause the program to crash because the destructor incorrectly uses delete instead of delete[]. Since new string and new string[] will both return string*, there’s no way for p0/p1’s constructors to distinguish between the two without additional machinery.

One way we can use shared_ptr with arrays is by passing an optional custom deleter argument to the constructor, as follows:

shared_ptr<string> p2 { new string[2] {"hello", "c2cpp"},
    [](string* s) { delete[] s; }
};
cout << *p2 << ' ' << p2.get()[1] << '\n';

When it’s time to delete the underlying object, the implementation will invoke our custom deleter instead of invoking delete by default. shared_ptr with custom deleter can be used in other resource management scenarios as well. Here’s one example where shared_ptr is used to wrap FILE*:

FILE* fp = fopen(...);
shared_ptr<FILE*> sp { fp, [](FILE* fp) { fclose(fp); } };
...

shared_ptr to array

Although we motivated our discussion of custom deleters with arrays, shared_ptr handles arrays correctly if we simply specify an array type as the type parameter.

shared_ptr<string[]> p3 { new string[2] {"hello", "c2cpp"} };
cout << p3[0] << " " << p3[1] << '\n';

Note that shared_ptr with an array type now defines operator[]().

Aliasing

Recall that shared_ptr makes a distinction between the “stored ptr” returned by get() and the “managed ptr” in the control block that it will eventually delete. In fact, the “stored ptr” can be arbitrary, and even unrelated to the object that “managed ptr” refers to.

shared_ptr provides an aliasing constructor that allows a new shared_ptr to participate in an existing reference count but hold an arbitrary “stored ptr”. In the following example, that arbitrary “stored ptr” in p2 points to the second member of a managed pair object:

shared_ptr<pair<string,int>> p1 { new pair{"hi"s, 5} };
shared_ptr<int>              p2 { p1, &p1->second };

cout << p1->first << ' ' << *p2 << '\n';  // prints "hi 5"

assert(p1.use_count() == 2 && p2.use_count() == 2);

The memory layout for this example is shown below:

17-alias

As we can see in the diagram, p1 and p2 share the same control block but p2 has a different “stored ptr”. Even if p2 is destructed after p1, the managed pair object will be properly destructed because it is deleted through the “managed ptr” in the control block.

std::weak_ptr

STL offers a variant of shared_ptr called weak_ptr. A weak_ptr points to an existing object managed by shared_ptr, but does not participate in the shared reference count. The following code demonstrates its usage:

int main() {
    auto weak_test = [](weak_ptr<string> wp) {
        shared_ptr<string> sp = wp.lock();
        if (sp) {
            assert(sp.use_count() == 2);
            cout << *sp << '\n';
        } else {
            cout << "weak_ptr expired\n";
        }
    };

    weak_ptr<string> wp;

    {
        auto sp = make_shared<string>("hello");
        assert(sp.use_count() == 1);

        wp = sp;                     // weak_ptr does not increase
        assert(sp.use_count() == 1); // the shared_ptr's ref count

        weak_test(wp); // sp is alive, so is wp
    }

    weak_test(wp); // sp is gone, so wp is expired
}

The main() function creates a shared_ptr, sp, that manages the string object "hello". We copy-assign sp to a weak_ptr, wp, making it also point to the managed string. The following diagram depicts the memory layout at this point:

17-std-weakptr

We see that the shared_ptr and weak_ptr have the same “stored ptr” and “control block ptr”. The diagram shows that the control block also tracks how many weak_ptrs point to the managed object. The destructor for the managed object is invoked once the shared reference count drops to zero, but the memory for the control block (including the embedded object) is not deallocated until both counts drop to zero. In the real STL implementation, the “weak count” might be tweaked to account for the existence of other shared_ptrs to improve performance in multi-threaded contexts. We omit such detail for simplicity.

After the weak_ptr is attached to the managed string object, we see that sp.use_count() is still 1, confirming that weak_ptr does not participate in the shared reference count.

The weak_test lambda demonstrates how to use a weak_ptr. In order to access the managed object, we must create another shared_ptr to the managed object using wp.lock(). This ensures that the object is not destroyed while we’re using it. If the managed object was already destroyed when wp.lock() is invoked – i.e., there are only weak_ptrs left – then wp.lock() returns an empty shared_ptr that evaluates to false.

The program produces the following output:

hello
weak_ptr expired

The first call to weak_test prints “hello” because the managed object is still alive. The second call to weak_test happens after sp goes out of scope, destroying the managed string object. In this case wp.lock() returns an empty shared_ptr, causing weak_test to print “weak_ptr expired”.

std::unique_ptr<T>

STL also offers a simpler smart pointer called std::unique_ptr, which represents unique ownership of a managed object. unique_ptr deletes the managed object when it gets destroyed, just like shared_ptr, but it doesn’t do any reference counting. Thus, unique_ptr needs to ensure that there is only one unique_ptr pointing to the managed object at any time. It achieves this by deleting copy operations. It can, however, be moved.

The program 17/uniqueptr, shown below, demonstrates how to use unique_ptr:

unique_ptr<string[]> create_array(size_t n, string s) {
    auto up = make_unique<string[]>(n);
    for (size_t i = 0; i < n; ++i) {
        up[i] = s;
    }
    return up;
}

int main() {
    size_t n = 1000;
    unique_ptr<string[]> up1 = create_array(n, "hello");
  
    // unique_ptr holds just a pointer
    static_assert(sizeof(up1) == 8);

    for (size_t i = 0; i < n; ++i) { assert(up1[i] == "hello"s); }

    unique_ptr<string[]> up2;
    assert(!up2);

    // This doesn't compile -- unique_ptr cannot be copied
    // up2 = up1;

    // But it can be moved
    up2 = std::move(up1);
    assert(!up1 && up2);

    // It can also be moved to a shared_ptr
    shared_ptr<string[]> sp { std::move(up2) };
    assert(!up2 && sp.use_count() == 1);

    for (size_t i = 0; i < n; ++i) { assert(sp[i] == "hello"s); }
}

The main() function invokes create_array() to create an array with 1000 string objects. The create_array() function returns the newly created array wrapped in a unique_ptr, which gets moved into up1 in main().

From the static_assert, we see that the size of up1 is just 8 bytes, the same as a raw pointer. This makes unique_ptr as efficient as using a raw pointer, with the added benefit of the RAII paradigm.

Like shared_ptr, unique_ptr can also be constructed with a custom deleter. If the deleter functor is stateless, the compiler can perform certain optimizations to keep the size of unique_ptr the same as a raw pointer. If the deleter functor is stateful, the size of unique_ptr will increase to account for it.

The main() function then declares an empty unique_ptr, up2, and asserts that it evaluates to false. The expression up2 = up1 doesn’t compile because unique_ptrs cannot be copied. Instead, we move up1 to up2. After the move, up1 becomes empty and up2 gains ownership of the object.

Lastly, we demonstrate that the ownership of the managed object can be transferred from a unique_ptr to a shared_ptr because shared_ptr has the following constructor function template:

template <template T, template Deleter>
shared_ptr(std::unique_ptr<T, Deleter>&& r);

In most situations where we need a smart pointer, we don’t really need more than one pointer at a time. Common advice is to start with a unique_ptr and switch to shared_ptr only when needed.


Last updated: 2025-11-28