4
\$\begingroup\$

C++ shared_ptr implemented as a coding practice and learning purposes. It uses std::shared_ptr interface. Basic tests are included (using single header Catch 2) Some methods are omitted to keep the code shorter and readable. For simplicity it expects C++ 20. See comments for more details.

  • Is this implementation correct?

  • Does it fulfill multithreaded guarantees?

  • Does it fulfill exception guarantees?

  • Can this code be simplified? I mean made more idiomatic and readable?

  • Can it be speed up? (In parallel/concurrent use)

  • What unit test to add?

Thank you in advance for your comments.

Find the code bellow and on: https://github.com/tintera/SharedPtr (Repo contains cmake, tests, project, sln)

#pragma once
#include <atomic>

/// Lock free smart ptr similar to shared ptr.
/// - Destructor of pointed object must not throw. Or operators =, == have undefined behavior.
/// - Published as single header file for readability.
///
/// Known limits:
/// - Some race condition exist. Best to fix them and keep implementation lock free. And keep default constructor noexcept (as in std::)
/// - Owned object is not part of control block.
/// - No custom deleter or allocator.
/// - No separate template type for constructors. (std::shared_ptr constructor has another template type Y)
/// - No make_shared function.
/// - No std::hash<std::shared_ptr>
/// - No std::atomic<std::shared_ptr>
///
/// Omitted (not much to learn in implementing them IMHO)
/// - reset
/// - swap
/// - operator[] managing arrays not implemented at all.
/// - unique as it's removed in C++ 20
/// - owner_before
/// - operator<<(std::shared_ptr)
///
namespace smart_ptr
{

template<typename T>
class weak_ptr;

template<typename T>
class shared_ptr
{
    friend class weak_ptr<T>;

    struct control_block
    {
        explicit control_block(T* payload)
            : payload_(payload)
        {
        }

        std::atomic<int> usages_{1};
        std::atomic<int> weak_usages_{0};
        T* payload_{nullptr};
    };

    control_block* control_{nullptr};

    void finish_one_instance_()
    {
        if (!control_)
        {
            return;
        }
        --control_->usages_;
        if (control_->usages_ == 0)
        {
            // Last owner. We are in a single threaded scenario now.
            delete control_->payload_;
            if (control_->weak_usages_ == 0)
            {
                delete control_;
            }
        }
    }

public:
    constexpr shared_ptr() noexcept = default;

    constexpr explicit shared_ptr(nullptr_t) noexcept{}

    explicit shared_ptr(T* ptr)
    try
        : control_(new control_block(ptr))
    {
    }
    catch(...)
    {
        delete ptr;
        throw;
    }

    explicit shared_ptr(std::unique_ptr<T>&& ptr)
        : control_(new control_block(ptr.get()))
    {
    }

    ~shared_ptr() noexcept
    {
        finish_one_instance_();
    }

    shared_ptr(const shared_ptr& other) noexcept
        : control_{other.control_}
    {
        if(control_)
        {
            // TODO: Race condition, control can be deleted before usages_ is incremented.
            ++control_->usages_;
        }
    }

    shared_ptr(shared_ptr&& other) noexcept
    {
        std::swap(control_, other.control_);
    }

    template< class Y >
    explicit shared_ptr( const weak_ptr<Y>& r )
    {
        if (r.expired())
        {
            throw std::bad_weak_ptr{};
        }
        control_ = r.control_;
        ++control_->usages_;
    }

    shared_ptr& operator=(const shared_ptr& other) noexcept
    {
        if(this == &other) {
            return *this;
        }
        finish_one_instance_();
        control_ = other.control_;
        ++other.use_count();
        return *this;
    }

    shared_ptr& operator=(shared_ptr&& other) noexcept
    {
        std::swap(this, shared_ptr{other}); 
        return *this;
    }

    [[nodiscard]] explicit operator bool() const noexcept
    {
        return static_cast<bool>(control_);
    }
    
    void reset() noexcept 
    {
        finish_one_instance_();
        control_->payload_ = nullptr;
    }

    [[nodiscard]] T* get() const noexcept
    {
        // No race condition. The control exists while it's owned by this shared_ptr.
        return control_ ? control_->payload_ : nullptr;
    }

    [[nodiscard]] T& operator*() const noexcept
    {
        return *get();
    }

    [[nodiscard]] T* operator->() const noexcept
    {
        return get();
    }

    [[nodiscard]] long use_count() const noexcept
    {
        // No race condition. The control_ exists while it's owned by this shared_ptr
        return control_ ? control_->usages_.load() : 0;
    }

};

template< class T, class U >
std::strong_ordering operator<=>( const shared_ptr<T>& lhs, const shared_ptr<U>& rhs ) noexcept
{
    return lhs.control_ <=> rhs.control_;
};

template<typename T>
class weak_ptr
{
    friend class shared_ptr<T>;

    typename shared_ptr<T>::control_block* control_{nullptr};
public:
    constexpr weak_ptr() noexcept = default;

    ~weak_ptr()
    {
        if (control_)
        {
            --control_->weak_usages_;
            if (control_->usages_ == 0 && control_->weak_usages_ == 0)
            {
                delete control_;
            }
        }
    }

    explicit weak_ptr( const shared_ptr<T>& r ) noexcept
        : control_(r.control_)
    {
        if (control_)
        {
            // TODO: Race condition, control can be deleted before weak_usages_ is incremented.
            ++control_->weak_usages_;
        }
    }

    // TODO: explicit dod not compile
    weak_ptr( const weak_ptr& r ) noexcept
    {
        control_ = r.control_;
        if (control_)
        {
            // TODO: Race condition, control can be deleted before weak_usages_ is incremented.
            ++r.control_->weak_usages_;
        }
    }

    explicit weak_ptr(weak_ptr&& r) noexcept
    {
        std::swap(*this, r);
    }

    weak_ptr& operator=(const weak_ptr& other) noexcept
    {
        if(this == &other)
        {
            return *this;
        }
        std::swap(*this, weak_ptr{other});
        return *this;
    }

    weak_ptr& operator=(weak_ptr&& r) noexcept
    {
        std::swap(*this, r);
        return *this;
    }

    [[nodiscard]] bool expired() const noexcept
    {
        return (!control_) || (control_->usages_ == 0);
    }

    shared_ptr<T> lock()
    {
        return expired() ? shared_ptr<T>() : shared_ptr<T>(*this);
    }
};

}
\$\endgroup\$
4
  • 1
    \$\begingroup\$ If you want suggestions for test cases please add the test case file to the code embedded in the question. While we can use GitHub repositories as references, we can't review the any code that isn't embedded in the question/post. \$\endgroup\$
    – pacmaninbw
    Commented Sep 26, 2023 at 14:14
  • \$\begingroup\$ Just a suggestion, but all the comments at the top of the header file should be in your readme file in your GitHub repository. For this question/post please pick the tag that identifies the version of C++ you are targeting, you have both C++11 and C++20, they are very different. \$\endgroup\$
    – pacmaninbw
    Commented Sep 26, 2023 at 14:18
  • 1
    \$\begingroup\$ Another thing you could add to the limitations: doesn't implement an analog of std::enable_shared_from_this. \$\endgroup\$ Commented Sep 26, 2023 at 19:08
  • \$\begingroup\$ FYI, there are no race conditions in the constructors that take shared_ptr const&, because you know with 100% certainty that, whatever else may be going on the universe, control_->usages_ is always at least 1… because you have other right there. You are correct that there is a race condition in the weak_ptr copy constructor, but that can be easily fixed by: weak_ptr(weak_ptr const& r) noexcept : weak_ptr{r.lock()} {}. Generally, making more of a shared pointer is easy, safe, and efficient… making less of a shared pointer is considerably trickier, and costlier. \$\endgroup\$
    – indi
    Commented Sep 30, 2023 at 17:50

4 Answers 4

7
\$\begingroup\$

Critical Bugs

Bug 1

    explicit shared_ptr(std::unique_ptr<T>&& ptr)
        : control_(new control_block(ptr.get()))
    {
    }

This is broken. You pass by R-Value reference so the object is moved into ptr. Which means this parameter is now the owning copy. You get() the pointer and store it in the control block.

BUT ptr now goes out of scope and calls delete on the object. So you are now storing a pointer to a released object in your shared_ptr!

Fix: Call release() rather than get().


Bug 2

The increment here is on a temporary:

    shared_ptr& operator=(const shared_ptr& other) noexcept
    {
        if(this == &other) {
            return *this;
        }
        finish_one_instance_();
        control_ = other.control_;

        // This line does not increment the count.
        // This returns a value (not a reference).
        // So here you are incrementing a local temporary.
        ++other.use_count();

        // This should probably be (something like this).
        if (countrol_) {
           ++control->usages_.load();
        }
        return *this;
    }

Code Review Opinion:

If you don't support custom deleter's then you should not support taking ownership of the generic unique_ptr.

    explicit shared_ptr(std::unique_ptr<T>&& ptr)
        : control_(new control_block(ptr.get()))
    {
    }

Please specialize so you can only take a standard unique_ptr that does not have a specialized destructor (or support custom deleters). What I would not like to see is somebody accidentally converting the unique pointer into your shared ptr and then it not being correctly deleted.


Note: if ptr has the value nullptr. Then you are still creating a control block. Are you sure you want to do this? Nearly every function you have check for the existence of the control block (and when you explicitly pass a nullptr value you don't create a control_block, so I would expect the behavior to be the same.

    explicit shared_ptr(T* ptr)
    try
        : control_(new control_block(ptr))

        /// I think I would change the above line to:
        : control_(ptr ? new control_block(ptr) : nullptr)
        
    {
    }
    catch(...)
    {
        delete ptr;
        throw;
    }

I will admit this is how I used to do it as well.

    shared_ptr(shared_ptr&& other) noexcept
    {
        std::swap(control_, other.control_);
    }

But I think a better solution is:

    shared_ptr(shared_ptr&& other) noexcept
        : control_(std::exchange(other.control_, nullptr))
    {}

The reason for this being a better solution (in my opinion) is that the consensus is that you should (when reasonable) use the initializer list to make sure that members are set correctly. Otherwise, you will end up initializing members twice (once in the implicit initializer list and then once in the body). For simple POD types this may not be an issue, but anything with a constructor or assignment operator this can become expensive. So having the habit of always using the initializer list puts you into a good habit (then when you have experience you can then make an argument for alternatives (as long as you comment why you are not following standard techniques)).


Assignment:

Checking for self assignment. Normally I would say this is an anti-pattern and you should use the copy and swap odiom. But shared_ptr may be a special case. Let me think about this some more.

    shared_ptr& operator=(const shared_ptr& other) noexcept
    {
        if(this == &other) {
            return *this;
        }
        finish_one_instance_();
        control_ = other.control_;
        ++other.use_count();
        return *this;
    }

    shared_ptr& operator=(shared_ptr&& other) noexcept
    {
        std::swap(this, shared_ptr{other}); 
        return *this;
    }

OK. Not sure about this. But I still think the modern pattern of using a single assignment operation to support both copy and move works in this situation.

    // Pass by value.
    // Usually achieves the optimum code for both R-Value and L-Values.
    // 
    // Your multi threading and locking and increment/decrement throw some intricacies in there. But I still think this is a better solution to your two options above.
    shared_ptr& operator=(shared_ptr other) noexcept
    {
        std::swap(this, shared_ptr{other}); 
        return *this;
    }

Lets actually see the code for Copy/Move for your original compared to the single assignment version.

Here is the code in the move:

 // MOVE ORIG                        MOVE Single Method
 // Explicit Move                    // Implicit Move.
               // Identical     
               : control_(std::exchange(other.control_,nullptr))

 // Body
               // Identical
               std::swap(this, shared_ptr{other});

 // Destructor of param             // Destructor of param
               // Identical
               finish_one_instance_();

 // Return                         // Return
               // Identical
               return *this;

Here is the code in the copy.

  // COPY ORIG                       COPY Single method
                                     // Implicit Copy (ptr)
                                         : control_{other.control_}
                                     if(control_) {
                                         ++control_->usages_;
                                     }

  // Assignment operator             // Assignment operator
  if(this == &other) {               std::swap(this, shared_ptr{other});
      return *this;
  }
                                     // Implicit destructor (ptr)
  finish_one_instance_();            finish_one_instance_();

  control_ = other.control_;
  if (countrol_) {            
      ++control->usages_;
  }
  return *this;                      return *this;
  

So the order of instructions are different. The only significant difference is the check for self assignment Vs swap().

Swap is a trivial operation, while branch (if statement) is a pain for optimization and thus most likely self-deprecating. So same number of increments and decrements. So I would go with the single assignment method version.

Note: I deliberately don't look at threading. There is more complexity here, but I don't think it changes this analysis. When you get it working correctly in a threading environment, the single assignment method will still be optimal.


Same as above I would use std::exchange.

    explicit weak_ptr(weak_ptr&& r) noexcept
    {
        std::swap(*this, r);
    }

Personal Stuff:

You can feel free to ignore this section.

Don't like snake_case (In C++). I don't think I have seen any C++ code that uses it.


This is not portable.

#pragma once

Prefer to use proper include guards. That way it will work on all conforming compilers.

Threadingness

I think others have covered the threading issues. I will ignore anything to do with it (because it is hard and I don't want to think that hard :-)

    struct control_block
    {
        explicit control_block(T* payload)
            : payload_(payload)
        {
        }

        std::atomic<int> usages_{1};
        std::atomic<int> weak_usages_{0};
        T* payload_{nullptr};
    };

Extra:

I wrote a lot about creating a smart pointer here:

UniquePointer
SharedPointer
Smart-Pointer-Constructors

\$\endgroup\$
5
  • \$\begingroup\$ Upvoted for the critical bug. Some of the other points are debatable. \$\endgroup\$
    – Davislor
    Commented Sep 26, 2023 at 16:25
  • 2
    \$\begingroup\$ Comment on snake_case... google c++ style guide says to use snake_case. a lot of places do it that way ime as well. google.github.io/styleguide/cppguide.html#Variable_Names \$\endgroup\$
    – g19fanatic
    Commented Sep 26, 2023 at 17:36
  • 1
    \$\begingroup\$ @g19fanatic I generally see snake_case in C programming, personally I use camelCase in C++. \$\endgroup\$
    – pacmaninbw
    Commented Sep 26, 2023 at 17:46
  • \$\begingroup\$ @g19fanatic Though Google Style guide has improved over the years, the last time I looked I still not agree with a lot of stuff. I personally don't think it is a good guide and don't recommend it. Maybe it has improved since I last looked. \$\endgroup\$ Commented Sep 26, 2023 at 18:36
  • \$\begingroup\$ @g19fanatic But in this specific instance, the snake case is just a personal preference. I don't think there is anything wrong apart from it being slightly strange looking (for C++). \$\endgroup\$ Commented Sep 26, 2023 at 18:40
7
\$\begingroup\$

Much of the same feedback I’ve given on other shared_ptr implementations applies here. In particular:

This design of control_block is Never Going to Work

You’re unfortunately going to need to start over from scratch.

You Need One Lock-Free Atomic Update of Both Counters

Your stated goal, appropriately, is a lock-free, thread-safe data structure. The counters you have now are:

        std::atomic<int> usages_{1};
        std::atomic<int> weak_usages_{0};

First, int is a signed type, and negative values are always errors, so it makes more sense to use unsigned types for the counter and get better range. Second, int has been anything from 16-bit to 64-bit on real-world implementations, which makes it a bit of a newbie trap. Here, there are real compilers where std::atomic<int> would overflow to a negative value after only 32,767 copies. Third, and most importantly, two separate atomic variables cannot meet the requirements.

If you have two separate atomic counters, you can only update them with two separate atomic transactions. If one thread is decrementing usages and another is decrementing weak_usages at the same time, there is no guarantee that one and only one thread will see both counters equal to 0, which is neccessary to properly free the control block. Memory ordering will not help: sequential consistency only applies to updates from the same thread. For example, the thread scheduler could choose to suspend your thread in between the instructions that check the two counters.

Therefore, you might pack the two counters (which should be an unsigned type at least 32 bits wide) into a single atomic variable, and update both values with a single atomic compare-and-swap. Possible implementation choices: a bitfield containing two 32-bit fields; an atomic struct of two uint32_t values with alignas(8), or a std::atomic<uint_least64_t> with some utility functions to update either value. All mainstream x86_64 compilers will also generate cmpxchg16b instructions if you select an instruction set that has them and use a naturally-aligned atomic struct of two 64-bit values.

Whatever you choose, you may want to static_assert that your atomic counter-pair is_always_lock_free.

Edit: That isn’t the only approach that works. It’s possible to fix the algorithm along the lines of Deduplicator’s suggestions, by among other things:

Changing lines such as

        --control_->usages_;
        if (control_->usages_ == 0)

to the atomic

        if (--control->usages == 0)

And having the functions that increment the strong count (such as copy construction) do so only if it is not 0, in a single atomic operation. This could allow the destructors to use an algorithm with one or two decrement operations rather than a CAS loop, which might be a more-optimal wait-free algorithm on some architectures such as x86, or less efficient on other architectures where the atomic decrements would need to be two store-conditional loops rather than one.

You Cannot Store the Data Pointer in the Control Block

This implementation will not meet the requirement (in [util.smartptr.shared.const]) that a shared_ptr<T> constructed from a shared_ptr<Y>, where T* and Y* are compatible types, share ownership of the object. In particular, the T* payload can have a different object representation than the Y* from which it is constructed. (For example, the virtual base class of a derived class with multiple inheritance.)

Doing it that way will make each shared_ptr slightly smaller, but at the cost of requiring an extra dereference each time it accesses the data pointer. Since a shared_pointer will access its data often and its counters rarely, it makes sense anyway to optimize for the common case by putting both the control_block and payload pointers in the shared_ptr object itself. On many architectures, a shared_ptr will still fit in a register and a std::atomic<shared_ptr> will still be lock-free. (On some, forcing natural alignment of the shared_ptr will generate slightly-more-optimal code.)

Comparisons

In general, your comparison operators should be non-member friend functions, so that you can properly resolve overloads. For example, you might want it to be possible to compare a shared_ptr to a different type of pointer.

If you implement the <=> spaceship operator, you must also implement == separately (for complex technical reasons involving inheritance and overriding). The compiler will then generate the remaining comparison operators for you.

Use Copy-and-Swap

You mention that you omitted swap because there’s “not much to learn from” it. Granted, this is true for now, because your class trivially wraps a pointer to the control block. You then say you want to implement the required overload of std::atomic<shared_ptr>.

Once you move the payload from shared_ptr::control_block into shared_ptr, this is going to stop being so trivial. You’ll now find it very useful to have a way to update the object that is atomic, lock-free and exception-free. You’re currently calling std::swap on the control-block pointers in your move constructor and assignment operator. If you’d written swap and used that in both places, you would not need to change them now, when you’re forced to alter the contents of shared_ptr.

There Are Other Thread-Safety Bugs

You mention that you’re aware of some race conditions. As G. Sleipen mentions in more detail, a great many of these are caused by writing multiple separate atomic operations, when what you need is a single atomic operation (usually an atomic compare-and-swap). Read their advice, too.

Future Directions: get_deleter

You haven’t implemented custom deleters at all, but since you say you’re thinking about them: when you do, it might be very helpful to turn control_block into an abstract base class that concrete_control_block<Deleter> derives from. (Since the shared pointer no longer stores its payload in the control block, the abstract interface no longer needs to remember what T is. The counter-manipulation and destroying the control block are both fully generic. However, if you are implementing get_deleter, it is useful to remember the type of the stored deleter and store a copy of the original payload.) That lets you implement a helper such as

template<typename T, typename Deleter>
    // requires clauses go here.
virtual void* concrete_control_block<T, Deleter>::get_deleter_helper(const std::type_info type) const override noexcept

which you can then call from get_deleter<Deleter, T>, as

static_cast<Deleter*>(sp.control_->get_deleter_helper(typeid(Deleter)))

The simpler alternative you mentioned in your comment will work too: store a std::function<void(T*)> in control_block<T> and implement get_deleter by calling std::function::target.

\$\endgroup\$
8
  • \$\begingroup\$ Davislor wrote: "You’re unfortunately going to need to start over from scratch.". I plan to address the issues found. Is is then better to create new question (and link it here from a comment) or somehow edit this question (and how)? Note: I'm new to StackExchange Code review. \$\endgroup\$ Commented Sep 27, 2023 at 11:08
  • \$\begingroup\$ @TomasTintera Okay, that was an overstatement. The part I originally recommended you “start over from scratch” was control_block. Then Deduplicator showed up and corrected me that there is a way to get thread-safety with separate atomic counters in the control block. You might use either approach; it’s up to you. To implement all the requirements (including get_deleter), you will definitely need to add several other pieces of data to the control block, as well as a copy of the object pointer to each shared_ptr instance. \$\endgroup\$
    – Davislor
    Commented Sep 27, 2023 at 14:57
  • \$\begingroup\$ @TomasTintera But making a new second draft sounds like a great idea! \$\endgroup\$
    – Davislor
    Commented Sep 27, 2023 at 14:59
  • \$\begingroup\$ If you replace the payload pointer with a deleter std::function or similar, then you could keep that in the control block. (But then I guess you do need a payload pointer in the shared_ptr/weak_ptr to keep track of what subobject the current smart pointer object refers to.) \$\endgroup\$ Commented Sep 27, 2023 at 16:37
  • \$\begingroup\$ @DanielSchepler In order to implement get_deleter, you need to store both the deleter in the control block, as well as either a copy of the original object pointer or the object itself. This is in addition to the T* that you need to store in the shared_ptr instance, because the conversion from shared_ptr<Y> to shared_ptr<T> could require a T* different from the original Y* to share the same control block. \$\endgroup\$
    – Davislor
    Commented Sep 27, 2023 at 16:49
6
\$\begingroup\$

Make sure your operations are really atomic

Using atomic variables does not automatically make the operations on your objects atomic. Consider these lines in finish_one_instance_():

--control_->usages_;
if (control_->usages_ == 0)
    …

Those are two separate atomic operations. Together they are no longer atomic. And this immediately leads to a race condition: consider two threads, each holding a shared_ptr that is pointing to the same control_block. Now they both destroy their shared_ptr object at exactly the same time. Then they first could both execute --control_->usages_, and only then do they both execute the if-statement. Both will now see that control_->usages_ == 0, and a double free will happen that at best crashes your program, at worst causes unexpected things to happen.

You can fix this by writing this instead:

if (--control_->usages_ == 0)
    …

This was an easy fix. But the same mistake happens when calling lock() on a weak_ptr:

return expired() ? shared_ptr<T>() : shared_ptr<T>(*this);

Don't let the fact that this is written on one line fool you. It is equivalent to:

if (expired())
    return shared_ptr<T>();
else
    return shared_ptr<T>(*this);

Consider that expired() is executed and returns false. However, another thread can then jump in and causes the usages_ to drop to zero. Then the first thread calls shared_ptr<T>(*this). Sure, the constructor of shared_ptr that takes a weak_ptr also checks expired(), but this has, again, exactly the same problem.

The fix here is more complicated. You must increment usages_ atomically, but only if it isn't zero. You have to use a compare-and-exchange method in a loop here.

Finally, it can happen that one thread destroys the last remaining shared_ptr, and another the last remaining weak_ptr. See Davislor's answer about the issue and possible fix for that.

weak_ptr::lock() should be noexcept

The standard says that lock() is noexcept. This should not be hard to do, just make sure you don't throw anywhere when trying to lock an expired() object.

\$\endgroup\$
5
\$\begingroup\$

Currently, your two use-counts are handled completely independent, the pitfalls of which others already explained (namely, if they are decremented to zero at the same time, who did it last?).

A better fix than stuffing them together or some such and hoping you get an efficient lock-free update for the resultant bigger structure, just make the existence of any strong claim at all hold down a single weak claim (properly counted in the member storing the weak count, like all others). Thus, you start out with a single strong claim and the single weak claim all strong claims collectively own.

Last strong claim removed means cleaning up the resource and then releasing the collective weak claim.
Last weak claim removed means cleaning up the control block.

No operation involving strong claims looks at weak claims, aside from the removal of the last strong claim (removing the collective weak claim at the end, proper sequencing is crucial).

No operation involving weak claims looks at strong claims, aside from trying to add a strong claim while only holding a weak one (which doesn't look at weak claims).

\$\endgroup\$
1

Not the answer you're looking for? Browse other questions tagged or ask your own question.