2
\$\begingroup\$

I recently discovered the atomic wait/notify mechanism in C++20 and wrote this readers-writer lock in the style of Linux kernel Seqlock. Writes have to be inside a lock/unlock, while reads are optimistic. It is very fast, but I am wondering if it's wrong in some way, e.g. could it lock up beyond hypothetical timing issues (e.g. reads taking a lot longer than writes).

#include <atomic>

class RwSeqLock
{
    // *** NB this can indeed lock up and is dangerous. see stackexchange replies ***
    std::atomic<uint64_t> m_count = { 0 };
    std::atomic<uint64_t> m_waiting = { 0 };

public:
    void lock()
    {
        uint64_t count = m_count.load();
        if (!(count & 1) && m_count.compare_exchange_weak(count, count + 1))
            return;

        count = m_count.load();
        m_waiting.fetch_add(1);
        while (1)
        {
            if (!(count & 1))
            {
                if (m_count.compare_exchange_weak(count, count + 1))
                {
                    m_waiting.fetch_sub(1);
                    return;
                }
            }
            else
            {
                m_count.wait(count);
                count = m_count.load();
            }
        }
    }

    void unlock()
    {
        m_count.fetch_add(1);
        if (m_waiting.load())
            m_count.notify_one();
    }

    template<class Func>
    auto read(const Func& func)
    {
        uint64_t count = m_count.load();
        if (!(count & 1))
        {
            auto val = func();
            if (m_count.load() == count)
                return val;
        }

        count = m_count.load();
        m_waiting.fetch_add(1);
        while (1)
        {
            if (!(count & 1))
            {
                auto val = func();
                uint64_t count_after = m_count.load();
                if (count_after == count)
                {
                    m_waiting.fetch_sub(1);
                    return val;
                }
                else
                    count = count_after;
            }
            else
            {
                m_count.wait(count);
                count = m_count.load();
            }
        }
    }

    // stats
    uint64_t count() const { return m_count.load(); }
    uint64_t waiting() const { return m_waiting.load(); }
};
\$\endgroup\$

1 Answer 1

2
\$\begingroup\$

About those hypothetical timing issues

I think you are referring to the fact that seqlocks can livelock readers. However, this is not hypothetical at all, it is a real property of seqlocks. That's why they should only be used in scenarios where writes are rare compared to reads.

Readers can steal notifications other threads are waiting on

Both readers (in read()) and writers (in lock()) can call m_count.wait(). So when you have two writers and one reader, and one writer has the lock, the second reader and writer can both be waiting. However, when the first writer unlocks, it calls m_count.notify_one(). If the reader gets this notification, the second writer will not get woken.

A similar situation is when you have one writer but two readers both waiting. Only one will get notified.

One solution is to call m_count.notify_all() in unlock(); this makes sure all threads that are waiting get notified. This might result in the thundering herd issue if you have lots of writers queued up, but that should be rare, or else you shouldn't have used seqlocks to begin with (see above).

Another solution is to have the reader check if m_waiting is not zero after calling m_waiting.fetch_sub(1), and if so, call m_count.notify() to pass it on to the next waiter.

\$\endgroup\$
5
  • \$\begingroup\$ Thanks, this is the sort of analysis I was looking for. I edited the code to say this naive approach is dangerous. I did not see the issue with some simple testing - read/write was decided by an RNG and not some real state like a queue being full, for example. \$\endgroup\$
    – MDH
    Commented Feb 27, 2023 at 15:25
  • 1
    \$\begingroup\$ I think it is a rather good attempt. But even for something as simple as this seqlock, it's hard to prove it is correct. I found this issue, but I am not 100% sure there is no other problem. Randomly locking and unlocking from multiple threads still is not good enough all race conditions. There might be some tools that can help provide formal proofs for your code, but I have never used those and wouldn't know what to recommend. See also this post. Maybe ask Computer Science? \$\endgroup\$
    – G. Sliepen
    Commented Feb 27, 2023 at 15:49
  • 1
    \$\begingroup\$ So SeqLocks in standard C++ are apparently a well-known and sticky problem. The core of it is in the second m_count.load() in read(). Because this is only a load, and not a store/release as in a regular rwlock, there is nothing preventing the reads above it being reordered after it. The standard solution is apparently to replace that second load with a m_count.fetch_add(0) that (some) compilers now have specific optimizations for. A lot more here: hpl.hp.com/techreports/2012/HPL-2012-68.pdf \$\endgroup\$
    – MDH
    Commented Mar 2, 2023 at 3:47
  • \$\begingroup\$ In my own very limited testing clang x64 recognizes count.fetch_add(0) and compiles it to an mfence instruction followed by a regular load. gcc does not, however - there you would presumably need to insert a fence manually with inline assembly or intrinsics. \$\endgroup\$
    – MDH
    Commented Mar 2, 2023 at 4:32
  • 2
    \$\begingroup\$ There is std::atomic_thread_fence(), so you don't need inline assembly or intrinsics AFAICS. You might want to write an answer yourself (it's totally allowed!) about the issue you found. \$\endgroup\$
    – G. Sliepen
    Commented Mar 2, 2023 at 14:00

Not the answer you're looking for? Browse other questions tagged or ask your own question.