Which is more efficient, basic mutex lock or atomic integer?

Question

For something simple like a counter if multiple threads will be increasing the number. I read that mutex locks can decrease efficiency since the threads have to wait. So, to me, an atomic counter would be the most efficient, but I read that internally it is basically a lock? So I guess I'm confused how either could be more efficient than the other.

Should this answer be for all platforms and programming languages supporting pthreads or some subset? I do not completely understand the relationships between pthreads, operating systems and programming languages but it seems these relationships could be relevant. — snow_abstraction, Commented Sep 16, 2017 at 16:37
Well, C++, every implementation on all machines should follow the rules. If not, its a bug. Plenty of bugs all over, never know. — OsamaBinLogin, Commented May 19 at 4:55

yahe · Accepted Answer · 2013-02-25 12:33:41Z

163

Atomic operations leverage processor support (compare and swap instructions) and don't use locks at all, whereas locks are more OS-dependent and perform differently on, for example, Win and Linux.

Locks actually suspend thread execution, freeing up cpu resources for other tasks, but incurring in obvious context-switching overhead when stopping/restarting the thread. On the contrary, threads attempting atomic operations don't wait and keep trying until success (so-called busy-waiting), so they don't incur in context-switching overhead, but neither free up cpu resources.

Summing up, in general atomic operations are faster if contention between threads is sufficiently low. You should definitely do benchmarking as there's no other reliable method of knowing what's the lowest overhead between context-switching and busy-waiting.

answered Feb 25, 2013 at 12:33

yahe

1,8181 gold badge10 silver badges6 bronze badges

4

"Locks actually suspend thread execution" this is not true in a general sense. You can have a spin lock or a non-spin lock. It entirely depends on how the lock is implemented and it's critical that you, as a programmer, know what kind of lock you're using.
– Allison
Commented Jan 10, 2022 at 1:34
I'm a bit confused that you say atomic operations dont use locks at all. If I look at the assembly code generated for compare_exchange() function in C++, it has instruction lock cmpxchg %rcx,(%rsi) (on x86).
– Qwert Yuiop
Commented Mar 23 at 21:36
@QwertYuiop The Intel opcode prefix "lock" has nothing to do with a mutex lock. It locks on internal register/write buffer access. At this level CPU architecture gets really sophisticated with MESI protocol etc.
– Lothar
Commented Apr 11 at 10:31
@Lothar, lock does not equal mutex. A spinlock for instance is also still a lock. The point is it still uses some kind of lock to make one go before the other and not at the same time.
– Qwert Yuiop
Commented Apr 11 at 15:25

Add a comment |

SCFrench · Accepted Answer · 2016-12-19 18:07:01Z

71

If you have a counter for which atomic operations are supported, it will be more efficient than a mutex.

Technically, the atomic will lock the memory bus on most platforms. However, there are two ameliorating details:

It is impossible to suspend a thread during the memory bus lock, but it is possible to suspend a thread during a mutex lock. This is what lets you get a lock-free guarantee (which doesn't say anything about not locking - it just guarantees that at least one thread makes progress).
Mutexes eventually end up being implemented with atomics. Since you need at least one atomic operation to lock a mutex, and one atomic operation to unlock a mutex, it takes at least twice long to do a mutex lock, even in the best of cases.

edited Dec 19, 2016 at 18:07

SCFrench

8,3142 gold badges33 silver badges61 bronze badges

answered Sep 12, 2013 at 2:48

Cort Ammon

10.6k32 silver badges46 bronze badges

1

It is important to understand it depends how well the compiler or interpreter supports the platform to generate the best machine instructions (in this case lock-free instructions) for the platform. I think this is what @Cort Ammon meant by "supported". Also some mutexes might make guarantees about forward progress or fairness for some or all threads that are not made by simple atomic instructions.
– snow_abstraction
Commented Sep 16, 2017 at 16:28

Add a comment |

LWimsey · Accepted Answer · 2017-09-23 17:06:06Z

A minimal (standards compliant) mutex implementation requires 2 basic ingredients:

A way to atomically convey a state change between threads (the 'locked' state)
memory barriers to enforce memory operations protected by the mutex to stay inside the protected area.

There is no way you can make it any simpler than this because of the 'synchronizes-with' relationship the C++ standard requires.

A minimal (correct) implementation might look like this:

class mutex {
    std::atomic<bool> flag{false};

public:
    void lock()
    {
        while (flag.exchange(true, std::memory_order_relaxed));
        std::atomic_thread_fence(std::memory_order_acquire);
    }

    void unlock()
    {
        std::atomic_thread_fence(std::memory_order_release);
        flag.store(false, std::memory_order_relaxed);
    }
};

Due to its simplicity (it cannot suspend the thread of execution), it is likely that, under low contention, this implementation outperforms a std::mutex. But even then, it is easy to see that each integer increment, protected by this mutex, requires the following operations:

an atomic store to release the mutex
an atomic compare-and-swap (read-modify-write) to acquire the mutex (possibly multiple times)
an integer increment

If you compare that with a standalone std::atomic<int> that is incremented with a single (unconditional) read-modify-write (eg. fetch_add), it is reasonable to expect that an atomic operation (using the same ordering model) will outperform the case whereby a mutex is used.

RonTLV · Accepted Answer · 2018-11-19 09:23:02Z

15

atomic integer is a user mode object there for it's much more efficient than a mutex which runs in kernel mode. The scope of atomic integer is a single application while the scope of the mutex is for all running software on the machine.

edited Nov 19, 2018 at 9:23

answered Sep 19, 2017 at 14:27

RonTLV

2,5182 gold badges25 silver badges39 bronze badges

6

This is almost true. Modern mutex implementations, like Linux's Futex, do tend to leverage atomic operations to avoid the switch to kernel mode on the fast path. Such mutexes only have to jump into kernel mode if the atomic operation failed to do the desired task (such as the case where the thread needs to block).
– Cort Ammon
Commented Sep 20, 2017 at 22:14
I think the scope of an atomic integer is a single process, which is significant insofar as applications can be comprised of multiple processes (e.g., Python multiprocessing for parallelism).
– weberc2
Commented Nov 10, 2019 at 23:35

Add a comment |

Ajay · Accepted Answer · 2013-02-24 20:36:00Z

6

The atomic variable classes in Java are able to take advantage of Compare and swap instructions provided by the processor.

Here's a detailed description of the differences: http://www.ibm.com/developerworks/library/j-jtp11234/

answered Feb 24, 2013 at 20:36

Ajay

7635 silver badges17 bronze badges

Add a comment |

Sunil Singhal · Accepted Answer · 2017-09-20 10:43:01Z

Mutex is a kernel level semantic which provides mutual exclusion even at the Process level. Note that it can be helpful in extending mutual exclusion across process boundaries and not just within a process (for threads). It is costlier.

Atomic Counter, AtomicInteger for e.g., is based on CAS, and usually try attempting to do operation until succeed. Basically, in this case, threads race or compete to increment\decrement the value atomically. Here, you may see good CPU cycles being used by a thread trying to operate on a current value.

Since you want to maintain the counter, AtomicInteger\AtomicLong will be the best for your use case.

Gem Taylor · Accepted Answer · 2017-09-20 16:12:21Z

Most processors have supported an atomic read or write, and often an atomic cmp&swap. This means that the processor itself writes or reads the latest value in a single operation, and there might be a few cycles lost compared to a normal integer access, especially as the compiler can't optimise around atomic operations nearly as well as normal.

On the other hand a mutex is a number of lines of code to enter and leave, and during that execution other processors that access the same location are totally stalled, so clearly a big overhead on them. In unoptimised high-level code, the mutex enter/exit and the atomic will be function calls, but for mutex, any competing processor will be locked out while your mutex enter function returns, and while your exit function is started. For atomic, it is only the duration of the actual operation which is locked out. Optimisation should reduce that cost, but not all of it.

If you are trying to increment, then your modern processor probably supports atomic increment/decrement, which will be great.

If it does not, then it is either implemented using the processor atomic cmp&swap, or using a mutex.

Mutex:

get the lock
read
increment
write
release the lock

Atomic cmp&swap:

atomic read the value
calc the increment
do{
   atomic cmpswap value, increment
   recalc the increment
}while the cmp&swap did not see the expected value

So this second version has a loop [incase another processor increments the value between our atomic operations, so value no longer matches, and increment would be wrong] that can get long [if there are many competitors], but generally should still be quicker than the mutex version, but the mutex version may allow that processor to task switch.

Collectives™ on Stack Overflow

Which is more efficient, basic mutex lock or atomic integer?

7 Answers 7

Not the answer you're looking for? Browse other questions tagged
multithreading
operating-system
pthreads
mutex
atomic
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

Not the answer you're looking for? Browse other questions tagged multithreadingoperating-systempthreadsmutexatomic or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
multithreading
operating-system
pthreads
mutex
atomic
or ask your own question.