SlideShare a Scribd company logo
Disruptor –
Ultrafast communication
March 2012
#theedge2012
Guy Raz Nir
Disruptor
» Introduction
» The problem …
» The (not so good) alternatives
» Architecture
» Summary
Agenda
Disruptor
Hida !
(Quiz!)
Disruptor
Introduction
Disruptor
» QUEUE !
» Communication facility between threads.
Disruptor
Disruptor
» London Multi Asset Exchange platform
» Can handle up to 6,000,000 TPS
▪ Dual socket, 3GHz quad-core Nehalem processors.
» 98% transactions under 38ms.
» Average transaction length: 9.22ms
LMAX
Disruptor
» To learn about LMAX Disruptor abilities and
usability.
» To practice “different thinking” in order to
solve complex problems.
Why are we here ?
Disruptor
"Any intelligent fool can make things bigger,
more complex, and more violent.
It takes a touch of genius,
and a lot of courage
to move in the opposite direction.“
Albert Einstein
Disruptor
Disruptor
57.3 MB/s
-20%
Disruptor
“Mechanical Sympathy”
Hardware and software working together in harmony *
* Martin Thompson’s blog
Disruptor
The problem …
Disruptor
» Test case: execution 10,000,000 primitive
increments.
» Single thread execution: ~ 7ms
▪ No concurrency
Multi-threading test
long value = 0;
while (value < 10000000L) {
value++;
Disruptor
» Synchronized approach:
» Mutual exclusion approach:
» AtomicLong:
java.util.concurrent.locks.ReentrantLock lock = …
lock.lock();
value++;
lock.unlock();
synchronized (syncObj) {
value++;
AtomicLong value = new AtomicLong(0);
long result = value.incrementAndGet();
Disruptor
» Single thread, bare execution (value++)
▪ About ~7 milliseconds
» Single thread, AtomicLong
▪ 68 milliseconds (x8.5)
» Single thread with lock
▪ 125 milliseconds (x15.5)
» Single thread with synchronized approach
▪ 450 milliseconds (x56)
Disruptor
» Single thread, bare execution (value++)
▪ About ~7 milliseconds
» Two threads, AtomicLong
▪ 270 milliseconds (x33.7)
» Two thread with lock
▪ 298 milliseconds (x37.5)
Disruptor
Concurrent execution latency (increment)
Time(ms)
Number of threads
Synchronized CAS Lock
Disruptor
Concurrent execution latency (PI calculation)
Time(ms)
Number of threads
Synchronized Lock
“A good preliminary design
overcomes any lastly patch”
Guy (Raz) Nir, The Edge 2012
Disruptor
Core #1
L1 cache
Core #2
L1 cache
Core #3
L1 cache
Core #4
L1 cache
L2 cache L2 cache L2 cache L2 cache
L3 cache
32KB inst.
32KB data
256KB
2M – 16M
64-bit
registers
Model CPU architecture
Disruptor
Non-volatile vs volatile
Non-volatile Volatile
7
Time(ms)
Disruptor
The (not so good) alternatives
Disruptor
» Linked-list based queue
▪ Requires re-allocation of units
▪ Memory fragmentation
▪ Garbage collection
▪ Bad contention
#0 #1 #2 #3 #4
Disruptor
» Cyclic array-based queue
▪ Bad contention
#0 #1 #2 #3 #4
Head
Tail
Disruptor
java.util.concurrent.ArrayBlockingQueue
// Put new element in the queue.
public boolean offer(E e, long timeout, TimeUnit unit) {
// Consume ‘lock’ for writing.
final ReentrantLock lock = this.lock;
lock.lock();
}
// Take one element from the queue.
public E poll() {
// Consume ‘lock’ for reading.
final ReentrantLock lock = this.lock;
lock.lock();
}
Sun (Oracle) JDK 1.7.0_u2
Disruptor
» General purpose assumptions:
▪ Multiple readers, multiple writers
▪ Queues can run as big as memory
▪ Other operations that degrade design
» No regards to hardware
Other problems
Disruptor
Architecture
Disruptor
» Barriers
» Ring buffer
» Sequences
Main components
Disruptors
Barriers
Disruptor
The ring buffer
1 2
3
4
MyDataType[] buffer = ...;
int offset = sequence % buffer.length;
Next read
sequence
Available
sequence
Disruptor
» Array-based cyclic buffer.
▪ Fast index-based accessed.
» Allow us to allocate all entries in advance
▪ Save GC time
▪ Continuous block allocation
▪ Save new costs at runtime.
The ring buffer
1 2
3
4
5
Ring Buffer
Disruptor
Barriers
1 2
3
4
5
Producer Consumer
sequence
nextSequence
Disruptor
public class StandardProducer {
public void offer(Object o) {
// ...
}
}
public class DisruptorProducer {
private RingBuffer buffer;
public void addMessage(String message, long timestamp) {
int seq = buffer.writeSequenceNumber++;
buffer.data[seq].msg = message;
buffer.data[seq].timestamp = timestamp;
buffer.availableSequenceNumber = seq;
}
}
X
Disruptor
public class DisruptorConsumer {
private RingBuffer buffer;
int nextSequenceNumber;
public Object take() {
while (nextSequenceNumber < buffer.sequenceNumber) { .. }
return buffer.get(nextSequenceNumber++);
}
}
Buffer.sequenceNumber
My
sequence
number
Ring Buffer
Disruptor
Multi consumers
1 2
3
4
5
Consumer
sequence
nextSequence = 2
Sequence
barrier
Consumer
nextSequence = 3
Consumer
nextSequence = 4
Disruptor
» Allow us to fetch multiple elements.
» Using event processors
▪ Callbacks
Batches & Events
Disruptor
Code sample – Create ring buffer
//
// Create a new ring buffer.
//
RingBuffer<MyEvent> ringBuffer = new RingBuffer<MyEvent>(
new MyOwnFactory(),
new SingleThreadedClaimStrategy(sizeOfRing),
new SleepingWaitStrategy());
Disruptor
Code sample - Producer
// Request the next available sequence number.
long sequence = buffer.next();
// Fetch the object at the that location.
MyEvent event = buffer.get(sequence);
//
// ... do something with the event.
//
// Notify the rest of the world this event is ready to be consumed.
buffer.publish(sequence);
Disruptor
Code sample - Consumer
// Extract a consumer's barrier.
SequenceBarrier barrier = ringBuffer.newBarrier();
// Wait for an event to come.
barrier.waitFor(nextSequence);
// Take the event (data).
MyEvent event = ringBuffer.get(nextSequence);
Disruptor
» Disruptor is a smart Queue.
» Latest release is 2.8
» Exploit hardware acceleration points.
» Won the Duke’s 2011 award for innovation !
Summary
Disruptor
» Google code:
▪ http://code.google.com/p/disruptor/
» Technical paper:
▪ http://disruptor.googlecode.com/files/Disruptor-1.0.pdf
» Martin Thompson’s blog:
▪ http://mechanical-sympathy.blogspot.com
» Trisha Gee’s blog:
▪ http://mechanitis.blogspot.com/
» InfoQ on Disruptor (session video):
▪ http://www.infoq.com/presentations/LMAX
References
Guy Raz Nir
guyn@alphacsp.com

More Related Content

LMAX Disruptor as real-life example

Editor's Notes

  1. 16 threadsAbout 375K TPS per threadAbout 518 billion T/day.1 cent per transaction = 5 billion Euros per day.Won the Duke’s 2011 award for innovation !
  2. Sun (Oracle) JDK 1.7Intel i7 2600K (SandyBridge) + Overclocking
  3. Intel i7 2600K SandyBridgeL1 cache speed: 450GB/secCPU – Memory speed: about 18GB (x25 slower than L1)
  4. Single-threaded example
  5. Single-threaded example
  6. Producers works in the same way.Disruptor provide various barriers for various models.