Building FoundationDB

Building a next-generation database

david [dot] rosenthal@foundationdb.com
Twitter: @FoundationDB

Motivation
Ease of building successful applications:
• High performance
• Ease scaling out
• Ease of building abstractions
• Ease of operation

Historical Perspective: 2008
Future

NoSQL doesn’t really exist yet

Databases in 2008
Relational is entrenched; NoSQL emerging
with some interesting advantages:
• Voldemort
• Cassandra
• HBase
…but the fine print about data guarantees
doesn’t look so good.

The CAP2008 theorem
• Brewer: Pick 2 out of 3
• Werner Vogels (CTO Amazon.com): “Data
inconsistency in large-scale reliable
distributed systems has to be tolerated …
[for performance and to handle faults]”
• Wrong descriptions all over the web: “The
availability property means that the system
is ‘online’ and the client of the system can
expect to receive a response for its
request.”

CAP2008 Conclusions?
• Scaling requires distributed design
• Distributed requires high availability
• Availability requires no C

So, if we want scalability we have to give up C,
the cornerstone of ACID.

Right?

Thinking about CAP2008
• Is a partition worse than a failure?
• Three computers can’t agree?
• Keyword: Availability…

Availability != high availability

Flash forward to CAP2012
• Brewer: “Why ‘2 of 3’ is misleading”
• Brewer: “CAP prohibits … perfect availability”
• Vogles: “Achieving strict consistency can come at
a cost in update or read latency, and may result in
lower throughput…”
• Google (Spanner): “…it is better to have
application programmers deal with performance
problems due to overuse of transactions as
bottlenecks arise, rather than always coding
around the lack of transactions.“

The FoundationDB concept
• Attack CAP2008 and deliver transactions at
NoSQL performance and scale
• Reduce core to minimal feature set
• Add features back with higher-level
abstractions—“Layers”
• Decouple choice of data model and
choice of storage technology

FoundationDB
Database software: Application

•Ordered key-value API Layer

•Scalable
Key-value API
•Transactional
•Fault tolerant

Engineering pressures
Engineering Challenge Strategy
Engineering for extreme reliability Simulation
and fault tolerance of large clusters
under adverse conditions
Many asynchronous Erlang?
communicating processes
Fast algorithms; efficient I/O C++

We need new tools!

First tool: Flow
• A new programming language
• Adds actor-model concurrency to C++11
• New keywords: ACTOR, future, promise,
wait, choose, when, streams
• Flow code -> C++11 code -> binary

Seriously?

Flow allows…
• Testability by enabling simulation.
• Performance by compiling to native.
• Easier ACTOR-model coding.

Flow performance
Joe Armstrong (author of “Programming Erlang”):

“Write a ring benchmark. Create N processes in a ring.
Send a message round the ring M times so that a total
of N * M messages get sent. Time how long this takes
for different values of N and M. Write a similar
program in some other programming language you are
familiar with. Compare the results. Write a blog, and
publish the results on the internet!”

Flow performance
(N=1000, M=1000)
• Ruby (using threads): 1990 seconds
• Ruby (queues): 360 seconds
• Objective C (using threads): 26 seconds
• Java (threads): 12 seconds
• Stackless Python: 1.68 seconds
• Erlang: 1.09 seconds
• Google Go: 0.87 seconds
• Flow: 0.075 seconds

Second Tool: Lithium
• Enabled by Flow
• Simulate physical interfaces
• Simulate failures modes
• Deterministic simulation of entire system

Traditional approaches
• Glue together smaller transactional
systems
– Two-phase-commit (Open/X XA)
– Paxos
• Build on a distributed file system
– BigTable/HBase

The FoundationDB approach
• Deconstruct a traditional transactional
database and scale the individual parts
• Each part must also be fault tolerant
• The parts:
– Accept requests
– Check for transaction conflicts
– Log transactions
– Store data

Key insight
Checking for transaction conflicts
• Problem is scalable
• When highly optimized, is a small
amount of the total % of work.
• Is tricky to make fault tolerant…

Training montage
• Paxos coordination algorithm
• Multi-versioned data structures
• SSD optimizations
• Application-managed page cache
• Prioritization deeply integrated
• Control theory for queue sizes
• Testing, testing, testing

Did we reach our big goals?
• High performance
• Ease scaling out
• Ease of building abstractions
• Ease of operation

High performance
FoundationDB
delivers performance
exceeding other
NoSQL databases, but
with transactions!

Ease of scaling out
• Add and remove nodes on-the-fly
• Single key-space with global transactions
• Validated to 96-cores, 48-SSDs

Ease of building abstractions
• Transactions enable abstraction
• Abstractions very hard to build on non-
transactional systems
• Ordered data model for performance

Abstractions built on a scalable, fault
tolerant, transactional foundation inherit those
properties.

Examples of “ease”
• SQL database in one day
• Indexed table layer (3 days * 1 intern)
• Fractal spatial index in 200 lines:

Ease of operation
• Automatic data partitioning/replication
• Highly fault-tolerant
• Minimal management

Try to break it yourself!

Conclusion
• Our mission is to solve the problem of state
management so that developers can focus on
building their applications
• 3+ years in the making, now ready for your
applications
• Bindings for C, Python, JVM, Node.js, Ruby

Building FoundationDB

Related slideshows

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Building FoundationDB

Similar to Building FoundationDB (20)

Recently uploaded

Recently uploaded (20)

Building FoundationDB