KeyValue Stores

KeyValue Stores
Jedi Master Edition

Who?
Antonio Garrote
@antoniogarrote

Mauro Pompilio
@malditogeek

Pablo Delgado
@pablete

Agenda
•Why?
•Definitions
•CouchDB
•Redis
•Cassandra
•Ruby Libraries
•Demo application
•Data modeling
•Benchmark

Why?
•Scalability
•Availability
•Fault Tolerance
•Schema-free
•Ease of use
•Performance
•Elasticity
•blah blah blah

NoSQL != NoSQL
No SQL Not Only SQL

Taxonomy
•Key-value stores:
Redis, Voldemort, Cassandra
•Column-oriented datastores:
Cassandra, HBase
•Document collection databases:
CouchDB, MongoDB
•Graph database:
Neo4J, AllegroGraph
•Data structure store:
Redis

CouchDB
relax!
•Damien Katz
•Erlang - OTP compliant
•schema-less documents
•high availability
•completely distributed
•made for the web

CouchDB

B-Trees . MapReduce . MVCC

Ruby Libraries
•CouchDB

•Pure: net/http + JSON implementation

•Thin wrapper: Couchrest
http://github.com/jchris/couchrest

•ORM/ActiveRecord: ActiveCouch,
CouchObject, RelaxDB ..etc
http://github.com/arunthampi/activecouch
http://github.com/paulcarey/relaxdb

CouchDB
•Rocks
•Simplicity and elegance
•Much more than a DB
•New possibilities for web apps

•Sucks
•Speed
•Speed
•Speed

Redis
il meglio d'Italia

classy as a tasty as
Giulietta a pizza

Redis
•Salvatore 'antirez' Sanfilippo
•ANSI C - POSIX compliant

•MemCache-like (on steroids)
•Data structures store:
•strings
•counters
•lists
•sets + sorted sets (>= 1.1)

Ruby Libraries
•Redis

•Client: redis-rb
http://github.com/ezmobius/redis-rb

•Hash/Object mapper: Ohm
http://github.com/soveran/ohm

•ORM: RedisRecord
http://github.com/malditogeek/redisrecord

Redis
require 'redis'
redis = Redis.new

# Strings
redis['foo'] = 'bar' # => 'bar'
redis['foo'] # => 'bar'

# Expirations
redis.expire('foo', 5) # will expire existing key 'foo' in 5 sec
redis.set('foo', 'bar', 5) # set 'foo' with 5 sec expiration

# Counters
redis.incr('counter') # => 1
redis.incr('counter', 10) # => 11
redis.decr('counter') # => 10

Redis
# Lists
%w(1st 2nd 3rd).each { |item| redis.push_tail('logs', item) }
redis.list_range('logs', 0, -1) # => ["1st", "2nd", "3rd"]
redis.pop_head('logs') # => "1st"
redis.pop_tail('logs') # => "3rd"

# Sets
%w(one two).each { |item| redis.set_add('foo-tags', item) }
%w(two three).each { |item| redis.set_add('bar-tags', item) }
redis.set_intersect('foo-tags', 'bar-tags') # => ["two"]
redis.set_union('foo-tags', 'bar-tags') # => ["three", "two",
"one"]

Redis
•Rocks
•Speed, in memory dataset
•Asynch non-blocking persistence
•Non-blocking replication
•Data structures with atomic operations
•Ease of use and deployment
•Sucks
•Sharding (client-side only at the moment)
•Datasets > RAM
•Very frequent code updates (?)

Redis
Upcoming coolness...

•1.1
•Sorted sets (ZSET), append-only journaling
•1.2
•HASH type, JSON dump tool
•1.3
•Virtual memory (datasets > RAM)
•1.4
•Redis-cluster proxy: consistent hashing and fault
tollerant nodes
•1.5
•Optimizations, UDP GET/SET

Cassandra

BigTable Dynamo
by
+ by

Cassandra
Structure Storage System over P2P network

•Developed at Facebook
•Java

•Dynamo: partition and
replication
•Bigtable: Log-structured
ColumnFamily data model

Ruby Libraries
•Cassandra

•Client: cassandra
http://github.com/fauna/cassandra

•ORM: cassandra_object
http://github.com/NZKoz/cassandra_object

•ORM: BigRecord
http://github.com/openplaces/bigrecord

Cassandra
•Rocks
•High Availability
•Incremental Scalability
•Minimal Administration
•No Single Point of Failure
•Sucks
•Thrift API (...not so bad)
•Change Schema, restart server
•The Logo

Demo Application

http://github.com/antoniogarrote/conf_rails_hispana_2009

Data Modeling
•Class mapping
•ID generation
•Relationships
•one-to-one
•one-to-many
•many-to-many
•Index sorting
•Pagination
•Data filtering

Cassandra
•Class mapping
• ColumnFamily :Blog, :Post

•ID generation
•UUID.new(Time.now)

•Relationships
•Use ColumnFamily :PostsforUser to
hold all posts that belong to a user

Cassandra
•Index sorting
•Columns within a ColumnFamily are stored in
sorted order. Keys are also sorted (if
OrderPreservingPartitioner)
•Pagination
•for keys get_range (start, finish, count)
•for columns get_slice (start, finish, count)
•Data filtering
•Use get_range/get_slice and play around with
start/finish

Redis
•Class mapping
• Namespaced keys: 'Post:5:title'

•ID generation
•Redis counters: incr('Post:ids')

•Relationships
•Redis lists: push_tail('Post:5:_rating_ids', 4)

Redis
•Index sorting
•Redis sort:
•sort 'Post:list', by 'Post:*:score', get
'Post:*:id'

•Pagination
•Redis lists: list_range('Post:list', 0, -9)

•Data filtering
•Lookups: 'Post:permalink:fifth_post' => 5

CouchDB
•Type attribute in each document
•CouchDB automatic ID generation
•Related document IDs in the
attributes
•Views with complex keys
•Special attributes for view functions

CouchDB
View: relation_blog_posts

function(doc){
if(doc.type=="post"){
emit([doc.blog_id,
doc.created_at],
doc);
}
}

CouchDB
View: relation_blog_posts

GET
/db/design_doc/relation_blog_posts?
startkey=[blog_1]

VPork
•Utility for load-testing a distributed hash table.
•Allows you to test raw throughput via
concurrent read/writes operations.
•Hardware:
•2 x comodity servers: CoreDuo 2.5Ghz, 4Gb RAM,
7200RPM disks
•CouchDB: 2 instances, round-robin balanced
•Cassandra: 2 instances
•Redis: 1 instance

http://github.com/antoniogarrote/vpork

VPork
Throughput with read probability 0.2

VPork

Conclusions
•Complementary to relational solutions
•Each K/V address a different problem
•Best use case:
•CouchDB: distributed/scalable
Javascript-only app (no backend)
•Cassandra: big amount of writes, no
SPOF
•Redis: datasets < RAM, lookups,
cache, buffers

Credits
•All sponsored products, company names, brand names,
trademarks and logos are the property of their respective
owners.
•Alfa Romeo Giulietta: http://www.flickr.com/photos/
mauboi/3296469097/
•Pizza: http://reportingfrombelgium.wordpress.com/2009/
05/20/belgian-summer-holidays/
•Sammy: http://www.yuddy.com/celebrity/Sammy-Davis-
Jr/bio
•Everything else is from teh internets and is free.

KeyValue Stores

Related slideshows

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to KeyValue Stores

Similar to KeyValue Stores (20)

Recently uploaded

Recently uploaded (20)

KeyValue Stores