Cassandra Explained

Cassandra Explained

Berlin Buzzwords
June 6, 2010

Eric Evans
eevans@rackspace.com
@jericevans
http://blog.sym-link.com

Outline
● Background
● Description
● API
● Examples

Influential Papers
● BigTable
● Strong consistency
● Sparse map data model

● GFS, Chubby, et al

● Dynamo
● O(1) distributed hash table (DHT)
● BASE (aka eventual consistency)
● Client tunable consistency/availability

NoSQL
● HBase ● Hypertable
● MongoDB ● HyperGraphDB
● Riak ● Memcached
● Voldemort ● Tokyo Cabinet
● Neo4J ● Redis
● Cassandra ● CouchDB

NoSQL Big data
● HBase ● Hypertable
● MongoDB ● HyperGraphDB
● Riak ● Memcached
● Voldemort ● Tokyo Cabinet
● Neo4J ● Redis
● Cassandra ● CouchDB

Bigtable / Dynamo
Bigtable Dynamo
● HBase ● Riak
● Hypertable ● Voldemort

Cassandra ??

CAP Theorem “Pick Two”
● CP ● AP
● Bigtable ● Dynamo
● Hypertable ● Voldemort
● HBase ● Cassandra

CAP Theorem “Pick Two”

● Consistency
● Availability
● Partition Tolerance

Properties
● Symmetric
● No single point of failure
● Linearly scalable

● Ease of administration

● Flexible partitioning, replica placement
● Automated provisioning
● High availability (eventual consistency)

Partitioning
● Random
● 128bit namespace, (MD5)
● Good distribution
● Order Preserving
● Tokens determine namespace
● Natural order (lexicographical)
● Range / cover queries
● Yours ??

Replica Placement
● SimpleSnitch
● Default
● N-1 successive nodes
● RackInferringSnitch
● Infers DC/rack from IP
● PropertyFileSnitch
● Configured w/ a properties file

Choosing Consistency

Write Read
Level Description Level Description
ZERO Hail Mary ZERO N/A
ANY 1 replica (HH) ANY N/A
ONE 1 replica ONE 1 replica
QUORUM (N / 2) +1 QUORUM (N / 2) +1
ALL All replicas ALL All replicas

R+W>N

Overview
● Keyspace
● Uppermost namespace
● Typically one per application
● ColumnFamily
● Associates records of a similar kind
● Record-level Atomicity
● Indexed
● Column
● Basic unit of storage

Column
● name
● byte[]
● Queried against (predicates)
● Determines sort order
● value
● byte[]
● Opaque to Cassandra
● timestamp
● long
● Conflict resolution (Last Write Wins)

Column Comparators
● Bytes
● UTF8
● TimeUUID
● Long
● LexicalUUID
● Composite (third-party)

http://github.com/edanuff/CassandraCompositeType

Low / High
● Thrift
● Compact binary RPC framework
● 12 different languages
● Idiomatic
● Hector (Java)
● Pycassa (Python)
● Others...

http://wiki.apache.org/cassandra/ClientOptions

Thrift Read Methods
● get() → Column
● get_slice() → list<Column>
● mulitget_slice() → map<key, list<Column>>
● get_count() → int
● multiget_count() → map<key, int>
● get_range_slices()

Thrift Write Methods
● insert()
● batch_insert()
● remove()
● batch_mutate()

Pycassa – Python Client API
● connect() → Thrift proxy
● cf = ColumnFamily(proxy, ksp, cfname)
● cf.insert() → long
● cf.get() → dict
● cf.get_range() → dict

http://github.com/vomjom/pycassa

Address Book – Setup

<Keyspace Name=”AddressBook”>
<ColumnFamily Name=”Addresses”
CompareWith=”BytesType”
RowsCached=”10000”
KeysCached=”50%”
Comment=”Too lame” />
</Keyspace>

Adding an entry
key = uuid()

columns = {
'first': 'Eric',
'last': 'Evans',
'email': 'eevans@rackspace.com',
'city': 'Austin',
'zip': 78250
}

addresses.insert(key, columns)

Fetching a record
# fetching the record by key
record = addresses.get(key)

# accessing columns by name
zipcode = record['zip']
city = record['city']

Indexing

<Keyspace Name=”AddressBook”>
<ColumnFamily Name=”Addresses”
CompareWith=”BytesType”
RowsCached=”10000”
KeysCached=”50%”
Comment=”Too lame” />
<ColumnFamily Name=”ByCity”
CompareWith=”UTF8Type” />
</Keyspace>

Updating the index
key = uuid()

columns = {
'first': 'Eric',
'last': 'Evans',
'email': 'eevans@rackspace.com',
'city': 'Austin',
'zip': 78250
}

addresses.insert(key, columns)
byCity.insert('Austin', {key: ''})

Timeseries

<Keyspace Name=”Sites”>
<ColumnFamily Name=”Stats”
CompareWith=”LongType”/>
</Keyspace>

Logging values
# time as a long, binary, network-order
ts = pack('>d', long(time() * 1e6))

stats.insert('org.apache', {ts: value})

Slicing
begin = pack('>d', long(s * 1e6))

stats.get_range('org.apache',
column_start=begin)

end = pack('>d', long((s + 86400) * 1e6))

stats.get_range(start='org.apache',
finish='org.debian',
column_start=begin,
column_finish=end)

Cassandra Explained

Related slideshows

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (6)

Similar to Cassandra Explained

Similar to Cassandra Explained (20)

More from Eric Evans

More from Eric Evans (16)

Recently uploaded

Recently uploaded (20)

Cassandra Explained