Cassandra Explained

    Berlin Buzzwords
      June 6, 2010

            Eric Evans
●   Background
●   Description
●   API
●   Examples
Influential Papers
●   BigTable
    ● Strong consistency
    ● Sparse map data model

    ● GFS, Chubby, et al

●   Dynamo
    ●   O(1) distributed hash table (DHT)
    ●   BASE (aka eventual consistency)
    ●   Client tunable consistency/availability

●   HBase          ●   Hypertable
●   MongoDB        ●   HyperGraphDB
●   Riak           ●   Memcached
●   Voldemort      ●   Tokyo Cabinet
●   Neo4J          ●   Redis
●   Cassandra      ●   CouchDB
NoSQL Big data
●   HBase           ●   Hypertable
●   MongoDB         ●   HyperGraphDB
●   Riak            ●   Memcached
●   Voldemort       ●   Tokyo Cabinet
●   Neo4J           ●   Redis
●   Cassandra       ●   CouchDB
Bigtable / Dynamo
        Bigtable              Dynamo
●   HBase          ●   Riak
●   Hypertable     ●   Voldemort

            Cassandra ??
Dynamo-Bigtable Lovechild

CAP Theorem “Pick Two”
●   CP               ●   AP
    ●   Bigtable         ●   Dynamo
    ●   Hypertable       ●   Voldemort
    ●   HBase            ●   Cassandra
CAP Theorem “Pick Two”

   ●   Consistency
   ●   Availability
   ●   Partition Tolerance
●   Symmetric
    ● No single point of failure
    ● Linearly scalable

    ● Ease of administration

●   Flexible partitioning, replica placement
●   Automated provisioning
●   High availability (eventual consistency)

P2P Routing
P2P Routing
●   Random
    ●   128bit namespace, (MD5)
    ●   Good distribution
●   Order Preserving
    ●   Tokens determine namespace
    ●   Natural order (lexicographical)
    ●   Range / cover queries
●   Yours ??
Replica Placement
●   SimpleSnitch
    ●   Default
    ●   N-1 successive nodes
●   RackInferringSnitch
    ●   Infers DC/rack from IP
●   PropertyFileSnitch
    ●   Configured w/ a properties file

Choosing Consistency

         Write                      Read
Level     Description      Level     Description
ZERO      Hail Mary        ZERO      N/A
ANY       1 replica (HH)   ANY       N/A
ONE       1 replica        ONE       1 replica
QUORUM    (N / 2) +1       QUORUM    (N / 2) +1
ALL       All replicas     ALL       All replicas


Quorum ((N/2) + 1)
Quorum ((N/2) + 1)
Data Model
●   Keyspace
    ●   Uppermost namespace
    ●   Typically one per application
●   ColumnFamily
    ●   Associates records of a similar kind
    ●   Record-level Atomicity
    ●   Indexed
●   Column
    ●   Basic unit of storage

Sparse Table
●   name
    ●   byte[]
    ●   Queried against (predicates)
    ●   Determines sort order
●   value
    ●   byte[]
    ●   Opaque to Cassandra
●   timestamp
    ●   long
    ●   Conflict resolution (Last Write Wins)
Column Comparators
●    Bytes
●    UTF8
●    TimeUUID
●    Long
●    LexicalUUID
●    Composite (third-party)

Low / High
●    Thrift
      ●   Compact binary RPC framework
      ●   12 different languages
●    Idiomatic
      ●   Hector (Java)
      ●   Pycassa (Python)
      ●   Others...
Thrift Read Methods
●   get() → Column
●   get_slice() → list<Column>
●   mulitget_slice() → map<key, list<Column>>
●   get_count() → int
●   multiget_count() → map<key, int>
●   get_range_slices()
Thrift Write Methods
●   insert()
●   batch_insert()
●   remove()
●   batch_mutate()

Pycassa – Python Client API
●    connect() → Thrift proxy
●    cf = ColumnFamily(proxy, ksp, cfname)
●    cf.insert() → long
●    cf.get() → dict
●    cf.get_range() → dict
Address Book – Setup

<Keyspace Name=”AddressBook”>
  <ColumnFamily Name=”Addresses”
                Comment=”Too lame” />
Adding an entry
key = uuid()

columns = {
    'first':   'Eric',
    'last':    'Evans',
    'email':   '',
    'city':    'Austin',
    'zip':     78250

addresses.insert(key, columns)
Fetching a record
# fetching the record by key
record = addresses.get(key)

# accessing columns by name
zipcode = record['zip']
city = record['city']

<Keyspace Name=”AddressBook”>
  <ColumnFamily Name=”Addresses”
                Comment=”Too lame” />
  <ColumnFamily Name=”ByCity”
                CompareWith=”UTF8Type” />
Updating the index
key = uuid()

columns = {
    'first':   'Eric',
    'last':    'Evans',
    'email':   '',
    'city':    'Austin',
    'zip':     78250

addresses.insert(key, columns)
byCity.insert('Austin', {key: ''})

<Keyspace Name=”Sites”>
  <ColumnFamily Name=”Stats”
Logging values
# time as a long, binary, network-order
ts = pack('>d', long(time() * 1e6))

stats.insert('org.apache', {ts: value})

begin = pack('>d', long(s * 1e6))


end = pack('>d', long((s + 86400) * 1e6))


