SlideShare a Scribd company logo
Cassandra Explained


      Disruptive Code
    September 22, 2010

            Eric Evans
    eevans@rackspace.com
           @jericevans
    http://blog.sym-link.com
Outline
●   Background
●   Description
●   API(s)
●   Code Samples
Background
The Digital Universe
Consolidation
Old Guard
Vertical Scaling Sucks
Influential Papers
●   BigTable
    ● Strong consistency
    ● Sparse map data model


    ● GFS, Chubby, et al


●   Dynamo
    ●   O(1) distributed hash table (DHT)
    ●   BASE (aka eventual consistency)
    ●   Client tunable consistency/availability
NoSQL
●   HBase          ●   Hypertable
●   MongoDB        ●   HyperGraphDB
●   Riak           ●   Memcached
●   Voldemort      ●   Tokyo Cabinet
●   Neo4J          ●   Redis
●   Cassandra      ●   CouchDB
NoSQL Big data
●   HBase           ●   Hypertable
●   MongoDB         ●   HyperGraphDB
●   Riak            ●   Memcached
●   Voldemort       ●   Tokyo Cabinet
●   Neo4J           ●   Redis
●   Cassandra       ●   CouchDB
Bigtable / Dynamo
        Bigtable              Dynamo
●   HBase          ●   Riak
●   Hypertable     ●   Voldemort



            Cassandra ??
Dynamo-Bigtable Lovechild
CAP Theorem “Pick Two”
●   CP               ●   AP
    ●   Bigtable         ●   Dynamo
    ●   Hypertable       ●   Voldemort
    ●   HBase            ●   Cassandra
CAP Theorem “Pick Two”



   ●   Consistency
   ●   Availability
   ●   Partition Tolerance
Description
Properties
●   Symmetric
    ● No single point of failure
    ● Linearly scalable


    ● Ease of administration


●   Flexible partitioning, replica placement
●   Automated provisioning
●   High availability (eventual consistency)
P2P Routing
P2P Routing
Partitioning
●   Random
    ●   128bit namespace, (MD5)
    ●   Good distribution
●   Order Preserving
    ●   Tokens determine namespace
    ●   Natural order (lexicographical)
    ●   Range / cover queries
●   Yours ??
Replica Placement
●   SimpleSnitch
    ●   Default
    ●   N-1 successive nodes
●   RackInferringSnitch
    ●   Infers DC/rack from IP
●   PropertyFileSnitch
    ●   Configured w/ a properties file
Bootstrap
Bootstrap
Bootstrap
Choosing Consistency

         Write                      Read
Level     Description      Level     Description
ZERO      Hail Mary        ZERO      N/A
ANY       1 replica (HH)   ANY       N/A
ONE       1 replica        ONE       1 replica
QUORUM    (N / 2) +1       QUORUM    (N / 2) +1
ALL       All replicas     ALL       All replicas

                       R+W>N
Quorum ((N/2) + 1)
Quorum ((N/2) + 1)
Data Model
Overview
●   Keyspace
    ●   Uppermost namespace
    ●   Typically one per application
●   ColumnFamily
    ●   Associates records of a similar kind
    ●   Record-level Atomicity
    ●   Indexed
●   Column
    ●   Basic unit of storage
Sparse Table
Column
●   name
    ●   byte[]
    ●   Queried against (predicates)
    ●   Determines sort order
●   value
    ●   byte[]
    ●   Opaque to Cassandra
●   timestamp
    ●   long
    ●   Conflict resolution (Last Write Wins)
Column Comparators
●    Bytes
●    UTF8
●    TimeUUID
●    Long
●    LexicalUUID
●    Composite (third-party)


    http://github.com/edanuff/CassandraCompositeType
API
RPC




THRIFT
RPC




THRIFT
RPC




AVRO
RPC




AVRO
Idiomatic Client Libraries
●    Pelops, Hector (Java)
●    Pycassa (Python)
●    Cassandra (Ruby)
●    Others …




    http://wiki.apache.org/cassandra/ClientOptions
Code Samples
Pycassa – Python Client API
# creating a connection
from pycassa import connect
hosts = ['host1:9160', 'host2:9160']
client = connect('Keyspace1', hosts)

# creating a column family instance
from pycassa import ColumnFamily
cf = ColumnFamily(client, “Standard1”)

# reading/writing a column
cf.insert('key1', {'name': 'value'})
print cf.get('key1')['name']

   1. http://github.com/vomjom/pycassa
Address Book – Setup
# conf/cassandra.yaml
keyspaces:
  - name: AddressBook
    column_families:
      - name: Addresses
        compare_with: BytesType
        rows_cached: 10000
        keys_cached: 50
        comment: 'No comment'
Adding an entry
key = uuid()

columns = {
    'first':   'Eric',
    'last':    'Evans',
    'email':   'eevans@rackspace.com',
    'city':    'San Antonio',
    'zip':     78250
}

addresses.insert(key, columns)
Fetching a record
# fetching the record by key
record = addresses.get(key)

# accessing columns by name
zipcode = record['zip']
city = record['city']
Indexing (manual)
# conf/cassandra.yaml
keyspaces:
  - name: AddressBook
    column_families:
      - name: Addresses
        compare_with: BytesType
        rows_cached: 10000
        keys_cached: 50
        comment: 'No comment'
      - name: ByCity
        compare_with: UTF8Type
Updating the index
key = uuid()

columns = {
    'first':   'Eric',
    'last':    'Evans',
    'email':   'eevans@rackspace.com',
    'city':    'San Antonio',
    'zip':     78250
}

addresses.insert(key, columns)
byCity.insert('San Antonio', {key: ''})
Indexing (auto)
# conf/cassandra.yaml
keyspaces:
  - name: AddressBook
    column_families:
      - name: Addresses
        compare_with: BytesType
        rows_cached: 10000
        keys_cached: 50
        comment: 'No comment'
        column_metadata:
           - name: city
             index_type: KEYS
Querying the Index
from pycassa.index import create_index_expression
from pycassa.index import create_index_clause

e = create_index_expression('city', 'San Antonio')
clause = create_index_clause([e])

results = address.get_indexed_slices(clause)

for (key, columns) in results.items():
    print “%(first)s %(last)s” % columns
Timeseries
# conf/cassandra.yaml
keyspaces:
  - name: Sites
    column_families:
      - name: Stats
        compare_with: LongType
Logging values
# time as long (milliseconds since epoch)
tstmp = long(time() * 1e6)

stats.insert('org.apache', {tstmp: value})
Slicing
begin = long(start * 1e6)

stats.get_range('org.apache',
                column_start=begin)

end = long((start + 86400) * 1e6)

stats.get_range(start='org.apache',
                finish='org.debian',
                column_start=begin,
                column_finish=end)
Questions?

More Related Content

What's hot

Bulk Loading Data into Cassandra
Bulk Loading Data into CassandraBulk Loading Data into Cassandra
Bulk Loading Data into Cassandra
DataStax
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for Sysadmins
Nathan Milford
 
Apache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis PriceApache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis Price
DataStax Academy
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
DataStax
 
Scaling Twitter with Cassandra
Scaling Twitter with CassandraScaling Twitter with Cassandra
Scaling Twitter with Cassandra
Ryan King
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
Gokhan Atil
 
Cassandra + Spark + Elk
Cassandra + Spark + ElkCassandra + Spark + Elk
Cassandra + Spark + Elk
Vasil Remeniuk
 
Cassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + DynamoCassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + Dynamo
jbellis
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
Robert Stupp
 
Introduction to Real-Time Analytics with Cassandra and Hadoop
Introduction to Real-Time Analytics with Cassandra and HadoopIntroduction to Real-Time Analytics with Cassandra and Hadoop
Introduction to Real-Time Analytics with Cassandra and Hadoop
Patricia Gorla
 
Heuritech: Apache Spark REX
Heuritech: Apache Spark REXHeuritech: Apache Spark REX
Heuritech: Apache Spark REX
didmarin
 
Bulk Loading into Cassandra
Bulk Loading into CassandraBulk Loading into Cassandra
Bulk Loading into Cassandra
Brian Hess
 
9b. Document-Oriented Databases lab
9b. Document-Oriented Databases lab9b. Document-Oriented Databases lab
9b. Document-Oriented Databases lab
Fabio Fumarola
 
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
DataStax
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
SoftwareMill
 
memcached Distributed Cache
memcached Distributed Cachememcached Distributed Cache
memcached Distributed Cache
Aniruddha Chakrabarti
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architecture
Markus Klems
 
Data analysis scala_spark
Data analysis scala_sparkData analysis scala_spark
Data analysis scala_spark
Yiguang Hu
 
Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)
gdusbabek
 
Using Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into CassandraUsing Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into Cassandra
Jim Hatcher
 

What's hot (20)

Bulk Loading Data into Cassandra
Bulk Loading Data into CassandraBulk Loading Data into Cassandra
Bulk Loading Data into Cassandra
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for Sysadmins
 
Apache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis PriceApache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis Price
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Scaling Twitter with Cassandra
Scaling Twitter with CassandraScaling Twitter with Cassandra
Scaling Twitter with Cassandra
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Cassandra + Spark + Elk
Cassandra + Spark + ElkCassandra + Spark + Elk
Cassandra + Spark + Elk
 
Cassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + DynamoCassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + Dynamo
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Introduction to Real-Time Analytics with Cassandra and Hadoop
Introduction to Real-Time Analytics with Cassandra and HadoopIntroduction to Real-Time Analytics with Cassandra and Hadoop
Introduction to Real-Time Analytics with Cassandra and Hadoop
 
Heuritech: Apache Spark REX
Heuritech: Apache Spark REXHeuritech: Apache Spark REX
Heuritech: Apache Spark REX
 
Bulk Loading into Cassandra
Bulk Loading into CassandraBulk Loading into Cassandra
Bulk Loading into Cassandra
 
9b. Document-Oriented Databases lab
9b. Document-Oriented Databases lab9b. Document-Oriented Databases lab
9b. Document-Oriented Databases lab
 
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
memcached Distributed Cache
memcached Distributed Cachememcached Distributed Cache
memcached Distributed Cache
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architecture
 
Data analysis scala_spark
Data analysis scala_sparkData analysis scala_spark
Data analysis scala_spark
 
Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)
 
Using Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into CassandraUsing Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into Cassandra
 

Viewers also liked

Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
Michelle Darling
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
Eric Evans
 
Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide Deck
DataStax Academy
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
DataStax Academy
 
NoSQL Essentials: Cassandra
NoSQL Essentials: CassandraNoSQL Essentials: Cassandra
NoSQL Essentials: Cassandra
Fernando Rodriguez
 
Raft
RaftRaft
Software Productivity and Serverless
Software Productivity and ServerlessSoftware Productivity and Serverless
Software Productivity and Serverless
Nick Gottlieb
 
[serverlessconf2017]FaaSで簡単に実現する数十万RPSスパイク負荷試験
[serverlessconf2017]FaaSで簡単に実現する数十万RPSスパイク負荷試験[serverlessconf2017]FaaSで簡単に実現する数十万RPSスパイク負荷試験
[serverlessconf2017]FaaSで簡単に実現する数十万RPSスパイク負荷試験
Takahiro Moteki
 
Serverlessconf Tokyo 2017 Biz serverless お客様のビジネスを支える サーバーレスアーキテクチャーと開発としてのビジ...
Serverlessconf Tokyo 2017 Biz serverless お客様のビジネスを支えるサーバーレスアーキテクチャーと開発としてのビジ...Serverlessconf Tokyo 2017 Biz serverless お客様のビジネスを支えるサーバーレスアーキテクチャーと開発としてのビジ...
Serverlessconf Tokyo 2017 Biz serverless お客様のビジネスを支える サーバーレスアーキテクチャーと開発としてのビジ...
Hiroyuki Hiki
 
now
nownow
BluetoothメッシュによるIoTシステムを支えるサーバーレス技術 #serverlesstokyo
BluetoothメッシュによるIoTシステムを支えるサーバーレス技術 #serverlesstokyoBluetoothメッシュによるIoTシステムを支えるサーバーレス技術 #serverlesstokyo
BluetoothメッシュによるIoTシステムを支えるサーバーレス技術 #serverlesstokyo
Masahiro NAKAYAMA
 
Step functionsとaws batchでオーケストレートするイベントドリブンな機械学習基盤
Step functionsとaws batchでオーケストレートするイベントドリブンな機械学習基盤Step functionsとaws batchでオーケストレートするイベントドリブンな機械学習基盤
Step functionsとaws batchでオーケストレートするイベントドリブンな機械学習基盤
Yu Yamada
 
分散システムについて語らせてくれ
分散システムについて語らせてくれ分散システムについて語らせてくれ
分散システムについて語らせてくれ
Kumazaki Hiroki
 
Growing up serverless
Growing up serverlessGrowing up serverless
Growing up serverless
Amazon Web Services Japan
 
「サーバレスの薄い本」からの1年 #serverlesstokyo
「サーバレスの薄い本」からの1年 #serverlesstokyo「サーバレスの薄い本」からの1年 #serverlesstokyo
「サーバレスの薄い本」からの1年 #serverlesstokyo
Masahiro NAKAYAMA
 

Viewers also liked (16)

Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide Deck
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Google Dremel
Google DremelGoogle Dremel
Google Dremel
 
NoSQL Essentials: Cassandra
NoSQL Essentials: CassandraNoSQL Essentials: Cassandra
NoSQL Essentials: Cassandra
 
Raft
RaftRaft
Raft
 
Software Productivity and Serverless
Software Productivity and ServerlessSoftware Productivity and Serverless
Software Productivity and Serverless
 
[serverlessconf2017]FaaSで簡単に実現する数十万RPSスパイク負荷試験
[serverlessconf2017]FaaSで簡単に実現する数十万RPSスパイク負荷試験[serverlessconf2017]FaaSで簡単に実現する数十万RPSスパイク負荷試験
[serverlessconf2017]FaaSで簡単に実現する数十万RPSスパイク負荷試験
 
Serverlessconf Tokyo 2017 Biz serverless お客様のビジネスを支える サーバーレスアーキテクチャーと開発としてのビジ...
Serverlessconf Tokyo 2017 Biz serverless お客様のビジネスを支えるサーバーレスアーキテクチャーと開発としてのビジ...Serverlessconf Tokyo 2017 Biz serverless お客様のビジネスを支えるサーバーレスアーキテクチャーと開発としてのビジ...
Serverlessconf Tokyo 2017 Biz serverless お客様のビジネスを支える サーバーレスアーキテクチャーと開発としてのビジ...
 
now
nownow
now
 
BluetoothメッシュによるIoTシステムを支えるサーバーレス技術 #serverlesstokyo
BluetoothメッシュによるIoTシステムを支えるサーバーレス技術 #serverlesstokyoBluetoothメッシュによるIoTシステムを支えるサーバーレス技術 #serverlesstokyo
BluetoothメッシュによるIoTシステムを支えるサーバーレス技術 #serverlesstokyo
 
Step functionsとaws batchでオーケストレートするイベントドリブンな機械学習基盤
Step functionsとaws batchでオーケストレートするイベントドリブンな機械学習基盤Step functionsとaws batchでオーケストレートするイベントドリブンな機械学習基盤
Step functionsとaws batchでオーケストレートするイベントドリブンな機械学習基盤
 
分散システムについて語らせてくれ
分散システムについて語らせてくれ分散システムについて語らせてくれ
分散システムについて語らせてくれ
 
Growing up serverless
Growing up serverlessGrowing up serverless
Growing up serverless
 
「サーバレスの薄い本」からの1年 #serverlesstokyo
「サーバレスの薄い本」からの1年 #serverlesstokyo「サーバレスの薄い本」からの1年 #serverlesstokyo
「サーバレスの薄い本」からの1年 #serverlesstokyo
 

Similar to Cassandra Explained

Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
Eric Evans
 
NoSQL, no Limits, lots of Fun!
NoSQL, no Limits, lots of Fun!NoSQL, no Limits, lots of Fun!
NoSQL, no Limits, lots of Fun!
Otávio Santana
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUG
Stu Hood
 
TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...
TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...
TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...
tdc-globalcode
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
Murat Çakal
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
shimi_k
 
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Data Con LA
 
Online Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and CassandraOnline Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and Cassandra
Robbie Strickland
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
shsedghi
 
Sorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at SpotifySorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at Spotify
Neville Li
 
Clojure ♥ cassandra
Clojure ♥ cassandra Clojure ♥ cassandra
Clojure ♥ cassandra
Max Penet
 
Cassandra
CassandraCassandra
Cassandra
Robert Koletka
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)
javier ramirez
 
Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Boris Yen
 
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
Omid Vahdaty
 
NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandra
rantav
 
Apache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelApache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data model
Andrey Lomakin
 
C# as a System Language
C# as a System LanguageC# as a System Language
C# as a System Language
ScyllaDB
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"
Jihyun Ahn
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
Roger Xia
 

Similar to Cassandra Explained (20)

Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 
NoSQL, no Limits, lots of Fun!
NoSQL, no Limits, lots of Fun!NoSQL, no Limits, lots of Fun!
NoSQL, no Limits, lots of Fun!
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUG
 
TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...
TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...
TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
 
Online Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and CassandraOnline Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and Cassandra
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
 
Sorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at SpotifySorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at Spotify
 
Clojure ♥ cassandra
Clojure ♥ cassandra Clojure ♥ cassandra
Clojure ♥ cassandra
 
Cassandra
CassandraCassandra
Cassandra
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)
 
Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012
 
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
 
NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandra
 
Apache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelApache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data model
 
C# as a System Language
C# as a System LanguageC# as a System Language
C# as a System Language
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
 

More from Eric Evans

Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)
Eric Evans
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
Eric Evans
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
Eric Evans
 
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Eric Evans
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
Eric Evans
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
Eric Evans
 
It's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRDIt's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRD
Eric Evans
 
Time series storage in Cassandra
Time series storage in CassandraTime series storage in Cassandra
Time series storage in Cassandra
Eric Evans
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
Eric Evans
 
Cassandra by Example: Data Modelling with CQL3
Cassandra by Example:  Data Modelling with CQL3Cassandra by Example:  Data Modelling with CQL3
Cassandra by Example: Data Modelling with CQL3
Eric Evans
 
Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)
Eric Evans
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
Eric Evans
 
Castle enhanced Cassandra
Castle enhanced CassandraCastle enhanced Cassandra
Castle enhanced Cassandra
Eric Evans
 
CQL: SQL In Cassandra
CQL: SQL In CassandraCQL: SQL In Cassandra
CQL: SQL In Cassandra
Eric Evans
 
CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)
Eric Evans
 
Cassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQLCassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQL
Eric Evans
 
NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?
Eric Evans
 
Outside The Box With Apache Cassnadra
Outside The Box With Apache CassnadraOutside The Box With Apache Cassnadra
Outside The Box With Apache Cassnadra
Eric Evans
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed Database
Eric Evans
 
An Introduction To Cassandra
An Introduction To CassandraAn Introduction To Cassandra
An Introduction To Cassandra
Eric Evans
 

More from Eric Evans (20)

Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
 
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
 
It's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRDIt's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRD
 
Time series storage in Cassandra
Time series storage in CassandraTime series storage in Cassandra
Time series storage in Cassandra
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
 
Cassandra by Example: Data Modelling with CQL3
Cassandra by Example:  Data Modelling with CQL3Cassandra by Example:  Data Modelling with CQL3
Cassandra by Example: Data Modelling with CQL3
 
Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
 
Castle enhanced Cassandra
Castle enhanced CassandraCastle enhanced Cassandra
Castle enhanced Cassandra
 
CQL: SQL In Cassandra
CQL: SQL In CassandraCQL: SQL In Cassandra
CQL: SQL In Cassandra
 
CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)
 
Cassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQLCassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQL
 
NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?
 
Outside The Box With Apache Cassnadra
Outside The Box With Apache CassnadraOutside The Box With Apache Cassnadra
Outside The Box With Apache Cassnadra
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed Database
 
An Introduction To Cassandra
An Introduction To CassandraAn Introduction To Cassandra
An Introduction To Cassandra
 

Recently uploaded

Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Erasmo Purificato
 
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
Stephanie Beckett
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
Aurora Consulting
 
Password Rotation in 2024 is still Relevant
Password Rotation in 2024 is still RelevantPassword Rotation in 2024 is still Relevant
Password Rotation in 2024 is still Relevant
Bert Blevins
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
ishalveerrandhawa1
 
Choose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presenceChoose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presence
rajancomputerfbd
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
Eric D. Schabell
 
20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf
Sally Laouacheria
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
ScyllaDB
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
SynapseIndia
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions
 
Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
KAMAL CHOUDHARY
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
ScyllaDB
 
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
Matthew Sinclair
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
SynapseIndia
 
Quantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLMQuantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLM
Vijayananda Mohire
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
Matthew Sinclair
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc
 
Manual | Product | Research Presentation
Manual | Product | Research PresentationManual | Product | Research Presentation
Manual | Product | Research Presentation
welrejdoall
 
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALLBLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
Liveplex
 

Recently uploaded (20)

Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
 
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
 
Password Rotation in 2024 is still Relevant
Password Rotation in 2024 is still RelevantPassword Rotation in 2024 is still Relevant
Password Rotation in 2024 is still Relevant
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
 
Choose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presenceChoose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presence
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
 
20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
 
Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
 
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
 
Quantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLMQuantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLM
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
 
Manual | Product | Research Presentation
Manual | Product | Research PresentationManual | Product | Research Presentation
Manual | Product | Research Presentation
 
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALLBLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
 

Cassandra Explained

  • 1. Cassandra Explained Disruptive Code September 22, 2010 Eric Evans eevans@rackspace.com @jericevans http://blog.sym-link.com
  • 2. Outline ● Background ● Description ● API(s) ● Code Samples
  • 8. Influential Papers ● BigTable ● Strong consistency ● Sparse map data model ● GFS, Chubby, et al ● Dynamo ● O(1) distributed hash table (DHT) ● BASE (aka eventual consistency) ● Client tunable consistency/availability
  • 9. NoSQL ● HBase ● Hypertable ● MongoDB ● HyperGraphDB ● Riak ● Memcached ● Voldemort ● Tokyo Cabinet ● Neo4J ● Redis ● Cassandra ● CouchDB
  • 10. NoSQL Big data ● HBase ● Hypertable ● MongoDB ● HyperGraphDB ● Riak ● Memcached ● Voldemort ● Tokyo Cabinet ● Neo4J ● Redis ● Cassandra ● CouchDB
  • 11. Bigtable / Dynamo Bigtable Dynamo ● HBase ● Riak ● Hypertable ● Voldemort Cassandra ??
  • 13. CAP Theorem “Pick Two” ● CP ● AP ● Bigtable ● Dynamo ● Hypertable ● Voldemort ● HBase ● Cassandra
  • 14. CAP Theorem “Pick Two” ● Consistency ● Availability ● Partition Tolerance
  • 16. Properties ● Symmetric ● No single point of failure ● Linearly scalable ● Ease of administration ● Flexible partitioning, replica placement ● Automated provisioning ● High availability (eventual consistency)
  • 19. Partitioning ● Random ● 128bit namespace, (MD5) ● Good distribution ● Order Preserving ● Tokens determine namespace ● Natural order (lexicographical) ● Range / cover queries ● Yours ??
  • 20. Replica Placement ● SimpleSnitch ● Default ● N-1 successive nodes ● RackInferringSnitch ● Infers DC/rack from IP ● PropertyFileSnitch ● Configured w/ a properties file
  • 24. Choosing Consistency Write Read Level Description Level Description ZERO Hail Mary ZERO N/A ANY 1 replica (HH) ANY N/A ONE 1 replica ONE 1 replica QUORUM (N / 2) +1 QUORUM (N / 2) +1 ALL All replicas ALL All replicas R+W>N
  • 28. Overview ● Keyspace ● Uppermost namespace ● Typically one per application ● ColumnFamily ● Associates records of a similar kind ● Record-level Atomicity ● Indexed ● Column ● Basic unit of storage
  • 30. Column ● name ● byte[] ● Queried against (predicates) ● Determines sort order ● value ● byte[] ● Opaque to Cassandra ● timestamp ● long ● Conflict resolution (Last Write Wins)
  • 31. Column Comparators ● Bytes ● UTF8 ● TimeUUID ● Long ● LexicalUUID ● Composite (third-party) http://github.com/edanuff/CassandraCompositeType
  • 32. API
  • 37. Idiomatic Client Libraries ● Pelops, Hector (Java) ● Pycassa (Python) ● Cassandra (Ruby) ● Others … http://wiki.apache.org/cassandra/ClientOptions
  • 39. Pycassa – Python Client API # creating a connection from pycassa import connect hosts = ['host1:9160', 'host2:9160'] client = connect('Keyspace1', hosts) # creating a column family instance from pycassa import ColumnFamily cf = ColumnFamily(client, “Standard1”) # reading/writing a column cf.insert('key1', {'name': 'value'}) print cf.get('key1')['name'] 1. http://github.com/vomjom/pycassa
  • 40. Address Book – Setup # conf/cassandra.yaml keyspaces: - name: AddressBook column_families: - name: Addresses compare_with: BytesType rows_cached: 10000 keys_cached: 50 comment: 'No comment'
  • 41. Adding an entry key = uuid() columns = { 'first': 'Eric', 'last': 'Evans', 'email': 'eevans@rackspace.com', 'city': 'San Antonio', 'zip': 78250 } addresses.insert(key, columns)
  • 42. Fetching a record # fetching the record by key record = addresses.get(key) # accessing columns by name zipcode = record['zip'] city = record['city']
  • 43. Indexing (manual) # conf/cassandra.yaml keyspaces: - name: AddressBook column_families: - name: Addresses compare_with: BytesType rows_cached: 10000 keys_cached: 50 comment: 'No comment' - name: ByCity compare_with: UTF8Type
  • 44. Updating the index key = uuid() columns = { 'first': 'Eric', 'last': 'Evans', 'email': 'eevans@rackspace.com', 'city': 'San Antonio', 'zip': 78250 } addresses.insert(key, columns) byCity.insert('San Antonio', {key: ''})
  • 45. Indexing (auto) # conf/cassandra.yaml keyspaces: - name: AddressBook column_families: - name: Addresses compare_with: BytesType rows_cached: 10000 keys_cached: 50 comment: 'No comment' column_metadata: - name: city index_type: KEYS
  • 46. Querying the Index from pycassa.index import create_index_expression from pycassa.index import create_index_clause e = create_index_expression('city', 'San Antonio') clause = create_index_clause([e]) results = address.get_indexed_slices(clause) for (key, columns) in results.items(): print “%(first)s %(last)s” % columns
  • 47. Timeseries # conf/cassandra.yaml keyspaces: - name: Sites column_families: - name: Stats compare_with: LongType
  • 48. Logging values # time as long (milliseconds since epoch) tstmp = long(time() * 1e6) stats.insert('org.apache', {tstmp: value})
  • 49. Slicing begin = long(start * 1e6) stats.get_range('org.apache', column_start=begin) end = long((start + 86400) * 1e6) stats.get_range(start='org.apache', finish='org.debian', column_start=begin, column_finish=end)