SlideShare a Scribd company logo
Project History
                                 Description
                                      People




                An Introduction To Cassandra

                                  Eric Evans
                             eevans@rackspace.com
                                  @jericevans


                                OpenSQL Camp
                               November 14, 2009




Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                    Description
                                         People




A prophetess in Troy during the Trojan War. Her predictions were
always true, but never believed.


   Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                    Description
                                         People




A massively scalable, decentralized, structured data store (aka
database).




   Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                       Description
                                            People


Outline



  1   Project History


  2   Description


  3   People




      Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra

Recommended for you

Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra

Introduction to Apache Cassandra (September 2014). Design principles, replication, consistency, clusters, CQL.

cassandra cql replication consistency
Introduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and ConsistencyIntroduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and Consistency

A short introduction to replication and consistency in the Cassandra distributed database. Delivered April 28th, 2010 at the Seattle Scalability Meetup.

seahugdynamonosql
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained

Cassandra is a distributed database management system designed to handle large amounts of data across many commodity servers. It provides high availability with no single points of failure and linear scalability as nodes are added. Cassandra uses a peer-to-peer distributed architecture and tunable consistency levels to achieve high performance and availability without requiring strong consistency. It is based on Amazon's Dynamo and Google's Bigtable papers and provides a combination of their features.

apache cassandra distributed database
Project History
                                 Description
                                      People




Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                 Description
                                      People




Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                 Description
                                      People




Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                 Description
                                      People




 4 new committers added
 Dozens of contributors
 60+ people on IRC
 Hundreds of closed issues (bugs, features, etc)
 2 major releases, 1 point release
 0.5.0 RSN




Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra

Recommended for you

An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra

Apache Cassandra is a free, distributed, open source, and highly scalable NoSQL database that is designed to handle large amounts of data across many commodity servers. It provides high availability with no single point of failure, linear scalability, and tunable consistency. Cassandra's architecture allows it to spread data across a cluster of servers and replicate across multiple data centers for fault tolerance. It is used by many large companies for applications that require high performance, scalability, and availability.

apache cassandranosqldatastax
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial

I don't think it's hyperbole when I say that Facebook, Instagram, Twitter & Netflix now define the dimensions of our social & entertainment universe. But what kind of technology engines purr under the hoods of these social media machines? Here is a tech student's perspective on making the paradigm shift to "Big Data" using innovative models: alphabet blocks, nesting dolls, & LEGOs! Get info on: - What is Cassandra (C*)? - Installing C* Community Version on Amazon Web Services EC2 - Data Modelling & Database Design in C* using CQL3 - Industry Use Cases

nosqlawsec2
Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)

The Wikimedia Foundation is a non-profit and charitable organization driven by a vision of a world where every human can freely share in the sum of all knowledge. Each month Wikimedia sites serve over 18 billion page views to 500 million unique visitors around the world. Among the many resources offered by Wikimedia is a public-facing API that provides low-latency, programmatic access to full-history content and meta-data, in a variety of formats. Commonly, results from this system are the product of computationally intensive transformations, and must be pre-generated and persisted to meet latency expectations. Unsurprisingly, there are numerous challenges to providing low-latency storage of such a massive data-set, in a demanding, globally distributed environment. This talk covers Wikimedia Content API, and it's use of Apache Cassandra, a massively-scalable distributed database, as storage for a diverse and growing set of use-cases. Trials, tribulations, and triumphs, of both a development and operational nature are discussed.

wikimediacassandranosql
Project History
                                       Description
                                            People


Outline



  1   Project History


  2   Description


  3   People




      Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                      Description
                                           People


Cassandra is...




      O(1) DHT
      Eventual consistency
      Tunable trade-offs, consistency vs. latency




     Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                 Description
                                      People




Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                      Description
                                           People


But...




         Values are structured, indexed
         Columns / column families
         Slicing w/ predicates (queries)




     Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra

Recommended for you

Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case

This document summarizes Eric Evans' presentation on using Cassandra as the backend for Wikimedia's content API. It discusses Wikimedia's goals of providing free knowledge, key metrics about Wikipedia and its architecture. It then focuses on how Wikimedia uses Cassandra, including their data model, compression techniques, and ongoing work to optimize compaction strategies and reduce node density to improve performance.

wikimediacassandranosql
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case

Among the resources offered by Wikimedia is an API providing low-latency access to full-history content, in many formats. Its results are often the product of computationally intensive transforms, and must be pre-generated and stored to meet latency expectations. Unsurprisingly, there are many challenges to providing low-latency access to such a large data-set, in a demanding, globally distributed environment. This presentation covers the Wikimedia content API and its use of Apache Cassandra as storage for a diverse and growing set of use-cases. Trials, tribulations, and triumphs, of both a development and operational nature will be discussed.

apachewikimediacassandra
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)

This document discusses using Apache Cassandra to store and retrieve time series data more efficiently than the traditional RRDTool approach. It describes how Cassandra is well-suited for time series data due to its high write throughput, ability to store data sorted on disk, and partitioning and replication. The document also outlines a data model for storing time series metrics in Cassandra and discusses Newts, an open source time series data store built on Cassandra.

time seriesapachecondistributed database
Project History
                                      Description
                                           People


Column families




     Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                      Description
                                           People


Supercolumn families




     Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                     Description
                                          People


Querying



     By column
     By column for multiple keys
     Slice by names, or ranges of names
             returning columns
             returning super columns
     Slice for multiple keys
     Range of keys
     Slice on a key range RSN




    Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                     Description
                                          People


Column comparators



     TimeUUID
     LexicalUUID
     UTF8
     Long
     Bytes
     ...




    Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra

Recommended for you

Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra

This document discusses using Apache Cassandra to store and manage time series data in OpenNMS. It describes some limitations of the existing RRDTool-based data storage, such as high I/O requirements for updating and aggregating data. Cassandra is presented as an alternative that is optimized for write throughput, flexible data modeling, high availability, and ability to perform aggregations at read time rather than write time. The Newts project is introduced as a standalone time series data store built on Cassandra that aims to provide fast storage and retrieval of raw samples along with flexible aggregation capabilities.

apachecassandrastrangeloop
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra

Whether it's statistics, weather forecasting, astronomy, finance, or network management, time series data plays a critical role in analytics and forecasting. Unfortunately, while many tools exist for time series storage and analysis, few are able to scale past memory limits, or provide rich query and analytics capabilities outside what is necessary to produce simple plots; For those challenged by large volumes of data, there is much room for improvement. Apache Cassandra is a fully distributed second-generation database. Cassandra stores data in key-sorted order making it ideal for time series, and its high throughput and linear scalability make it well suited to very large data sets. This talk will cover some of the requirements and challenges of large scale time series storage and analysis. Cassandra data and query modeling for this use-case will be discussed, and Newts, an open source Cassandra-based time series store under development at The OpenNMS Group will be introduced.

cassandradistributedbbuzz
It's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRDIt's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRD

OpenNMS User Conference Europe presentation on using Apache Cassandra and Newts for time-series data storage.

big datadatabasenetwork management
Project History
                                     Description
                                          People


Updating




     Insert column (by key)
     Batch insert (multi-column but still by key)
     Remove (by key)
     Remove key range RSN




    Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                      Description
                                           People


Consistency



  CAP Theorem: choose any two of Consistency, Availability, or
  Partition tolerance.
      Zero
      One
      Quorum ((N / 2) + 1)
      All




     Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                      Description
                                           People


Client API


      Thrift (12 different languages!)
      Ruby
              http://github.com/fauna/cassandra/tree/master
              http://github.com/NZKoz/cassandra object/tree/master
      Python
              http://github.com/digg/lazyboy/tree/master
              http://github.com/driftx/Telephus/tree/master (Twisted)
      Scala
              http://github.com/viktorklang/Cassidy/tree/master
              http://github.com/nodeta/scalandra/tree/master




     Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                     Description
                                          People


Performance vs MySQL w/ 50GB




     MySQL
              300ms write
              350ms read

     Cassandra
              0.12ms write
              15ms read




    Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra

Recommended for you

Time series storage in Cassandra
Time series storage in CassandraTime series storage in Cassandra
Time series storage in Cassandra

Presented at Cassandra London (April 7, 2014); The challenges of time-series storage and analytics in OpenNMS, with an introduction to Newts, a new Cassandra-based time-series data store.

distributed databasenosqlapache
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra

A presentation on the recent work to transition Cassandra from its naive 1-partition-per-node distribution, to a proper virtual nodes implementation.

cassandradatabasedistributed database
Cassandra by Example: Data Modelling with CQL3
Cassandra by Example:  Data Modelling with CQL3Cassandra by Example:  Data Modelling with CQL3
Cassandra by Example: Data Modelling with CQL3

This document summarizes a presentation about modeling data with Cassandra Query Language (CQL) using examples from a Twitter-like application called Twissandra. It introduces CQL as an alternative to Thrift for querying Cassandra and describes how to model users, followers, tweets, timelines and other social media data structures in Cassandra tables. The presentation emphasizes denormalizing data and using materialized views to optimize queries, and concludes by noting that applications can be built in various languages thanks to Cassandra drivers.

nosqldatabasecassandra
Project History
                                      Description
                                           People


Writes




     Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                      Description
                                           People


About writes...



      No reads
      No seeks
      Sequential disk access
      Atomic within a column family
      Fast
      Any node
      Always writeable (hinted hand-off)




     Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                     Description
                                          People


Reads




    Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                      Description
                                           People


About reads...




      Any node
      Read repair
      Usual caching conventions apply




     Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra

Recommended for you

Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3

CQL is the query language for Apache Cassandra that provides an SQL-like interface. The document discusses the evolution from the older Thrift RPC interface to CQL and provides examples of modeling tweet data in Cassandra using tables like users, tweets, following, followers, userline, and timeline. It also covers techniques like denormalization, materialized views, and batch loading of related data to optimize for common queries.

ddtx13distributed databasesql
Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)

The document discusses topology and partitioning in Cassandra distributed hash tables (DHTs). It describes issues with poor load distribution and data distribution in traditional DHT designs. It proposes using virtual nodes, where each physical node is assigned multiple tokens, to better distribute partitions and improve performance. Configuration options for Cassandra are presented that implement virtual nodes using a random token assignment strategy.

distributed databasenosqlvirtual nodes
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra

The document discusses Cassandra's topology and how it is moving from a single token per node model to a virtual node model where each node is assigned multiple tokens. This improves load balancing and data distribution in the cluster. Specifically, it addresses problems with the single token approach like poor load distribution when nodes fail and inefficient data movement when adding or replacing nodes. The virtual node model with random token assignment provides better scaling properties as the number of nodes and data size increases.

nosqldatabaseapache
Project History
                                       Description
                                            People


Outline



  1   Project History


  2   Description


  3   People




      Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                     Description
                                          People


Droppin’ Names




     Facebook
     Digg
     IBM Research
     Rackspace
     Twitter




    Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra
Project History
                                 Description
                                      People




       http://incubator.apache.org/cassandra

                #cassandra / irc.freenode.net




Eric Evans eevans@rackspace.com @jericevans     An Introduction To Cassandra

More Related Content

Viewers also liked

Cassandra ppt 1
Cassandra ppt 1Cassandra ppt 1
Cassandra ppt 1
Skillwise Group
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed Database
Eric Evans
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
nickmbailey
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
Robert Stupp
 
Introduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and ConsistencyIntroduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and Consistency
Benjamin Black
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
Eric Evans
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
DataStax
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
Michelle Darling
 

Viewers also liked (8)

Cassandra ppt 1
Cassandra ppt 1Cassandra ppt 1
Cassandra ppt 1
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed Database
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Introduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and ConsistencyIntroduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and Consistency
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 

More from Eric Evans

Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)
Eric Evans
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
Eric Evans
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
Eric Evans
 
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Eric Evans
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
Eric Evans
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
Eric Evans
 
It's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRDIt's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRD
Eric Evans
 
Time series storage in Cassandra
Time series storage in CassandraTime series storage in Cassandra
Time series storage in Cassandra
Eric Evans
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
Eric Evans
 
Cassandra by Example: Data Modelling with CQL3
Cassandra by Example:  Data Modelling with CQL3Cassandra by Example:  Data Modelling with CQL3
Cassandra by Example: Data Modelling with CQL3
Eric Evans
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
Eric Evans
 
Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)
Eric Evans
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
Eric Evans
 
Castle enhanced Cassandra
Castle enhanced CassandraCastle enhanced Cassandra
Castle enhanced Cassandra
Eric Evans
 
CQL: SQL In Cassandra
CQL: SQL In CassandraCQL: SQL In Cassandra
CQL: SQL In Cassandra
Eric Evans
 
CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)
Eric Evans
 

More from Eric Evans (16)

Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
 
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
 
It's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRDIt's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRD
 
Time series storage in Cassandra
Time series storage in CassandraTime series storage in Cassandra
Time series storage in Cassandra
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
 
Cassandra by Example: Data Modelling with CQL3
Cassandra by Example:  Data Modelling with CQL3Cassandra by Example:  Data Modelling with CQL3
Cassandra by Example: Data Modelling with CQL3
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
 
Castle enhanced Cassandra
Castle enhanced CassandraCastle enhanced Cassandra
Castle enhanced Cassandra
 
CQL: SQL In Cassandra
CQL: SQL In CassandraCQL: SQL In Cassandra
CQL: SQL In Cassandra
 
CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)
 

Recently uploaded

論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
Toru Tamaki
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions
 
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
Matthew Sinclair
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
ScyllaDB
 
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
Stephanie Beckett
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
RaminGhanbari2
 
UiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs ConferenceUiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs Conference
UiPathCommunity
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
Eric D. Schabell
 
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
Stephanie Beckett
 
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
huseindihon
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
Adam Dunkels
 
Quantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLMQuantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLM
Vijayananda Mohire
 
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
jackson110191
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
Aurora Consulting
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
ishalveerrandhawa1
 
Best Programming Language for Civil Engineers
Best Programming Language for Civil EngineersBest Programming Language for Civil Engineers
Best Programming Language for Civil Engineers
Awais Yaseen
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
Tatiana Al-Chueyr
 
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
BookNet Canada
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
ScyllaDB
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
Larry Smarr
 

Recently uploaded (20)

論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
 
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
 
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
 
UiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs ConferenceUiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs Conference
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
 
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
 
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
 
Quantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLMQuantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLM
 
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
 
Best Programming Language for Civil Engineers
Best Programming Language for Civil EngineersBest Programming Language for Civil Engineers
Best Programming Language for Civil Engineers
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
 
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
 

An Introduction To Cassandra

  • 1. Project History Description People An Introduction To Cassandra Eric Evans eevans@rackspace.com @jericevans OpenSQL Camp November 14, 2009 Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 2. Project History Description People A prophetess in Troy during the Trojan War. Her predictions were always true, but never believed. Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 3. Project History Description People A massively scalable, decentralized, structured data store (aka database). Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 4. Project History Description People Outline 1 Project History 2 Description 3 People Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 5. Project History Description People Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 6. Project History Description People Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 7. Project History Description People Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 8. Project History Description People 4 new committers added Dozens of contributors 60+ people on IRC Hundreds of closed issues (bugs, features, etc) 2 major releases, 1 point release 0.5.0 RSN Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 9. Project History Description People Outline 1 Project History 2 Description 3 People Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 10. Project History Description People Cassandra is... O(1) DHT Eventual consistency Tunable trade-offs, consistency vs. latency Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 11. Project History Description People Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 12. Project History Description People But... Values are structured, indexed Columns / column families Slicing w/ predicates (queries) Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 13. Project History Description People Column families Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 14. Project History Description People Supercolumn families Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 15. Project History Description People Querying By column By column for multiple keys Slice by names, or ranges of names returning columns returning super columns Slice for multiple keys Range of keys Slice on a key range RSN Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 16. Project History Description People Column comparators TimeUUID LexicalUUID UTF8 Long Bytes ... Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 17. Project History Description People Updating Insert column (by key) Batch insert (multi-column but still by key) Remove (by key) Remove key range RSN Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 18. Project History Description People Consistency CAP Theorem: choose any two of Consistency, Availability, or Partition tolerance. Zero One Quorum ((N / 2) + 1) All Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 19. Project History Description People Client API Thrift (12 different languages!) Ruby http://github.com/fauna/cassandra/tree/master http://github.com/NZKoz/cassandra object/tree/master Python http://github.com/digg/lazyboy/tree/master http://github.com/driftx/Telephus/tree/master (Twisted) Scala http://github.com/viktorklang/Cassidy/tree/master http://github.com/nodeta/scalandra/tree/master Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 20. Project History Description People Performance vs MySQL w/ 50GB MySQL 300ms write 350ms read Cassandra 0.12ms write 15ms read Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 21. Project History Description People Writes Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 22. Project History Description People About writes... No reads No seeks Sequential disk access Atomic within a column family Fast Any node Always writeable (hinted hand-off) Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 23. Project History Description People Reads Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 24. Project History Description People About reads... Any node Read repair Usual caching conventions apply Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 25. Project History Description People Outline 1 Project History 2 Description 3 People Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 26. Project History Description People Droppin’ Names Facebook Digg IBM Research Rackspace Twitter Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra
  • 27. Project History Description People http://incubator.apache.org/cassandra #cassandra / irc.freenode.net Eric Evans eevans@rackspace.com @jericevans An Introduction To Cassandra