SlideShare a Scribd company logo
The Cassandra Distributed Database

             Eric Evans
        eevans@rackspace.com
             @jericevans


             FOSDEM
          February 7, 2010
A prophetess in Troy during the Trojan War. Her predictions were
always true, but never believed.
A massively scalable, decentralized, structured data store (aka
database).
Outline



1 Project History


2 Description


3 Case Studies


4 Roadmap

Recommended for you

Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra

An introduction to the Apache Cassandra database, as presented at the Northern Illinois Coders user group on 20141022.

cqlnosqldatabase
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin

This document provides an agenda and introduction for a presentation on Apache Cassandra and DataStax Enterprise. The presentation covers an introduction to Cassandra and NoSQL, the CAP theorem, Apache Cassandra features and architecture including replication, consistency levels and failure handling. It also discusses the Cassandra Query Language, data modeling for time series data, and new features in DataStax Enterprise like Spark integration and secondary indexes on collections. The presentation concludes with recommendations for getting started with Cassandra in production environments.

apache cassandradatastaxapache
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview

Cassandra is a distributed database designed to handle large amounts of data across commodity servers. It aims for high availability with no single points of failure. Data is distributed across nodes and replicated for redundancy. Cassandra uses a decentralized design with peer-to-peer communication and an eventually consistent model. It requires denormalized data models and queries to be defined prior to data structure.

The Cassandra Distributed Database
The Cassandra Distributed Database
The Cassandra Distributed Database
• 7 new committers added
• Dozens of contributors
• 100+ people on IRC
• Hundreds of closed issues (bugs, features, etc)
• 3 major releases, 2 point releases
• Graduation to TLP?

Recommended for you

Cassandra Architecture FTW
Cassandra Architecture FTWCassandra Architecture FTW
Cassandra Architecture FTW

Cassandra is a distributed database that is especially well-suited for handling large volumes of writes and data across many servers. It provides high availability through replication and tunable consistency levels. The document discusses Cassandra's architecture including its use of a ring topology, log-structured storage, and data model using a partition key and clustering columns. It also explains how Cassandra can be used as part of a polyglot persistence strategy along with complementary technologies like Spark and DSE Analytics.

apache cassandra
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra

This document provides an introduction to Cassandra, including key details about its history, supported versions, scalability, data model, and use cases. Cassandra is an open source distributed database management system designed to handle large amounts of data across many commodity servers. It provides high availability with no single points of failure and linear scalability across commodity hardware. Cassandra is optimized for fast reads on large datasets based on predefined keys or indexes and is well-suited for applications with heavy write loads like time series data, messaging, and fraud detection.

databasebig datacassandra
Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud Database

This document discusses evaluating Apache Cassandra as a cloud database. It provides an overview of DataStax, the commercial leader in Apache Cassandra. DataStax delivers database products and services based on Cassandra. Cassandra is a free, distributed, high performance, and extremely scalable database that can serve as both a real-time and read-intensive database. The document outlines how Cassandra stacks up against key attributes of a cloud database such as transparent elasticity, scalability, high availability, and more. It encourages readers to download Cassandra to try in their own environments.

apache cassandranosqldatastax
Outline



1 Project History


2 Description


3 Case Studies


4 Roadmap
Cassandra is...




• O(1) DHT
• Eventual consistency
• Tunable trade-offs, consistency vs. latency
The Cassandra Distributed Database
But...




• Values are structured, indexed
• Columns / column families
• Slicing w/ predicates (queries)

Recommended for you

Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial

I don't think it's hyperbole when I say that Facebook, Instagram, Twitter & Netflix now define the dimensions of our social & entertainment universe. But what kind of technology engines purr under the hoods of these social media machines? Here is a tech student's perspective on making the paradigm shift to "Big Data" using innovative models: alphabet blocks, nesting dolls, & LEGOs! Get info on: - What is Cassandra (C*)? - Installing C* Community Version on Amazon Web Services EC2 - Data Modelling & Database Design in C* using CQL3 - Industry Use Cases

nosqlawsec2
Cassandra
CassandraCassandra
Cassandra

This document outlines an online course on Cassandra that covers its key concepts and features. The course contains 8 modules that progress from introductory topics to more advanced ones like integrating Cassandra with Hadoop. It teaches students how to model and query data in Cassandra, configure and maintain Cassandra clusters, and build a sample application. The course includes live classes, recordings, quizzes, assignments, and an online certification exam to help students learn Cassandra.

Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World

Apache Cassandra is a highly scalable, multi-datacenter database that provides massive scalability, high performance, reliability and availability without single points of failure. It is operations and developer friendly with simple design, exposed metrics, and tools like OpsCenter and DevCenter. Cassandra is used by many large companies including Netflix to store film metadata and user ratings, La Poste to store parcel distribution metadata, and Spotify to store over 1 billion playlists.

cassandra
Column families
Supercolumn families
Querying



• get(): retrieve by column name
• multiget(): by column name for a set of keys
• get slice(): by column name, or a range of names
    • returning columns
    • returning super columns
• multiget slice(): a subset of columns for a set of keys
• get count: number of columns or sub-columns
• get range slice(): subset of columns for a range of keys
Column comparators



• TimeUUID
• LexicalUUID
• UTF8
• Long
• Bytes
• ...

Recommended for you

Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...

This is a crash course introduction to Cassandra. You'll step away understanding how it's possible to to utilize this distributed database to achieve high availability across multiple data centers, scale out as your needs grow, and not be woken up at 3am just because a server failed. We'll cover the basics of data modeling with CQL, and understand how that data is stored on disk. We'll wrap things up by setting up Cassandra locally, so bring your laptops.

apache cassandracassandraintroduction
Cassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + DynamoCassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + Dynamo

Cassandra is a highly scalable, eventually consistent, distributed, structured columnfamily store with no single points of failure, initially open-sourced by Facebook and now part of the Apache Incubator. These slides are from Jonathan Ellis's OSCON 09 talk: http://en.oreilly.com/oscon2009/public/schedule/detail/7975

distributeddynamodatabase
Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26

Apache Cassandra is a scalable distributed hash map that stores data across multiple commodity servers. It provides high availability with no single point of failure and scales horizontally as more servers are added. Cassandra uses an eventually consistent model and tunable consistency levels. Data is organized into keyspaces containing column families with rows and columns.

cassandra nosql
Updating




• insert(): add/update column (by key)
• batch insert(): add/update multiple columns (by key)
• remove(): remove a column
• batch mutate(): like batch insert() but can also delete
  (new for 0.6, deprecates batch insert())
• Remove key range RSN
Consistency



CAP Theorem: choose any two of Consistency, Availability, or
Partition tolerance.
  • Zero
  • One
  • Quorum ((N / 2) + 1)
  • All
Client API


• Thrift (12 different languages!)
• Ruby
    • http://github.com/fauna/cassandra/tree/master
    • http://github.com/NZKoz/cassandra object/tree/master
• Python
    • http://github.com/digg/lazyboy/tree/master
    • http://github.com/driftx/Telephus/tree/master (Twisted)
• Scala
    • http://github.com/viktorklang/Cassidy/tree/master
    • http://github.com/nodeta/scalandra/tree/master
Performance vs MySQL w/ 50GB




• MySQL
   • 300ms write
   • 350ms read

• Cassandra
    • 0.12ms write
    • 15ms read

Recommended for you

Cassandra training
Cassandra trainingCassandra training
Cassandra training

This document provides an overview and introduction to Cassandra including: - An agenda that outlines the topics covered in the overview including architecture, data modeling differences from RDBMS, and CQL. - Recommended resources for learning more about Cassandra including documentation, video courses, books, and articles. - Requirements that Cassandra aims to meet for database management including scaling, uptime, performance, and cost. - Key aspects of Cassandra including being open source, distributed, decentralized, scalable, fault tolerant, and using a flexible data model. - Examples of large companies that use Cassandra in production including Apple, Netflix, eBay, and others handling large datasets.

trainingnosqlcassandra
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architecture

Cassandra is a distributed, decentralized, wide column store NoSQL database modeled after Amazon's Dynamo and Google's Bigtable. It provides high availability with no single point of failure, elastic scalability and tunable consistency. Cassandra uses consistent hashing to partition and distribute data across nodes, vector clocks to track data versions for consistency, and Merkle trees to detect and repair inconsistencies between replicas.

Introduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraIntroduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache Cassandra

An introduction to NoSQL databases and an overview of Apache Cassandra as a column family database. Presentation I gave at Synechron Technologies

nosqlcassandra
Writes
About writes...



• No reads
• No seeks
• Sequential disk access
• Atomic within a column family
• Fast
• Any node
• Always writeable (hinted hand-off)
Reads
About reads...




• Any node
• Read repair
• Usual caching conventions apply

Recommended for you

Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra Cluster

With Apache Cassandra being a massively scalable open source NoSQL database and with the amount of data that we create and copy annually which is doubling in size every two years, it is expected to reach 44 zettabytes, or 44 trillion gigabytes, we can assume that sooner or later a DBA will be handling a Cassandra database in their shop. This beginner/intermediate-level session will take you through my journey of an Oracle DBA and my first 100 days of starting to administer a Cassandra Cluster, show several demos and all the roadblocks and the success I had along this path.

cassandra summit 2015datastax enterpriseavailable
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...

This document provides an overview of Apache Cassandra and Datastax Enterprise. It discusses what Cassandra is, how it is used across different industries, its key features like scalability and availability. It also covers Cassandra terminology, data distribution, replication strategies, consistency levels, and how reads and writes work in Cassandra.

datastax enterpriseplanet cassandradatastore
Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLDevelopers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQL

1) The document discusses Cassandra, a NoSQL database. It provides an overview of Cassandra's history and features. 2) Cassandra was originally developed at Facebook and is now an open source project. It is based on concepts from Bigtable and Dynamo. 3) The document covers Cassandra's data model, architecture including use of gossip protocols and consistency levels, and compares it with relational databases.

developers summit cassandra nosql
Outline



1 Project History


2 Description


3 Case Studies


4 Roadmap
Case 1: Digg




Digg is a social news site that allows people to discover and share
content from anywhere on the Internet by submitting stories and
links, and voting and commenting on submitted stories and links.

Ranked 98th by Alexa.com.
Digg
Problem




• Terabytes of data; high transaction rate (reads dominated)
• Multiple clusters; heavily sharded
• Management nightmare (high effort, error prone)
• Unsatisfied availability requirements (geographic isolation)

Recommended for you

C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
C* Summit 2013: How Not to Use Cassandra by Axel LiljencrantzC* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz

At Spotify, we see failure as an opportunity to learn. During the two years we've used Cassandra in our production environment, we have learned a lot. This session touches on some of the exciting design anti-patterns, performance killers and other opportunities to lose a finger that are at your disposal with Cassandra.

production environmentapache cassandranosql
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patterns

The document discusses Cassandra concepts, patterns, and anti-patterns. It begins with an agenda that covers choosing NoSQL, Cassandra concepts based on Dynamo and Bigtable, and patterns and anti-patterns of use. It then delves into Cassandra concepts such as consistent hashing, vector clocks, gossip protocol, hinted handoff, read repair, and consistency levels. It also discusses Bigtable concepts like sparse column-based data model, SSTables, commit log, and memtables. Finally, it outlines several patterns and anti-patterns of Cassandra use.

nosqlcassandra
Cassandra Data Model
Cassandra Data ModelCassandra Data Model
Cassandra Data Model

Cassandra's data model is more flexible than typically assumed. Cassandra allows tuning of consistency levels to balance availability and consistency. It can be made consistently when certain replication conditions are met. Cassandra uses a row-oriented model where rows are uniquely identified by keys and group columns and super columns. Super column families allow grouping columns under a common name and are often used for denormalizing data. Cassandra's data model is query-based rather than domain-based. It focuses on answering questions through flexible querying rather than storing predefined objects. Design patterns like materialized views and composite keys can help support different types of queries.

nosqlcassandradata
Solution




• Currently production on ”Green Badges”
• Cassandra as primary data store RSN
• Datacenter and rack-aware replication
Case 2: Twitter




Twitter is a social networking and microblogging service that
enables its users to send and read tweets, text-based posts of up to
140 characters.

Ranked 12th by Alexa.com.
Twitter
MySQL




• Terabytes of data, ˜1,000,000 ops/s
• Calls for heavy sharding, light replication
• Schema changes are very difficult, (if possible at all)
• Manual sharding is very high effort
• Automated sharding and replication is Hard

Recommended for you

Cassandra at eBay - Cassandra Summit 2013
Cassandra at eBay - Cassandra Summit 2013Cassandra at eBay - Cassandra Summit 2013
Cassandra at eBay - Cassandra Summit 2013

"Buy It Now! Cassandra at eBay" talk at Cassandra Summit 2013 This session will cover various use cases for Cassandra at eBay. It’ll start with overview of eBay’s heterogeneous data platform comprised of SQL & NoSQL databases, and where Cassandra fits into that. For each use case, Jay will go into detail of system design, data model & multi-datacenter deployment. To conclude, Jay will summarize the best practices that guide Cassandra utilization at eBay. http://www.datastax.com/company/news-and-events/events/cassandrasummit2013

ebaynosqlcassandra
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3

CQL is the query language for Apache Cassandra that provides an SQL-like interface. The document discusses the evolution from the older Thrift RPC interface to CQL and provides examples of modeling tweet data in Cassandra using tables like users, tweets, following, followers, userline, and timeline. It also covers techniques like denormalization, materialized views, and batch loading of related data to optimize for common queries.

ddtx13distributed databasesql
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained

Cassandra is a distributed database management system designed to handle large amounts of data across many commodity servers. It provides high availability with no single points of failure and linear scalability as nodes are added. Cassandra uses a peer-to-peer distributed architecture and tunable consistency levels to achieve high performance and availability without requiring strong consistency. It is based on Amazon's Dynamo and Google's Bigtable papers and provides a combination of their features.

apache cassandra distributed database
Case 3: Facebook




Facebook is a social networking site where users can create a
profile, add friends, and send them messages. Users can also join
groups organized by location or other points of common interest.

Ranked #2 by Alexa.com.
Inbox Search




• 100 TB
• 160 nodes
• 1/2 billion writes per day (2yr old number?)
Case 4: Mahalo




Mahalo.com is a web directory and knowledge exchange. It
differentiates itself by tracking and building hand-crafted result
sets for many of the popular search terms.

(it also means ”thank you” in Hawaiian)
MySQL




• Partial deployment; 16 million video records (and growing)
• Writes (and storage) rapidly exceeding single box limitations
• Managability suffering (clustering is painful)
• Concerns over availability

Recommended for you

236 mobile optimization-cdnetworks
236 mobile optimization-cdnetworks236 mobile optimization-cdnetworks
236 mobile optimization-cdnetworks

This document discusses mobile optimization technologies. It begins with an introduction to market trends in mobile data usage and the growth of 4G/LTE networks. It then covers several technologies for optimizing mobile content delivery, including TCP optimization, front-end optimization (FEO) of HTML and images, and mobile CDNs. Performance tests are presented comparing the impact of FEO and image optimization as well as analyzing packet loss rates with and without TCP tuning for bandwidth-limited users. The goal is to improve quality of experience for mobile users through optimizations at various levels of the networking stack.

Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained

This document provides an overview and introduction to Cassandra, an open source distributed database management system designed to handle large amounts of data across many commodity servers. It discusses Cassandra's origins from influential papers on Bigtable and Dynamo, its properties including flexibility, scalability and high availability. The document also covers Cassandra's data model using keyspaces and column families, its consistency options, API including Thrift and language drivers, and provides examples of usage for an address book app and storing timeseries data.

nosql database cassandra buzzwords
NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?

1. The document discusses Cassandra Query Language (CQL), a new structured query language for Apache Cassandra that is similar to SQL. 2. CQL aims to provide a simpler alternative to Cassandra's existing Thrift API, which is difficult for clients to use and unstable due to its tight coupling to Cassandra's internal APIs. 3. The document outlines some benefits of CQL compared to the Thrift API, such as requiring less client-side abstraction and being more intuitive through its use of a familiar query/data model.

nosqlbuzzwordsdatabase
Outline



1 Project History


2 Description


3 Case Studies


4 Roadmap
0.6


• batch mutate command
• authentication (basic)
• new consistency level, ANY
• fat client
• mmapped i/o reads (default on 64bit jvm)
• improved write concurrency (HH)
• networking optimizations
• row caching
• improved management tools
• per-keyspace replication factor
0.7


• more efficient compactions (row sizes bigger than memory)
• easier (dynamic?) column family changes
• SSTable versioning
• SSTable compression
• support for column family truncation
• improved configuration handling
• remove key range command
• even more improved management tools
• vector clocks w/ server-side conflict resolution
THE END

Recommended for you

Cassandra
Cassandra Cassandra
Cassandra

Cassandra is a highly scalable, open-source distributed database designed to handle large amounts of structured data across many servers. It provides high availability with no single point of failure and was created by Facebook to power search on their messaging platform. Cassandra uses a decentralized peer-to-peer architecture and replicates data across multiple data centers for fault tolerance. It emphasizes performance and scalability over more complex query options and does not support features like joins typically found in relational databases. Companies like Netflix and Hulu use Cassandra for its availability, scalability, and ability to span large clusters with minimal maintenance.

Presentacion mercy angulo
Presentacion  mercy anguloPresentacion  mercy angulo
Presentacion mercy angulo

Mercy Natalia Angulo Pinillos es una profesora que trabaja en el Instituto Educativo Antonio José de Sucre en el Valle del Cauca. Ella tiene títulos de Normalista Bachiller, Normalista Superior, Licenciada en Ciencias Naturales y Especialista en Informática Educativa. Imparte clases de grado primero y preescolar y su principal dificultad es el transporte. En su tiempo libre le gusta leer y visitar amigos. Sus aspiraciones son aplicar mejor las TIC en su labor docente y ampliar su conocimiento sobre cómo utilizar las herramientas

Outside The Box With Apache Cassnadra
Outside The Box With Apache CassnadraOutside The Box With Apache Cassnadra
Outside The Box With Apache Cassnadra

Cassandra presentation given at the 3rd annual Palmetto Open Source Software Conference (POSSCON 2010).

cassandra nosql database dbms db

More Related Content

What's hot

Cassandra
CassandraCassandra
Cassandra
Upaang Saxena
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
DataStax Academy
 
Cassandra basics 2.0
Cassandra basics 2.0Cassandra basics 2.0
Cassandra basics 2.0
Asis Mohanty
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra
Aaron Ploetz
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
Christian Johannsen
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
Sean Murphy
 
Cassandra Architecture FTW
Cassandra Architecture FTWCassandra Architecture FTW
Cassandra Architecture FTW
Jeffrey Carpenter
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
SoftwareMill
 
Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud Database
DataStax
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
Michelle Darling
 
Cassandra
CassandraCassandra
Cassandra
Edureka!
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
Jeremy Hanna
 
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
DataStax Academy
 
Cassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + DynamoCassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + Dynamo
jbellis
 
Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26
Benoit Perroud
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
András Fehér
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architecture
Markus Klems
 
Introduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraIntroduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache Cassandra
Chetan Baheti
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra Cluster
DataStax Academy
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
DataStax Academy
 

What's hot (20)

Cassandra
CassandraCassandra
Cassandra
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
 
Cassandra basics 2.0
Cassandra basics 2.0Cassandra basics 2.0
Cassandra basics 2.0
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
 
Cassandra Architecture FTW
Cassandra Architecture FTWCassandra Architecture FTW
Cassandra Architecture FTW
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud Database
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 
Cassandra
CassandraCassandra
Cassandra
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
 
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
 
Cassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + DynamoCassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + Dynamo
 
Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architecture
 
Introduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraIntroduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache Cassandra
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra Cluster
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
 

Viewers also liked

Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLDevelopers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQL
Ryu Kobayashi
 
C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
C* Summit 2013: How Not to Use Cassandra by Axel LiljencrantzC* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
DataStax Academy
 
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patterns
Dave Gardner
 
Cassandra Data Model
Cassandra Data ModelCassandra Data Model
Cassandra Data Model
ebenhewitt
 
Cassandra at eBay - Cassandra Summit 2013
Cassandra at eBay - Cassandra Summit 2013Cassandra at eBay - Cassandra Summit 2013
Cassandra at eBay - Cassandra Summit 2013
Jay Patel
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
Eric Evans
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
Eric Evans
 
236 mobile optimization-cdnetworks
236 mobile optimization-cdnetworks236 mobile optimization-cdnetworks
236 mobile optimization-cdnetworks
NAVER D2
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
Eric Evans
 
NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?
Eric Evans
 
Cassandra
Cassandra Cassandra
Cassandra
Pooja GV
 
Presentacion mercy angulo
Presentacion  mercy anguloPresentacion  mercy angulo
Presentacion mercy angulo
mercynatalia1
 
Outside The Box With Apache Cassnadra
Outside The Box With Apache CassnadraOutside The Box With Apache Cassnadra
Outside The Box With Apache Cassnadra
Eric Evans
 
Cassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQLCassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQL
Eric Evans
 
Cassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemCassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage System
Varad Meru
 
An Introduction To Cassandra
An Introduction To CassandraAn Introduction To Cassandra
An Introduction To Cassandra
Eric Evans
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
Arunit Gupta
 
Cassandra and Spark
Cassandra and Spark Cassandra and Spark
Cassandra and Spark
datastaxjp
 
Distributed database
Distributed databaseDistributed database
Distributed database
sanjay joshi
 
Apache Cassandra 2.0
Apache Cassandra 2.0Apache Cassandra 2.0
Apache Cassandra 2.0
Joe Stein
 

Viewers also liked (20)

Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLDevelopers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQL
 
C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
C* Summit 2013: How Not to Use Cassandra by Axel LiljencrantzC* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
 
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patterns
 
Cassandra Data Model
Cassandra Data ModelCassandra Data Model
Cassandra Data Model
 
Cassandra at eBay - Cassandra Summit 2013
Cassandra at eBay - Cassandra Summit 2013Cassandra at eBay - Cassandra Summit 2013
Cassandra at eBay - Cassandra Summit 2013
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 
236 mobile optimization-cdnetworks
236 mobile optimization-cdnetworks236 mobile optimization-cdnetworks
236 mobile optimization-cdnetworks
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 
NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?
 
Cassandra
Cassandra Cassandra
Cassandra
 
Presentacion mercy angulo
Presentacion  mercy anguloPresentacion  mercy angulo
Presentacion mercy angulo
 
Outside The Box With Apache Cassnadra
Outside The Box With Apache CassnadraOutside The Box With Apache Cassnadra
Outside The Box With Apache Cassnadra
 
Cassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQLCassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQL
 
Cassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemCassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage System
 
An Introduction To Cassandra
An Introduction To CassandraAn Introduction To Cassandra
An Introduction To Cassandra
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 
Cassandra and Spark
Cassandra and Spark Cassandra and Spark
Cassandra and Spark
 
Distributed database
Distributed databaseDistributed database
Distributed database
 
Apache Cassandra 2.0
Apache Cassandra 2.0Apache Cassandra 2.0
Apache Cassandra 2.0
 

Similar to The Cassandra Distributed Database

Building a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with CassandraBuilding a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with Cassandra
aaronmorton
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
Rahul Borate
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
SudheerKumar499932
 
Cassandra
CassandraCassandra
Cassandra
exsuns
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
Rahul Borate
 
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Lviv Startup Club
 
Big Data Architecture Workshop - Vahid Amiri
Big Data Architecture Workshop -  Vahid AmiriBig Data Architecture Workshop -  Vahid Amiri
Big Data Architecture Workshop - Vahid Amiri
datastack
 
Introduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyIntroduction to libre « fulltext » technology
Introduction to libre « fulltext » technology
Robert Viseur
 
Dissecting Scalable Database Architectures
Dissecting Scalable Database ArchitecturesDissecting Scalable Database Architectures
Dissecting Scalable Database Architectures
hypertable
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and models
Korea Sdec
 
NYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentNYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ Speedment
Speedment, Inc.
 
Cassandra tech talk
Cassandra tech talkCassandra tech talk
Cassandra tech talk
Satish Mehta
 
Introduction to Google BigQuery
Introduction to Google BigQueryIntroduction to Google BigQuery
Introduction to Google BigQuery
Csaba Toth
 
Modern software design in Big data era
Modern software design in Big data eraModern software design in Big data era
Modern software design in Big data era
Bill GU
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
Jonas Bonér
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
Fayez Shayeb
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
Fabio Fumarola
 
Large Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraphLarge Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraph
DataWorks Summit
 
Large Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraphLarge Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraph
P. Taylor Goetz
 
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
confluent
 

Similar to The Cassandra Distributed Database (20)

Building a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with CassandraBuilding a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with Cassandra
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
 
Cassandra
CassandraCassandra
Cassandra
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
 
Big Data Architecture Workshop - Vahid Amiri
Big Data Architecture Workshop -  Vahid AmiriBig Data Architecture Workshop -  Vahid Amiri
Big Data Architecture Workshop - Vahid Amiri
 
Introduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyIntroduction to libre « fulltext » technology
Introduction to libre « fulltext » technology
 
Dissecting Scalable Database Architectures
Dissecting Scalable Database ArchitecturesDissecting Scalable Database Architectures
Dissecting Scalable Database Architectures
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and models
 
NYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentNYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ Speedment
 
Cassandra tech talk
Cassandra tech talkCassandra tech talk
Cassandra tech talk
 
Introduction to Google BigQuery
Introduction to Google BigQueryIntroduction to Google BigQuery
Introduction to Google BigQuery
 
Modern software design in Big data era
Modern software design in Big data eraModern software design in Big data era
Modern software design in Big data era
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
 
Large Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraphLarge Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraph
 
Large Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraphLarge Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraph
 
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
 

More from Eric Evans

Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)
Eric Evans
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
Eric Evans
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
Eric Evans
 
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Eric Evans
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
Eric Evans
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
Eric Evans
 
It's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRDIt's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRD
Eric Evans
 
Time series storage in Cassandra
Time series storage in CassandraTime series storage in Cassandra
Time series storage in Cassandra
Eric Evans
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
Eric Evans
 
Cassandra by Example: Data Modelling with CQL3
Cassandra by Example:  Data Modelling with CQL3Cassandra by Example:  Data Modelling with CQL3
Cassandra by Example: Data Modelling with CQL3
Eric Evans
 
Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)
Eric Evans
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
Eric Evans
 
Castle enhanced Cassandra
Castle enhanced CassandraCastle enhanced Cassandra
Castle enhanced Cassandra
Eric Evans
 
CQL: SQL In Cassandra
CQL: SQL In CassandraCQL: SQL In Cassandra
CQL: SQL In Cassandra
Eric Evans
 
CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)
Eric Evans
 
Cassandra In A Nutshell
Cassandra In A NutshellCassandra In A Nutshell
Cassandra In A Nutshell
Eric Evans
 

More from Eric Evans (16)

Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
 
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
 
It's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRDIt's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRD
 
Time series storage in Cassandra
Time series storage in CassandraTime series storage in Cassandra
Time series storage in Cassandra
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
 
Cassandra by Example: Data Modelling with CQL3
Cassandra by Example:  Data Modelling with CQL3Cassandra by Example:  Data Modelling with CQL3
Cassandra by Example: Data Modelling with CQL3
 
Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
 
Castle enhanced Cassandra
Castle enhanced CassandraCastle enhanced Cassandra
Castle enhanced Cassandra
 
CQL: SQL In Cassandra
CQL: SQL In CassandraCQL: SQL In Cassandra
CQL: SQL In Cassandra
 
CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)
 
Cassandra In A Nutshell
Cassandra In A NutshellCassandra In A Nutshell
Cassandra In A Nutshell
 

Recently uploaded

[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
Kief Morris
 
Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
KAMAL CHOUDHARY
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Chris Swan
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Erasmo Purificato
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
RaminGhanbari2
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
Aurora Consulting
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions
 
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsScaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Mydbops
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
Matthew Sinclair
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
Adam Dunkels
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
SynapseIndia
 
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Bert Blevins
 
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
jackson110191
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
Neo4j
 
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
Stephanie Beckett
 
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
Matthew Sinclair
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
Larry Smarr
 
20240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 202420240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 2024
Matthew Sinclair
 
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALLBLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
Liveplex
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
Lidia A.
 

Recently uploaded (20)

[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
 
Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
 
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsScaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
 
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
 
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
 
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
 
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
 
20240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 202420240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 2024
 
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALLBLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
 

The Cassandra Distributed Database

  • 1. The Cassandra Distributed Database Eric Evans eevans@rackspace.com @jericevans FOSDEM February 7, 2010
  • 2. A prophetess in Troy during the Trojan War. Her predictions were always true, but never believed.
  • 3. A massively scalable, decentralized, structured data store (aka database).
  • 4. Outline 1 Project History 2 Description 3 Case Studies 4 Roadmap
  • 8. • 7 new committers added • Dozens of contributors • 100+ people on IRC • Hundreds of closed issues (bugs, features, etc) • 3 major releases, 2 point releases • Graduation to TLP?
  • 9. Outline 1 Project History 2 Description 3 Case Studies 4 Roadmap
  • 10. Cassandra is... • O(1) DHT • Eventual consistency • Tunable trade-offs, consistency vs. latency
  • 12. But... • Values are structured, indexed • Columns / column families • Slicing w/ predicates (queries)
  • 15. Querying • get(): retrieve by column name • multiget(): by column name for a set of keys • get slice(): by column name, or a range of names • returning columns • returning super columns • multiget slice(): a subset of columns for a set of keys • get count: number of columns or sub-columns • get range slice(): subset of columns for a range of keys
  • 16. Column comparators • TimeUUID • LexicalUUID • UTF8 • Long • Bytes • ...
  • 17. Updating • insert(): add/update column (by key) • batch insert(): add/update multiple columns (by key) • remove(): remove a column • batch mutate(): like batch insert() but can also delete (new for 0.6, deprecates batch insert()) • Remove key range RSN
  • 18. Consistency CAP Theorem: choose any two of Consistency, Availability, or Partition tolerance. • Zero • One • Quorum ((N / 2) + 1) • All
  • 19. Client API • Thrift (12 different languages!) • Ruby • http://github.com/fauna/cassandra/tree/master • http://github.com/NZKoz/cassandra object/tree/master • Python • http://github.com/digg/lazyboy/tree/master • http://github.com/driftx/Telephus/tree/master (Twisted) • Scala • http://github.com/viktorklang/Cassidy/tree/master • http://github.com/nodeta/scalandra/tree/master
  • 20. Performance vs MySQL w/ 50GB • MySQL • 300ms write • 350ms read • Cassandra • 0.12ms write • 15ms read
  • 22. About writes... • No reads • No seeks • Sequential disk access • Atomic within a column family • Fast • Any node • Always writeable (hinted hand-off)
  • 23. Reads
  • 24. About reads... • Any node • Read repair • Usual caching conventions apply
  • 25. Outline 1 Project History 2 Description 3 Case Studies 4 Roadmap
  • 26. Case 1: Digg Digg is a social news site that allows people to discover and share content from anywhere on the Internet by submitting stories and links, and voting and commenting on submitted stories and links. Ranked 98th by Alexa.com.
  • 27. Digg
  • 28. Problem • Terabytes of data; high transaction rate (reads dominated) • Multiple clusters; heavily sharded • Management nightmare (high effort, error prone) • Unsatisfied availability requirements (geographic isolation)
  • 29. Solution • Currently production on ”Green Badges” • Cassandra as primary data store RSN • Datacenter and rack-aware replication
  • 30. Case 2: Twitter Twitter is a social networking and microblogging service that enables its users to send and read tweets, text-based posts of up to 140 characters. Ranked 12th by Alexa.com.
  • 32. MySQL • Terabytes of data, ˜1,000,000 ops/s • Calls for heavy sharding, light replication • Schema changes are very difficult, (if possible at all) • Manual sharding is very high effort • Automated sharding and replication is Hard
  • 33. Case 3: Facebook Facebook is a social networking site where users can create a profile, add friends, and send them messages. Users can also join groups organized by location or other points of common interest. Ranked #2 by Alexa.com.
  • 34. Inbox Search • 100 TB • 160 nodes • 1/2 billion writes per day (2yr old number?)
  • 35. Case 4: Mahalo Mahalo.com is a web directory and knowledge exchange. It differentiates itself by tracking and building hand-crafted result sets for many of the popular search terms. (it also means ”thank you” in Hawaiian)
  • 36. MySQL • Partial deployment; 16 million video records (and growing) • Writes (and storage) rapidly exceeding single box limitations • Managability suffering (clustering is painful) • Concerns over availability
  • 37. Outline 1 Project History 2 Description 3 Case Studies 4 Roadmap
  • 38. 0.6 • batch mutate command • authentication (basic) • new consistency level, ANY • fat client • mmapped i/o reads (default on 64bit jvm) • improved write concurrency (HH) • networking optimizations • row caching • improved management tools • per-keyspace replication factor
  • 39. 0.7 • more efficient compactions (row sizes bigger than memory) • easier (dynamic?) column family changes • SSTable versioning • SSTable compression • support for column family truncation • improved configuration handling • remove key range command • even more improved management tools • vector clocks w/ server-side conflict resolution