SlideShare a Scribd company logo
Cassandra Explained


    Berlin Buzzwords
      June 6, 2010

            Eric Evans
    eevans@rackspace.com
           @jericevans
    http://blog.sym-link.com
Outline
●   Background
●   Description
●   API
●   Examples
Background
Influential Papers
●   BigTable
    ● Strong consistency
    ● Sparse map data model


    ● GFS, Chubby, et al


●   Dynamo
    ●   O(1) distributed hash table (DHT)
    ●   BASE (aka eventual consistency)
    ●   Client tunable consistency/availability

Recommended for you

Building Scalable, Distributed Job Queues with Redis and Redis::Client
Building Scalable, Distributed Job Queues with Redis and Redis::ClientBuilding Scalable, Distributed Job Queues with Redis and Redis::Client
Building Scalable, Distributed Job Queues with Redis and Redis::Client

This document discusses using Redis and the Redis::Client Perl module to build scalable distributed job queues. It provides an overview of Redis, describing it as a key-value store that is simple, fast, and open-source. It then covers the various Redis data types like strings, lists, hashes, sets and sorted sets. Examples are given of how to work with these types using Redis::Client. The document discusses using Redis lists to implement job queues, with jobs added via RPUSH and popped via BLPOP. Benchmark results show the Redis-based job queue approach significantly outperforms using a MySQL jobs table with polling. Some caveats are provided about the benchmarks.

moosejobqueueyapcna2012
Lcna example-2012
Lcna example-2012Lcna example-2012
Lcna example-2012

This document summarizes a presentation about building a negative lookup caching translator for GlusterFS. The presentation demonstrates adding caching functionality to speed up lookups by caching previous misses. It shows the steps to hook the translator together, build it, configure it, debug it, and test its performance. Finally, it briefly introduces glupy, a new project for writing GlusterFS translators in Python, and demonstrates a Python implementation of the negative lookup cache.

Data file handling in python binary & csv files
Data file handling in python binary & csv filesData file handling in python binary & csv files
Data file handling in python binary & csv files

This document discusses binary files and CSV (comma separated value) files in Python. It covers creating and reading binary files using the pickle module's dump() and load() methods. It also covers various binary file operations like inserting/appending, searching, updating and deleting records. For CSV files, it describes the characteristics and advantages/disadvantages of CSV format. It provides examples of writing to and reading from CSV files in Python using the csv module.

binary and csv files
NoSQL
●   HBase          ●   Hypertable
●   MongoDB        ●   HyperGraphDB
●   Riak           ●   Memcached
●   Voldemort      ●   Tokyo Cabinet
●   Neo4J          ●   Redis
●   Cassandra      ●   CouchDB
NoSQL Big data
●   HBase           ●   Hypertable
●   MongoDB         ●   HyperGraphDB
●   Riak            ●   Memcached
●   Voldemort       ●   Tokyo Cabinet
●   Neo4J           ●   Redis
●   Cassandra       ●   CouchDB
Bigtable / Dynamo
        Bigtable              Dynamo
●   HBase          ●   Riak
●   Hypertable     ●   Voldemort



            Cassandra ??
Dynamo-Bigtable Lovechild

Recommended for you

A Brief Introduction to Redis
A Brief Introduction to RedisA Brief Introduction to Redis
A Brief Introduction to Redis

Redis is a networked data structure server that provides fast, simple access to various data types like Strings, Lists, Sets, Sorted Sets and Hashes. It uses an abstract data type interface where operations take a key as the first parameter and match the type of object stored. For example, list operations like LPUSH take a key and value, and the LRANGE operation takes a key and range to return elements in a list. Redis supports multiple programming language clients and can be used for tasks like leader boards, shopping carts and user profiles.

nosqlprogrammingredis
Work WIth Redis and Perl
Work WIth Redis and PerlWork WIth Redis and Perl
Work WIth Redis and Perl

This document discusses using Redis as a work queue for distributing tasks across worker processes. It provides an overview of Redis, describes how to implement a basic work queue using Redis lists, and shows various work queue patterns like synchronous and asynchronous producer-consumer models. It also covers options for scaling out queues and ensuring high availability and reliability. Code examples are provided using the Redis.pm Perl module.

perldistributed computingdatabases
What Reika Taught us
What Reika Taught usWhat Reika Taught us
What Reika Taught us

- Reika is a domain-specific language for querying time series databases built on ANTLR. It aims to provide a SQL-like syntax that supports multiple backends. - The current implementation includes a lexer, parser, AST generation using ANTLR, and an interpreter. Symbol and type checking are also implemented. - Lessons learned include checking library source code before using, problems can cascade, and deeper understanding comes after initial implementation. Related work includes InfluxQL and other query languages for time series data.

programmingdatabase
CAP Theorem “Pick Two”
●   CP               ●   AP
    ●   Bigtable         ●   Dynamo
    ●   Hypertable       ●   Voldemort
    ●   HBase            ●   Cassandra
CAP Theorem “Pick Two”



   ●   Consistency
   ●   Availability
   ●   Partition Tolerance
Description
Properties
●   Symmetric
    ● No single point of failure
    ● Linearly scalable


    ● Ease of administration


●   Flexible partitioning, replica placement
●   Automated provisioning
●   High availability (eventual consistency)

Recommended for you

Lcna 2012-tutorial
Lcna 2012-tutorialLcna 2012-tutorial
Lcna 2012-tutorial

GlusterFS uses "translators" to modify and route file requests between users and storage bricks. Translators can convert request types, modify request properties like paths or flags, intercept or block requests, and spawn new requests. This allows GlusterFS to provide features like replication, caching, and integration with other systems, but also enables custom file systems to be built by modifying the translators. The asynchronous programming model and shared context objects allow translators to cooperate complex workflows across multiple servers.

Bulk Loading Data into Cassandra
Bulk Loading Data into CassandraBulk Loading Data into Cassandra
Bulk Loading Data into Cassandra

Whether running load tests or migrating historic data, loading data directly into Cassandra can be very useful to bypass the system’s write path. In this webinar, we will look at how data is stored on disk in sstables, how to generate these structures directly, and how to load this data rapidly into your cluster using sstableloader. We'll also review different use cases for when you should and shouldn't use this method.

apache cassandracassandranosql
"Metrics: Where and How", Vsevolod Polyakov
"Metrics: Where and How", Vsevolod Polyakov"Metrics: Where and How", Vsevolod Polyakov
"Metrics: Where and How", Vsevolod Polyakov

Abstract: Nowadays it’s only a lazy one who haven’t written his own metric storage and aggregation system. I am lazy, and that’s why I have to choose what to use and how to use. I don’t want you to do the same job, so I decided to share my considerations concerning architectures and test results.

P2P Routing
P2P Routing
Partitioning
●   Random
    ●   128bit namespace, (MD5)
    ●   Good distribution
●   Order Preserving
    ●   Tokens determine namespace
    ●   Natural order (lexicographical)
    ●   Range / cover queries
●   Yours ??
Replica Placement
●   SimpleSnitch
    ●   Default
    ●   N-1 successive nodes
●   RackInferringSnitch
    ●   Infers DC/rack from IP
●   PropertyFileSnitch
    ●   Configured w/ a properties file

Recommended for you

Dexador Rises
Dexador RisesDexador Rises
Dexador Rises

The document summarizes a presentation about HTTP clients in Common Lisp. Eitaro Fukamachi discusses several Common Lisp HTTP client libraries, including Drakma and his own library called Dexador. He notes some pitfalls of Drakma, such as forcing URL encoding and poor error handling. Dexador is presented as an alternative with simpler APIs, better language support, and improved error handling including automatic retrying. Benchmarks show that Dexador is faster than Drakma for local requests and comparable for remote requests, but connection pooling in Dexador can further improve performance for multiple requests.

common lisplispmeetupshibuyalisp
Kubernetes
KubernetesKubernetes
Kubernetes

This document provides an overview of key Kubernetes concepts including containers, pods, volumes, deployments, services, configmaps, secrets, replica sets, and horizontal pod autoscaling. It describes the basic building blocks in Kubernetes like pods, containers, volumes, labels and selectors. It also covers different types of services, deployments for declarative updates, replica sets for scaling pods, and horizontal pod autoscaling for autoscaling based on CPU utilization.

kubernetesk8sdevops
Scalable XQuery Processing with Zorba on top of MongoDB
Scalable XQuery Processing with Zorba on top of MongoDBScalable XQuery Processing with Zorba on top of MongoDB
Scalable XQuery Processing with Zorba on top of MongoDB

Since a couple of years, the NoSQL movement has developed a variety of open-source document stores. Most of them focus on high availability, horizontal scalability, and are designed to run on commodity hardware. These products have gained great traction in the industry to store large amounts of flexible data (mostly JSON). In the meantime, XQuery has evolved to a standardized, full-fledged programming language for XML with native support for complex queries, indexes, updates, full-text search, and scripting. Moreover, JSON has recently been added as a first-level datatype into the language. As of today, it is without doubt the most robust and productive technology to process flexible data. The aim of this talk is to showcase the benefits that can be achieved by integrating the Zorba XQuery Processor with MongoDB. We will introduce the 28msec platform that seamlessly stores, indexes, and manages flexible data entirely in XQuery. The data itself is stored in MongoDB. The platform leverages MongoDB’s indexes, sharding, and consistency guarantees to scale-out horizontally. The talk will conclude by showing a benchmark of the platform and discuss perspectives of the outlined approach.

xquery xml json mongodb
Bootstrap
Bootstrap
Bootstrap
Choosing Consistency

         Write                      Read
Level     Description      Level     Description
ZERO      Hail Mary        ZERO      N/A
ANY       1 replica (HH)   ANY       N/A
ONE       1 replica        ONE       1 replica
QUORUM    (N / 2) +1       QUORUM    (N / 2) +1
ALL       All replicas     ALL       All replicas

                       R+W>N

Recommended for you

Fluentd and AWS at classmethod
Fluentd and AWS at classmethodFluentd and AWS at classmethod
Fluentd and AWS at classmethod

This document discusses using Fluentd and AWS together. It provides an overview of how Treasure Data uses Fluentd to collect log data from applications on AWS and forwards it to various AWS services like S3, DynamoDB, and Redshift for storage and analysis. It also describes how Fluentd can be used to collect logs from EC2 instances to monitor them and address issues. The document highlights Fluentd's pluggable architecture and some of its core plugins for buffering, routing, and input/output of log data.

fluentd aws
Scale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_glusterScale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_gluster

This document discusses integrating Bareos backups with the Gluster distributed file system for scalable backups. It begins with an agenda that covers the Gluster integration in Bareos, an introduction to GlusterFS, a quick start guide, an example configuration and demo, and future plans. It then provides more details on GlusterFS architecture including concepts like bricks, volumes, peers and site replication. The remainder of the document outlines quick start instructions for setting up Gluster and configuring Bareos to use the Gluster backend for scalable backups across multiple servers.

Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdose

This document provides an introduction and overview of Gluster, an open source scale-out network-attached storage file system. It discusses what Gluster is, its architecture using distributed and replicated volumes, a quick start guide, use cases, features, and how to get involved in the community. The presentation aims to explain the benefits and capabilities of Gluster for scalable, high performance storage.

Quorum ((N/2) + 1)
Quorum ((N/2) + 1)
Data Model
Overview
●   Keyspace
    ●   Uppermost namespace
    ●   Typically one per application
●   ColumnFamily
    ●   Associates records of a similar kind
    ●   Record-level Atomicity
    ●   Indexed
●   Column
    ●   Basic unit of storage

Recommended for you

Gluster d2
Gluster d2Gluster d2
Gluster d2

The document discusses GlusterD 2.0, a redesign of the Gluster distributed file system management daemon. Some key points: - GlusterD 1.0 had scalability and consistency issues that limited it to hundreds of nodes. GlusterD 2.0 was rewritten from scratch in Go for better performance. - GlusterD 2.0 uses etcd for centralized management and configuration storage. It has REST APIs and plugins for modularity. - Components include REST interfaces, etcd backend, RPC framework, transaction system, and a flexible volume generator. - Upgrades from Gluster 3.x to 4.x will be disruptive but provide a migration path. Gluster

Handling 20 billion requests a month
Handling 20 billion requests a monthHandling 20 billion requests a month
Handling 20 billion requests a month

This document discusses the architecture and technical challenges of handling a large volume of requests for an online advertising platform. It summarizes three key projects handled by the platform that delivered 3 billion, 14 billion, and 20 billion requests per month respectively. It describes the technologies used, including Solr, Redis, MySQL, Hadoop and Amazon Web Services instances. It also outlines optimizations made to improve performance, such as data compression, query optimizations, and Java 7 improvements. The goal was to process over 11,000 requests per second on average while maintaining response times below 100ms.

NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?

1. The document discusses Cassandra Query Language (CQL), a new structured query language for Apache Cassandra that is similar to SQL. 2. CQL aims to provide a simpler alternative to Cassandra's existing Thrift API, which is difficult for clients to use and unstable due to its tight coupling to Cassandra's internal APIs. 3. The document outlines some benefits of CQL compared to the Thrift API, such as requiring less client-side abstraction and being more intuitive through its use of a familiar query/data model.

nosqlbuzzwordsdatabase
Sparse Table
Column
●   name
    ●   byte[]
    ●   Queried against (predicates)
    ●   Determines sort order
●   value
    ●   byte[]
    ●   Opaque to Cassandra
●   timestamp
    ●   long
    ●   Conflict resolution (Last Write Wins)
Column Comparators
●    Bytes
●    UTF8
●    TimeUUID
●    Long
●    LexicalUUID
●    Composite (third-party)


    http://github.com/edanuff/CassandraCompositeType
API

Recommended for you

Outside The Box With Apache Cassnadra
Outside The Box With Apache CassnadraOutside The Box With Apache Cassnadra
Outside The Box With Apache Cassnadra

Cassandra presentation given at the 3rd annual Palmetto Open Source Software Conference (POSSCON 2010).

cassandra nosql database dbms db
Cassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQLCassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQL

CQL is a structured query language for Apache Cassandra that is similar to SQL. It provides an alternative interface to the existing Thrift API, with the goals of being more stable, easier to use, and providing a better mental model for querying and data. The document outlines the motivations for developing CQL, including limitations of the existing Thrift API, and provides details on CQL specification, drivers, and additional resources.

distributednosqlcassandrasf2011
An Introduction To Cassandra
An Introduction To CassandraAn Introduction To Cassandra
An Introduction To Cassandra

This document is an introduction to Cassandra presented by Eric Evans. It provides an outline that covers the project history, description of Cassandra as a massively scalable and decentralized structured data store, and lists some of the people and companies involved in Cassandra including Facebook, Digg, IBM Research, Rackspace and Twitter. The document discusses Cassandra's capabilities such as tunable consistency levels, structured columns and supercolumns, querying, updates, client APIs and performance compared to MySQL.

cassandra nosql opensqlcamp rackspace 2009 apache
Low / High
●    Thrift
      ●   Compact binary RPC framework
      ●   12 different languages
●    Idiomatic
      ●   Hector (Java)
      ●   Pycassa (Python)
      ●   Others...


    http://wiki.apache.org/cassandra/ClientOptions
Thrift Read Methods
●   get() → Column
●   get_slice() → list<Column>
●   mulitget_slice() → map<key, list<Column>>
●   get_count() → int
●   multiget_count() → map<key, int>
●   get_range_slices()
Thrift Write Methods
●   insert()
●   batch_insert()
●   remove()
●   batch_mutate()
Examples

Recommended for you

The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed Database

This document summarizes Cassandra, an open source distributed database management system designed to handle large amounts of data across many commodity servers. It discusses Cassandra's history, key features like tunable consistency levels and support for structured and indexed columns. Case studies describe how companies like Digg, Twitter, Facebook and Mahalo use Cassandra to handle terabytes of data and high transaction volumes. The roadmap outlines upcoming releases that will improve features like compaction, management tools, and support for dynamic schema changes.

nosql cassandra fosdem apache rackspace distribute
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained

Cassandra is a distributed database management system designed to handle large amounts of data across many commodity servers. It provides high availability with no single points of failure and linear scalability as nodes are added. Cassandra uses a peer-to-peer distributed architecture and tunable consistency levels to achieve high performance and availability without requiring strong consistency. It is based on Amazon's Dynamo and Google's Bigtable papers and provides a combination of their features.

apache cassandra distributed database
On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache Cassandra

This document discusses Apache Cassandra, a distributed database management system designed to handle large amounts of data across many commodity servers. It summarizes Cassandra's origins from Amazon Dynamo and Google Bigtable, describes its data model and client APIs. The document also provides examples of using Cassandra and discusses considerations around operations and performance.

railsnosqlruby
Pycassa – Python Client API
●    connect() → Thrift proxy
●    cf = ColumnFamily(proxy, ksp, cfname)
●    cf.insert() → long
●    cf.get() → dict
●    cf.get_range() → dict




    http://github.com/vomjom/pycassa
Address Book – Setup

<Keyspace Name=”AddressBook”>
  <ColumnFamily Name=”Addresses”
                CompareWith=”BytesType”
                RowsCached=”10000”
                KeysCached=”50%”
                Comment=”Too lame” />
</Keyspace>
Adding an entry
key = uuid()

columns = {
    'first':   'Eric',
    'last':    'Evans',
    'email':   'eevans@rackspace.com',
    'city':    'Austin',
    'zip':     78250
}

addresses.insert(key, columns)
Fetching a record
# fetching the record by key
record = addresses.get(key)

# accessing columns by name
zipcode = record['zip']
city = record['city']

Recommended for you

Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUG

This document provides an overview of Apache Cassandra, a distributed database designed for managing large amounts of structured data across commodity servers. It discusses Cassandra's data model, which is based on Dynamo and Bigtable, as well as its client API and operational benefits like easy scaling and high availability. The document uses a Twitter-like application called StatusApp to illustrate Cassandra's data model and provide examples of common operations.

distributedjavaaustin
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive

This document provides an overview of various AWS big data services including Athena, Redshift Spectrum, EMR, and Hive. It discusses how Athena allows users to run SQL queries directly on data stored in S3 using Presto. Redshift Spectrum enables querying data in S3 using standard SQL from Amazon Redshift. EMR is a managed Hadoop framework that can run Hive, Spark, and other big data applications. Hive provides a SQL-like interface to query data stored in various formats like Parquet and ORC on distributed storage systems. The document demonstrates features and provides best practices for working with these AWS big data services.

awsbig dataathena
Cassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + DynamoCassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + Dynamo

Cassandra is a highly scalable, eventually consistent, distributed, structured columnfamily store with no single points of failure, initially open-sourced by Facebook and now part of the Apache Incubator. These slides are from Jonathan Ellis's OSCON 09 talk: http://en.oreilly.com/oscon2009/public/schedule/detail/7975

distributeddynamodatabase
Indexing

<Keyspace Name=”AddressBook”>
  <ColumnFamily Name=”Addresses”
                CompareWith=”BytesType”
                RowsCached=”10000”
                KeysCached=”50%”
                Comment=”Too lame” />
  <ColumnFamily Name=”ByCity”
                CompareWith=”UTF8Type” />
</Keyspace>
Updating the index
key = uuid()

columns = {
    'first':   'Eric',
    'last':    'Evans',
    'email':   'eevans@rackspace.com',
    'city':    'Austin',
    'zip':     78250
}

addresses.insert(key, columns)
byCity.insert('Austin', {key: ''})
Timeseries

<Keyspace Name=”Sites”>
  <ColumnFamily Name=”Stats”
                CompareWith=”LongType”/>
</Keyspace>
Logging values
# time as a long, binary, network-order
ts = pack('>d', long(time() * 1e6))

stats.insert('org.apache', {ts: value})

Recommended for you

Taming NoSQL with Spring Data
Taming NoSQL with Spring DataTaming NoSQL with Spring Data
Taming NoSQL with Spring Data

Enterprise applications are complex making it difficult to fit everything in one model. NoSQL is taking a leading role in the next generation database technologies and polyglot persistence a good option to leverage the strength of multiple data stores. This talk will introduce the Spring Data project, an umbrella project that provides a familiar and consistent Spring-based programming model for a wide range of data access technologies such as Redis, MongoDB, HBase, Neo4j...while retaining store-specific features and capabilities.

mongodbnosqlspring
NoSQL, no Limits, lots of Fun!
NoSQL, no Limits, lots of Fun!NoSQL, no Limits, lots of Fun!
NoSQL, no Limits, lots of Fun!

JNoSQL is an open source project that provides a common API for working with different NoSQL databases. It includes Diana, which defines a common communication layer, and Artemis, a CDI-based annotation framework. The goal is to simplify development of NoSQL applications by handling differences in data models and query languages between databases in a standardized way.

javanosqljnosql
Online Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and CassandraOnline Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and Cassandra

To date, Hadoop usage has focused primarily on offline analysis--making sense of web logs, parsing through loads of unstructured data in HDFS, etc. But what if you want to run map/reduce against your live data set without affecting online performance? Combining Hadoop with Cassandra's multi-datacenter replication capabilities makes this possible. If you're interested in getting value from your data without the hassle and latency of first moving it into Hadoop, this talk is for you. I'll show you how to connect all the parts, enabling you to write map/reduce jobs or run Pig queries against your live data. As a bonus I'll cover writing map/reduce in Scala, which is particularly well-suited for the task.

apache hadoopmapreducehadoop
Slicing
begin = pack('>d', long(s * 1e6))

stats.get_range('org.apache',
                column_start=begin)

end = pack('>d', long((s + 86400) * 1e6))

stats.get_range(start='org.apache',
                finish='org.debian',
                column_start=begin,
                column_finish=end)
Questions?

More Related Content

What's hot

Introduction to redis - version 2
Introduction to redis - version 2Introduction to redis - version 2
Introduction to redis - version 2
Dvir Volk
 
Caching solutions with Redis
Caching solutions   with RedisCaching solutions   with Redis
Caching solutions with Redis
George Platon
 
Disperse xlator ramon_datalab
Disperse xlator ramon_datalabDisperse xlator ramon_datalab
Disperse xlator ramon_datalab
Gluster.org
 
Building Scalable, Distributed Job Queues with Redis and Redis::Client
Building Scalable, Distributed Job Queues with Redis and Redis::ClientBuilding Scalable, Distributed Job Queues with Redis and Redis::Client
Building Scalable, Distributed Job Queues with Redis and Redis::Client
Mike Friedman
 
Lcna example-2012
Lcna example-2012Lcna example-2012
Lcna example-2012
Gluster.org
 
Data file handling in python binary & csv files
Data file handling in python binary & csv filesData file handling in python binary & csv files
Data file handling in python binary & csv files
keeeerty
 
A Brief Introduction to Redis
A Brief Introduction to RedisA Brief Introduction to Redis
A Brief Introduction to Redis
Charles Anderson
 
Work WIth Redis and Perl
Work WIth Redis and PerlWork WIth Redis and Perl
Work WIth Redis and Perl
Brett Estrade
 
What Reika Taught us
What Reika Taught usWhat Reika Taught us
Lcna 2012-tutorial
Lcna 2012-tutorialLcna 2012-tutorial
Lcna 2012-tutorial
Gluster.org
 
Bulk Loading Data into Cassandra
Bulk Loading Data into CassandraBulk Loading Data into Cassandra
Bulk Loading Data into Cassandra
DataStax
 
"Metrics: Where and How", Vsevolod Polyakov
"Metrics: Where and How", Vsevolod Polyakov"Metrics: Where and How", Vsevolod Polyakov
"Metrics: Where and How", Vsevolod Polyakov
Yulia Shcherbachova
 
Dexador Rises
Dexador RisesDexador Rises
Dexador Rises
fukamachi
 
Kubernetes
KubernetesKubernetes
Kubernetes
Diego Pacheco
 
Scalable XQuery Processing with Zorba on top of MongoDB
Scalable XQuery Processing with Zorba on top of MongoDBScalable XQuery Processing with Zorba on top of MongoDB
Scalable XQuery Processing with Zorba on top of MongoDB
William Candillon
 
Fluentd and AWS at classmethod
Fluentd and AWS at classmethodFluentd and AWS at classmethod
Fluentd and AWS at classmethod
Treasure Data, Inc.
 
Scale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_glusterScale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_gluster
Gluster.org
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdose
Gluster.org
 
Gluster d2
Gluster d2Gluster d2
Gluster d2
Gluster.org
 
Handling 20 billion requests a month
Handling 20 billion requests a monthHandling 20 billion requests a month
Handling 20 billion requests a month
Dmitriy Dumanskiy
 

What's hot (20)

Introduction to redis - version 2
Introduction to redis - version 2Introduction to redis - version 2
Introduction to redis - version 2
 
Caching solutions with Redis
Caching solutions   with RedisCaching solutions   with Redis
Caching solutions with Redis
 
Disperse xlator ramon_datalab
Disperse xlator ramon_datalabDisperse xlator ramon_datalab
Disperse xlator ramon_datalab
 
Building Scalable, Distributed Job Queues with Redis and Redis::Client
Building Scalable, Distributed Job Queues with Redis and Redis::ClientBuilding Scalable, Distributed Job Queues with Redis and Redis::Client
Building Scalable, Distributed Job Queues with Redis and Redis::Client
 
Lcna example-2012
Lcna example-2012Lcna example-2012
Lcna example-2012
 
Data file handling in python binary & csv files
Data file handling in python binary & csv filesData file handling in python binary & csv files
Data file handling in python binary & csv files
 
A Brief Introduction to Redis
A Brief Introduction to RedisA Brief Introduction to Redis
A Brief Introduction to Redis
 
Work WIth Redis and Perl
Work WIth Redis and PerlWork WIth Redis and Perl
Work WIth Redis and Perl
 
What Reika Taught us
What Reika Taught usWhat Reika Taught us
What Reika Taught us
 
Lcna 2012-tutorial
Lcna 2012-tutorialLcna 2012-tutorial
Lcna 2012-tutorial
 
Bulk Loading Data into Cassandra
Bulk Loading Data into CassandraBulk Loading Data into Cassandra
Bulk Loading Data into Cassandra
 
"Metrics: Where and How", Vsevolod Polyakov
"Metrics: Where and How", Vsevolod Polyakov"Metrics: Where and How", Vsevolod Polyakov
"Metrics: Where and How", Vsevolod Polyakov
 
Dexador Rises
Dexador RisesDexador Rises
Dexador Rises
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
Scalable XQuery Processing with Zorba on top of MongoDB
Scalable XQuery Processing with Zorba on top of MongoDBScalable XQuery Processing with Zorba on top of MongoDB
Scalable XQuery Processing with Zorba on top of MongoDB
 
Fluentd and AWS at classmethod
Fluentd and AWS at classmethodFluentd and AWS at classmethod
Fluentd and AWS at classmethod
 
Scale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_glusterScale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_gluster
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdose
 
Gluster d2
Gluster d2Gluster d2
Gluster d2
 
Handling 20 billion requests a month
Handling 20 billion requests a monthHandling 20 billion requests a month
Handling 20 billion requests a month
 

Viewers also liked

NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?
Eric Evans
 
Outside The Box With Apache Cassnadra
Outside The Box With Apache CassnadraOutside The Box With Apache Cassnadra
Outside The Box With Apache Cassnadra
Eric Evans
 
Cassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQLCassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQL
Eric Evans
 
An Introduction To Cassandra
An Introduction To CassandraAn Introduction To Cassandra
An Introduction To Cassandra
Eric Evans
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed Database
Eric Evans
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
Eric Evans
 

Viewers also liked (6)

NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?
 
Outside The Box With Apache Cassnadra
Outside The Box With Apache CassnadraOutside The Box With Apache Cassnadra
Outside The Box With Apache Cassnadra
 
Cassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQLCassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQL
 
An Introduction To Cassandra
An Introduction To CassandraAn Introduction To Cassandra
An Introduction To Cassandra
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed Database
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 

Similar to Cassandra Explained

On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache Cassandra
Stu Hood
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUG
Stu Hood
 
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
Omid Vahdaty
 
Cassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + DynamoCassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + Dynamo
jbellis
 
Taming NoSQL with Spring Data
Taming NoSQL with Spring DataTaming NoSQL with Spring Data
Taming NoSQL with Spring Data
Sergi Almar i Graupera
 
NoSQL, no Limits, lots of Fun!
NoSQL, no Limits, lots of Fun!NoSQL, no Limits, lots of Fun!
NoSQL, no Limits, lots of Fun!
Otávio Santana
 
Online Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and CassandraOnline Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and Cassandra
Robbie Strickland
 
Cassandra
CassandraCassandra
Cassandra
Robert Koletka
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)
javier ramirez
 
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Data Con LA
 
Heterogenous Persistence
Heterogenous PersistenceHeterogenous Persistence
Heterogenous Persistence
Jervin Real
 
JDD 2016 - Michal Matloka - Small Intro To Big Data
JDD 2016 - Michal Matloka - Small Intro To Big DataJDD 2016 - Michal Matloka - Small Intro To Big Data
JDD 2016 - Michal Matloka - Small Intro To Big Data
PROIDEA
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
shimi_k
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
Roger Xia
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
Murat Çakal
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
shsedghi
 
Introduction to AWS Big Data
Introduction to AWS Big Data Introduction to AWS Big Data
Introduction to AWS Big Data
Omid Vahdaty
 
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
Alexey Zinoviev
 
TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...
TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...
TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...
tdc-globalcode
 
Sorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at SpotifySorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at Spotify
Neville Li
 

Similar to Cassandra Explained (20)

On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache Cassandra
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUG
 
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
 
Cassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + DynamoCassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + Dynamo
 
Taming NoSQL with Spring Data
Taming NoSQL with Spring DataTaming NoSQL with Spring Data
Taming NoSQL with Spring Data
 
NoSQL, no Limits, lots of Fun!
NoSQL, no Limits, lots of Fun!NoSQL, no Limits, lots of Fun!
NoSQL, no Limits, lots of Fun!
 
Online Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and CassandraOnline Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and Cassandra
 
Cassandra
CassandraCassandra
Cassandra
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)
 
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
 
Heterogenous Persistence
Heterogenous PersistenceHeterogenous Persistence
Heterogenous Persistence
 
JDD 2016 - Michal Matloka - Small Intro To Big Data
JDD 2016 - Michal Matloka - Small Intro To Big DataJDD 2016 - Michal Matloka - Small Intro To Big Data
JDD 2016 - Michal Matloka - Small Intro To Big Data
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
 
Introduction to AWS Big Data
Introduction to AWS Big Data Introduction to AWS Big Data
Introduction to AWS Big Data
 
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
 
TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...
TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...
TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...
 
Sorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at SpotifySorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at Spotify
 

More from Eric Evans

Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)
Eric Evans
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
Eric Evans
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
Eric Evans
 
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Eric Evans
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
Eric Evans
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
Eric Evans
 
It's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRDIt's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRD
Eric Evans
 
Time series storage in Cassandra
Time series storage in CassandraTime series storage in Cassandra
Time series storage in Cassandra
Eric Evans
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
Eric Evans
 
Cassandra by Example: Data Modelling with CQL3
Cassandra by Example:  Data Modelling with CQL3Cassandra by Example:  Data Modelling with CQL3
Cassandra by Example: Data Modelling with CQL3
Eric Evans
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
Eric Evans
 
Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)
Eric Evans
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
Eric Evans
 
Castle enhanced Cassandra
Castle enhanced CassandraCastle enhanced Cassandra
Castle enhanced Cassandra
Eric Evans
 
CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)
Eric Evans
 
Cassandra In A Nutshell
Cassandra In A NutshellCassandra In A Nutshell
Cassandra In A Nutshell
Eric Evans
 

More from Eric Evans (16)

Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
 
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
 
It's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRDIt's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRD
 
Time series storage in Cassandra
Time series storage in CassandraTime series storage in Cassandra
Time series storage in Cassandra
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
 
Cassandra by Example: Data Modelling with CQL3
Cassandra by Example:  Data Modelling with CQL3Cassandra by Example:  Data Modelling with CQL3
Cassandra by Example: Data Modelling with CQL3
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
 
Castle enhanced Cassandra
Castle enhanced CassandraCastle enhanced Cassandra
Castle enhanced Cassandra
 
CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)
 
Cassandra In A Nutshell
Cassandra In A NutshellCassandra In A Nutshell
Cassandra In A Nutshell
 

Recently uploaded

論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
Toru Tamaki
 
Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
KAMAL CHOUDHARY
 
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
Emerging Tech
 
Cookies program to display the information though cookie creation
Cookies program to display the information though cookie creationCookies program to display the information though cookie creation
Cookies program to display the information though cookie creation
shanthidl1
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
ScyllaDB
 
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
Kief Morris
 
7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf
Enterprise Wired
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Chris Swan
 
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
huseindihon
 
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionAdvanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Bert Blevins
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
Larry Smarr
 
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
Yevgen Sysoyev
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
Neo4j
 
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
ArgaBisma
 
UiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs ConferenceUiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs Conference
UiPathCommunity
 
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
Stephanie Beckett
 
Best Programming Language for Civil Engineers
Best Programming Language for Civil EngineersBest Programming Language for Civil Engineers
Best Programming Language for Civil Engineers
Awais Yaseen
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
SynapseIndia
 
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
BookNet Canada
 

Recently uploaded (20)

論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
 
Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
 
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
 
Cookies program to display the information though cookie creation
Cookies program to display the information though cookie creationCookies program to display the information though cookie creation
Cookies program to display the information though cookie creation
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
 
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
 
7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
 
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
 
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionAdvanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
 
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
 
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
 
UiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs ConferenceUiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs Conference
 
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
 
Best Programming Language for Civil Engineers
Best Programming Language for Civil EngineersBest Programming Language for Civil Engineers
Best Programming Language for Civil Engineers
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
 
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
 

Cassandra Explained

  • 1. Cassandra Explained Berlin Buzzwords June 6, 2010 Eric Evans eevans@rackspace.com @jericevans http://blog.sym-link.com
  • 2. Outline ● Background ● Description ● API ● Examples
  • 4. Influential Papers ● BigTable ● Strong consistency ● Sparse map data model ● GFS, Chubby, et al ● Dynamo ● O(1) distributed hash table (DHT) ● BASE (aka eventual consistency) ● Client tunable consistency/availability
  • 5. NoSQL ● HBase ● Hypertable ● MongoDB ● HyperGraphDB ● Riak ● Memcached ● Voldemort ● Tokyo Cabinet ● Neo4J ● Redis ● Cassandra ● CouchDB
  • 6. NoSQL Big data ● HBase ● Hypertable ● MongoDB ● HyperGraphDB ● Riak ● Memcached ● Voldemort ● Tokyo Cabinet ● Neo4J ● Redis ● Cassandra ● CouchDB
  • 7. Bigtable / Dynamo Bigtable Dynamo ● HBase ● Riak ● Hypertable ● Voldemort Cassandra ??
  • 9. CAP Theorem “Pick Two” ● CP ● AP ● Bigtable ● Dynamo ● Hypertable ● Voldemort ● HBase ● Cassandra
  • 10. CAP Theorem “Pick Two” ● Consistency ● Availability ● Partition Tolerance
  • 12. Properties ● Symmetric ● No single point of failure ● Linearly scalable ● Ease of administration ● Flexible partitioning, replica placement ● Automated provisioning ● High availability (eventual consistency)
  • 15. Partitioning ● Random ● 128bit namespace, (MD5) ● Good distribution ● Order Preserving ● Tokens determine namespace ● Natural order (lexicographical) ● Range / cover queries ● Yours ??
  • 16. Replica Placement ● SimpleSnitch ● Default ● N-1 successive nodes ● RackInferringSnitch ● Infers DC/rack from IP ● PropertyFileSnitch ● Configured w/ a properties file
  • 20. Choosing Consistency Write Read Level Description Level Description ZERO Hail Mary ZERO N/A ANY 1 replica (HH) ANY N/A ONE 1 replica ONE 1 replica QUORUM (N / 2) +1 QUORUM (N / 2) +1 ALL All replicas ALL All replicas R+W>N
  • 24. Overview ● Keyspace ● Uppermost namespace ● Typically one per application ● ColumnFamily ● Associates records of a similar kind ● Record-level Atomicity ● Indexed ● Column ● Basic unit of storage
  • 26. Column ● name ● byte[] ● Queried against (predicates) ● Determines sort order ● value ● byte[] ● Opaque to Cassandra ● timestamp ● long ● Conflict resolution (Last Write Wins)
  • 27. Column Comparators ● Bytes ● UTF8 ● TimeUUID ● Long ● LexicalUUID ● Composite (third-party) http://github.com/edanuff/CassandraCompositeType
  • 28. API
  • 29. Low / High ● Thrift ● Compact binary RPC framework ● 12 different languages ● Idiomatic ● Hector (Java) ● Pycassa (Python) ● Others... http://wiki.apache.org/cassandra/ClientOptions
  • 30. Thrift Read Methods ● get() → Column ● get_slice() → list<Column> ● mulitget_slice() → map<key, list<Column>> ● get_count() → int ● multiget_count() → map<key, int> ● get_range_slices()
  • 31. Thrift Write Methods ● insert() ● batch_insert() ● remove() ● batch_mutate()
  • 33. Pycassa – Python Client API ● connect() → Thrift proxy ● cf = ColumnFamily(proxy, ksp, cfname) ● cf.insert() → long ● cf.get() → dict ● cf.get_range() → dict http://github.com/vomjom/pycassa
  • 34. Address Book – Setup <!-- conf/storage-conf.xml --> <Keyspace Name=”AddressBook”> <ColumnFamily Name=”Addresses” CompareWith=”BytesType” RowsCached=”10000” KeysCached=”50%” Comment=”Too lame” /> </Keyspace>
  • 35. Adding an entry key = uuid() columns = { 'first': 'Eric', 'last': 'Evans', 'email': 'eevans@rackspace.com', 'city': 'Austin', 'zip': 78250 } addresses.insert(key, columns)
  • 36. Fetching a record # fetching the record by key record = addresses.get(key) # accessing columns by name zipcode = record['zip'] city = record['city']
  • 37. Indexing <!-- conf/storage-conf.xml --> <Keyspace Name=”AddressBook”> <ColumnFamily Name=”Addresses” CompareWith=”BytesType” RowsCached=”10000” KeysCached=”50%” Comment=”Too lame” /> <ColumnFamily Name=”ByCity” CompareWith=”UTF8Type” /> </Keyspace>
  • 38. Updating the index key = uuid() columns = { 'first': 'Eric', 'last': 'Evans', 'email': 'eevans@rackspace.com', 'city': 'Austin', 'zip': 78250 } addresses.insert(key, columns) byCity.insert('Austin', {key: ''})
  • 39. Timeseries <!-- conf/storage-conf.xml --> <Keyspace Name=”Sites”> <ColumnFamily Name=”Stats” CompareWith=”LongType”/> </Keyspace>
  • 40. Logging values # time as a long, binary, network-order ts = pack('>d', long(time() * 1e6)) stats.insert('org.apache', {ts: value})
  • 41. Slicing begin = pack('>d', long(s * 1e6)) stats.get_range('org.apache', column_start=begin) end = pack('>d', long((s + 86400) * 1e6)) stats.get_range(start='org.apache', finish='org.debian', column_start=begin, column_finish=end)