SlideShare a Scribd company logo
Cassandra
Under Guidance
Of –
T. B Patil Sir
Roll No. - 35
PRN - 1914110227
Name - Aditya Bhateja
Table of
Contents!
 NoSQL Database
 Apache Cassandra
 Architecture
 Data Model
 Future Scope
 References
 CQL
01
NoSQL Database
● A NoSQL database (sometimes called as Not Only SQL)
is a database that provides a mechanism to store and
retrieve data other than the tabular relations used in
relational databases. These databases are schema-free,
support easy replication , eventually consistent, and
can handle huge amounts of data.
● The primary objective of a NoSQL database is to haves
NoSQL Database
1. implicitly of design,
2. horizontal scaling, and
3. finer control over availability.
● NoSql databases use different data
structures compared to relational
databases. It makes some operations
faster in SQL. The suitability of a
given NoSQL database depends on the
problem it must solve.
NoSQL Database
Besides Cassandra, we have the
following NoSQL databases that are
quite popular −
1. Apache HBase
2. MongoDB
NoSQL Database
Apache
Cassandra
02
Apache Cassandra is an open source, distributed and Decentralized/distributed
storage system (database), for managing very large amounts of structured data
spread out across the world. It provides highly available service with no single
point of failure.
Apache Cassandra
Features of Cassandra
03
Architecture
Architecture
The design goal of Cassandra is
to handle big data workloads
across multiple nodes without
any single point of failure.
Cassandra has peer-to-peer
distributed system across its
nodes, and data is distributed
among all the nodes in a
cluster.
• All the nodes in a cluster play
the same role. Each node is
independent and at the same
time interconnected to other
nodes.
• Each node in a cluster can
accept read and write requests,
regardless of where the data is
actually located in the cluster.
• When a node goes down,
read/write requests can be
served from other nodes in the
network.
Data Replication
in Cassandra In Cassandra, one or more of the
nodes in a cluster act as replicas
for a given piece of data. If it is
detected that some of the nodes
responded with an out-of-date
value, Cassandra will return the
most recent value to the client.
After returning the most recent
value, Cassandra performs a read
repair in the background to update
the stale values
Components of Cassandra
The key components of Cassandra are as follows −
i. Node − It is the place where data is stored.
ii. Data center − It is a collection of related nodes.
iii. Cluster − A cluster is a component that contains one or more data
centers.
iv. Commit log − The commit log is a crash-recovery mechanism in
Cassandra. Every write operation is written to the commit log.
v. Mem-table − A mem-table is a memory-resident data structure.
After commit log, the data will be written to the mem-table.
Sometimes, for a single-column family, there will be multiple mem-
tables.
vi. SSTable − It is a disk file to which the data is flushed from the
mem-table when its contents reach a threshold value.
04
Data Model
Cluster
Cassandra database is distributed
over several machines that
operate together. The outermost
container is known as the Cluster.
For failure handling, every node
contains a replica, and in case of
a failure, the replica takes charge.
Cassandra arranges the nodes in a
cluster, in a ring format, and
assigns data to them.
Key Space
Key space is the outermost container for data in Cassandra. The basic attributes
of a Key space in Cassandra are −
1. Replication factor − It is the number of machines in the cluster that will
receive copies of the same data.
2. Replica placement strategy − It is nothing but the strategy to place replicas
in the ring. We have strategies such as simple strategy (rack-aware
strategy), old network topology strategy (rack-aware strategy), and network
topology strategy (datacenter-shared strategy).
3. Column families − Key space is a container for a list of one or more column
families. A column family, in turn, is a container of a collection of rows. Each
row contains ordered columns. Column families represent the structure of
your data. Each key space has at least one and often many column families.
Column Family
A column family is a container for an ordered
collection of rows. Each row, in turn, is an
ordered collection of columns. The following
table lists the points that differentiate a
column family from a table of relational
databases.
Column
Super Column
05
CQL
CQL
Cassandra provides a prompt Cassandra query
language shell (cqlsh) that allows users to
communicate with it. Using this shell, you
can execute Cassandra Query Language (CQL).
Using cqlsh, you can
•define a schema,
•insert data, and
•execute a query.
•Cqlsh − As discussed above, this command is
used to start the cqlsh prompt. In addition,
it supports a few more options as well. The
following table explains all the options
of cqlsh and their usage.
Creating a Table
You can create a table using the
command CREATE TABLE. Given below is
the syntax for creating a table.
Syntax
CREATE (TABLE | COLUMNFAMILY) <tablename>
('<column-definition>' , '<column-definition>')
(WITH <option> AND <option>)
Altering a Table
You can alter a table using the
command ALTER TABLE. Given below is
the syntax for creating a table.
Syntax
ALTER (TABLE | COLUMNFAMILY) <tablename>
<instruction>
Given below is an example to create a table in Cassandra using cqlsh. Here we are
• Using the keyspace Student
• Creating a table named emp
It will have details such as employee name, id, city, salary, and phone number.
Employee id is the primary key.
Example
cqlsh> USE Student;
cqlsh:Student>; CREATE TABLE emp(
emp_id int PRIMARY KEY,
emp_name text,
emp_city text,
emp_sal varint,
emp_phone varint
);
Future Scope
06
With the modern world being trademarked
by a data boom, having a robust database
management system is a necessity for
businesses. Apache Cassandra, a NoSQL
database is an excellent choice for the
use cases across many different
applications like business and e-
commerce apps. It can be scaled linearly,
provides high-octane performance even
with variable workloads, and is easily
available. Add to that, the support that
Cassandra can provide for replicating
across multiple data centers. It is
probably the best-in-class, rendering
lower latency for users. The operators
love it because it can survive regional
outages.
Refrences
07
1. John Lemprière, Lemprière’s Classical
Dictionary, first published 1788, London
2. Avery, Catherine B. (1962). New Century
Classical Handbook. New York: Appleton-
Century-Crofts. p. 258.
3.Lycophron, Alexandra 30; Pausanias, 3.19, 3.
26.
4. Wilhelm Schulze, Kleine Schriften (1966),
698, J. B. Hoffmann, Glotta 28, 52
5. Edgar Howard Sturtevant, Class. Phil. 21,
248ff.
6. J. Davreux, La légende de la prophétesse
Cassandre (Paris, 1942) 90ff.
7. Albert Carnoy, Les ét. class. 22, 344
8.R. S. P. Beekes, Etymological Dictionary of
Greek, Brill, 2009, p. 654
9. "Online Etymology Dictionary". Online
Etymology Dictionary. Archived from the
original on 2019-05-01. Retrieved November
27, 2021.
Thank You

More Related Content

Cassndra (4).pptx

  • 1. Cassandra Under Guidance Of – T. B Patil Sir Roll No. - 35 PRN - 1914110227 Name - Aditya Bhateja
  • 2. Table of Contents!  NoSQL Database  Apache Cassandra  Architecture  Data Model  Future Scope  References  CQL
  • 4. ● A NoSQL database (sometimes called as Not Only SQL) is a database that provides a mechanism to store and retrieve data other than the tabular relations used in relational databases. These databases are schema-free, support easy replication , eventually consistent, and can handle huge amounts of data. ● The primary objective of a NoSQL database is to haves NoSQL Database 1. implicitly of design, 2. horizontal scaling, and 3. finer control over availability.
  • 5. ● NoSql databases use different data structures compared to relational databases. It makes some operations faster in SQL. The suitability of a given NoSQL database depends on the problem it must solve. NoSQL Database
  • 6. Besides Cassandra, we have the following NoSQL databases that are quite popular − 1. Apache HBase 2. MongoDB NoSQL Database
  • 8. Apache Cassandra is an open source, distributed and Decentralized/distributed storage system (database), for managing very large amounts of structured data spread out across the world. It provides highly available service with no single point of failure. Apache Cassandra
  • 11. Architecture The design goal of Cassandra is to handle big data workloads across multiple nodes without any single point of failure. Cassandra has peer-to-peer distributed system across its nodes, and data is distributed among all the nodes in a cluster. • All the nodes in a cluster play the same role. Each node is independent and at the same time interconnected to other nodes. • Each node in a cluster can accept read and write requests, regardless of where the data is actually located in the cluster. • When a node goes down, read/write requests can be served from other nodes in the network.
  • 12. Data Replication in Cassandra In Cassandra, one or more of the nodes in a cluster act as replicas for a given piece of data. If it is detected that some of the nodes responded with an out-of-date value, Cassandra will return the most recent value to the client. After returning the most recent value, Cassandra performs a read repair in the background to update the stale values
  • 13. Components of Cassandra The key components of Cassandra are as follows − i. Node − It is the place where data is stored. ii. Data center − It is a collection of related nodes. iii. Cluster − A cluster is a component that contains one or more data centers. iv. Commit log − The commit log is a crash-recovery mechanism in Cassandra. Every write operation is written to the commit log. v. Mem-table − A mem-table is a memory-resident data structure. After commit log, the data will be written to the mem-table. Sometimes, for a single-column family, there will be multiple mem- tables. vi. SSTable − It is a disk file to which the data is flushed from the mem-table when its contents reach a threshold value.
  • 15. Cluster Cassandra database is distributed over several machines that operate together. The outermost container is known as the Cluster. For failure handling, every node contains a replica, and in case of a failure, the replica takes charge. Cassandra arranges the nodes in a cluster, in a ring format, and assigns data to them.
  • 16. Key Space Key space is the outermost container for data in Cassandra. The basic attributes of a Key space in Cassandra are − 1. Replication factor − It is the number of machines in the cluster that will receive copies of the same data. 2. Replica placement strategy − It is nothing but the strategy to place replicas in the ring. We have strategies such as simple strategy (rack-aware strategy), old network topology strategy (rack-aware strategy), and network topology strategy (datacenter-shared strategy). 3. Column families − Key space is a container for a list of one or more column families. A column family, in turn, is a container of a collection of rows. Each row contains ordered columns. Column families represent the structure of your data. Each key space has at least one and often many column families.
  • 17. Column Family A column family is a container for an ordered collection of rows. Each row, in turn, is an ordered collection of columns. The following table lists the points that differentiate a column family from a table of relational databases. Column Super Column
  • 19. CQL Cassandra provides a prompt Cassandra query language shell (cqlsh) that allows users to communicate with it. Using this shell, you can execute Cassandra Query Language (CQL). Using cqlsh, you can •define a schema, •insert data, and •execute a query. •Cqlsh − As discussed above, this command is used to start the cqlsh prompt. In addition, it supports a few more options as well. The following table explains all the options of cqlsh and their usage. Creating a Table You can create a table using the command CREATE TABLE. Given below is the syntax for creating a table. Syntax CREATE (TABLE | COLUMNFAMILY) <tablename> ('<column-definition>' , '<column-definition>') (WITH <option> AND <option>) Altering a Table You can alter a table using the command ALTER TABLE. Given below is the syntax for creating a table. Syntax ALTER (TABLE | COLUMNFAMILY) <tablename> <instruction>
  • 20. Given below is an example to create a table in Cassandra using cqlsh. Here we are • Using the keyspace Student • Creating a table named emp It will have details such as employee name, id, city, salary, and phone number. Employee id is the primary key. Example cqlsh> USE Student; cqlsh:Student>; CREATE TABLE emp( emp_id int PRIMARY KEY, emp_name text, emp_city text, emp_sal varint, emp_phone varint );
  • 21. Future Scope 06 With the modern world being trademarked by a data boom, having a robust database management system is a necessity for businesses. Apache Cassandra, a NoSQL database is an excellent choice for the use cases across many different applications like business and e- commerce apps. It can be scaled linearly, provides high-octane performance even with variable workloads, and is easily available. Add to that, the support that Cassandra can provide for replicating across multiple data centers. It is probably the best-in-class, rendering lower latency for users. The operators love it because it can survive regional outages.
  • 22. Refrences 07 1. John Lemprière, Lemprière’s Classical Dictionary, first published 1788, London 2. Avery, Catherine B. (1962). New Century Classical Handbook. New York: Appleton- Century-Crofts. p. 258. 3.Lycophron, Alexandra 30; Pausanias, 3.19, 3. 26. 4. Wilhelm Schulze, Kleine Schriften (1966), 698, J. B. Hoffmann, Glotta 28, 52 5. Edgar Howard Sturtevant, Class. Phil. 21, 248ff. 6. J. Davreux, La légende de la prophétesse Cassandre (Paris, 1942) 90ff. 7. Albert Carnoy, Les ét. class. 22, 344 8.R. S. P. Beekes, Etymological Dictionary of Greek, Brill, 2009, p. 654 9. "Online Etymology Dictionary". Online Etymology Dictionary. Archived from the original on 2019-05-01. Retrieved November 27, 2021.