CQL: SQL In Cassandra

CQL: SQL for Cassandra
Cassandra NYC
December 6, 2011

Eric Evans
eric@acunu.com
@jericevans, @acunu

● Overview, history, motivation
● Performance characteristics
● Coming soon (?)
● Drivers status

What?
● Cassandra Query Language
● aka CQL
● aka /ˈsēkwəl/
● Exactly like SQL (except where it's not)
● Introduced in Cassandra 0.8.0
● Ready for production use

SQL? Almost.

–- Inserts or updates
INSERT INTO Standard1 (KEY, col0, col1)
VALUES (key, value0, value1)
vs.
–- Inserts or updates
UPDATE Standard1
SET col0=value0, col1=value1 WHERE KEY=key

SQL? Almost.
–- Get columns for a row
SELECT col0,col1 FROM Standard1 WHERE KEY=key

–- Range of columns for a row
SELECT col0..colN
FROM Standard1 WHERE KEY=key

–- First 10 results from a range of columns
SELECT FIRST 10 col0..colN

–- Invert the sorting of results
SELECT REVERSED col0..colN

(Un)ease of use
Column col = new Column(ByteBuffer.wrap(“name”.getBytes()));
col.setValue(ByteBuffer.wrap(“value”.getBytes()));
col.setTimestamp(System.currentTimeMillis());

ColumnOrSuperColumn cosc = new ColumnOrSuperColumn();
cosc.setColumn(col);
Mutation mutation = new Mutation();
Mutation.setColumnOrSuperColumn(cosc);
List mutations = new ArrayList<Mutation>();
mutations.add(mutation);
Map mutations_map = new HashMap<ByteBuffer, Map<String, List<Mutation>>>();
Map cf_map = new HashMap<String, List<Mutation>>();
cf_map.set(“Standard1”, mutations);
mutations.put(ByteBuffer.wrap(“key”.getBytes()), cf_map)

CQL
INSERT INTO Standard1 (KEY, col0)
VALUES (key, value0)

Why? How about...
● Better stability guarantees
● Easier to use (you already know it)
● Better code readability / maintainability

Why? How about...
● Irritates the NoSQL purists

Why? How about...
● Irritates the NoSQL purists
● (Still )irritates the SQL purists

Thrift RPC
Column col = new Column(ByteBuffer.wrap(“name”.getBytes()));
col.setValue(ByteBuffer.wrap(“value”.getBytes()));
col.setTimestamp(System.currentTimeMillis());

ColumnOrSuperColumn cosc = new ColumnOrSuperColumn();
cosc.setColumn(col);
Mutation mutation = new Mutation();
Mutation.setColumnOrSuperColumn(cosc);
List mutations = new ArrayList<Mutation>();
mutations.add(mutation);
Map mutations_map = new HashMap<ByteBuffer, Map<String, List<Mutation>>>();
Map cf_map = new HashMap<String, List<Mutation>>();
cf_map.set(“Standard1”, mutations);
mutations.put(ByteBuffer.wrap(“key”.getBytes()), cf_map)

CQL

INSERT INTO Standard1 (KEY, col0)
VALUES (key, value0)

Hotspot
Quoted string literals

UPDATE table SET 'name' = 'value'
WHERE KEY = 'somekey'

Hotspot
Quoted string literals

UPDATE table SET 'name' = 'value'
WHERE KEY = 'somekey'

● Anything that appears between quotes
● Inlined Java constructs a StringBuilder to store
the contents (slow not fast)
● Incurred multiple times per statement

Hotspot
Marshalling

UPDATE table SET 'clear' = 'abffaadd10'
WHERE KEY = 'acfe12ff'

Hotspot
Marshalling

ascii blob

Hotspot
Marshalling

ascii blob

● Terms are marshalled to bytes by type
● String.getBytes is slow (AsciiType)
● Hex conversion is fast faster (BytesType)
● Incurred multiple times per statement

Hotspot
Copying / Conversion

execute_cql_query(
ByteBuffer query, enum compression)
● Query is binary to support compression (is it worth it?)
● And don't forget the String → ByteBuffer conversion on
the client-side
● Incurred only once per statement!

Achtung!
(These tests weren't perfect)

● Uneeded String → ByteBuffer → String
● No query compression implemented
● Co-located client and server

Insert 20M rows, 5 columns

Avg rate Avg latency
RPC 20,953/s 1.6ms
CQL 19,176/s (-8%) 1.7ms (+9%)

Insert 10M rows, 5 cols (indexed)

RPC 9,850/s 5.3ms
CQL 9,290/s (-6%) 5.5ms (+4%)

Counts, 10M rows, 5 cols

RPC 18,052/s 1.7ms
CQL 17,635/s (-2%) 1.7ms

Reading 20M rows, 5 cols

RPC 22.726/s 2.0ms
CQL 20,272/s (-11%) 2.3ms (+10%)

In Summary
Don't step over dollars to pick up pennies!

Roadmap
● Prepared statements (CASSANDRA-2475)
● Compound columns (CASSANDRA-2474)
● Custom transport / protocol (CASSANDRA-2478)
● Performance testing (CASSANDRA-2268)
● Schema introspection (CASSANDRA-2477)
● Multiget support (CASSANDRA-3069)

Drivers
● Hosted on Apache Extras (Google Code)
● Tagged cassandra and cql
● Licensed using Apache License 2.0
● Conforming to a standard for database
connectivity (if applicable)
● Coming soon, automated testing and
acceptance criteria

Drivers
Driver Platform Status
cassandra-jdbc Java Good
cassandra-dbapi2 Python Good
cassandra-ruby Ruby New
cassandra-pdo PHP New
cassandra-node Node.js Good

http://code.google.com/a/apache-extras.org/hosting/search?q=label%3aCassandra

CQL: SQL In Cassandra

Related slideshows

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

More Related Content

What's hot

What's hot (19)

Viewers also liked

Viewers also liked (20)

Similar to CQL: SQL In Cassandra

Similar to CQL: SQL In Cassandra (20)

More from Eric Evans

More from Eric Evans (12)

Recently uploaded

Recently uploaded (20)

CQL: SQL In Cassandra