SlideShare a Scribd company logo
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Planning your queries
for maximum performance
VP R&D, ScyllaDB
Shlomi Livne
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Shlomi Livne
2
Shlomi is VP of R&D at ScyllaDB. Prior to ScyllaDB
he led the research and development team at
Convergin, which was acquired by Oracle.
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
How Scylla executes
your queries
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Cluster View
4
client
Cluster of nodes
1
7
3
4
5
68
2
Coordinator
Replica

Recommended for you

Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor Laor
Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor LaorScylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor Laor
Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor Laor

ScyllaDB CEO and co-founder Dor Laor shares his vision for Scylla and announces Scylla 2.0, a big step towards the first autonomous NoSQL database—one that dynamically tunes itself to varying conditions while always maintaining a high level of performance.

scylladbnosqlscyllasummit
Scylla Summit 2017: A Deep Dive on Heat Weighted Load Balancing
Scylla Summit 2017: A Deep Dive on Heat Weighted Load BalancingScylla Summit 2017: A Deep Dive on Heat Weighted Load Balancing
Scylla Summit 2017: A Deep Dive on Heat Weighted Load Balancing

This presentation discusses the "cold node problem" that occurs when a node restarts in a Cassandra cluster. When a node restarts, it loses its cached data and becomes a bottleneck. The presentation proposes a "heat weighted load balancing" solution where the cluster tracks each node's cache hit ratio and redistributes requests based on this ratio after a restart. Testing shows this solution significantly improves throughput after a node restart by distributing requests more evenly across nodes based on their "heat" or cache contents.

scyllanosqlscyllasummit
Scylla Summit 2017: Distributed Materialized Views
Scylla Summit 2017: Distributed Materialized ViewsScylla Summit 2017: Distributed Materialized Views
Scylla Summit 2017: Distributed Materialized Views

Duarte Nunes presented on distributed materialized views in ScyllaDB. He discussed the challenges of implementing materialized views in a distributed system without a single master, including propagating updates from base tables to views, handling consistency when tables can diverge, and managing concurrent updates safely. His proposed solution uses asynchronous replica-based propagation paired with repair mechanisms and locking or optimistic concurrency to address these issues. Materialized views provide powerful indexing capabilities but also introduce performance overhead that is difficult to avoid given Scylla's data model.

scyllascyllasummitnosql
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Coordinator Tasks
5
1. Prepare the statement
2. Single partition queries
a. Selects replicas (using cache heat info) - and send query / digest requests
requesting a page of results
b. Compare the digests, if there is a mismatch:
i. Request data from selected replicas
ii. Repair the data on replicas
c. Return result
3. Partition scan queries
a. Split the request up based on the ring
b. Send requests for data using ranges - requesting a page of results
c. Merge results
d. Return result
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Replica Tasks
6
1. Receive a data/digest/range request
2. Split the request up according to shards
3. On each shard:
a. Execute the request merging data from memtables + cache/sstables
b. For data request:
i. prepare a result and return it (compute digest if RF > 1)
c. For digest request:
i. compute digest and return it
d. For partition scan request
i. return the partition range data (do not prepare a result)
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
7
Bloom Filter Summary Index Compression Data
Bloom Filter Summary Index Compression Data
Bloom Filter Summary Index Compression Data
ResultRow CacheMemtable
Read Req Result
Bloom Filter Summary Index Compression Data
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
8
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter
P8
Summary Index Compression Data
Row Cache
P8:R1:A=8,B=7
Memtable
P8:R1:C=3
Read: P8:R1
Bloom Filter Summary Index Compression Data

Recommended for you

Scylla Summit 2017: Running a Soft Real-time Service at One Million QPS
Scylla Summit 2017: Running a Soft Real-time Service at One Million QPSScylla Summit 2017: Running a Soft Real-time Service at One Million QPS
Scylla Summit 2017: Running a Soft Real-time Service at One Million QPS

AdGear runs an ad tech gateway at more than one million queries per second to Scylla and recently transitioned from Apache Cassandra. In this talk, we will highlight the tools and languages that we use (Erlang), how we do bulk imports, and how performance compares between the two database engines.

scylladbscyllasummitnosql
Scylla Summit 2017: Scylla on Kubernetes
Scylla Summit 2017: Scylla on KubernetesScylla Summit 2017: Scylla on Kubernetes
Scylla Summit 2017: Scylla on Kubernetes

Kubernetes is a declarative system for automatically deploying, managing, and scaling applications and their dependencies. In this short talk, I'll demonstrate a small Scylla cluster running in Google Compute Engine via Kubernetes and our publicly-published Docker images.

scyllasummitnosqlscylla
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDs
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDsScylla Summit 2017: Scylla on Samsung NVMe Z-SSDs
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDs

I will be giving a talk about performance characterization and tuning of Scylla on Samsung NVMe SSDs. We will characterize the performance of Scylla on Samsung high-performance NVMe SSDs and show how Z-SSD ─ the Samsung ultra-low-latency NVMe drive ─ can significantly shrink the performance gap between in-memory and in-storage with Scylla. We will further evaluate the throughput-vs-latency profile of Scylla with NVMe devices and present end-to-end latencies (from the client's viewpoint) as well as the latencies of the software/hardware stack. We will show that a Z-SSD-backed Scylla cluster can provide competitive performance to an in-memory deployment while sharply reducing costs.

scyllasummitnosqlscylla
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
9
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter
P8
Summary Index Compression Data
Row Cache
P8:R1:A=8,B=7
Memtable
P8:R1:C=3
Read: P8:R1
P8:R1
A=8,B=7,C=3
Bloom Filter Summary Index Compression Data
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
10
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter
P8
Summary Index Compression Data
Row Cache
Memtable
P8:R1:C=3
Read: P8:R1
Bloom Filter Summary Index Compression Data
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Bloom Filter
emtable
P8:R1:C=3
Replica Shard Read Diagram
11
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter
P8
Summary Index Compression Data
Row Cache
Memtable
P8:R1:C=3
Read: P8:R1
Summary Index Compression Data
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
12
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter
P8
Summary Index Compression Data
Row Cache
Memtable
P8:R1:C=3
Read: P8:R1
Bloom Filter 12Summary Index Compression Data

Recommended for you

If You Care About Performance, Use User Defined Types
If You Care About Performance, Use User Defined TypesIf You Care About Performance, Use User Defined Types
If You Care About Performance, Use User Defined Types

Shlomi Livne, VP of R&D at ScyllaDB, presented on the performance benefits of using user-defined types (UDTs) in ScyllaDB. He explained that with traditional columns, each column has overhead and flexibility comes at a price. However, with frozen UDTs, the columns are treated as a single unit, sharing metadata and improving performance. Livne showed results of a test where UDTs with many fields outperformed traditional columns with the same number of fields. However, he noted that Scylla's row cache and Java driver performance need improvement for UDTs.

nosqlscyllasummitscylla
Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...
Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...
Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...

In my talk, I will present the different compaction strategies that Scylla provides, and demonstrate when it is appropriate and when it is inappropriate to use each one. I will then present a new compaction strategy that we designed as a lesson from the existing compaction strategies by picking the best features of the existing strategies while avoiding their problems.

nosqlscyllasummitscylla
Scylla Summit 2017: Keynote, Looking back, looking ahead
Scylla Summit 2017: Keynote, Looking back, looking aheadScylla Summit 2017: Keynote, Looking back, looking ahead
Scylla Summit 2017: Keynote, Looking back, looking ahead

ScyllaDB CTO Avi Kivity gave a keynote on how Scylla has evolved. He discussed new features in Scylla 2.0—including Materialized Views and Heat-Weighted Load Balancing, changes in monitoring—and shared our product roadmap. He also talked about our recent acquisition of Seastar.io and how it will enable us to deliver a database-as-a-service offering.

scyllanosqlscyllasummit
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
13
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter
P8
Summary Index Compression Data
Row Cache
Memtable
P8:R1:C=3
Read: P8:R1
13
Bloom Filter 13Summary Index Compression Data
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter Summary Index Compression Data
Row Cache
Memtable
P8:R1:C=3
Read: P8:R1
Bloom Filter
P8
Summary Index Compression Data
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
15
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter Summary Index Compression Data
P8:R1:A=8,B=7Row Cache
Memtable
P8:R1:C=3
Read: P8:R1
Bloom Filter
P8
Summary Index Compression Data
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
16
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter Summary Index Compression Data
P8:R1:A=8,B=7Row Cache
P8:R1:A=8,B=7
Memtable
P8:R1:C=3
Read: P8:R1
P8:R1
A=8,B=7,C=3
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter
P8
Summary Index Compression Data

Recommended for you

Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data Platform
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data PlatformScylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data Platform
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data Platform

In this presentation, I'll speak of the benefits of running Scylla on our Big Data environment which stores over 500TB of data as well as using Scylla as the indexing engine to replace MongoDB and Cassandra for our log data analysis platform.

nosqlscyllasummitscylla
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...

The document appears to be a presentation on optimizing inter-data center communication. It discusses key topics like what inter-data center communication involves, the costs associated with it, best practices for setting snitches, keyspaces, client drivers and consistency levels for queries to optimize performance between data centers. It recommends using network topology replication strategies over simple strategies for multi-region deployments, setting load balancing and consistency levels appropriately in clients, and enabling internode compression to reduce costs of communication between data centers. The presentation encourages reviewing client locations, data access patterns, who is reading/writing data, and having conversations between operations and development teams to determine the best use cases.

nosqlscyllasummitscylla
Scylla Summit 2017: Scylla's Open Source Monitoring Solution
Scylla Summit 2017: Scylla's Open Source Monitoring SolutionScylla Summit 2017: Scylla's Open Source Monitoring Solution
Scylla Summit 2017: Scylla's Open Source Monitoring Solution

Scylla's monitoring capability has come a long way in the last year. We now have native support for Prometheus. Through scylla-grafana-monitoring, we have started providing default dashboards summarizing the most important aspects of Scylla for users. In this talk, I will cover what is currently available in our metrics, other non-standard metrics that are interesting but not available in our main dashboard, as well as our future plans for enhancement.

nosqlscylladbscyllasummit
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
emtable
P8:R1:C=3
Replica Shard Read Diagram
17
Bloom Filter
P8
Summary
P8
Index
P8
Compression Data
P8:R1:A=8
Bloom Filter
P8
Summary
Index
P8
Compression
Data
P8:R1:B=7
Bloom Filter Summary Index Compression Data
P8:R1:A=8,B=7Row Cache
P8:R1:A=8,B=7
Memtable
P8:R1:C=3
Read: P8:R1
P8:R1
A=8,B=7,C=3
Bloom Filter
P8
Summary Index Compression Data
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Row Cache
18
▪ Cache stores complete row data
▪ In addition to storing existing rows, cache stores information
about completeness of clustering ranges (continuity), so it doesn't
miss between cached rows.
▪ Cache is populated on:
o Queries
o Memtable flush:
• Data is merged - to keep it up to date with new sstables written.
• Data is inserted - in case there is no data for that partition on disk.
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Selecting Sstables
19
▪ Given a partition key (pk), the current set of sstables is reduced so that
sstable X will be included iff:
o min_partition_key(sstable X) < pk < max_partition_key (sstable X)
o bloom_filer (sstable X, pk) = True
▪ Scylla 2.0: SStables will be read in parallel
▪ Scylla 2.1:
o The reduced set of sstables is searched newest to oldest until a result can be
constructed and we can prove that older sstables are not relevant.
o SStables read parallelism will grow starting from a single sstable
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
7 Rules To
Optimize your Queries

Recommended for you

Scylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQL
Scylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQLScylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQL
Scylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQL

Apache Kafka is a high-throughput distributed streaming platform that is being adopted by hundreds of companies to manage their real-time data. KSQL is an open source streaming SQL engine that implements continuous, interactive queries against Apache Kafka™. KSQL makes it easy to read, write and process streaming data in real-time, at scale, using SQL-like semantics. In my talk, I will discuss streaming ETL from Kafka into stores like Apache Cassandra using KSQL.

nosqlscyllasummitscylla
Scylla Summit 2017: A Toolbox for Understanding Scylla in the Field
Scylla Summit 2017: A Toolbox for Understanding Scylla in the FieldScylla Summit 2017: A Toolbox for Understanding Scylla in the Field
Scylla Summit 2017: A Toolbox for Understanding Scylla in the Field

In this talk, we will share useful tools and techniques that we are using in the field to understand Scylla clusters. Users will learn how to use those same tools to better understand their deployment. Some of the questions that will be answered are: - how to find out which queries are the slowest and why - how we go about understanding the impact of the data model in a node's performance - how to check which resources are the bottlenecks in the cluster

nosqlscyllasummitscylla
Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...
Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...
Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...

JanusGraph, a highly scalable graph database solution, supports historically Cassandra and HBase as database backends. We decided to put Scylla in the mix, certainly searching for the best performing backend. We ran test scenarios that cover high volume reads and writes. In this talk, we will show you the performance results of Scylla vs others and also share our lessons learned during the performance evaluation.

scylladbscyllasummitnosql
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Rule #1 - Use Prepared statements
▪ Coordinator needs to pre-process the query:
o A lot of repetitive work that can be done only once
o Adds overhead in execution of a query - directly translates to throughput and
latency
▪ Driver is not able to send the request to a coordinator node that
holds the data (an additional hop)
▪ tip: compare scylla_query_processor_statements_prepared to the
# of executed scylla_transport_requests_served
21
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Sample: single Scylla server, using c-s
22
Results Unprepared Prepared
op rate 13037 18704
partition rate 13037 18704
row rate 13037 18704
latency mean 1.5 1.1
latency median 1.3 1
latency 95th percentile 2.9 1.6
latency 99th percentile 6.2 2.5
latency 99.9th percentile 12.2 7.1
latency max 31.1 16.9
Total partitions 100000 100000
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Rule #2 - Use Paging
▪ Paging Disabled: Coordinator will be forced to prepare a single
result that holds all the data and send it back:
o If coordinator is not able to return a response (allocate enough memory for
the single result) an error will be returned to the client
o tip: compare scylla_transport_unpaged_queries to scylla_cql_reads to
detected if many of your read queries are unpaged
23
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Rule #3 - Use correct Page Size
▪ Drivers enable paging by default with a default page_size 5000
rows (java, python, gocql)
▪ CQL requires returning at least one result and allows returning less
results than the page size
▪ Scylla utilizes this:
o Scylla caps a page_size to ~1MB of memory - Scylla will return less rows than
requested when rows are large
o Do not use the number of returned results as indication if there are no more
results
24

Recommended for you

Scylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot Instances
Scylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot InstancesScylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot Instances
Scylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot Instances

Scylla and Spotinst together provide a strong combination of extreme performance and cost reduction. In this talk, we will present how a Scylla cluster can be used on AWS’s EC2 Spot without losing consistency with the help of Spotinst prediction technology and advanced stateful features. We will show a live demo on how to run Scylla on the Spotinst platform.

nosqlscyllasummitscylla
Scylla Summit 2017: Welcome and Keynote - Nextgen NoSQL
Scylla Summit 2017: Welcome and Keynote - Nextgen NoSQLScylla Summit 2017: Welcome and Keynote - Nextgen NoSQL
Scylla Summit 2017: Welcome and Keynote - Nextgen NoSQL

Our CEO and co-founder Dor Laor and our chairman Benny Schnaider sharing their vision for Scylla. This was also our opportunity to announce Scylla 2.0. Our latest release is a big step toward the first autonomous NoSQL database—one that dynamically tunes itself to varying conditions while always maintaining a high level of performance.

scyllanosqlscyllasummit
MySQL Optimizer: What's New in 8.0
MySQL Optimizer: What's New in 8.0MySQL Optimizer: What's New in 8.0
MySQL Optimizer: What's New in 8.0

The document discusses new features and improvements in the MySQL 8.0 optimizer. Key highlights include: - New SQL syntax like SELECT...FOR UPDATE SKIP LOCKED and NOWAIT to handle row locking contention. - Support for common table expressions to improve readability and allow referencing derived tables multiple times. - Enhancements to the cost model to produce more accurate estimates based on factors like data location. - Better support for data types like UUID and IPv6, including optimized storage formats and new functions.

utf8mysqloptimizer
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
25
21
Has more pages
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Scylla 2.0: does the default page_size make sense
26
page size 10^6 rows of 100 bytes 10^5 rows of 1000 bytes 10^4 rows of 10^4 bytes 1000 rows of 10^5 bytes
10 timed out 2104.492031 331.087871 173.932543
50 5679.087615 737.148927 202.113023 168.165375
100 4034.920447 573.046783 186.384383 168.951807
500 2663.383039 415.760383 183.894015 173.015039
1000 2451.570687 395.313151 182.976511 168.427519
5000 2285.895679 400.031743 184.942591 169.345023
10000 2281.701375 399.769599 183.369727 169.738239
50000 2273.312767 396.099583 183.107583 170.000383
Test: duration in millisecond fetching a single wide partition with 10^8 bytes
split into rows using different page size
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Test: duration in millisecond fetching a single wide partition with 10^8 bytes
split into rows using different page size
C* 3.11.0: does the default page_size make sense
27
page size 10^6 rows of 100 bytes 10^5 rows of 1000 bytes 10^4 rows of 10^4 bytes 1000 rows of 10^5 bytes
10 timed out 4030.726143 903.872511 364.380159
50 12876.51328 1535.115263 419.430399 300.941311
100 8992.587775 1202.716671 405.274623 316.407807
500 6400.507903 907.542527 354.680831 348.651519
1000 6077.546495 874.512383 360.972287 370.409471
5000 5620.367359 791.674879 422.051839 358.612991
10000 5490.343935 793.772031 389.021695 360.447999
50000 5662.310399 913.833983 383.516671 355.467263
tip: consider changing the page size if your rows are large
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Rule #4 - Beware of Multi Partition CQL IN queries
▪ Multi-Partition CQL IN queries: force the coordinator node to split
the queries up to single partition queries and aggregate results.
28

Recommended for you

How to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issuesHow to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issues

Apache Impala is an exceptional, best-of-breed massively parallel processing SQL query engine that is a fundamental component of the big data software stack. Juan Yu demystifies the cost model Impala Planner uses and how Impala optimizes queries and explains how to identify performance bottleneck through query plan and profile and how to drive Impala to its full potential.

big dataanalyticsdata warehousing
Best Practices for Migrating Legacy Data Warehouses into Amazon Redshift
Best Practices for Migrating Legacy Data Warehouses into Amazon RedshiftBest Practices for Migrating Legacy Data Warehouses into Amazon Redshift
Best Practices for Migrating Legacy Data Warehouses into Amazon Redshift

The document summarizes best practices for migrating legacy data warehouses to Amazon Redshift. It covers architectural concepts like columnar storage and compression, data distribution styles, sort keys to optimize query performance, and materializing dimension columns in fact tables. The presentation provides an overview of these topics and their impact on storage, I/O and querying. Real-world examples are also given to illustrate key points.

dc-summit-2019
CS 542 -- Query Execution
CS 542 -- Query ExecutionCS 542 -- Query Execution
CS 542 -- Query Execution

The document discusses query execution in database management systems. It begins with an example query on a City, Country database and represents it in relational algebra. It then discusses different query execution strategies like table scan, nested loop join, sort merge join, and hash join. The strategies are compared based on their memory and disk I/O requirements. The document emphasizes that query execution plans can be optimized for parallelism and pipelining to improve performance.

query executionquery optimization
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Rule #5 - Beware of Single Partition CQL IN queries
Question: Should I split the CQL IN Query ?
Sample:
▪ CQL: “Select * from ks.cf where pk = X and ck in (Y1, Y2, … Yn)
Translated to:
▪ CQL:
o “Select * from ks.cf where pk = X and ck = Y1“
o “Select * from ks.cf where pk = X and ck = Y2“
.
o “Select * from ks.cf where pk = X and ck = Yn“
29
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
30
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
31
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
32

Recommended for you

Accurately and Reliably Extracting Data from the Web:
Accurately and Reliably Extracting Data from the Web: Accurately and Reliably Extracting Data from the Web:
Accurately and Reliably Extracting Data from the Web:

STALKER is a machine learning algorithm that learns to extract data from web pages using a small number of labeled examples provided by the user. It generates extraction rules in a hierarchical manner, exploiting the structure of the web source. The algorithm is efficient because most web pages have a fixed template with few variations. It also uses an active learning approach called co-testing to select the most informative examples for the user to label. The system verifies extracted data by comparing it to learned statistical patterns, and can automatically repair wrappers when sites change.

pattern recognitionmachine learning
C:\nppdf32 log\debuglog
C:\nppdf32 log\debuglogC:\nppdf32 log\debuglog
C:\nppdf32 log\debuglog

The document contains log output from a plugin making calls to load, write, and render PDF files from various URLs and local files. It initializes the plugin, opens streams for the PDF files, writes the stream data in chunks, and finally destroys the streams and plugin instance. This process is repeated for multiple PDF files loaded by the plugin.

Oracle tips and tricks
Oracle tips and tricksOracle tips and tricks
Oracle tips and tricks

The document provides an overview of various Oracle tips and tricks, including CASE statements, joins, timestamps, renaming tables/columns, merge statements, subqueries, window functions, hierarchical queries, XML, grouping sets, rollups and cubes, indexes, temporary tables and more. Key features introduced in Oracle 9i such as the CASE statement, full outer joins, timestamps and the WITH clause are highlighted.

sqloracle
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
33
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Question: Should I split the CQL IN Query ?
Answer: It depends on how wide your rows are
Comments:
▪ Prior to Scylla-2.0 in some wide partition cases single partition CQL
IN Queries - performed very badly.
▪ All reported results are using Scylla 2.0
34
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Rule #6 - There’s a faster way todo full scans
▪ The blog post efficient-full-table-scans-with-scylla outlaid an
algorithm todo full scans; in highlevel:
o split the range up into small sub ranges
o run “enough” sub ranges in parallel
▪ In follow up blog How to scan 475 million partitions 12x faster
using efficient full table scan a sample implementation applying
this was provided
▪ Is there even a “faster” way ?
35
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
▪ Yes there is:
o Using the token ownership of nodes in the ring one can select ranges of
tokens. Once a “range” has been processed - the next “range” can be
selected based on the ownership in the ring.
o An even more optimized solution would use the “sharding” information and
aim ranges based on shards on a machine - so that all cores are executing
requests in parallel.
36

Recommended for you

Les12[1]Creating Views
Les12[1]Creating ViewsLes12[1]Creating Views
Les12[1]Creating Views

After completing this lesson, you should be able to do the following: Describe a view Create a view Retrieve data through a view Alter the definition of a view Insert, update, and delete data through a view Drop a view

pl/sqlsqloracle
DB_lecturs8 27 11.pptx
DB_lecturs8 27 11.pptxDB_lecturs8 27 11.pptx
DB_lecturs8 27 11.pptx

This document provides examples of SQL queries using aggregation functions such as SUM, AVG, MIN, MAX, and COUNT. It demonstrates how to use aggregation functions to calculate values across entire tables or groups of rows. It also shows how to use the GROUP BY clause to aggregate values for each unique value in a column, and the HAVING clause to filter groups based on aggregation results. Proper order of operations for aggregation queries is also discussed.

database
MySQL Optimizer Overview
MySQL Optimizer OverviewMySQL Optimizer Overview
MySQL Optimizer Overview

This document provides an overview of the MySQL query optimizer. It discusses the main phases of the optimizer including logical transformations, cost-based optimizations, analyzing access methods, join ordering, and plan refinements. Logical transformations prepare the query for cost-based optimization by simplifying conditions. Cost-based optimizations select the optimal join order and access methods to minimize resources used. Access methods analyzed include table scans, index scans, and ref access. The join optimizer searches for the best join order. Plan refinements include sort avoidance and index condition pushdown.

mysql optimizer
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Rule #7: Use the tools ….
▪ Probelastic tracing
▪ Slow query tracing
▪ Wireshark
▪ CQL Trace
▪ Enable Client Side tracing.
37
PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
THANK YOU
shlomi@scylladb.com
@ShlomiLivne
Any questions?

More Related Content

What's hot

Scylla Summit 2017: Scylla for Mass Simultaneous Sensor Data Processing of ME...
Scylla Summit 2017: Scylla for Mass Simultaneous Sensor Data Processing of ME...Scylla Summit 2017: Scylla for Mass Simultaneous Sensor Data Processing of ME...
Scylla Summit 2017: Scylla for Mass Simultaneous Sensor Data Processing of ME...
ScyllaDB
 
Scylla Summit 2017: Snapfish's Journey Towards Scylla
Scylla Summit 2017: Snapfish's Journey Towards ScyllaScylla Summit 2017: Snapfish's Journey Towards Scylla
Scylla Summit 2017: Snapfish's Journey Towards Scylla
ScyllaDB
 
Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...
Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...
Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...
ScyllaDB
 
Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor Laor
Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor LaorScylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor Laor
Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor Laor
ScyllaDB
 
Scylla Summit 2017: A Deep Dive on Heat Weighted Load Balancing
Scylla Summit 2017: A Deep Dive on Heat Weighted Load BalancingScylla Summit 2017: A Deep Dive on Heat Weighted Load Balancing
Scylla Summit 2017: A Deep Dive on Heat Weighted Load Balancing
ScyllaDB
 
Scylla Summit 2017: Distributed Materialized Views
Scylla Summit 2017: Distributed Materialized ViewsScylla Summit 2017: Distributed Materialized Views
Scylla Summit 2017: Distributed Materialized Views
ScyllaDB
 
Scylla Summit 2017: Running a Soft Real-time Service at One Million QPS
Scylla Summit 2017: Running a Soft Real-time Service at One Million QPSScylla Summit 2017: Running a Soft Real-time Service at One Million QPS
Scylla Summit 2017: Running a Soft Real-time Service at One Million QPS
ScyllaDB
 
Scylla Summit 2017: Scylla on Kubernetes
Scylla Summit 2017: Scylla on KubernetesScylla Summit 2017: Scylla on Kubernetes
Scylla Summit 2017: Scylla on Kubernetes
ScyllaDB
 
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDs
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDsScylla Summit 2017: Scylla on Samsung NVMe Z-SSDs
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDs
ScyllaDB
 
If You Care About Performance, Use User Defined Types
If You Care About Performance, Use User Defined TypesIf You Care About Performance, Use User Defined Types
If You Care About Performance, Use User Defined Types
ScyllaDB
 
Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...
Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...
Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...
ScyllaDB
 
Scylla Summit 2017: Keynote, Looking back, looking ahead
Scylla Summit 2017: Keynote, Looking back, looking aheadScylla Summit 2017: Keynote, Looking back, looking ahead
Scylla Summit 2017: Keynote, Looking back, looking ahead
ScyllaDB
 
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data Platform
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data PlatformScylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data Platform
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data Platform
ScyllaDB
 
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
ScyllaDB
 
Scylla Summit 2017: Scylla's Open Source Monitoring Solution
Scylla Summit 2017: Scylla's Open Source Monitoring SolutionScylla Summit 2017: Scylla's Open Source Monitoring Solution
Scylla Summit 2017: Scylla's Open Source Monitoring Solution
ScyllaDB
 
Scylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQL
Scylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQLScylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQL
Scylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQL
ScyllaDB
 
Scylla Summit 2017: A Toolbox for Understanding Scylla in the Field
Scylla Summit 2017: A Toolbox for Understanding Scylla in the FieldScylla Summit 2017: A Toolbox for Understanding Scylla in the Field
Scylla Summit 2017: A Toolbox for Understanding Scylla in the Field
ScyllaDB
 
Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...
Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...
Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...
ScyllaDB
 
Scylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot Instances
Scylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot InstancesScylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot Instances
Scylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot Instances
ScyllaDB
 
Scylla Summit 2017: Welcome and Keynote - Nextgen NoSQL
Scylla Summit 2017: Welcome and Keynote - Nextgen NoSQLScylla Summit 2017: Welcome and Keynote - Nextgen NoSQL
Scylla Summit 2017: Welcome and Keynote - Nextgen NoSQL
ScyllaDB
 

What's hot (20)

Scylla Summit 2017: Scylla for Mass Simultaneous Sensor Data Processing of ME...
Scylla Summit 2017: Scylla for Mass Simultaneous Sensor Data Processing of ME...Scylla Summit 2017: Scylla for Mass Simultaneous Sensor Data Processing of ME...
Scylla Summit 2017: Scylla for Mass Simultaneous Sensor Data Processing of ME...
 
Scylla Summit 2017: Snapfish's Journey Towards Scylla
Scylla Summit 2017: Snapfish's Journey Towards ScyllaScylla Summit 2017: Snapfish's Journey Towards Scylla
Scylla Summit 2017: Snapfish's Journey Towards Scylla
 
Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...
Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...
Scylla Summit 2017: How to Use Gocql to Execute Queries and What the Driver D...
 
Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor Laor
Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor LaorScylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor Laor
Scylla Summit 2017 Keynote: NextGen NoSQL with CEO Dor Laor
 
Scylla Summit 2017: A Deep Dive on Heat Weighted Load Balancing
Scylla Summit 2017: A Deep Dive on Heat Weighted Load BalancingScylla Summit 2017: A Deep Dive on Heat Weighted Load Balancing
Scylla Summit 2017: A Deep Dive on Heat Weighted Load Balancing
 
Scylla Summit 2017: Distributed Materialized Views
Scylla Summit 2017: Distributed Materialized ViewsScylla Summit 2017: Distributed Materialized Views
Scylla Summit 2017: Distributed Materialized Views
 
Scylla Summit 2017: Running a Soft Real-time Service at One Million QPS
Scylla Summit 2017: Running a Soft Real-time Service at One Million QPSScylla Summit 2017: Running a Soft Real-time Service at One Million QPS
Scylla Summit 2017: Running a Soft Real-time Service at One Million QPS
 
Scylla Summit 2017: Scylla on Kubernetes
Scylla Summit 2017: Scylla on KubernetesScylla Summit 2017: Scylla on Kubernetes
Scylla Summit 2017: Scylla on Kubernetes
 
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDs
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDsScylla Summit 2017: Scylla on Samsung NVMe Z-SSDs
Scylla Summit 2017: Scylla on Samsung NVMe Z-SSDs
 
If You Care About Performance, Use User Defined Types
If You Care About Performance, Use User Defined TypesIf You Care About Performance, Use User Defined Types
If You Care About Performance, Use User Defined Types
 
Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...
Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...
Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...
 
Scylla Summit 2017: Keynote, Looking back, looking ahead
Scylla Summit 2017: Keynote, Looking back, looking aheadScylla Summit 2017: Keynote, Looking back, looking ahead
Scylla Summit 2017: Keynote, Looking back, looking ahead
 
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data Platform
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data PlatformScylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data Platform
Scylla Summit 2017: How Baidu Runs Scylla on a Petabyte-Level Big Data Platform
 
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
 
Scylla Summit 2017: Scylla's Open Source Monitoring Solution
Scylla Summit 2017: Scylla's Open Source Monitoring SolutionScylla Summit 2017: Scylla's Open Source Monitoring Solution
Scylla Summit 2017: Scylla's Open Source Monitoring Solution
 
Scylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQL
Scylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQLScylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQL
Scylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQL
 
Scylla Summit 2017: A Toolbox for Understanding Scylla in the Field
Scylla Summit 2017: A Toolbox for Understanding Scylla in the FieldScylla Summit 2017: A Toolbox for Understanding Scylla in the Field
Scylla Summit 2017: A Toolbox for Understanding Scylla in the Field
 
Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...
Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...
Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...
 
Scylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot Instances
Scylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot InstancesScylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot Instances
Scylla Summit 2017: Saving Thousands by Running Scylla on EC2 Spot Instances
 
Scylla Summit 2017: Welcome and Keynote - Nextgen NoSQL
Scylla Summit 2017: Welcome and Keynote - Nextgen NoSQLScylla Summit 2017: Welcome and Keynote - Nextgen NoSQL
Scylla Summit 2017: Welcome and Keynote - Nextgen NoSQL
 

Similar to Scylla Summit 2017: Planning Your Queries for Maximum Performance

MySQL Optimizer: What's New in 8.0
MySQL Optimizer: What's New in 8.0MySQL Optimizer: What's New in 8.0
MySQL Optimizer: What's New in 8.0
Manyi Lu
 
How to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issuesHow to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issues
Cloudera, Inc.
 
Best Practices for Migrating Legacy Data Warehouses into Amazon Redshift
Best Practices for Migrating Legacy Data Warehouses into Amazon RedshiftBest Practices for Migrating Legacy Data Warehouses into Amazon Redshift
Best Practices for Migrating Legacy Data Warehouses into Amazon Redshift
Amazon Web Services
 
CS 542 -- Query Execution
CS 542 -- Query ExecutionCS 542 -- Query Execution
CS 542 -- Query Execution
J Singh
 
Accurately and Reliably Extracting Data from the Web:
Accurately and Reliably Extracting Data from the Web: Accurately and Reliably Extracting Data from the Web:
Accurately and Reliably Extracting Data from the Web:
butest
 
C:\nppdf32 log\debuglog
C:\nppdf32 log\debuglogC:\nppdf32 log\debuglog
C:\nppdf32 log\debuglog
padblo
 
Oracle tips and tricks
Oracle tips and tricksOracle tips and tricks
Oracle tips and tricks
Yanli Liu
 
Les12[1]Creating Views
Les12[1]Creating ViewsLes12[1]Creating Views
Les12[1]Creating Views
siavosh kaviani
 
DB_lecturs8 27 11.pptx
DB_lecturs8 27 11.pptxDB_lecturs8 27 11.pptx
DB_lecturs8 27 11.pptx
NermeenKamel7
 
MySQL Optimizer Overview
MySQL Optimizer OverviewMySQL Optimizer Overview
MySQL Optimizer Overview
Olav Sandstå
 
OrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionalityOrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionality
Curtis Mosters
 
SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6
Mahesh Vallampati
 
ReactDC Intro to NextJS 9
ReactDC Intro to NextJS 9ReactDC Intro to NextJS 9
ReactDC Intro to NextJS 9
Allison Kunz
 
educational course/tutorialoutlet.com
educational course/tutorialoutlet.comeducational course/tutorialoutlet.com
educational course/tutorialoutlet.com
jorge0043
 
Ctes percona live_2017
Ctes percona live_2017Ctes percona live_2017
Ctes percona live_2017
Guilhem Bichot
 
Sql scripting sorcerypaper
Sql scripting sorcerypaperSql scripting sorcerypaper
Sql scripting sorcerypaper
oracle documents
 
Witsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streamingWitsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streaming
Mark Kerzner
 
lab14444444444444444444444444444444444444444
lab14444444444444444444444444444444444444444lab14444444444444444444444444444444444444444
lab14444444444444444444444444444444444444444
227567
 
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
Databricks
 
Sparklyr: Big Data enabler for R users
Sparklyr: Big Data enabler for R usersSparklyr: Big Data enabler for R users
Sparklyr: Big Data enabler for R users
ICTeam S.p.A.
 

Similar to Scylla Summit 2017: Planning Your Queries for Maximum Performance (20)

MySQL Optimizer: What's New in 8.0
MySQL Optimizer: What's New in 8.0MySQL Optimizer: What's New in 8.0
MySQL Optimizer: What's New in 8.0
 
How to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issuesHow to use Impala query plan and profile to fix performance issues
How to use Impala query plan and profile to fix performance issues
 
Best Practices for Migrating Legacy Data Warehouses into Amazon Redshift
Best Practices for Migrating Legacy Data Warehouses into Amazon RedshiftBest Practices for Migrating Legacy Data Warehouses into Amazon Redshift
Best Practices for Migrating Legacy Data Warehouses into Amazon Redshift
 
CS 542 -- Query Execution
CS 542 -- Query ExecutionCS 542 -- Query Execution
CS 542 -- Query Execution
 
Accurately and Reliably Extracting Data from the Web:
Accurately and Reliably Extracting Data from the Web: Accurately and Reliably Extracting Data from the Web:
Accurately and Reliably Extracting Data from the Web:
 
C:\nppdf32 log\debuglog
C:\nppdf32 log\debuglogC:\nppdf32 log\debuglog
C:\nppdf32 log\debuglog
 
Oracle tips and tricks
Oracle tips and tricksOracle tips and tricks
Oracle tips and tricks
 
Les12[1]Creating Views
Les12[1]Creating ViewsLes12[1]Creating Views
Les12[1]Creating Views
 
DB_lecturs8 27 11.pptx
DB_lecturs8 27 11.pptxDB_lecturs8 27 11.pptx
DB_lecturs8 27 11.pptx
 
MySQL Optimizer Overview
MySQL Optimizer OverviewMySQL Optimizer Overview
MySQL Optimizer Overview
 
OrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionalityOrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionality
 
SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6
 
ReactDC Intro to NextJS 9
ReactDC Intro to NextJS 9ReactDC Intro to NextJS 9
ReactDC Intro to NextJS 9
 
educational course/tutorialoutlet.com
educational course/tutorialoutlet.comeducational course/tutorialoutlet.com
educational course/tutorialoutlet.com
 
Ctes percona live_2017
Ctes percona live_2017Ctes percona live_2017
Ctes percona live_2017
 
Sql scripting sorcerypaper
Sql scripting sorcerypaperSql scripting sorcerypaper
Sql scripting sorcerypaper
 
Witsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streamingWitsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streaming
 
lab14444444444444444444444444444444444444444
lab14444444444444444444444444444444444444444lab14444444444444444444444444444444444444444
lab14444444444444444444444444444444444444444
 
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
 
Sparklyr: Big Data enabler for R users
Sparklyr: Big Data enabler for R usersSparklyr: Big Data enabler for R users
Sparklyr: Big Data enabler for R users
 

More from ScyllaDB

Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
ScyllaDB
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
ScyllaDB
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
ScyllaDB
 
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
ScyllaDB
 
Noise Canceling RUM by Tim Vereecke, Akamai
Noise Canceling RUM by Tim Vereecke, AkamaiNoise Canceling RUM by Tim Vereecke, Akamai
Noise Canceling RUM by Tim Vereecke, Akamai
ScyllaDB
 
Running a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU ImpactsRunning a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU Impacts
ScyllaDB
 
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
ScyllaDB
 
Performance Budgets for the Real World by Tammy Everts
Performance Budgets for the Real World by Tammy EvertsPerformance Budgets for the Real World by Tammy Everts
Performance Budgets for the Real World by Tammy Everts
ScyllaDB
 
Using Libtracecmd to Analyze Your Latency and Performance Troubles
Using Libtracecmd to Analyze Your Latency and Performance TroublesUsing Libtracecmd to Analyze Your Latency and Performance Troubles
Using Libtracecmd to Analyze Your Latency and Performance Troubles
ScyllaDB
 
Reducing P99 Latencies with Generational ZGC
Reducing P99 Latencies with Generational ZGCReducing P99 Latencies with Generational ZGC
Reducing P99 Latencies with Generational ZGC
ScyllaDB
 
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
ScyllaDB
 
How Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global ScaleHow Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global Scale
ScyllaDB
 
Conquering Load Balancing: Experiences from ScyllaDB Drivers
Conquering Load Balancing: Experiences from ScyllaDB DriversConquering Load Balancing: Experiences from ScyllaDB Drivers
Conquering Load Balancing: Experiences from ScyllaDB Drivers
ScyllaDB
 
Interaction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance MetricInteraction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance Metric
ScyllaDB
 
How to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory ModelHow to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory Model
ScyllaDB
 
99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz
ScyllaDB
 
Square's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with RaftSquare's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with Raft
ScyllaDB
 
Making Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of RustMaking Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of Rust
ScyllaDB
 
A Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus AlbuquerqueA Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus Albuquerque
ScyllaDB
 
The Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of LatencyThe Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of Latency
ScyllaDB
 

More from ScyllaDB (20)

Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
 
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
 
Noise Canceling RUM by Tim Vereecke, Akamai
Noise Canceling RUM by Tim Vereecke, AkamaiNoise Canceling RUM by Tim Vereecke, Akamai
Noise Canceling RUM by Tim Vereecke, Akamai
 
Running a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU ImpactsRunning a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU Impacts
 
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
 
Performance Budgets for the Real World by Tammy Everts
Performance Budgets for the Real World by Tammy EvertsPerformance Budgets for the Real World by Tammy Everts
Performance Budgets for the Real World by Tammy Everts
 
Using Libtracecmd to Analyze Your Latency and Performance Troubles
Using Libtracecmd to Analyze Your Latency and Performance TroublesUsing Libtracecmd to Analyze Your Latency and Performance Troubles
Using Libtracecmd to Analyze Your Latency and Performance Troubles
 
Reducing P99 Latencies with Generational ZGC
Reducing P99 Latencies with Generational ZGCReducing P99 Latencies with Generational ZGC
Reducing P99 Latencies with Generational ZGC
 
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
 
How Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global ScaleHow Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global Scale
 
Conquering Load Balancing: Experiences from ScyllaDB Drivers
Conquering Load Balancing: Experiences from ScyllaDB DriversConquering Load Balancing: Experiences from ScyllaDB Drivers
Conquering Load Balancing: Experiences from ScyllaDB Drivers
 
Interaction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance MetricInteraction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance Metric
 
How to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory ModelHow to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory Model
 
99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz
 
Square's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with RaftSquare's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with Raft
 
Making Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of RustMaking Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of Rust
 
A Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus AlbuquerqueA Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus Albuquerque
 
The Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of LatencyThe Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of Latency
 

Recently uploaded

Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsScaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Mydbops
 
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
Matthew Sinclair
 
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
Stephanie Beckett
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
BookNet Canada
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
Matthew Sinclair
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
ishalveerrandhawa1
 
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALLBLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
Liveplex
 
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
Safe Software
 
Cookies program to display the information though cookie creation
Cookies program to display the information though cookie creationCookies program to display the information though cookie creation
Cookies program to display the information though cookie creation
shanthidl1
 
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
Toru Tamaki
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
Aurora Consulting
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
Neo4j
 
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
huseindihon
 
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
ArgaBisma
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Chris Swan
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
RaminGhanbari2
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
Adam Dunkels
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
Lidia A.
 
Manual | Product | Research Presentation
Manual | Product | Research PresentationManual | Product | Research Presentation
Manual | Product | Research Presentation
welrejdoall
 
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionAdvanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Bert Blevins
 

Recently uploaded (20)

Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsScaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
 
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
 
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
 
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALLBLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
 
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
 
Cookies program to display the information though cookie creation
Cookies program to display the information though cookie creationCookies program to display the information though cookie creation
Cookies program to display the information though cookie creation
 
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
 
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
 
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
 
Manual | Product | Research Presentation
Manual | Product | Research PresentationManual | Product | Research Presentation
Manual | Product | Research Presentation
 
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionAdvanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
 

Scylla Summit 2017: Planning Your Queries for Maximum Performance

  • 1. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Planning your queries for maximum performance VP R&D, ScyllaDB Shlomi Livne
  • 2. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Shlomi Livne 2 Shlomi is VP of R&D at ScyllaDB. Prior to ScyllaDB he led the research and development team at Convergin, which was acquired by Oracle.
  • 3. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company How Scylla executes your queries
  • 4. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Cluster View 4 client Cluster of nodes 1 7 3 4 5 68 2 Coordinator Replica
  • 5. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Coordinator Tasks 5 1. Prepare the statement 2. Single partition queries a. Selects replicas (using cache heat info) - and send query / digest requests requesting a page of results b. Compare the digests, if there is a mismatch: i. Request data from selected replicas ii. Repair the data on replicas c. Return result 3. Partition scan queries a. Split the request up based on the ring b. Send requests for data using ranges - requesting a page of results c. Merge results d. Return result
  • 6. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Replica Tasks 6 1. Receive a data/digest/range request 2. Split the request up according to shards 3. On each shard: a. Execute the request merging data from memtables + cache/sstables b. For data request: i. prepare a result and return it (compute digest if RF > 1) c. For digest request: i. compute digest and return it d. For partition scan request i. return the partition range data (do not prepare a result)
  • 7. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram 7 Bloom Filter Summary Index Compression Data Bloom Filter Summary Index Compression Data Bloom Filter Summary Index Compression Data ResultRow CacheMemtable Read Req Result Bloom Filter Summary Index Compression Data
  • 8. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram 8 Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter P8 Summary Index Compression Data Row Cache P8:R1:A=8,B=7 Memtable P8:R1:C=3 Read: P8:R1 Bloom Filter Summary Index Compression Data
  • 9. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram 9 Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter P8 Summary Index Compression Data Row Cache P8:R1:A=8,B=7 Memtable P8:R1:C=3 Read: P8:R1 P8:R1 A=8,B=7,C=3 Bloom Filter Summary Index Compression Data
  • 10. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram 10 Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter P8 Summary Index Compression Data Row Cache Memtable P8:R1:C=3 Read: P8:R1 Bloom Filter Summary Index Compression Data
  • 11. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Bloom Filter emtable P8:R1:C=3 Replica Shard Read Diagram 11 Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter P8 Summary Index Compression Data Row Cache Memtable P8:R1:C=3 Read: P8:R1 Summary Index Compression Data
  • 12. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram 12 Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter P8 Summary Index Compression Data Row Cache Memtable P8:R1:C=3 Read: P8:R1 Bloom Filter 12Summary Index Compression Data
  • 13. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram 13 Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter P8 Summary Index Compression Data Row Cache Memtable P8:R1:C=3 Read: P8:R1 13 Bloom Filter 13Summary Index Compression Data
  • 14. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter Summary Index Compression Data Row Cache Memtable P8:R1:C=3 Read: P8:R1 Bloom Filter P8 Summary Index Compression Data
  • 15. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram 15 Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter Summary Index Compression Data P8:R1:A=8,B=7Row Cache Memtable P8:R1:C=3 Read: P8:R1 Bloom Filter P8 Summary Index Compression Data
  • 16. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram 16 Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter Summary Index Compression Data P8:R1:A=8,B=7Row Cache P8:R1:A=8,B=7 Memtable P8:R1:C=3 Read: P8:R1 P8:R1 A=8,B=7,C=3 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter P8 Summary Index Compression Data
  • 17. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company emtable P8:R1:C=3 Replica Shard Read Diagram 17 Bloom Filter P8 Summary P8 Index P8 Compression Data P8:R1:A=8 Bloom Filter P8 Summary Index P8 Compression Data P8:R1:B=7 Bloom Filter Summary Index Compression Data P8:R1:A=8,B=7Row Cache P8:R1:A=8,B=7 Memtable P8:R1:C=3 Read: P8:R1 P8:R1 A=8,B=7,C=3 Bloom Filter P8 Summary Index Compression Data
  • 18. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Row Cache 18 ▪ Cache stores complete row data ▪ In addition to storing existing rows, cache stores information about completeness of clustering ranges (continuity), so it doesn't miss between cached rows. ▪ Cache is populated on: o Queries o Memtable flush: • Data is merged - to keep it up to date with new sstables written. • Data is inserted - in case there is no data for that partition on disk.
  • 19. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Selecting Sstables 19 ▪ Given a partition key (pk), the current set of sstables is reduced so that sstable X will be included iff: o min_partition_key(sstable X) < pk < max_partition_key (sstable X) o bloom_filer (sstable X, pk) = True ▪ Scylla 2.0: SStables will be read in parallel ▪ Scylla 2.1: o The reduced set of sstables is searched newest to oldest until a result can be constructed and we can prove that older sstables are not relevant. o SStables read parallelism will grow starting from a single sstable
  • 20. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company 7 Rules To Optimize your Queries
  • 21. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Rule #1 - Use Prepared statements ▪ Coordinator needs to pre-process the query: o A lot of repetitive work that can be done only once o Adds overhead in execution of a query - directly translates to throughput and latency ▪ Driver is not able to send the request to a coordinator node that holds the data (an additional hop) ▪ tip: compare scylla_query_processor_statements_prepared to the # of executed scylla_transport_requests_served 21
  • 22. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Sample: single Scylla server, using c-s 22 Results Unprepared Prepared op rate 13037 18704 partition rate 13037 18704 row rate 13037 18704 latency mean 1.5 1.1 latency median 1.3 1 latency 95th percentile 2.9 1.6 latency 99th percentile 6.2 2.5 latency 99.9th percentile 12.2 7.1 latency max 31.1 16.9 Total partitions 100000 100000
  • 23. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Rule #2 - Use Paging ▪ Paging Disabled: Coordinator will be forced to prepare a single result that holds all the data and send it back: o If coordinator is not able to return a response (allocate enough memory for the single result) an error will be returned to the client o tip: compare scylla_transport_unpaged_queries to scylla_cql_reads to detected if many of your read queries are unpaged 23
  • 24. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Rule #3 - Use correct Page Size ▪ Drivers enable paging by default with a default page_size 5000 rows (java, python, gocql) ▪ CQL requires returning at least one result and allows returning less results than the page size ▪ Scylla utilizes this: o Scylla caps a page_size to ~1MB of memory - Scylla will return less rows than requested when rows are large o Do not use the number of returned results as indication if there are no more results 24
  • 25. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company 25 21 Has more pages
  • 26. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Scylla 2.0: does the default page_size make sense 26 page size 10^6 rows of 100 bytes 10^5 rows of 1000 bytes 10^4 rows of 10^4 bytes 1000 rows of 10^5 bytes 10 timed out 2104.492031 331.087871 173.932543 50 5679.087615 737.148927 202.113023 168.165375 100 4034.920447 573.046783 186.384383 168.951807 500 2663.383039 415.760383 183.894015 173.015039 1000 2451.570687 395.313151 182.976511 168.427519 5000 2285.895679 400.031743 184.942591 169.345023 10000 2281.701375 399.769599 183.369727 169.738239 50000 2273.312767 396.099583 183.107583 170.000383 Test: duration in millisecond fetching a single wide partition with 10^8 bytes split into rows using different page size
  • 27. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Test: duration in millisecond fetching a single wide partition with 10^8 bytes split into rows using different page size C* 3.11.0: does the default page_size make sense 27 page size 10^6 rows of 100 bytes 10^5 rows of 1000 bytes 10^4 rows of 10^4 bytes 1000 rows of 10^5 bytes 10 timed out 4030.726143 903.872511 364.380159 50 12876.51328 1535.115263 419.430399 300.941311 100 8992.587775 1202.716671 405.274623 316.407807 500 6400.507903 907.542527 354.680831 348.651519 1000 6077.546495 874.512383 360.972287 370.409471 5000 5620.367359 791.674879 422.051839 358.612991 10000 5490.343935 793.772031 389.021695 360.447999 50000 5662.310399 913.833983 383.516671 355.467263 tip: consider changing the page size if your rows are large
  • 28. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Rule #4 - Beware of Multi Partition CQL IN queries ▪ Multi-Partition CQL IN queries: force the coordinator node to split the queries up to single partition queries and aggregate results. 28
  • 29. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Rule #5 - Beware of Single Partition CQL IN queries Question: Should I split the CQL IN Query ? Sample: ▪ CQL: “Select * from ks.cf where pk = X and ck in (Y1, Y2, … Yn) Translated to: ▪ CQL: o “Select * from ks.cf where pk = X and ck = Y1“ o “Select * from ks.cf where pk = X and ck = Y2“ . o “Select * from ks.cf where pk = X and ck = Yn“ 29
  • 30. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company 30
  • 31. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company 31
  • 32. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company 32
  • 33. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company 33
  • 34. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Question: Should I split the CQL IN Query ? Answer: It depends on how wide your rows are Comments: ▪ Prior to Scylla-2.0 in some wide partition cases single partition CQL IN Queries - performed very badly. ▪ All reported results are using Scylla 2.0 34
  • 35. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Rule #6 - There’s a faster way todo full scans ▪ The blog post efficient-full-table-scans-with-scylla outlaid an algorithm todo full scans; in highlevel: o split the range up into small sub ranges o run “enough” sub ranges in parallel ▪ In follow up blog How to scan 475 million partitions 12x faster using efficient full table scan a sample implementation applying this was provided ▪ Is there even a “faster” way ? 35
  • 36. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company ▪ Yes there is: o Using the token ownership of nodes in the ring one can select ranges of tokens. Once a “range” has been processed - the next “range” can be selected based on the ownership in the ring. o An even more optimized solution would use the “sharding” information and aim ranges based on shards on a machine - so that all cores are executing requests in parallel. 36
  • 37. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Rule #7: Use the tools …. ▪ Probelastic tracing ▪ Slow query tracing ▪ Wireshark ▪ CQL Trace ▪ Enable Client Side tracing. 37
  • 38. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company THANK YOU shlomi@scylladb.com @ShlomiLivne Any questions?