My Database Skills Killed the Server

MY DATABASE SKILLS
KILLED THE SERVER
Dave Ferguson
@dfgrumpy
CFSummit 2015

WHO AM I?
I am an Adobe Community Professional
I started building web applications a long time ago
Contributor to Learn CF in a week
I have a ColdFusion podcast called
CFHour w/ Scott Stroz (@boyzoid)
(please listen)
3x District Champion in Taekwondo

WHAT WILL WE COVER?
• Running Queries
• When good SQL goes bad
• Bulk processing
• Large volume datasets
• Indexes
• Outside influences

(I KNOW SQL)
“WHY AM I HERE?”

Because you have probably written something like
this…

“I can write SQL in my sleep”
select * from myTable where id = 2

“I can write joins and
other complex SQL”
Select mt.* from myTable mt
join myOtherTable mot
on mt.id = mot.id
where mot.id = 2

“I might even create a table”
CREATE TABLE `myFakeTable` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`someName` varchar(150) NOT NULL DEFAULT '',
`someDescription` text,
`type` varchar(50) DEFAULT NULL,
`status` int(11) NOT NULL,
PRIMARY KEY (`id`)
);

But, how do you know if what
you did was the best / most
efficient way to do it?

Did the
internet tell
you it was
right?

Did you get
some advice
from a
someone?

“My app works fine. It has
thousands of queries and we
only see slowness every once in
a while. ”

Have you ever truly looked at
what your queries are doing?

Most developers don't bother.
They leave all that technical
database stuff up to the DBA.
But what if you are the
developer AND the DBA?

Query Plan
Uses Execution Contexts
Created for each degree of parallelism for
a query
Execution Context
Specific to the query being executed.
Created for each query
QUERY EXECUTION

Execution Context & Query Plan

Have you ever looked
at a query plan?
Do you know what a query plan is?

Query Plan, In the event you were curious…

WHAT A QUERY PLAN WILL TELL YOU
• Path taken to get data
• Almost like a Java stack trace
• Indexes usage
• How the indexes are being used
• Cost of each section of plan
• Possible suggestions for performance improvement
• Whole bunch of other stuff

How long are plans / contexts kept?
 1 Hour
 1 Day
 ‘Til SQL server restarts
 Discards it immediately
 The day after forever
 Till the server runs out of cache space

What can cause plans to be flushed from cache?
 Forced via code
 Memory pressure
 Alter statements
 Statistics update
 auto_update statistics on

HOW CAN WE KEEP THE
DATABASE FROM THROWING
AWAY THE PLANS?

MORE IMPORTANTLY,
HOW CAN WE GET THE DATABASE
TO USE THE CACHED PLANS?

• Force it
• Use params
• Use stored procedures
• Get more ram
• Use less queries
SIMPLE ANSWER

HOW DOES SQL
DETERMINE IF THERE
IS A QUERY PLAN?

THIS QUERY WILL
CREATE A EXECUTION CONTEXT..
select id, name from myTable where id = 2
THAT…

WILL NOT BE USED BY
THIS QUERY.

WHY IS THAT?
Well, the queries are
not the same.

According to the SQL optimizer,
this query…
and this query…
are not the same.
So, they each get their own execution context.

PLANS CAN BECOME DATA HOGS
If the query above ran 5,000 times over the
course of an hour (with different ids), you could
have that many plans cached.
That could equal around 120mb of cache space!

TO RECAP…
EXECUTION CONTEXTS
ARE GOOD
TOO MANY ARE BAD

USING QUERY PARAMS…
The secret sauce to plan reuse

<cfquery name="testQuery">
select a.ARTID, a.ARTNAME from ART a
where a.ARTID = <cfqueryparam value="5”
cfsqltype="cf_sql_integer">
</cfquery>
Using a simple query… let’s add a param for the id.

where a.ARTID = ?
THE QUERY OPTIMIZER SEES THIS…

testQuery (Datasource=cfartgallery, Time=1ms, Records=1)
in /xxx/x.cfm
where a.ARTID = ?
Query Parameter Value(s) -
Parameter #1(cf_sql_integer) = 5
THE DEBUG OUTPUT LOOKS LIKE THIS…

testQuery (Datasource=cfartgallery, Time=8ms, Records=5) in /xxx/x.cfm
where a.ARTID in (?,?,?,?,?)
Parameter #1(CF_SQL_CHAR) = 1
IT EVEN WORKS ON LISTS…

testQuery (Datasource=cfartgallery, Time=3ms, Records=1) in /xxx/x.cfm
select a.ARTID, a.ARTNAME, (
select count(*) from ORDERITEMS oi where oi.ARTID = ?
) as ordercount
from ART a
where a.ARTID in (?)
MORE ACCURATELY, THEY WORK ANYWHERE
YOU WOULD HAVE DYNAMIC INPUT...

When can plans cause more harm
then help?
► When your data structure changes
► When data volume grows quickly
► When you have data with a high
degree of cardinality.

How do I deal
with all this data?

What do I mean by large data sets?
► Tables over 1 million rows
► Large databases
► Heavily denormalized data

Ways to manage large data
► Only return what you need (no “select *”)
► Try and page the data in some fashion
► Optimize indexes to speed up where
clauses
► Avoid using triggers on large volume
inserts / multi-row updates
► Reduce any post query processing as
much as possible

Inserting / Updating large datasets
► Reduce calls to database by combining queries
► Use bulk loading features of your Database
► Use XML/JSON to load data into Database

Combining Queries: Instead of doing this…

Gotcha’s in query combining
► Errors could cause whole batch to fail
► Overflowing allowed query string size
► Database locking can be problematic
► Difficult to get any usable result from query

Upside query combining
► Reduces network calls to database
► Processed as a single batch in database
► Generally processed many times faster
than doing the insert one at a time
I have used this technique to insert over 50k rows
into mysql in under one second.

Indexes
The secret art
of a faster select

Index Types
► Unique
► Primary key or row ID
► Covering
► A collection of columns indexed in an order that
matches where clauses
► Clustered
► The way the data is physically stored
► Table can only have one
► NonClustered
► Only contain indexed data with a pointer back to
source data

Seeking and Scanning
► Index SCAN (table scan)
► Touches all rows
► Useful only if the table contains small amount of rows
► Index SEEK
► Only touches rows that qualify
► Useful for large datasets or highly selective queries
► Even with an index, the optimizer may still opt to perform a scan

To index or not to index…
► DO INDEX
► Large datasets where 10 – 15% of the data is usually
returned
► Columns used in where clauses with high cardinality
► User name column where values are unique
► DON’T INDEX
► Small tables
► Columns with low cardinality
► Any column with only a couple unique values

Other things that can effect performance
► Processor load
► Memory pressure
► Hard drive I/O
► Network

Processor
► Give SQL Server process CPU priority
► Watch for other processes on the server using
excessive CPU cycles
► Have enough cores to handle your database
activity
► Try to keep average processor load below
50% so the system can handle spikes
gracefully

Memory (RAM)
► Get a ton (RAM is cheap)
► Make sure you have enough RAM to keep your server
from doing excess paging
► Make sure your DB is using the RAM in the server
► Allow the DB to use RAM for cache
► Watch for other processes using excessive RAM

Drive I/O
► Drive I/O is usually the largest bottle neck on the server
► Drives can only perform one operation at a time
► Make sure you don’t run out of space
► Purge log files
► Don’t store all DB and log files on the same physical drives
► On windows don’t put your DB on the C: drive
► If possible, use SSD drives for tempdb or other highly
transactional DBs
► Log drives should be in write priority mode
► Data drives should be in read priority mode

Network
► Only matters if App server and DB server are on separate machines
(they should be)
► Minimize network hops between servers
► Watch for network traffic spikes that slow data retrieval
► Only retrieving data needed will speed up retrieval from DB server to
app server
► Split network traffic on SQL server across multiple NIC cards so that
general network traffic doesn’t impact DB traffic

Some Important
Database Statistics

Important stats
► Recompiles
► Recompile of a proc while running shouldn’t occur
► Caused by code in proc or memory issues
► Latch Waits
► Low level lock inside DB; Should be sub 10ms
► Lock Waits
► Data lock wait caused by thread waiting for
another lock to clear
► Full Scans
► Select queries not using indexes

Important stats continued..
► Cache Hit Ratio
► How often DB is hitting memory cache vs Disk
► Disk Read / Write times
► Access time or write times to drives
► SQL Processor time
► SQL server processor load
► SQL Memory
► Amount of system memory being used by SQL

Where SQL goes wrong
(Good examples of bad SQL)

Inline queries that
shouldn’t be

Transactions – Do you see the issue?

THANK YOU
Dave Ferguson
@dfgrumpy
dave@dkferguson.com
www.cfhour.com
CFSummit 2015
Don’t forget to fill out the survey

My Database Skills Killed the Server

More Related Content

My Database Skills Killed the Server

Editor's Notes