How to Think Like the SQL Server Engine

Welcome to the online class!
Slides, scripts, videos: BrentOzar.com/go/engine
Want to follow along? Download the small Stack Overflow 2010
database: BrentOzar.com/go/querystack
We’ll start at 5 minutes after the hour to give folks time to get
GoToWebinar working.
To chat with me & students: https://BrentOzar.com/slack in the
#BrentOzarUnlimited room. (Not GoToWebinar Q&A.)

How to Think Like
the SQL Server Engine
Brent Ozar, 2019/01/25

The best-in-class performance
monitoring tool, SQL Sentry is now
available in an edition that’s right-sized
for smaller environments.
SQL Sentry Essentials includes the core
features of SentryOne’s flagship
monitoring product and is perfect for
environments of up to five targets.
BrentOzar.com/go/sentryone

We’re using Stack Overflow data.
Open source, licensed with Creative Commons
SQL Server: BrentOzar.com/go/querystack
XML dump: archive.org/details/stackexchange
I’m using SQL Server 2019,
compatibility level 150/2019.
My cost threshold for parallelism is 5.

Page Header
Index OR
Data Rows
Slot Array
8KB

You: SQL Server.
Me: end user.

First query:
SELECT Id
FROM dbo.Users

Your execution plan:
1. Shuffle through all of the pages,
saying the Id of each record out loud.

SET STATISTICS IO ON
Logical reads: the number of 8K pages we read.
(7,405 x 8KB = 59MB)

Let’s add a filter.
SELECT Id
FROM dbo.Users
WHERE LastAccessDate > ‘2014/07/01’

Your execution plan:
saying the Id of each record out loud,
if their LastAccessDate > ‘2014/07/01’.

Lesson:
Using WHERE
without a matching index
means scanning all the data.
(And therearesome extra reads whenqueries goparallel –but moreonthatin
our moreadvanced classes.)

Lesson:
Estimated Subtree Cost is a rough measure
of CPU and IO work required for a query.

Let’s add a sort.
SELECT Id
FROM dbo.Users
ORDER BY LastAccessDate

Your execution plan
writing down fields __________ for each record,
2. Sort the matching records by LastAccessDate.

Order By:
Cost is up about 2x
We needed space to
write down our results,
so we got a memory grant

You can see more in Properties

You can’t always get what you want.
Memory is set when the query
starts, and not revised.
SQL Server has to assume
other people will run queries at
the same time as you.
Your memory grant can change
with each time that you run a
query.
* - This screenshot is from a different query to show variances.

And if you run out of memory…

Let’s get all the fields.
SELECT *
FROM dbo.Users

But why does it suck?
Do we work harder to read the data?
Do we work harder to write the data?
Do we work harder to sort the data?
Do we work harder to output the data?

Of a MUCH larger
overall cost.

SELECT ID SELECT *
No order 6 6
ORDER BY 13 871

Lesson:
Sorting data is expensive, and more fields
make it worse.

Let’s run it a few times.
SELECT *
FROM dbo.Users
ORDER BY LastAccessDate;
GO 100

Your execution plan
writing down all the fields for each record,
2. Sort the matching records by LastAccessDate.
3. Keep the output so you could reuse it the next time
you saw this same query?

Oracle can.
(One ofthe reasons itcosts$47,000 per core.)

Oracle can.
(One ofthe reasons itcosts$47,000 per core.)
Another reason

SQL Server reads & sorts 100 times.

Lesson:
SQL Server caches data pages, not query
output.
So how do we
make this fast?

Nonclustered indexes: copies.
Stored in order we want, include the fields we want
CREATE INDEX
IX_LastAccessDate_Id
ON dbo.Users(LastAccessDate, Id)

Leaf pages
(we’re focusing on these)
“Index” pages
(but exist for both clustered and
nonclustered indexes)

Let’s go simple again.
SELECT Id
FROM dbo.Users

Your execution plan
1. Grab IX_LastAccessDate and seek to 2014/07/01.
2. Read the Id’s out in order.

SELECT ID SELECT *
No order 6 6
ORDER BY 13 871
ORDER BY, with index <1 48

Why cheaper?
For starters, it does less
logical reads…

And less CPU, too.
SET STATISTICS TIME shows you
how much CPU time each query
burned up.
The index eliminates the sort, which
burned up our CPUs.

The index covers the fields
needed by the query,
so we call it a covering index.*
*But covering isn’t really aspecial kindof index –
it’sonly covering when we’re talking about aquery.

So nonclustered index seeks are
great, right?

“Seek” sounds small, right?
But that’s a lot of data.

You probably think “seek” means,
“I’m going to jump to a row and read that one row.”
You probably think “scan” means,
“I’m going to read the whole thing.”

Note that date
“Seek” = read all rows

SQL Server doesn’t know.
You and I know this means the whole table:
But SQL Server doesn’t, and can’t guarantee it
unless you tell it more about the data in the table,
like add a constraint.
(More on that in other classes.)

Seek means,
“I’m going to jump to a row and start reading.”
Scan means,
“I’m going to start at either end of the object
(might be either the start, or the end)
and start reading.”
Neither term defines
how many rows will be read.

Seeks vs scans
A seek can start at the first row,
and read the entire table.
A scan can start at one end of the table,
and only read a few pages.
We can’t just say, “All index seeks! We’re done.”

Lessons we learned
SET STATISTICS IO ON: shows # of 8KB pages read
SET STATISTICS TIME ON: shows CPU work done
WHERE without a supporting index: table scan
ORDER BY without a supporting index: CPU work
Indexes reduce page reads and sorts
Seek =! awesome, and scan != terribad

Key Lookups and
Cardinality Estimation

Let’s add a couple of fields.
SELECT Id, DisplayName, Age
FROM dbo.Users

One execution plan
1. Grab IX_LastAccessDate_Id, seek to 2014/07/01.
2. Write down the Id and LastAccessDate of matching
records.
3. Grab the clustered index (white pages), and look up
each matching row by their Id to get DisplayName and
Age.

That’s why SQL includes the key
For simplicity, I said I created this index with the Id.
SQL Server always includes your clustering keys whether
you ask for ‘em or not because it has to join indexes
together.

Classic index
tuning sign
Key lookup is required when the
index doesn’t have all the fields we
need.
Hover your mouse over the key
lookup, look for the OUTPUT.
Small fields? Frequently used?
Add ‘em to the index.

Lesson:
Even with indexes,
there’s a tipping point where it’s more efficient
for SQL to just scan the table once and get
out.

Statistics help SQL Server:
Decide which index to use
What order to process tables/indexes in
Whether to do seeks or scans
Guess how many rows will match your query
How much memory to allocate for the query

WHERE LastAccessDate
> '2014/07/01'
Add it up
Add it up

Examples of varchar & int stats

Two ways you can help
1. Keep your stats updated at least weekly.
Automatic stats updates aren’t enough. Consider Ola
Hallengren’s free scripts: Ola.Hallengren.com
2. Learn which T-SQL elements will cause cardinality
estimation problems, ignoring statistics

Estimated 2,076 rows
Estimated 2 rows
Both produce the same 2,443 rows, but they
use 2 different ways to retrieve those rows due
to their different estimates.

The classic problem
SQL Server has to decide between:
• Scanning the entire table,
which is great for big data, or
• An index seek + key lookup,
which is better for small data
It bases this decision on
cardinality estimation – and it’s not perfect.
We can avoid this problem by
widening our nonclustered index.

CREATE INDEX IX_LastAccessDate_Id_DisplayName_Age
ON dbo.Users (LastAccessDate, Id, DisplayName, Age)
Or:
CREATE INDEX IX_LastAccessDate_Id_Includes
ON dbo.Users (LastAccessDate, Id)
INCLUDE (DisplayName, Age)

Same query again
SELECT Id, DisplayName, Age
FROM dbo.Users

Yay! Back to a single operator.

Lessons we learned
Index seek + key lookup = we may need wider indexes
Statistics help SQL Server pick indexes, methods
Cardinality estimation isn’t perfect (especially with real-
world T-SQL and joins to multiple tables)
You can help by understanding SQL’s limitations
and crafting your T-SQL to avoid them

Your next steps
Full How to Think Like the Engine: free videos
Fundamentals of Index Tuning: 1-day online class
Mastering Index Tuning: 3-day online class
Learn more:
BrentOzar.com/go/engine

How to Think Like the SQL Server Engine

Related slideshows

More Related Content

How to Think Like the SQL Server Engine

Editor's Notes