"But It Worked In Development!" - 3 Hard SQL Server Problems
- 6. Agenda
About my server & workload
Problem #1: RESOURCE_SEMAPHORE
Problem #2: THREADPOOL
Problem #3: LOCK ESCALATION
- 7. SQL2016A
VM with 4 cores, 8GB RAM
Max server memory: 6GB RAM
Stack Overflow database: 100GB
dbo.Users table: <1GB
- 9. Symptoms
RESOURCE_SEMAPHORE waits present
Page life expectancy drops
(sometimes without long-running-queries happening)
SQLServer: Memory Manager – Memory Grants Pending
counter goes > 0
Queries take an unpredictably long time to start
- 11. Where to learn more
Find queries with high memory grants:
sp_BlitzCache @SortOrder = 'memory grant'
Bonus points: train developers to watch this in dev, spotting
high grant queries before going to production
Paul White: Sorting, Row Goals, and TOP 100
http://sqlblog.com/blogs/paul_white/archive/2010/08/27/sort
ing-row-goals-and-the-top-100-problem.aspx
- 13. Symptoms
Apps, users can’t connect to the SQL Server
Monitoring tools show gaps in history with no metrics
SQL Server error log doesn’t show anything unusual
- 14. Today’s fix
Create an index to let the readers go past:
CREATE INDEX IX_Reputation ON dbo.Users
(Reputation) INCLUDE (DisplayName, Location);
Other great options:
• Commit/shorten your transactions, or
• Consider different isolation levels, or
• Ease up on app server quantity & workloads
- 15. Where to learn more
Enable & use the Dedicated Admin Connection:
https://www.mssqltips.com/sqlservertip/1801/enable-sql-
server-dedicated-administrator-connection/
When it hits, train someone to log in and save sp_BlitzWho
or sp_WhoIsActive results to a table
Use sp_BlitzIndex to find “aggressive indexes” – indexes
with long lock waits, and tune those tables
- 16. Problem 3 Demo:
LOCK ESCALATION(which isn’t usually typed in all capitals, but we’re on a roll here)
- 17. Symptoms
Production & development have different data sizes
Queries that worked fine suddenly start blocking
Queries that take wildly variable workloads:
• Data warehouse nightly loads
• User-driven batch processing
Varying execution plans for writes
(narrow vs wide plans for different parameters)
- 18. Today’s fix
If writers are blocking readers (or vice versa),
let’s try Read Committed Snapshot Isolation (RCSI):
ALTER DATABASE [StackOverflow] SET
READ_COMMITTED_SNAPSHOT ON WITH NO_WAIT;
Other great options:
• Work in bigger batches (update everything at once), or
• Purposely work in small batches, committing as you go to
avoid triggering escalation, or
• Consider different isolation levels
- 19. Where to learn more
Michael J. Swart on batch coding:
http://michaeljswart.com/2014/09/take-care-when-scripting-
batches/
SQLCAT team’s Fast Ordered Deletes technique, archived
by John Sansom:
https://www.johnsansom.com/fast-sql-server-delete/
Support KB #323630 on lock escalation:
https://support.microsoft.com/en-us/help/323630/