SharePoint Performance Monitoring with Sean P. McDonough
- 2. Enterprise Server Monitoring & Administration Tool
Monitor servers, users and apps live
Audit user’s logging activities
Monitor server usage and network performance
Track application and licence usage
Create inventory reports
Customize, export and email reports
Collect data from all your servers using one console
With real-time alerting, troubleshoot important issues faster
Read More about SysKit Monitor
Citrix XenDesktop
and XenApp
monitoring
Remote Desktop
Services and Gateway
monitoring
SharePoint, SQL, and
Windows Server
monitoring
- 3. Today’s Presenter
Sean McDonough
Bitstream Foundry LLC
About Me
▪ Owner and Lead Bitsmith of Bitstream Foundry
▪ 50% dev, 50% admin = 100% confused
▪ AR/VR enthusiast and wannabe developer
▪ Former polymer chemist
▪ CTO for a non-profit mental health awareness
organization (http://www.schizophreniaoralhistories.com)
▪ Desktop DJ (http://www.bunkertuneage.com)
▪ Husband to a wonderful woman (Tracy)
▪ Father to two beautiful twins (Brendan & Sabrina)
▪ Coffee lover and occasional donut eater …
- 4. What We’ll Be Covering
1. Some Introductory Words
2. Farm Environments
3. Tools and Monitoring Servers
4. Page Performance Monitoring
5. Questions & Answers
6. References
- 6. Yes, I said farm, not stamp
▪ Subtle distinction, but it means we’re likely on-premises …
• No SharePoint Online / Office 365
• Unless you’re on a “farm in the cloud”
Farm Environments
- 7. Yes, I said farm, not stamp
▪ Subtle distinction, but it means we’re likely on-premises …
• No SharePoint Online / Office 365
• Unless you’re on a “farm in the cloud”
▪ Why on-premises?
• Significant surface reduction for monitoring in the cloud
• It’s “someone else’s” problem (i.e., a value-add for consumers)
• Administrative APIs very limited vs. on-premises
• Limited tools (no perfmon, developer dashboard, etc.)
• In short: we can’t get at the counters and logs we need!
Farm Environments
- 8. “An ounce of prevention is
worth a pound of cure.”
When you have the luxury of starting
from scratch, you can get the basics
right
Getting a Solid Start
- 9. Without a properly configured SQL Server environment,
no amount of SharePoint troubleshooting will amount to
anything.
So, some things to bear in mind …
▪ If virtualizing, then minimize abstractions
▪ Choose an appropriate storage sub-system
▪ Don’t skimp on disks!
▪ Put your I/O where you need it
A brief word …
- 10. “The Ultimate SharePoint Performance Guide”
by Vlad Catrinescu and Gokan Ozcifci
Special link:
https://leanpub.com/SharePointPerformanceGuide/c/SysKit
A Smart Investment
- 12. Why do we monitor performance? Reasons typically fall into one
of the following three categories:
▪ We are seeking to understand why our SharePoint
environment is underperforming
• Troubleshooting!
▪ We want to ensure that we have enough headroom to scale
and grow as desired.
• Capacity!
▪ We want to quantify changes we’ve made to our farm in
terms of performance
• Improvements!
Reasons
- 13. We’re looking for the source of a performance
problem. Where should we start?
Performance issues typically originate in at least
one general sub-system:
▪ Memory
▪ Network
▪ Processor (CPU)
▪ Storage (Disk)
Of course, SharePoint problems often muddy
the waters by spanning more than one category
Troubleshooting
- 14. Recommendation: start with monitoring the server(s)
over time to gain an understanding:
▪ First understand “the normal state” of a server
▪ Then observe the server when a problem occurs
Tools
- 15. Recommendation: start with monitoring the server(s)
over time to gain an understanding:
▪ First understand “the normal state” of a server
▪ Then observe the server when a problem occurs
Establishing a baseline when your environment is
running normally (and non-stressed) is critical.
▪ Baselines provide a reference point
▪ Without a baselne, all measurements are simply
relative to one another
Tools
- 16. Many different tools at our disposal:
▪ Farm Health Analyzer
▪ Event Viewer
▪ ULS Viewer
▪ Fiddler
▪ Developer Dashboard
▪ Wireshark
▪ Diskspd
▪ CrystalDiskMark
Tools
- 17. Today’s focus for performance monitoring is on counters
▪ Specific performance counters that can help direct
further investigation and keep us out of the weeds
Performance Counters
- 18. Today’s focus for performance monitoring is on counters
▪ Specific performance counters that can help direct
further investigation and keep us out of the weeds
How do we view performance counters?
▪ Windows Performance Monitor (perfmon.exe)
Performance Counters
- 19. Today’s focus for performance monitoring is on counters
▪ Specific performance counters that can help direct
further investigation and keep us out of the weeds
How do we view performance counters?
▪ Windows Performance Monitor (perfmon.exe)
▪ Windows Resource Monitor (resmon.exe)
Performance Counters
- 20. Today’s focus for performance monitoring is on counters
▪ Specific performance counters that can help direct
further investigation and keep us out of the weeds
How do we view performance counters?
▪ Windows Performance Monitor (perfmon.exe)
▪ Windows Resource Monitor (resmon.exe)
▪ A more specialized tool (like SysKit Monitor) ☺
Performance Counters
- 22. Performance Counter Basics
The operating system exposes counters
▪ Memory, CPU, network, and more
Applications oftentimes expose their own counters
▪ For instance, SharePoint alone exposes over 20
categories and hundreds of counters
Performance Counters
- 23. Performance Counter Basics
The operating system exposes counters
▪ Memory, CPU, network, and more
Applications oftentimes expose their own counters
▪ For instance, SharePoint alone exposes over 20
categories and hundreds of counters
Bottom line: unless you know what to watch, you’ll
suffer a cruel and horrible death at the hands of the
Performance Counter Gods.
Performance Counters
- 24. We may need to configure our farm to facilitate better data capture
(covered in the references):
▪ Turn off Event Log Flooding Protection
▪ Reduce the interval on the SharePoint Foundation Usage Data
Import Timer Job
▪ Enable all diagnostic providers
▪ Lower job-diagnostics-performance-counter-###-provider
schedule interval (where ### is “wfe” and “sql”)
▪ Enable stack tracing for content requests
▪ Enable the Developer Dashboard
▪ Enable additional usage data collection.
Configuration
- 25. What should I be watching?
That depends on the role of the server
▪ Web Front-End
▪ Application Server
▪ SQL Server
Server Roles and Counters
- 26. WFEs serve-up pages through IIS, so we want low values for all of these counters
▪ ASP.NET: Requests Queued (should be “low”)
▪ ASP.NET: Requests Rejected (should be 0)
▪ ASP.NET: Request Wait Time (should be near 0)
▪ ASP.NET: Worker Process Restarts (should be 0)
Web Front-Ends
- 27. WFEs serve-up pages through IIS, so we want low values for all of these counters
▪ ASP.NET: Requests Queued (should be “low”)
▪ ASP.NET: Requests Rejected (should be 0)
▪ ASP.NET: Request Wait Time (should be near 0)
▪ ASP.NET: Worker Process Restarts (should be 0)
WFEs also use their memory for caching to accelerate web requests.
▪ ASP.NET Applications: Cache API Trims (should be near 0)
▪ ASP.NET Applications: Cache API Hit Ratio (should be “high”)
▪ SharePoint Publishing Cache: Total Number of Cache Compactions (should be near 0)
▪ SharePoint Publishing Cache: Publishing Cache Hit Ratio (should be “high”)
▪ SharePoint Publishing Cache: Publishing Cache Flushes / Second (should be 0)
Web Front-Ends
- 28. WFEs use disks for BLOB caching
▪ SharePoint Publishing Cache: BLOB Cache % Full (maintain headroom)
Web Front-Ends
- 29. Unless an application server is experiencing issues specific to its function (which might
require monitoring specialized counters), consider monitoring the following:
▪ Processor: % Processor Time (>75% - 85% is bad)
▪ Memory: Available Mbytes (<2 GB is bad)
▪ Memory: Cache Faults/sec (>1 is bad)
▪ Memory: Pages/sec (>10 is bad)
▪ Disk: Avg. Disk Queue Length (depends)
▪ Disk: % Idle Time (<90% is bad)
▪ Disk: % Free Space (<30% is bad)
Application Servers
- 30. Unless an application server is experiencing issues specific to its function (which might
require monitoring specialized counters), consider monitoring the following:
▪ Processor: % Processor Time (>75% - 85% is bad)
▪ Memory: Available Mbytes (<2 GB is bad)
▪ Memory: Cache Faults/sec (>1 is bad)
▪ Memory: Pages/sec (>10 is bad)
▪ Disk: Avg. Disk Queue Length (depends)
▪ Disk: % Idle Time (<90% is bad)
▪ Disk: % Free Space (<30% is bad)
These also are valid for WFEs, as well!
Application Servers
- 31. Consider watching the following:
▪ SQLServer:Buffer Manager: Buffer Cache Hit Ratio
▪ SQLServer:Databases: Transactions/sec
▪ SQLServer:General Statistics: User Connections
▪ SQLServer:Latches: Average Latch Wait Time (ms)
▪ SQLServer:Latches: Latch Waits/sec
▪ SQLServer:Locks: Average Wait Time (ms)
▪ SQLServer:Locks: Lock Wait Time (ms)
▪ SQLServer:Locks: Number of Deadlocks/sec
▪ SQLServer:Plan Cache: Cache Hit Ratio
▪ SQLServer:SQL Statistics: SQL Compilations/sec
▪ SQLServer:SQL Statistics: SQL Re-Compilations/sec
SQL Servers
- 34. We’ve been looking at server-side performance
monitoring thus far. It represents only half of the
overall equation.
Page Performance Monitoring
- 35. We’ve been looking at server-side performance
monitoring thus far. It represents only half of the
overall equation.
We need go to put ourselves in the role of the
end-user to monitor and diagnose a number of
other issues, including page performance issues.
Page Performance Monitoring
- 36. We’ve been looking at server-side performance
monitoring thus far. It represents only half of the
overall equation.
We need go to put ourselves in the role of the
end-user to monitor and diagnose a number of
other issues, including page performance issues.
What can we do from the other end of the wire?
Page Performance Monitoring
- 37. The answer is “quite a bit”
Your browser is an amazingly capable
performance tool – if you understand
how to use it.
Page Performance Monitoring
- 38. The answer is “quite a bit”
Your browser is an amazingly capable
performance tool – if you understand
how to use it.
Requests and their responses are
recorded chronologically – including all
sorts of information such as HTTP
headers, response codes, cookies, and
much more.
Page Performance Monitoring
- 40. X-SharePointHealthScore
▪ A measure of the front-end’s general load or
stress. Values from 0 (no stress) to 10 (max stress).
We want this low.
SPRequestDuration
▪ The amount of time your request spends
processing on the server (in ms). Ideally less than
three seconds (3000ms)
Page Performance Monitoring
- 41. X-SharePointHealthScore
▪ A measure of the front-end’s general load or
stress. Values from 0 (no stress) to 10 (max stress).
We want this low.
SPRequestDuration
▪ The amount of time your request spends
processing on the server (in ms). Ideally less than
three seconds (3000ms)
SPIisLatency
▪ The amount of time your request spends waiting
on the server (in ms). Should be near zero.
Page Performance Monitoring
- 42. Round Trip Time – (SPRequestDuration + SPIisLatency) = Time lost “Elsewhere”
Page Performance Monitoring
- 43. Round Trip Time – (SPRequestDuration + SPIisLatency) = Time Lost “Elsewhere”
For example:
▪ Round Trip Time = 76.04ms
▪ SPRequestDuration = 51ms
▪ SPIisLatency = 0
▪ Time Lost Elsewhere = 25.04ms
Page Performance Monitoring
- 44. Round Trip Time – (SPRequestDuration + SPIisLatency) = Time lost “Elsewhere”
For example:
▪ Round Trip Time = 76.04ms
▪ SPRequestDuration = 51ms
▪ SPIisLatency = 0
▪ Time Lost Elsewhere = 25.04ms
This is a high-performance SharePoint
farm that is not under load.
▪ May not reflect real world conditions
Page Performance Monitoring
- 45. This will work for …
▪ SharePoint 2013 on-prem
SharePoint On-Premises
- 46. This will work for …
▪ SharePoint 2013 on-prem
▪ SharePoint 2016 on-prem
SharePoint On-Premises
- 47. I’ve got consistently high SPRequestDuration values
▪ This is oftentimes where we find questionable dev practices
▪ May be related to server (over-)load or other factors
▪ X-SharePointHealthScore can corroborate (or not)
The Common Outcomes
- 48. I’ve got consistently high SPRequestDuration values
▪ This is oftentimes where find questionable dev practices
▪ May be related to server (over-)load or other factors
▪ X-SharePointHealthScore can corroborate (or not)
I’m seeing a lot of “time lost elsewhere”
▪ Network congestion or failure
▪ Web proxies inserting themselves between you and SharePoint
▪ DNS resolution issues
▪ Routing problems
The Common Outcomes
- 51. References
1. Storage and SQL Server Capacity Planning and Configuration (SharePoint Server)
https://technet.microsoft.com/en-us/library/cc298801(v=office.16).aspx
2. Best Practices for SQL Server in a SharePoint Server Farm
https://technet.microsoft.com/en-us/library/hh292622(v=office.16).aspx
3. Diskspd Utility: A Robust Storage Testing Tool (superseding SQLIO)
https://gallery.technet.microsoft.com/DiskSpd-a-robust-storage-6cd2f223
4. Github repository for diskspd
https://github.com/microsoft/diskspd
5. Using Microsoft DiskSpd to Test Your Storage Subsystem
https://sqlperformance.com/2015/08/io-subsystem/diskspd-test-storage
6. CrystalDiskMark 6.0.0
https://crystalmark.info/download/index-e.html
7. The Ultimate SharePoint Performance Guide
https://leanpub.com/SharePointPerformanceGuide/c/SysKit
- 52. References
8. Monitoring and maintaining SharePoint Server 2013
https://technet.microsoft.com/en-us/library/ff758658(v=office.16).aspx
9. Performance Testing for SharePoint Server 2013
https://technet.microsoft.com/en-us/library/ff758659(v=office.16).aspx
10. Capacity management and sizing overview for SharePoint Server 2013
https://technet.microsoft.com/en-us/library/ff758647(v=office.16).aspx
11. SharePoint Performance Monitoring – How and Why?
http://blog.syskit.com/sharepoint-performance-monitoring
12. Performance Counters for ASP.NET
https://msdn.microsoft.com/en-us/library/fxk122b4.aspx
13. Monitor Cache Performance in SharePoint Server 2016
https://technet.microsoft.com/en-us/library/ff934623(v=office.16).aspx
14. ASP.NET Performance Monitoring, and When to Alert Administrators
https://msdn.microsoft.com/en-us/library/ms972959.aspx
- 53. References
15. MOSS Object Cache Memory Tuning is not an Intuitive Process
https://sharepointinterface.com/2009/08/30/moss-object-cache-memory-tuning-is-not-an-intuitive-process/
16. High Avg Disk Queue Length and Finding the Cause
http://www.ithacks.com/2008/09/12/high-avg-disk-queue-length-and-finding-the-cause/
17. SharePoint Performance: Best Practices from the Field
https://www.slideshare.net/jasonhimmelstein/sharepoint-performance
18. ULS Viewer
https://www.microsoft.com/en-us/download/details.aspx?id=44020
19. Fiddler
https://www.telerik.com/download/fiddler
20. Using the Developer Dashboard
https://msdn.microsoft.com/en-us/library/office/ff512745(v=office.16).aspx
21. The Five-Minute Page Performance Troubleshooting Guide for SharePoint Online
https://sharepointinterface.com/2017/07/07/the-five-minute-page-performance-troubleshooting-guide-for-
sharepoint-online/
- 54. References
22. Akamai Reveals 2 Seconds As The New Threshold Of Acceptability For Ecommerce Web Page Response Times
https://www.akamai.com/us/en/about/news/press/2009-press/akamai-reveals-2-seconds-as-the-new-
threshold-of-acceptability-for-ecommerce-web-page-response-times.jsp
23. How Loading Time Affects Your Bottom Line
https://blog.kissmetrics.com/loading-time/
- 55. Sean P. McDonough
SharePoint and Office 365 Gearhead,
Tinkerer, Microsoft MVP
My Company:
Twitter:
Blog:
About:
@spmcdonough
http://SharePointInterface.com
http://about.me/spmcdonough