SlideShare a Scribd company logo
SQL 2005 Disk I/O Performance By Bryan Oliver SQL Server Domain Expert
Agenda Disk I/O Performance Call to Action Performance Analysis Demo Q & A
Some Questions To Think About Two queries - the first is run once a week and takes 10 mins to return its result set - the second is run 10 thoushand times a week and takes 1 second to return its result set ? which of these two queries will have the potential to affect Disk I/O the greatest. Two computers - the first uses Raid 5 for its data drive - the second uses Raid 10 for its data drive ? which of these two computers will return data faster all else been equal
The Basics of I/O A single fixed disk is inadequate except for the simplest needs Database applications require a Redundant Array of Inexpensive Disks (RAID) for: Fault tolerance Availability Speed Different levels offer different pros/cons
RAID Level 5 Pros Highest Read data transaction rate; Medium Write data transaction rate Low ratio of parity disks to data disks means high efficiency Good aggregate transfer rate Cons Disk failure has a medium impact on throughput; Most complex controller design Difficult to rebuild in the event of a disk failure (compared to RAID 1) Individual block data transfer rate same as single disk
RAID Level 1 Pros One Write or two Reads possible per mirrored pair 100% redundancy of data RAID 1 can (possibly) sustain multiple simultaneous drive failures Simplest RAID storage subsystem design Cons High disk overhead  (100%) Cost
RAID Level 10 (a.k.a. 1 + 0) Pros RAID 10 is implemented as a striped array whose segments are RAID 1 arrays  RAID 10 has the same fault tolerance as RAID level 1 RAID 10 has the same overhead for fault-tolerance as mirroring alone High I/O rates are achieved by striping RAID 1 segments RAID 10 array can (possibly) sustain multiple simultaneous drive failures Excellent solution for sites that would have otherwise go with RAID 1 but need some additional performance boost
SAN (Storage Area Network) Pros Supports multiple systems  Newest technology matches RAID1 / RAID1+0 performance Cons Expense and setup Must measure for bandwidth requirements of systems, internal RAID, and I/O requirements
Overview by Analogy
Monitoring Disk Performance Physical Disk Logical Disk
Monitoring Raw Disk Physical Performance Avg. Disk sec/Read and Avg. Disk sec/Write Transaction Log Access  Avg disk writes/sec should be <= 1 msec (with array accelerator enabled) Database Access Avg disk reads/sec should be <= 15-20 msec Avg disk writes/sec should be <= 1 msec (with array accelerator enabled) Remember checkpointing in your calculations!
Monitoring Raw I/O Physical Performance Counters - Disk Transfers/sec, Disk Reads/sec, and Disk Writes/sec Calculate the nbr of transfers/sec for a single drive: First divide the number of I/O operations/sec by number of disk drives Then factor in appropriate RAID overhead You shouldn’t have more I/O requests (disk transfers)/sec per disk drive: 8KB I/O Requests  10K RPM 9-72 GB  15K RPM 9–18 GB Sequential Write  ~ 166 ~250 Random Read/Write  ~ 90  ~110
Estimating Average I/O Collect long-term averages of I/O counters (Disk Transfers/sec, Disk Reads/sec, and Disk Writes/sec) Use the following equations to calculate I/Os per second per disk drive: I/Os per sec. per drive w/RAID 1 = (Disk Reads/sec + 2*Disk Writes /sec)/(nbr drives in volume) I/Os per sec. per drive w/RAID 5 = (Disk Reads/sec + 4*Disk Writes /sec)/(nbr drives in volume) Repeat for each logical volume. (Remember Checkpoints!) If your values don’t equal or exceed the values on the previous slide, increase speeds by: Adding drives to the volume Getting faster drives
Queue Lengths Counters - Avg. Disk Queue Length and Current Disk Queue Length Avg Disk Queue <= 2 per disk drive in volume Calculate by dividing queue length by number of drives in volume Example: In a 12-drive array, max queued disk request = 22 and average queued disk requests = 8.25 Do the math for max:  22 (max queued requests) divided by 12 (disks in array) = 1.83 queued requests per disk during peak.  We’re ok since we’re <= 2. Do the math for avg: 8.25 (avg queued requests) divided by 12 (disks in array) = 0.69 queued requests per disk on average.  Again, we’re ok since we’re <= 2.
Disk Time Counters - % Disk Time (%DT), % Disk Read Time (%DRT), and % Disk Write Time (%DWT) Use %DT with % Processor Time to determine time spent executing I/O requests and processing non-idle threads. Use %DRT and %DWT to understand types of I/O performed Goal is the have most time spent processing non-idle threads (i.e. %DT and % Processor Time >= 90). If %DT and % Processor Time are drastically different, then there’s usually a bottleneck.
Database I/O Counters – Page Reads/sec, Page Requests/sec, Page Writes/sec, and Readahead Pages/sec Page Reads/sec If consistently high, it may indicate low memory allocation or an insufficient disk drive subsystem. Improve by optimizing queries, using indexes, and/or redesigning database Related to, but not the same as, the  Reads/sec  reported by the Logical Disk or Physical Disk objects  Page Writes/Sec:  Ratio of  Page Reads/sec  to  Page Writes/sec  typically ranges from 5:1 and higher in OLTP environments.  Readahead Pages/Sec Included in  Page Reads/sec  value Performs full extent reads of 8 8k pages (64k per read)
Tuning I/O When bottlenecking on too much I/O: Tuning queries (reads) or transactions (writes) Tuning or adding indexes  Tuning fill factor Placing tables and/or indexes in separate file groups on separate drives Partitioning tables Hardware solutions include: Adding spindles (reads) or controllers (writes) Adding or upgrading drive speed Adding or upgrading controller cache.  (However, beware write cache without battery backup.) Adding memory or moving to 64-bit memory.
Trending and Forecasting Trending and forecasting is hard work! Create a tracking table to store: Number of records in each table Amount of data pages and index pages, or space consumed Track I/O per table using fn_virtualfilestats Run a daily job to capture data Perform analysis: Export tracking data to Excel Forecast and graph off of data in worksheet Go back to step 2d and repeat
Disk Rules of Thumb for Better Performance Put SQL Server data devices on a non-boot disk Put logs and data on separate volumes and, if possible, on independent SCSI channels Pre-size your data and log files; Don’t rely on AUTOGROW RAID 1 and RAID1+0 are much better than RAID5 Tune TEMPDB separately Create 1 data file (per filegroup) for physical CPU on the server Create data files all the same size per database Add spindles for read speed, controllers for write speed Partitioning … for the highly stressed database Monitor, tune, repeat…
Resources See Kevin Klines webcast and read his article on SQL Server Magazine called ‘Bare Metal Tuning’ to learn about file placement, RAID comparisons, etc. Check out  www.baarf.com  and  www.SQL-Server-Performance.com   Storage Top 10 Best Practices at  http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/storage-top-10.mspx
Call to Action – Next Steps Attend a live demo:  http://www.quest.com/landing/qc_demos.asp Download white papers:  http://www.quest.com/whitepapers   Get a trial versions:  http://www.quest.com/solutions/download.asp Email us with your questions:  [email_address]  or go to  www.quest.com
Q & A Send questions to me at:  [email_address]   Send broader technical questions to:  [email_address] For sales questions, go to:  www.quest.com   THANK YOU!

More Related Content

SQL 2005 Disk IO Performance

  • 1. SQL 2005 Disk I/O Performance By Bryan Oliver SQL Server Domain Expert
  • 2. Agenda Disk I/O Performance Call to Action Performance Analysis Demo Q & A
  • 3. Some Questions To Think About Two queries - the first is run once a week and takes 10 mins to return its result set - the second is run 10 thoushand times a week and takes 1 second to return its result set ? which of these two queries will have the potential to affect Disk I/O the greatest. Two computers - the first uses Raid 5 for its data drive - the second uses Raid 10 for its data drive ? which of these two computers will return data faster all else been equal
  • 4. The Basics of I/O A single fixed disk is inadequate except for the simplest needs Database applications require a Redundant Array of Inexpensive Disks (RAID) for: Fault tolerance Availability Speed Different levels offer different pros/cons
  • 5. RAID Level 5 Pros Highest Read data transaction rate; Medium Write data transaction rate Low ratio of parity disks to data disks means high efficiency Good aggregate transfer rate Cons Disk failure has a medium impact on throughput; Most complex controller design Difficult to rebuild in the event of a disk failure (compared to RAID 1) Individual block data transfer rate same as single disk
  • 6. RAID Level 1 Pros One Write or two Reads possible per mirrored pair 100% redundancy of data RAID 1 can (possibly) sustain multiple simultaneous drive failures Simplest RAID storage subsystem design Cons High disk overhead (100%) Cost
  • 7. RAID Level 10 (a.k.a. 1 + 0) Pros RAID 10 is implemented as a striped array whose segments are RAID 1 arrays RAID 10 has the same fault tolerance as RAID level 1 RAID 10 has the same overhead for fault-tolerance as mirroring alone High I/O rates are achieved by striping RAID 1 segments RAID 10 array can (possibly) sustain multiple simultaneous drive failures Excellent solution for sites that would have otherwise go with RAID 1 but need some additional performance boost
  • 8. SAN (Storage Area Network) Pros Supports multiple systems Newest technology matches RAID1 / RAID1+0 performance Cons Expense and setup Must measure for bandwidth requirements of systems, internal RAID, and I/O requirements
  • 10. Monitoring Disk Performance Physical Disk Logical Disk
  • 11. Monitoring Raw Disk Physical Performance Avg. Disk sec/Read and Avg. Disk sec/Write Transaction Log Access Avg disk writes/sec should be <= 1 msec (with array accelerator enabled) Database Access Avg disk reads/sec should be <= 15-20 msec Avg disk writes/sec should be <= 1 msec (with array accelerator enabled) Remember checkpointing in your calculations!
  • 12. Monitoring Raw I/O Physical Performance Counters - Disk Transfers/sec, Disk Reads/sec, and Disk Writes/sec Calculate the nbr of transfers/sec for a single drive: First divide the number of I/O operations/sec by number of disk drives Then factor in appropriate RAID overhead You shouldn’t have more I/O requests (disk transfers)/sec per disk drive: 8KB I/O Requests 10K RPM 9-72 GB 15K RPM 9–18 GB Sequential Write ~ 166 ~250 Random Read/Write ~ 90 ~110
  • 13. Estimating Average I/O Collect long-term averages of I/O counters (Disk Transfers/sec, Disk Reads/sec, and Disk Writes/sec) Use the following equations to calculate I/Os per second per disk drive: I/Os per sec. per drive w/RAID 1 = (Disk Reads/sec + 2*Disk Writes /sec)/(nbr drives in volume) I/Os per sec. per drive w/RAID 5 = (Disk Reads/sec + 4*Disk Writes /sec)/(nbr drives in volume) Repeat for each logical volume. (Remember Checkpoints!) If your values don’t equal or exceed the values on the previous slide, increase speeds by: Adding drives to the volume Getting faster drives
  • 14. Queue Lengths Counters - Avg. Disk Queue Length and Current Disk Queue Length Avg Disk Queue <= 2 per disk drive in volume Calculate by dividing queue length by number of drives in volume Example: In a 12-drive array, max queued disk request = 22 and average queued disk requests = 8.25 Do the math for max: 22 (max queued requests) divided by 12 (disks in array) = 1.83 queued requests per disk during peak. We’re ok since we’re <= 2. Do the math for avg: 8.25 (avg queued requests) divided by 12 (disks in array) = 0.69 queued requests per disk on average. Again, we’re ok since we’re <= 2.
  • 15. Disk Time Counters - % Disk Time (%DT), % Disk Read Time (%DRT), and % Disk Write Time (%DWT) Use %DT with % Processor Time to determine time spent executing I/O requests and processing non-idle threads. Use %DRT and %DWT to understand types of I/O performed Goal is the have most time spent processing non-idle threads (i.e. %DT and % Processor Time >= 90). If %DT and % Processor Time are drastically different, then there’s usually a bottleneck.
  • 16. Database I/O Counters – Page Reads/sec, Page Requests/sec, Page Writes/sec, and Readahead Pages/sec Page Reads/sec If consistently high, it may indicate low memory allocation or an insufficient disk drive subsystem. Improve by optimizing queries, using indexes, and/or redesigning database Related to, but not the same as, the Reads/sec reported by the Logical Disk or Physical Disk objects Page Writes/Sec: Ratio of Page Reads/sec to Page Writes/sec typically ranges from 5:1 and higher in OLTP environments. Readahead Pages/Sec Included in Page Reads/sec value Performs full extent reads of 8 8k pages (64k per read)
  • 17. Tuning I/O When bottlenecking on too much I/O: Tuning queries (reads) or transactions (writes) Tuning or adding indexes Tuning fill factor Placing tables and/or indexes in separate file groups on separate drives Partitioning tables Hardware solutions include: Adding spindles (reads) or controllers (writes) Adding or upgrading drive speed Adding or upgrading controller cache. (However, beware write cache without battery backup.) Adding memory or moving to 64-bit memory.
  • 18. Trending and Forecasting Trending and forecasting is hard work! Create a tracking table to store: Number of records in each table Amount of data pages and index pages, or space consumed Track I/O per table using fn_virtualfilestats Run a daily job to capture data Perform analysis: Export tracking data to Excel Forecast and graph off of data in worksheet Go back to step 2d and repeat
  • 19. Disk Rules of Thumb for Better Performance Put SQL Server data devices on a non-boot disk Put logs and data on separate volumes and, if possible, on independent SCSI channels Pre-size your data and log files; Don’t rely on AUTOGROW RAID 1 and RAID1+0 are much better than RAID5 Tune TEMPDB separately Create 1 data file (per filegroup) for physical CPU on the server Create data files all the same size per database Add spindles for read speed, controllers for write speed Partitioning … for the highly stressed database Monitor, tune, repeat…
  • 20. Resources See Kevin Klines webcast and read his article on SQL Server Magazine called ‘Bare Metal Tuning’ to learn about file placement, RAID comparisons, etc. Check out www.baarf.com and www.SQL-Server-Performance.com Storage Top 10 Best Practices at http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/storage-top-10.mspx
  • 21. Call to Action – Next Steps Attend a live demo: http://www.quest.com/landing/qc_demos.asp Download white papers: http://www.quest.com/whitepapers Get a trial versions: http://www.quest.com/solutions/download.asp Email us with your questions: [email_address] or go to www.quest.com
  • 22. Q & A Send questions to me at: [email_address] Send broader technical questions to: [email_address] For sales questions, go to: www.quest.com THANK YOU!

Editor's Notes

  1. Each entire data block is written on a data disk; parity for blocks in the same rank is generated on Writes, recorded in a distributed location and checked on Reads. RAID Level 5 requires a minimum of 3 drives to implement
  2. For Highest performance, the controller must be able to perform two concurrent separate Reads per mirrored pair or two duplicate Writes per mirrored pair. RAID Level 1 requires a minimum of 2 drives to implement Other Pros: Twice the Read transaction rate of single disks, same Write transaction rate as single disks; Transfer rate per block is equal to that of a single disk 100% redundancy of data means no rebuild is necessary in case of a disk failure, just a copy to the replacement disk
  3. RAID Level 10 requires a minimum of 4 drives to implement. Cons Very expensive / High overhead All drives must move in parallel to properly track, lowering sustained performance Very limited scalability at a very high inherent cost Note that RAID 0 + 1 is this diagram turned on its side. RAID 0 + 1 is excellent for logs.
  4. Note that on Windows 2000 server you must enable Diskperf –y to get physical disk counters. Otherwise, PerfMon will return values for the physical disk counters, but they’ll actually be logical values. This is setting enabled by default on Windows 2003 server. It incurs a small overhead of perhaps 1-2%.
  5. The Avg. Disk sec/Read and Avg. Disk sec/Write counters monitor the average number of seconds for read or write operations from or to a disk, respectively. If your values significantly exceed those listed above for database access, you may want to increase the speed of your disk subsystem by either using faster drives or adding more drives to the system. Checkpointing note: The above values will temporarily increase during periods of heavy I/O activity, such as during the checkpoint. When monitoring these values, take an average over a longer period of time, and/or monitor periods that do not contain a checkpoint.
  6. Monitors the rate of i/o operations handled by the disk subsystem. Remember that with several drives allocated to a logical disk volume, the counters monitor the total number of disk transfers for the entire volume. Note: With the array accelerator enabled, you may actually see substantially higher I/O per second per drive rates than those suggested in the table above. This is due to the array controller caching some of these I/Os.
  7. This slide details how to estimate the average number of I/O requests per second for each disk drive. With RAID 1, each write is duplicated onto a mirrored drive. Hence there are two-disk writes/sec. With RAID 5, each write generates four I/O operations3: reading the data block, reading the parity block, writing the data block, and writing the parity block. Hence there are four disk writes/sec. Repeat these steps for each logical volume. If the values significantly exceed those suggested above, increase the speed of your disk subsystem by adding more or using faster drives. As the equations illustrate, RAID 0 has the lowest impact on performance but offers no data protection. RAID 5, on the other hand, slows performance but offers low-cost data protection. Disk Reads/sec and Disk Writes/sec counters can be used to determine an application’s read-to-write ratio. They can also be used to profile disk I/O at a lower level. The sum of these two counters should equal the Disk Transfers/sec value. Note : The above values will temporarily increase during periods of heavy I/O activity, such as during the checkpoint. When monitoring these values, take an average over a longer period of time, and/or monitor periods that do not contain a checkpoint.
  8. The Avg. Disk Queue Length and Current Disk Queue Length counters monitor both the average number and the instantaneous number of reads and writes queued for the selected disk. Disk devices composed of multiple spindles, such as logical volumes configured on Smart Array controllers, will have several active requests at any point in time and several requests waiting for different disk drives. You therefore need to factor in the number of disks in the logical volume that are servicing the I/O requests. For example, A twelve-drive array is facing a maximum of 22 queued disk requests and an average of 8.25 queued disk requests. Therefore, the array has 22/12=1.83 queued disk requests per disk drive during the peak, and 8.25/12=0.69 queued disk requests per disk drive on an average. You should not average more than two queued disk requests per disk drive. The Avg. Disk Read Queue Length and Avg. Disk Write Queue Length counters provide you with more insight into what type of I/O requests are being queued the most. Remember that these values will temporarily increase under spikes of heavy I/O, such as during the checkpoint.
  9. The % Disk Time , % Disk Read Time , and % Disk Write Time counters monitor the percentage of time spent servicing particular I/O requests during the sampling interval. Use the % Disk Time counter in conjunction with the % Processor Time counter to determine the time the system spends executing I/O requests or processing non-idle threads. Use the % Disk Read Time and % Disk Write Time counters to gain a further insight into the type of I/O being performed. Your goal is to have a high percentage of time being spent executing non-idle threads ( high % Processor Time ) AND executing I/O (high % Disk Time ). On a highly optimized system, these counters consistently measure at over 90 percent. If one of these counters reads substantially lower than the other, this usually indicates a bottleneck, and further investigation is necessary. With high % Disk Time , use the % Disk Read Time and % Disk Write Time counters to get the I/O breakdown. With high % Processor Time , use the % User Time and % Privileged Time to get further CPU utilization breakdown.
  10. Page Reads/sec The Page Reads/sec counter monitors the number of pages read from disk per second. Depending on your environment, this counter may be high before your system reaches a steady state, and then gradually decrease. If your database fits entirely into memory, the counter should be zero. If the counter is consistently high, it may indicate low memory allocation or an insufficient disk drive subsystem. You may be able to reduce the number of Reads/sec by optimizing your queries, using indexes, and/or redesigning your database. Page Reads/sec is related to, but not the same as, the Reads/sec reported by the Logical Disk or Physical Disk objects. Multiple pages can be read with a single logical or physical disk read. The number of physical reads and Page Reads should be roughly the same in OLTP environments. Page Requests/sec A Page Request occurs when SQL Server looks in the buffer pool for a database page. If the page is in the buffer pool, it can be processed immediately. If the page is not in the buffer pool, a Page Read is issued. Page Writes/sec Eventually all modified database pages have to be written back to disk. The Page Writes/sec counter reports the rate at which this occurs. The ratio of Page Reads/sec to Page Writes/sec typically ranges from 2:1 to 5:1 in OLTP environments. Most Business Intelligence applications perform few updates and, as a result, few Page Writes . Excessive Page Writes can be caused by insufficient memory or frequent checkpoints. Readahead Pages/sec The SQL Server storage architecture supports optimizations that allow SQL Server to determine in advance which database pages will be requested (read-ahead). A full scan, in which every page of an index or table is read, is the simplest case. Read-ahead occurs when SQL Server issues the read request before the thread that is processing the query or transaction needs the page. Readahead Pages/sec is included in the Page Reads/sec counter. The number of read requests issued due to cache misses (the requested page was not found in the data cache) can be calculated by subtracting Readahead Pages/sec from Page Reads/sec . SQL Server typically reads entire extents when performing read-aheads. All eight pages of an extent will be read with a single 64KB read. Read-aheads will cause the Avg. Disk Bytes/Read reported by the Logical Disk or Physical Disk object to be larger than 8KB. It is important to note that read-aheads are performed in sequential order, allowing much higher throughput than random accesses.
  11.     Do not put SQL Server data devices on the boot disk     Put logs on a RAID 1 on an independent SCSI channel     Put data on a RAID 5 on an independent SCSI channel     If read disk queuing is high on your data device (avg 2 or higher per spindle) put non-clustered indexes in a new filegroup on a RAID 5 on an independent SCSI channel     If tempdb is stressed (consistent blocking in dbid 2 is one common indicator) and you cannot redesign to relieve the stress put tempdb on a RAID 1+0 on an independent SCSI channel and the tempdb log on yet another RAID 1+0 on an independent channel.        If your database holds highly sensitive data consider RAID 1+0 over RAID 5.     Avoid unnecessary complexity (KISS). With thanks to Bill Wunder and his article on the SIGs at the PASS website (http://sigs.sqlpass.org/Resources/Articles/tabid/35/ctl/ArticleView/mid/349/articleId/58/Default.aspx).
  12. Bare Metal Tuning facts, for example: The commodity platform was RAID5 Extra spindles added 1.4% eachover baseline. RAID1 &amp; RAID1+0 each provided about a 319% boost over baseline Extra spindles added 5% each over baseline