10

I recently started working with SQL Server 2008 as a DBA trainee. I need calculate the size of the database but also estimate its growth over recent months and the predicted growth for the next 12 months.

I can use the sp_spaceused statement to calculate the actual size but how do I calculate everything else?

6 Answers 6

21

The other answers are technically correct, but not real-world correct. Here's what you need to ask the business:

What time horizon am I aiming for? In your case, you're looking for a 12-month number.

During that time, will we be archiving data, or keeping all data? In some businesses, you're allowed to (or required to) only keep a certain amount of data, like the last 12 months. In that case, you'll need to figure out the data growth (which the subsequent questions will answer) but then back down to the last rolling 12 months. You can't just say, "Right now that amount of data is 100GB," because if your data volume is growing, then the last 12 months is growing too. The time amount might be constant, but the data is not.

Will we be adding additional users? For example, the business might be growing into new territories or acquiring new customers. If they double the user base, then in some cases, the data will start doubling as well.

Do we expect the business volume to grow? If you're tracking sales on a web site, for example, and you start running Super Bowl or World Cup ads, your data volume can hit the hockey stick growth curve.

Will we be adding additional functionality in the app? If the app suddenly starts storing images, this will dramatically affect database size.

Will we be adding data from another source, or logging new data? If you start capturing web site clicks, or in a data warehouse, adding additional sources, then data volume will grow.

Will developers or DBAs be performance tuning indexes? If you're going to let people create indexes, you can easily double (or triple, or quadruple) the size of your data depending on how overzealous they get.

And as long as you're asking these questions, you should also ask if performance is expected to stay the same, degrade, or get better. I like mapping out the projected growth on a line chart, and then comparing hardware and staff training investments over that same timeline.

2
  • 7
    so, IT DEPENDS™ !?!?
    – Hannah Vernon
    Commented Jul 25, 2014 at 21:01
  • 3
    It depends on what people are putting into the database, yes.
    – Brent Ozar
    Commented Jul 26, 2014 at 20:38
14

You can't accurately project future growth without a history of previous growth. You can however cheat and get a rough trend using backup history, as detailed by Erin Stellato in Trending Database Growth From Backups.

Plot the output of the following query in Excel:

SELECT
    [Database] = [database_name]
    , [Month] = DATEPART(month,[backup_start_date])
    , [Backup Size MB] = AVG([backup_size]/1024/1024)
    , [Compressed Backup Size MB] = AVG([compressed_backup_size]/1024/1024)
    , [Compression Ratio] = AVG([backup_size]/[compressed_backup_size])
FROM 
    msdb.dbo.backupset
WHERE 
    [database_name] = N'YourDatabaseName'
AND [type] = 'D'
GROUP BY 
    [database_name]
    , DATEPART(mm, [backup_start_date]);
2
  • I use this one constantly, then modify it to go by year if happen to have that much history on a client server. Love looking at this kind of data for a server.
    – user507
    Commented Jul 24, 2014 at 22:30
  • I also like combining it with @BrentOzar [backup scripts from here])brentozar.com/archive/2012/03/…).
    – user507
    Commented Jul 24, 2014 at 22:34
1

There are many ways of how you can do database capacity planning.

msdb backup history if gets regular trimmed, you wont be having much data left for analysis

As Mark pointed out, it can be done using the method described by Erin - trending database growth from backup.

You can even use PIVOT to find out the database growth over a period of 12 months from the backup history as below :

DECLARE @startDate DATETIME;

SET @startDate = GetDate();

SELECT PVT.DatabaseName
    ,PVT.[0]
    ,PVT.[-1]
    ,PVT.[-2]
    ,PVT.[-3]
    ,PVT.[-4]
    ,PVT.[-5]
    ,PVT.[-6]
    ,PVT.[-7]
    ,PVT.[-8]
    ,PVT.[-9]
    ,PVT.[-10]
    ,PVT.[-11]
    ,PVT.[-12]
FROM (
    SELECT BS.database_name AS DatabaseName
        ,DATEDIFF(mm, @startDate, BS.backup_start_date) AS MonthsAgo
        ,CONVERT(NUMERIC(10, 1), AVG(BF.file_size / 1048576.0)) AS AvgSizeMB
    FROM msdb.dbo.backupset AS BS
    INNER JOIN msdb.dbo.backupfile AS BF ON BS.backup_set_id = BF.backup_set_id
    WHERE BS.database_name NOT IN (
            'master'
            ,'msdb'
            ,'model'
            ,'tempdb'
            )
        AND BS.database_name IN (
            SELECT db_name(database_id)
            FROM master.SYS.DATABASES
            WHERE state_desc = 'ONLINE'
            )
        AND BF.[file_type] = 'D'
        AND BS.backup_start_date BETWEEN DATEADD(yy, - 1, @startDate)
            AND @startDate
    GROUP BY BS.database_name
        ,DATEDIFF(mm, @startDate, BS.backup_start_date)
    ) AS BCKSTAT
PIVOT(SUM(BCKSTAT.AvgSizeMB) FOR BCKSTAT.MonthsAgo IN (
            [0]
            ,[-1]
            ,[-2]
            ,[-3]
            ,[-4]
            ,[-5]
            ,[-6]
            ,[-7]
            ,[-8]
            ,[-9]
            ,[-10]
            ,[-11]
            ,[-12]
            )) AS PVT
ORDER BY PVT.DatabaseName;

There is another way that you will find really useful as described excellently by Chad Miller on SSC - Database Space Capacity Planning. He also focuses on days remaining which is very useful.

1
  • I am using above query and it is giving me result like SSISDB 11059.5 10233.6 9322.9 8338.8 7675.6 7075.1 6383.7 5592.6 4862.1 (for 0,-1,-2,-3... etc) What does this value mean? Does it mean that my row size in MB is 11059 and it will be increase by 10233 mb next month? I am confuse with the output.. can you please assist me Commented May 27, 2015 at 12:28
1

Hope this code Helps:

Works based on backup size history (in MB), gives month by month min MB, avg MB, max MB and difference from other month in MB.

Lists all databases with backups except for system databases.

-- T-SQL script - Analyses database growth using backup information (Last (12) months in that case
-- Looks only to FULL backups information
-- Parameters: Date GetDate() and nr of months to analyse

SET NOCOUNT ON
DECLARE @endDate datetime, @months smallint; 
SET @endDate = GetDate();  -- Data atual
SET @months = 12;          -- Nr. de meses a analisar

;WITH HIST AS 
   (SELECT BS.database_name AS DatabaseName 
          ,YEAR(BS.backup_start_date) * 100 
           + MONTH(BS.backup_start_date) AS YearMonth 
          ,CONVERT(numeric(10, 1), MIN(BS.backup_size / 1048576.0)) AS MinSizeMB 
          ,CONVERT(numeric(10, 1), MAX(BS.backup_size / 1048576.0)) AS MaxSizeMB 
          ,CONVERT(numeric(10, 1), AVG(BS.backup_size / 1048576.0)) AS AvgSizeMB 
    FROM msdb.dbo.backupset as BS 
    WHERE NOT BS.database_name IN 
              ('master', 'msdb', 'model', 'tempdb') 
          AND BS.type = 'D' 
          AND BS.backup_start_date BETWEEN DATEADD(mm, - @months, @endDate) AND     @endDate 
    GROUP BY BS.database_name 
            ,YEAR(BS.backup_start_date) 
            ,MONTH(BS.backup_start_date)) 
SELECT @@SERVERNAME
      ,MAIN.DatabaseName 
      ,MAIN.YearMonth 
      ,MAIN.MinSizeMB 
      ,MAIN.MaxSizeMB 
      ,MAIN.AvgSizeMB 
      ,MAIN.AvgSizeMB  
       - (SELECT TOP 1 SUB.AvgSizeMB 
          FROM HIST AS SUB 
          WHERE SUB.DatabaseName = MAIN.DatabaseName 
                AND SUB.YearMonth < MAIN.YearMonth 
          ORDER BY SUB.YearMonth DESC) AS GrowthMB 
FROM HIST AS MAIN 
ORDER BY MAIN.DatabaseName 
        ,MAIN.YearMonth
1
  • Thank you (after 10 years). The GrowthMB column is very useful. Commented Jun 30 at 22:10
1

There is other method involving mathematical calculations and this would give accurate results. As already pointed backups would be best to refer to data growth since you said you need to calculate and predict size of database below Microsoft links would help you

Estimate Size of Database

Estimate Size of Clustered Index

Estimate size of heap

Estimate size of table

0

I think Brent Ozar's post is spot on. I've been in a massively bloating DB project and had exactly the same issue you do here, and it just isn't that simple.

Since it's better to at least do something - even if not really that accurate -, I'd set up the required tables and a job (or whichever other method you want, anything to just query the sizes and store it somewhere reliably) to track the rows and space used for DB and all its tables on a weekly basis and use that to project the most likely growth curve. Using the backup history is also a great idea. But regardless of the method, you need time for getting even remotely reliable data.

Other than that, it really depends on your situation. It may be the use% of your DB is now only a fraction of what it will be in the next 6 months, for example when your software gains more ground, making it impossible to predict the explosive growth that's coming. It may be there are yearly massive data transfers that will double the size of the DB, but you will only find out about that mass after the fact.

But as said, if growth is a concern, then you absolutely should do something to keep track of it. Last thing you want is to find yourself 6 months from now with a DB twice as massive as its original lifetime projection, having to explain to your customer how or why that happened, not to even mention having to start guessing how much more it will grow in the next 6 months. There are also some very obvious benefits of knowing where the new data has gone and what's the relative growth of each table in a given amount of time, as it can provide valuable information on different trends, potential software issues etc. all for relatively small effort.

Not the answer you're looking for? Browse other questions tagged or ask your own question.