14

I'm working on creating a chart for my client and they want to get the total customer count over a 24 hour, 3 day, 1 week, 1 month, etc period. I'm honestly not the best with SQL, so generating these queries aren't my forte.

In regards to getting the customers over 24 hours, I've come across two "where" statements that may work, but I'm not sure which is the best.

First version:

WHERE DATEDIFF(hh,CreatedDate,GETDATE())>24

Second Version:

WHERE CreatedDate >= DATEADD(HH, -24, GETDATE())

The first version generates 21 rows and the second generates 17 rows (from the same dataset, of course) so obviously one is more accurate than the other. I'm leaning towards the first, but I would like your opinion... please.

Thanks, Andrew

2
  • I think that WHERE DATEDIFF(hh,CreatedDate,GETDATE())>24 needs to be WHERE DATEDIFF(hh,CreatedDate,GETDATE())<24 Commented Jul 20, 2010 at 19:22
  • 1
    ...or more accurately, WHERE DATEDIFF(hh,CreatedDate,GETDATE())<=24 to match the logic of the second :) Commented Jul 20, 2010 at 19:28

3 Answers 3

11

Avoid the first version. First, because it disables index utilization. The second (functional) issue with the first version is, DATEDIFF(HOUR...) returns all values less than 25 hours. Try this for clarity:

SELECT DATEDIFF(HOUR, '2010-07-19 00:00:00', '2010-07-20 00:59:59.99')
11

The first version is not accurate.

WHERE DateDiff(hh, CreatedDate, GETDATE()) > 24

This will return values somewhere between from 23.0001 hours ago to 24.9999 hours ago because you are counting "boundaries crossed", not an actual 24-hour period. Consider that from 1:59:59 to 2:00:00 is only one second, but DateDiff by hours will return 1 hour. Similarly, 1:00:00 to 2:59:59 is almost 2 hours, but DateDiff by hours returns the same 1 hour.

The second version is correct.

WHERE CreatedDate >= DateAdd(hh, -24, GETDATE())

Subtracting 24 hours from the current date will yield a time exactly 24.0 hours ago, to the millisecond. This will thus return 24 hours of data.

Also, the first version would be bad even if it was what you wanted because the engine would have to perform date math on every single row in the entire table, making any potential index useless and consuming a bunch of needless CPU. Instead, do the math on the opposite side of the expression from the column name. To duplicate the logic of your first expression without the performance penalty would look like this:

WHERE CreateDate >= DateAdd(hh, DateDiff(hh, 0, GETDATE()) - 24, 0)

Example:

  • GetDate() = '20100720 17:52'
  • DateDiff(hh, 0, '20100720 17:52') = 969065
  • DateAdd(hh, 969065 - 24, 0) = '20100719 17:00'

and to prove this is the same as your first expression:

  • DateDiff(hh, '20100719 17:00', '20100720 17:52') = 24
0

Maybe these for each of your where clauses?

WHERE CreatedDate < GETDATE() AND CreatedDate >= DATEADD(hh, -24, GETDATE)

WHERE CreatedDate < GETDATE() AND CreatedDate >= DATEADD(day, -3, GETDATE)

WHERE CreatedDate < GETDATE() AND CreatedDate >= DATEADD(wk, -1, GETDATE)

WHERE CreatedDate < GETDATE() AND CreatedDate >= DATEADD(mm, -1, GETDATE)

And as flo mentioned about indexes, just make sure you're indexing your CreatedDate column.

Not the answer you're looking for? Browse other questions tagged or ask your own question.