The problem I have solved is as follows:
Consider a table of dogs,
DOGGIES
, that records on each row theID
of a doggy, one skill that they have, when they started having it, when they stopped having it (assume a useful placeholder if they never stop), and an extra bool column calledIS_NO_LONGER_GOOD_BOY
. This bool column is magically a property of the skill and not of the dog or the history. Find every doggy who has a skill with a0
value forIS_NO_LONGER_GOOD_BOY
without, at some point in time, having that same skill with a1
value forIS_NO_LONGER_GOOD_BOY
. For each doggy found, return theIS_NO_LONGER_GOOD_BOY = 0
row(s) that caused them to appear.Doggies can have multiple skills at any given time, but doggies with no skills aren't in the data. There is no need to worry about NULLs, malformed data, or other such traps.
This was my solution. It works, but the code comments explain my dislike of it.
SELECT [ID], [SKILL_NAME], [START_DATE], [STOP_DATE], [IS_NO_LONGER_GOOD_BOY]
FROM [DOGGIES] AS [DOGS_OUT] --Not sure if the alias is needed, but it adds clarity?
WHERE EXISTS
(
SELECT 1
FROM
--Really don't like this bit.
--This SELECT 1 part only exists so I can write a WHERE after the indented query's HAVING.
--You'd really think there'd be a way to put this inside the indented query.
--But the innermost query needs the HAVING part, so I think it's inescapable?
(
SELECT [ID], [SKILL_NAME]
FROM [DOGGIES]
GROUP BY [ID], [SKILL_NAME]
HAVING
COUNT(CASE WHEN [IS_NO_LONGER_GOOD_BOY] = 0 THEN 1 END) > 0 --SQL has no COUNTIF
AND
COUNT(CASE WHEN [IS_NO_LONGER_GOOD_BOY] = 1 THEN 1 END) = 0
) AS [DOG_SKILL_PAIRS]
WHERE
[DOG_SKILL_PAIRS].[ID] = [DOGS_OUT].[ID]
AND
[DOG_SKILL_PAIRS].[SKILL_NAME] = [DOGS_OUT].[SKILL_NAME]
)
My question is simply if what I've done can be improved upon. Some suspicions of mine are as following:
- I get the feeling that all of this code should only need at most one level of nesting, rather than having a correlated subquery inside the outermost query.
- Working from the innermost layer outwards, I don't like how I've gone from a
HAVING
clause without aWHERE
clause to aWHERE
clause that filters the result of theHAVING
clause. You typically go fromWHERE
toHAVING
, not the other way around. - Can window functions save the day?
COUNT(CASE WHEN [IS_NO_LONGER_GOOD_BOY] = 0 THEN 1 END) OVER (PARTITION BY [ID], [SKILL_NAME]) AS [FALSE_COUNT]
sounds like it ought to be a good idea, but I couldn't make it fit.
I'm on a 2016 version of SQL Server.