1

I'm having issues calculating the median for my result set and could use some help. I need to provide the median, max, min, avg, and standard deviation. There are 222 rows which can be more or less and I'm not sure what I have so far is an accurate way of calculating the median. Here is my query.

    Select 
    min(nodes) as min_nodes
    ,max(nodes) as max_nodes
    ,avg(nodes) as avg_nodes
    ,max(nodes) + min(nodes))/2 as median_nodes
    ,stddev(nodes) as sd_nodes
    from Table
1
  • 2
    unless db2 has a median function built in, you'll have to it in multiple steps: get number of rows in result set, figure out the mid-point, and there's your median.
    – Marc B
    Commented Sep 2, 2014 at 19:48

2 Answers 2

3

You can do it using window functions:

Select min(nodes) as min_nodes, max(nodes) as max_nodes, avg(nodes) as avg_nodes,
       avg(case when 2*seqnum in (cnt, cnt - 1, cnt + 1) then nodes end) as median_nodes,
       stddev(nodes) as sd_nodes
from (select t.*, row_number() over (order by nodes) as seqnum,
             count(*) over () as cnt
      from table t
     ) t

The use of avg() is to handle the case where you have an even number of values. In this case, the median is traditionally assigned to the midpoint of the two middle values.

0
1

Here's one way to calculate the median:

select avg(nodes)
from (
    select nodes
         , row_number() over(order by nodes asc) as rn1
         , row_number() over(order by nodes desc) as rn2
    from table
) as x(nodes, rn1, rn2)
where rn1 in (rn2, rn2 - 1, rn2 + 1)

Enumerating the nodes in both directions is an optimization.

Not the answer you're looking for? Browse other questions tagged or ask your own question.