3

I have a table match which looks like this (please see attached image). I wanted to retrieve a dataset that had a column of average values for home_goal and away_goal using this code

SELECT 
    m.country_id, 
    m.season,
    m.home_goal,
    m.away_goal,
    AVG(m.home_goal + m.away_goal)  AS avg_goal
FROM match AS m;

However, I got this error

column "m.country_id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 3:  m.country_id,

My question is: why was GROUP BY clause required? Why couldn't SQL know how to take average of two columns row by row?

Thank you.

enter image description here

1
  • Tag your question with the database you are using. Commented Dec 13, 2020 at 12:49

5 Answers 5

4

try this:

SELECT 
    m.country_id, 
    m.season,
    m.home_goal,
    m.away_goal,
    (m.home_goal + m.away_goal)/2  AS avg_goal
FROM match AS m;

You have been asked for the group_by as avg() much like sum() work on multiple values of one column where you classify all columns that are not a columns wise operation in the group by

You are looking to average two distinct columns - it is a row-wise operations instead of column-wise

3
  • Thanks, DPH. Now I understand the reason. Just to clarify: built-in functions such as sum, avg are for column-wise otherwise one has to use group by if one wants to apply on row-wise?
    – Nemo
    Commented Dec 13, 2020 at 1:12
  • 1
    @Nemo if you want operations across columns (horizontal) think of a formula (a + b, etc). When working column-wise you use GROUP BY to identify variable that compose the groups (sub groups, etc.) you want to compute for (sum(), avg(), etc.). When you inform a colum in a column-wise operaion with no function on (sum(), avg(), etc.) SQL needs the explicite info to group by as the dataset is beeing compressed vertically by the column-wise operation (meaning the values with no operation wont get compressed and now SQL is lost as the others are beeing compressed)
    – DPH
    Commented Dec 13, 2020 at 1:20
  • Thanks for the clarification. I didn't expect to learn that much when posting the question.
    – Nemo
    Commented Dec 13, 2020 at 1:30
1

how to take average of two columns row by row?

You don't use AVG() for this; it is an aggregate function, that operates over a set of rows. Here, it seems like you just want a simple math computation:

SELECT 
    m.country_id, 
    m.season,
    m.home_goal,
    m.away_goal,
    (m.home_goal + m.away_goal) / 2.0 AS avg_goal
FROM match AS m;

Note the decimal denominator (2.0): this avoids integer division in databases that implement it.

5
  • Thanks for the solution, GMB. But why couldn't I use the AVG function? Could you please elaborate that in your answer so that I would accept it as the solution?
    – Nemo
    Commented Dec 13, 2020 at 0:56
  • 1
    @Nemo: no, that's not what AVG() does. There is no built-in function for the simple computation that you want to do..
    – GMB
    Commented Dec 13, 2020 at 0:58
  • Do you mean functions AVG, SUM etc can't be used at row level, only at aggregate level?
    – Nemo
    Commented Dec 13, 2020 at 1:03
  • 1
    SQL likes columns, so that's usually what built-in functions are geared towards. You can do anything you want with SQL, but that doesn't mean it's going to be efficient or easy to implement. Commented Dec 13, 2020 at 1:12
  • Thanks, Mark. Your comment helped me understand more about SQL mechanism.
    – Nemo
    Commented Dec 13, 2020 at 1:19
1

Avg in the context of the function mentioned above is calculating the average of the values of the columns and not the average of the two values in the same row. It is an aggregate function and that’s why the group by clause is required.

In order to take the average of two columns in the same row you need to divide by 2.

1

Let's consider the following table:

CREATE TABLE Numbers([x] int, [y] int, [category] nvarchar(10));

INSERT INTO Numbers ([x], [y], [category])
VALUES
    (1, 11, 'odd'),
    (2, 22, 'even'),
    (3, 33, 'odd'),
    (4, 44, 'even');

Here is an example of using two aggregate functions - AVG and SUM - with GROUP BY:

SELECT
  Category,
  AVG(x) as avg_x,
  AVG(x+y) as avg_xy,
  SUM(x) as sum_x,
  SUM(x+y) as sum_xy
FROM Numbers
GROUP BY Category
  

The result has two rows:

Category    avg_x   avg_xy  sum_x   sum_xy
even    3   36  6   72
odd     2   24  4   48

Please note that Category is available in the SELECT part because the results are GROUP BY'ed by it. If a GROUP BY is not specified then the result would be 1 row and Category is not available (which value should be displayed if we have sums and averages for multiple rows with different caetories?).

What you want is to compute a new column and for this you don't use aggregate functions:

SELECT
  (x+y)/2 as avg_xy,
  (x+y) as sum_xy
FROM Numbers

This returns all rows:

avg_xy  sum_xy
6   12
12  24
18  36
24  48

If your columns are integers don't forget to handle rounding, if needed. For example (CAST(x AS DECIMAL)+y)/2 as avg_xy,

1

The simple arithmetic calculation:

(m.home_goal + m.away_goal) / 2.0

is not exactly equivalent to AVG(), because NULL values mess it up. Databases that support lateral joins provide a pretty easy (and efficient) way to actually use AVG() within a row.

The safe version looks like:

(coalesce(m.home_goal, 0) + coalesce(m.away_goal, 0)) /
nullif( (case when m.home_goal is not null then 1 else 0 end +
         case when m.away_goal is not null then 1 else 0 end
        ), 0
      )

Some databases have syntax extensions that allow the expression to be simplified.

2
  • Thanks for the new concepts (coalesce, nullif), Gordon. What did you mean by some databases, as in hierarchical, relational, graph, network etc?
    – Nemo
    Commented Dec 13, 2020 at 23:14
  • @Nemo . . . No. The traditional relational databases all extend the standard in one way or another -- or support "uncommon" features of the standard. For instance, Postgers supports filter and MySQL treats booleans as integers. Commented Dec 14, 2020 at 0:22

Not the answer you're looking for? Browse other questions tagged or ask your own question.