Bash command to calculate average on each row and each column

Question

Suppose we have a log file like marks.log and the content looks something like this:

Fname   Lname   Net Algo    
Jack    Miller  15  20  
John    Compton 12  20  
Susan   Wilson  13  19

I want to add a new column that contains average for each person, and a new row that contains average for each course. The result has to look like this:

Fname   Lname   Net  Algo  Avg
Jack    Miller  15   20    17.5
John    Compton 12   20    16
Susan   Wilson  13   19    16
Average         13.3 19.6  -

Re: "I'm new to bash can i can't figure the syntax for loops and awk etc." First, awk is not bash, they are two completely different languages. Second, if you don't know the syntax, you should go read a tutorial instead of delegating it to other people. — 4ae1e1, Commented Oct 3, 2015 at 8:53
Honestly, if this is to run on Linux or any modern UNIX I would suggest scripting it with some other language more suitable for complex scripting. Perl and Python are almost always available to you, and in many cases Ruby and PHP are there as well on a default Linux installation. Although doable in bash, It will be much easier to do in one of these languages (I suspect no more than a few lines of code in either of these) — shevron, Commented Oct 3, 2015 at 8:57
please edit your question to include your expected output, given this sample input. Good luck. — shellter, Commented Oct 3, 2015 at 9:11
I would suggest using something like "Overall Average" as the final row, just so it has the same number of fields as the previous rows. It will make life easier for any formatting you do afterwards, using space as the field delimiter. — seumasmac, Commented Oct 3, 2015 at 9:37

seumasmac · Accepted Answer · 2015-10-03 10:14:06Z

4

If your data is in datafile.txt, the syntax for awk could be something like:

awk '
  {
  # If it is the first row
  if (NR==1)
  print $0, "Avg";
  else
  # Print all fields, then the average of fields 3 & 4
  print $0,($3+$4)/2;
  # Get the total for field 3 and field 4
  t3+=$3; t4+=$4
  }
  # Once that is done...
  END {
  # Print the final line
  printf "Overall Average %.1f %.1f -\n",
  # The average of field 3 (NR is the Number of Records)
  t3/(NR-1),
  # The average of field 4 (NR is the Number of Records)
  t4/(NR-1);
  }' datafile.txt

That's the long version with comments. The one-liner looks like:

awk '{if (NR==1) print $0, "Avg"; else print $0,($3+$4)/2; t3+=$3; t4+=$4}END{printf "Overall Average %.1f %.1f -\n",t3/(NR-1),t4/(NR-1);}' datafile.txt

This should match the desired output.

edited Oct 3, 2015 at 10:14

answered Oct 3, 2015 at 9:28

seumasmac

2,6041 gold badge18 silver badges7 bronze badges

How can you do this if rows are not defined to have 2 integer entries and not restricted to fields 3 or 4? So let's ignore the column names and say row 1 has 2 integers. row 2 has 4, row 3 has zero, row 5 has 1. How can you work the average out of each row in that instance via the methods defined including the Overall Average?
– Data
Commented Feb 9, 2021 at 13:58

Add a comment |

Component 10 · Accepted Answer · 2015-10-03 11:33:36Z

2

How about:

gawk '{if (NR==1) { print $0, "Avg"; tn = 0; ta = 0; c = 0; } else { print $0,($3+$4)/2; tn = tn + $3; ta = ta + $4; c = c + 1; } } END {print "Average", tn/c, ta/c, c; }' <filename>

edited Oct 3, 2015 at 11:33

answered Oct 3, 2015 at 9:28

Component 10

10.4k7 gold badges51 silver badges67 bronze badges

1

Thanks for making me feel lazy :) Updated my answer doing the first line properly.
– seumasmac
Commented Oct 3, 2015 at 10:15

Add a comment |

ewcz · Accepted Answer · 2015-10-03 09:38:43Z

a lengthy solution not using awk could be:

#!/bin/bash
A=0
B=0

process(){
  A=$(( $A + $3 ))
  B=$(( $B + $4 ))
}
get_mean(){
  val=$( echo "($3 + $4)/2" | bc -l)
  printf "%.1f" $val
}

line_id=0
while read line
do
  line_id=$(( $line_id + 1 ))
  if [ $line_id -le 1 ]; then
    echo "Fname   Lname   Net  Algo  Avg"
    continue
  fi

  process $line
  mean=$(get_mean $line)

  echo $line $mean
done
A=$(echo "$A/($line_id-1)" | bc -l)
B=$(echo "$B/($line_id-1)" | bc -l)
printf "Average\t\t%.1f %.1f -" $A $B

Then one can invoke this script as ./test.sh < input.

Collectives™ on Stack Overflow

Bash command to calculate average on each row and each column

3 Answers 3

Not the answer you're looking for? Browse other questions tagged
bash
terminal
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Not the answer you're looking for? Browse other questions tagged bashterminal or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
bash
terminal
or ask your own question.