0

I'm new here and I want to ask how to add sizes of files with the same name? My file names are actually different from each other but I cut them so I could get files that are grouped together.

here are my original files with sizes (sample)

sample.txt has this data inside:

  1. 12345 a_1.txt
  2. 12234 b_1.txt
  3. 32123 c_1.txt
  4. 11122 a_2.txt

Now I cut the filenames inside sample.txt to remove those characters starting from '_' (underscore). They become like this:

  1. 12345 a
  2. 12234 b
  3. 32123 c
  4. 11122 a

Now I want to add sizes of all files with same filename as seen above. Output should be like this:

  1. 23467 a
  2. 12234 b
  3. 32123 c

Please help. Thanks a lot guys. I'm stuck here for hours now

2
  • I'm not sure if line numbers exist in sample.txt and you want them in the output. It looks like numbered list so maybe not. Please paste your file and desired output as code sample to avoid confusion. Commented Apr 19, 2016 at 5:22
  • Wow Kamil, it worked! Can you please explain the code to me? Especially the awk command. I'm not familiar with it. Thank you so much! Commented Apr 19, 2016 at 7:11

2 Answers 2

0

Assuming there are no line numbers in sample.txt:

cut -f 1 -d _ sample.txt | awk '{a[$2] += $1} END{for (i in a) print a[i], i}'

You may want to add | sort -k 2 at the end.


EDIT1 - explanation as requested:

The cut command cuts every line with _ delimiter and saves the first part only. You have already done it with your original file.

Then the awk command finds two fields in every line. We call them size and name, but awk refers to them as $1 and $2 internally. For every line it increments one element of an array a (the name a is arbitrarily chosen and has nothing to do with filename in the sample). The name $2 tells which element to increment -- it is an index; the size $1 is the incrementation value. awk is smart enough to initialize a element with 0 as it is mentioned for the first time. The particular element is incremented every time its index (name) appears as a second field of the input line. At the end (after the last line of the input) awk goes through every known index of a and prints the value (which is now cumulative size) and an index (name).

1
  • Wow Kamil, it worked! Can you please explain the code to me? Especially the awk command. I'm not familiar with it. Thank you so much! Commented Apr 19, 2016 at 6:43
0

To get the total of every file starting with a_ you could do this:

du -c a_*  | grep total

du works out the size of all the files then -c totals the sizes. The grep just extracts the total rather than all the individual files.

1
  • Thank you Paul but it's not working. I just want to clear that those 2 set of files are in .txt. I'll edit my question again. Sorry Commented Apr 19, 2016 at 4:47

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .