How to add sizes of all files with the same name in UNIX Shell scripts

Question

I'm new here and I want to ask how to add sizes of files with the same name? My file names are actually different from each other but I cut them so I could get files that are grouped together.

here are my original files with sizes (sample)

sample.txt has this data inside:

12345 a_1.txt
12234 b_1.txt
32123 c_1.txt
11122 a_2.txt

Now I cut the filenames inside sample.txt to remove those characters starting from '_' (underscore). They become like this:

12345 a
12234 b
32123 c
11122 a

Now I want to add sizes of all files with same filename as seen above. Output should be like this:

23467 a
12234 b
32123 c

Please help. Thanks a lot guys. I'm stuck here for hours now

I'm not sure if line numbers exist in sample.txt and you want them in the output. It looks like numbered list so maybe not. Please paste your file and desired output as code sample to avoid confusion. — Kamil Maciorowski, Commented Apr 19, 2016 at 5:22
Wow Kamil, it worked! Can you please explain the code to me? Especially the awk command. I'm not familiar with it. Thank you so much! — Zero Darbelll, Commented Apr 19, 2016 at 7:11

Kamil Maciorowski · Accepted Answer · 2016-04-19 07:32:50Z

Assuming there are no line numbers in sample.txt:

cut -f 1 -d _ sample.txt | awk '{a[$2] += $1} END{for (i in a) print a[i], i}'

You may want to add | sort -k 2 at the end.

EDIT1 - explanation as requested:

The cut command cuts every line with _ delimiter and saves the first part only. You have already done it with your original file.

Then the awk command finds two fields in every line. We call them size and name, but awk refers to them as $1 and $2 internally. For every line it increments one element of an array a (the name a is arbitrarily chosen and has nothing to do with filename in the sample). The name $2 tells which element to increment -- it is an index; the size $1 is the incrementation value. awk is smart enough to initialize a element with 0 as it is mentioned for the first time. The particular element is incremented every time its index (name) appears as a second field of the input line. At the end (after the last line of the input) awk goes through every known index of a and prints the value (which is now cumulative size) and an index (name).

Wow Kamil, it worked! Can you please explain the code to me? Especially the awk command. I'm not familiar with it. Thank you so much! — Zero Darbelll, Commented Apr 19, 2016 at 6:43

Paul · Accepted Answer · 2016-04-19 04:10:20Z

0

To get the total of every file starting with a_ you could do this:

du -c a_*  | grep total

du works out the size of all the files then -c totals the sizes. The grep just extracts the total rather than all the individual files.

answered Apr 19, 2016 at 4:10

Paul

60.3k19 gold badges152 silver badges171 bronze badges

Thank you Paul but it's not working. I just want to clear that those 2 set of files are in .txt. I'll edit my question again. Sorry
– Zero Darbelll
Commented Apr 19, 2016 at 4:47

Add a comment |

Stack Exchange Network

How to add sizes of all files with the same name in UNIX Shell scripts

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
unix
shell
.

Hot Network Questions

How to add sizes of all files with the same name in UNIX Shell scripts

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged unixshell.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
unix
shell
.