8

I have a text-file that contains a list of filenames (one filename per line).

Now I would like to calculate the size of all these files. I think I will have to do a ls -la on every line of the file and then accumulate the filesize.

I think that awk will be part of the solution, but thats just guess.

6 Answers 6

11

You need just last line of du -c output

du -ch $(<list) | tail -1
11

With GNU stat:

stat -c %s -- $(<list) | paste -d+ -s - | bc
  • stat displays information about the file
    • -c specifies the format, %s gives the filesize in bytes
  • paste -d+ -s concats the output together line by line with a + as delimiter
  • bc piped to bc, it will be calculated together.

Add a -L option to stat, if for symlinks, you'd rather count the size of the file that the symlink eventually resolves to.

That assumes a shell like ksh, bash or zsh with the $(<file) operator to invoke split+glob on the content of a file.

Here list is expected to be a space, tab or newline (assuming the default value of $IFS) delimited list of file patterns (as in *.txt /bin/*). For a list of file paths, one per line, you'd need to disable globbing and limit $IFS to newline only, or with GNU xargs:

xargs -rd '\n' -a list stat -c %s -- | paste -sd+ - | bc
1
  • Nice... One may have to handle spaces in filenames using something like $ find . -type f -exec stat -c %s {} \; | paste -s -d+ | bc
    – tonioc
    Commented Oct 31, 2017 at 11:52
5

I would use the -s file test and perl:

-s File has nonzero size (returns size in bytes).

Something like this:

#!/usr/bin/env perl;
use strict;
use warnings;

my $sum = 0;
while ( my $filename = <> ) {
    chomp ( $filename );
    $sum += -s $filename;
}

print "Sum is $sum bytes\n";

(reads filenames either from STDIN or from a file specified on command line, e.g. myscript.pl file_list.txt)

You could "one liner" this:

perl -nle '$sum += -s $_; END { print $sum }'

(and either pipe in a 'file name list' or specify a file argument after it as before)

5
  • 1
    Nice. One tweak for efficiency for the one-liner: perl -nle '$sum += -s $_} END { print $sum' -- that way you don't redefine the END sub for every iteration of the loop. Check both versions with perl -MO=Deparse -nle ...× Commented Oct 5, 2015 at 13:20
  • 1
    Ah yeah. That's kinda cute. I've run into that before, but am wary about it's impact on readability. Then again, this is a perl one liner....
    – Sobrique
    Commented Oct 5, 2015 at 13:21
  • 2
    If you want unreadability: perl -nlE'$s+=-s$_}END{say$s' Commented Oct 5, 2015 at 13:24
  • @glennjackman: The END can completely ignore, perl -nle '$sum += -s $_}{ print $sum'
    – cuonglm
    Commented Oct 5, 2015 at 17:27
  • You can, but they're somewhat obscure tricks which I'd shy away from when giving someone a suggestion :)
    – Sobrique
    Commented Oct 9, 2015 at 16:06
1

Another alternative, using ordinary shell commands. Even handles the filename-with-spaces case. Assumes that list of file names is in a file named fnames.

tr '\n' '\0' < fnames | xargs -0 cat | wc -c

wc is oddly useful in counting situations. Keep it in mind.

2
  • 1
    Nice thinking-outside-the-box.  But: (1) The question doesn't say anything about access.  It goes without saying that the user must have execute permission on the directories containing the files, but this is the only answer that requires read access.  (2) This approach is very expensive if the file(s) are very large. Commented Oct 5, 2015 at 21:49
  • @Scott - legitimate criticisms. Thanks for thinking about it, and writing them down.
    – user732
    Commented Oct 5, 2015 at 22:00
0

I also came up with a solution:

cat files.txt | while read f; do ls -la $f; done | awk '{s+=$5;} END {print s;}'
6
  • 1
    This will break if any of your filenames contain newlines or spaces.
    – Sobrique
    Commented Oct 5, 2015 at 13:06
  • ahh.. just out of curiosity: is there a way to make above onliner work with all possible filenames? Commented Oct 5, 2015 at 13:07
  • 1
    Difficult, because of the way you're parsing the output of ls. By doing so, you'll always be subject to it's formatting. I tend to ignore the filenames with newlines problem, but files with spaces are distressingly common. But quoting filenames and using the stat answer as above would probably do the trick.
    – Sobrique
    Commented Oct 5, 2015 at 13:11
  • @TobiasGassmann did specify in the question that the filenames are "one per line", so the only change required would be to quote the "$f" in the ls portion.
    – Jeff Schaller
    Commented Oct 5, 2015 at 13:11
  • 3
    The rule of thumb is don't parse ls -- there are utilities like stat that are built for this exact purpose. Commented Oct 5, 2015 at 13:22
0

For filenames with space in my list, I used this (inspired by @Costas's answer):

SAVEIFS=$IFS
IFS=$(echo -en "\n\b")
du -ch $(<list) |tail -1
IFS=$SAVEIFS

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .