3

du was showing some drives with much less space than I was expecting, and ls -alh also showed the sum at the top to be a factor of three more than the sum of the individual files. Following this answer, I checked with ls -s, and sure enough, most of the files are using three times as much disk space as their size. What causes this and can I do anything to get the disk usage down?

Edit

I'm seeing output like this from ls -alhs:

 50K -rw-------   1 xxx xxx 9.0K Jan 29 20:34 20120103.gz
242K -rw-------   1 xxx xxx  67K Jan 29 20:53 20121130.gz

so the problem isn't that my file sizes are much less than 4k.

8
  • What is your typical file size. What is the actual number of bytes actually being wasted?
    – Zoredache
    Commented Jan 31, 2014 at 17:21
  • Wow, that is impressive. I don't think I have ever seen, or at least I have never space being wasted like in your example. Just for additional information. What type of filesystem is this on? ext2/3/4, or something else?
    – Zoredache
    Commented Jan 31, 2014 at 17:38
  • 1
    So then NFS to what? What is the OS and filesystem on the NFS server? Do you see the results if you look at the ls output on the nfs server?
    – Zoredache
    Commented Jan 31, 2014 at 17:44
  • 1
    Well in any case your problem has nothing to do with your Linux machine then. It is the mechanics of the filesystem on the Server that matter. I am not familiar with what that equipment.
    – Zoredache
    Commented Jan 31, 2014 at 17:49
  • 1
    Interesting that those files are compressed archives. Perhaps the "large size" is the uncompressed size, and the "small size" is the compressed size?
    – sawdust
    Commented Jan 31, 2014 at 19:14

1 Answer 1

0

I don't know what file system you are using or cluster size but here is some generic information that should help.

The file system allocates data in groups sometimes called clusters(by some file systems). The cluster size is variable, but in many cases a power of 2 at least 512 bytes in size. The 512 bytes represents the physical sector size of all but the newest hard drives which have 4096 byte sectors.

Each file uses at least 1 cluster, and in most cases the last cluster is not fully used. The remaining space on each file is remains not allocatable. Using FAT,FAT32,NTFS it is not possible to go higher than 64kb each cluster, but the same is not true for linux.

ls -alhs

how big is the file at the top of list the . and the ..?

So if you have a lot of files wasting tiny amounts of space it all adds up to a large amount of wasted space.

You would have to look into the exact details of your file system to find this data out. Changing file systems can have a major impact on the overhead. I tried BTRFS and it wasted a tons of space. I did a fresh install and ran updates and it was like 2x or more higher than other file systems.

Ext4 also does poorly with a large number of small files, a perfect example of this is 1 copy of the kernel source code has 10,000's of thousands of small files.

It is entirely possible that your file system is responsible for the wasted space and the only way to change that is changing the file system.

In addition, some file systems support snapshots which allows for back up copies of the same file to be stored in the file system. The distro controls how the feature is configured and whether it is on by default. Every file you change or delete could be in a snapshot and not actually deleted. There is a command to remove the old snapshots, but I don't recall what the command is.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .