41

I can reformat a brand new 2TB WD Passport drive to exFAT, with choice of many "Allocation Unit Size":

128kb
256kb
512kb
1024kb
4096kb
16384kb
32768kb

which one is best if this drive is mainly used for recording HDTV programs using Media Center on Windows 7? thanks.

This is related to question: Is it best to reformat our external Hard drive to exFAT for compatibility with Mac?

2
  • 1
    I created some plots and did some analysis on a trade study you might find interesting, in my new answer here. For disks with only large files, I recommend 128 KiB cluster sizes. For disks with lots of small files, I recommend 8 KiB to keep disk usage down. Wasted space on a system with 32 MiB cluster sizes, for instance, is 32768/8 = 4096x greater than on systems with the same data using 8 KiB cluster sizes. Commented May 23, 2023 at 7:16
  • So, if you had 10000 files that were 1 byte each, that would be about 10000 x 8 KiB / 1024 = 78 MiB on an 8-KiB-cluster exFAT drive, and a ridiculous 10000 x 32 MiB/1024 = 312.5 GiB on a 32-MiB-cluster exFAT drive. Again, 4096 times higher. Commented May 23, 2023 at 7:20

6 Answers 6

36

You should first understand what

Allocation Unit Size (AUS)

means.

It is the smallest data block on the disk. Your actual data will be separated into units of that size while saving to the disk. For example, if you have a file sized 512KB and you have 128KB allocation unit size, your file will be saved in 4 units in the disk (512KB/128KB).

If your file's size is 500KB and you have 128KB AUS, your file will still be saved in 4 units on the disk because as mentioned above 128KB is the smallest size of an allocation unit. 384KB will be allocated in 3 units, the remaining 116KB will be allocated in a final unit, and 12KB of that unit will be empty. You can observe this behaviour on the file properties dialog on Windows; what your file size is and how much space this file actually covers on the disk are two different concepts. The operating system reads only the allocation unit size worth of data at a low level disk read operation.

That being said, using a large AUS significantly reduces the free space utilization due to not using the last allocation unit completely. And as a side effect, the number of files to store on the disk is reduced due to same problem: the last AU not being used fully. But here's the trade-off: using a large AUS significantly improves the disk reading performance. The O.S. can read more data at one read. Imagine if the O.S. need to make only a couple of disk reads to completely read a GB sized file!

Using small AUS improves the free space utilization but reduces the disk read performance. Think using large AUS in reverse, same category problems and improvements, but in reverse...

So, what is the conclusion here? If you will store large (I mean large!) files on the disk, a higher AUS will give an appreciable read performance boost while reducing the file count and free space

Which AUS you should use? This depends on how much your average file size is. Also you can compute the free space utilization according to your file sizes.

4
  • Very lucid breakdown. But does each cluster have any inherent storage overhead (e.g. indices or the cluster equivalent of sector headers)? And are there any interactions with physical/emulated sector sizes or cache sizes? Lastly, do larger cluster sizes negatively affect random access performance? 4KB sector HDDs seem to have lower random access performance even though they have higher throughput than 512byte HDDs. Commented Apr 27, 2012 at 2:40
  • 2
    There are no significant storage overhead at high levels. Besides there is enough hrdw overhead since the actual physical sector size is 512Bytes... There is a part of file system formatting that records the cluster information, from how many sector this cluster is created, to the partition structure. The sector size emulation is a job of disk driver. O.S. file system server should deal with logical organization (NTFS, FAT etc) at high level O.S ops, smallest unit reads/writes at low level O.S ops and disk driver itself must work back to back with controller(hardware) for low level hardware... Commented Apr 27, 2012 at 3:33
  • ...access which contains the emulation. And caching is not a job of O.S. It is done by hardware itself. O.S asks for certain data, disk decides wheter look on cache or platter itself for it... Random access performance should actually not be a general performance criteria when having parameters like A.U.S.. Think it this way: ... Commented Apr 27, 2012 at 3:33
  • .. N sized units, M number units, N*M capacity disk, "what is the probability of hitting this unit?" and remember, disk has to be more precise in locating the beginnings of the units.. So, Random access performance is something bound with M^2/N.. 4K units, 8 units, 32K capacity disk. R.A bound with 64/4. 8K units, 4 units, same capacity, same disk. R.A becomes 16/8. You wouldn't find an article about this kind of calculation, but believe me :) It is more job to "randomly" locate a data using large unit sizes over small sizes Commented Apr 27, 2012 at 3:50
7

For filesystems with tons of small files, use 8 KiB cluster sizes. For filesystems with only large files, like media, use 128 KiB clusters. If not sure, use 8 KiB cluster sizes. There is negligible speed improvement for clusters larger than 128 KiB (see the top-left plot in the group of plots below), but the possibility for huuuuuge disk usage if you go to larger clusters.

Example: for my case, with tons of small files (just over 1 million files, comprising 74 GB):

  1. 8 KiB cluster size --> 82 GB of storage spaced used up for my 74 GB of data.
    1. 338 MB write speed
  2. 128 KiB cluster size --> 194 GB of storage space used up for my same 74 GB of data
    1. 390 MB write speed
  3. 32 MiB cluster size --> 32768 GB (32.8 TB) (nope! not an error) of storage space used up for my 74 GB of data.
    1. 428 MB write speed

Microsoft's default values for cluster size max out at 128 KiB too. See the table at the end of my answer, from Microsoft.

Study these plots I meticulously made. The top-left plot trends apply regardless of how many small files you have, as I did those tests by rsyncing over a 5.3 GB file to the exFAT external SSD, but the other three plots are exacerbated by the quantity of tiny files I have. The log-linear trend in the bottom-right applies to everyone, but its slope, and therefore the y-axis values, depends on how many small files you have.

enter image description here

I have a full writeup on it on my website here if interested: https://gabrielstaples.com/exfat-clusters/

Full Python matplotlib/numpy plotting code is here: https://github.com/ElectricRCAircraftGuy/eRCaGuy_hello_world/blob/master/stack_exchange/format_exFAT_PLOTS.py


I'm not answering the OPs question about "recording HDTV programs". I'm answering for people with lots of small files, including doing a full disk backup, as they will certainly land on this page too.


I'd choose the smallest allocation unit possible if you have a lot of small files. This avoids wasted space for tiny files. Ex: use 4 KiB allocation unit size on exFAT instead of 128 KiB.

I just backed up 74 GB of data from an Apple APFS filesystem onto an external SSD with an exFAT filesystem with 128 KB allocation unit sizes (the default apparently when formatting to exFAT using Gnome Disks in Linux Ubuntu), and on the external drive, the formerly-74 GB of data on the APFS filesystem now takes up a whopping 194 GB on the exFAT filesystem! That's nuts! That's 2.62x more space taken, for nothing!

It's because I have thousands and thousands of tiny files that are, for instance, only a couple hundred bytes. On the old drive with the APFS filesystem, those would take up a single 512 byte cluster, and on the new external drive with the exFAT filesystem, those same 90 byte files take up a whopping 128 KB cluster, which is 128KiB * 1024 bytes/KiB / 512 bytes = 256 times larger storage space required on the exFAT drive for tiny files. Reducing that 128 KiB allocation unit to only 4 KiB, for instance, would take up 128/4 = 32x less space!

See also

  1. My answer: Unix & Linux: Create and format exFAT partition from Linux - I show the commands for both Linux Ubuntu 20.04 with mkexfatfs, and Linux Ubuntu 22.04 with mkfs.exfat.

  2. My answer on how to determine the cluster size of any filesystem you have: Server Fault: How to find the cluster size of any filesystem, whether NTFS, Apple APFS, ext4, ext3, FAT, exFAT, etc.

  3. [REALLY USEFUL] Support.Microsoft.com: Default cluster size for NTFS, FAT, and exFAT:

    Default cluster sizes for exFAT

    The following table describes the default cluster sizes for exFAT.

    Volume size Windows 7, Windows Server 2008 R2, Windows Server 2008,
    Windows Vista, Windows Server 2003, Windows XP
    7 MB–256 MB 4 KB
    256 MB–32 GB 32 KB
    32 GB–256 TB 128 KB
    > 256 TB Not supported
3
  • "mainly used for recording HDTV programs" Commented May 21, 2023 at 19:34
  • 1
    @JoepvanSteen, true for the OP. But, if someone asks the same question but says "for tons of tiny 50 byte files", it will surely get closed as a duplicate of this one. I'm just covering that case now for anyone who lands here. Commented May 21, 2023 at 19:41
  • 1
    I had exactly this issue! I even made my SSD invisible for macOS. I have millions of very small files images with 30-50KB size Commented May 29 at 9:43
6

Given that HD recordings are large files, a large allocation unit (16384 or 32768 KB) will give better performance. The impact of slack space (space wasted due to allocation units not used fully--files are stored in allocation units which must be used as whole units) will be limited with a small number of files. On the other hand, if you have many smaller files, use a smaller allocation unit to reduce wasted space.

2
  • I wonder if someone has done a speed test comparison table/chart with different allocation sizes. I couldn't find anything googling tho.
    – Shayan
    Commented Dec 9, 2022 at 9:59
  • 1
    as a downside you will lose 32MB for every small file (like metadata or subtitles or just a txt file)
    – Zibri
    Commented Jan 17, 2023 at 21:15
3

You can safely use 4K allocation unit for exFAT. Even if you have thousands of small files you won't waste a lot of space. In case of default 128KB allocation unit for e.g. 64GB usb stick, 1024 files of 4K bytes will occupy 128MB instead of 4MB, since every file requires at least one allocation unit.

If you use your disk mostly for audio and video files use a larger allocation unit.

FAT32 is not an option for disks larger than 32GB so choose whatever Windows allows.

4
  • Which size is a good intermediate? I'd like to store both very small and very large files.
    – PythonNut
    Commented Oct 19, 2015 at 17:24
  • 1
    @PythonNut: 4k. Always use 4k. There is no significant benefit to larger allocation units, but if you ever might store small files on the drive, there are huge disadvantages to larger units. Commented Oct 29, 2018 at 3:10
  • 1
    When formatting a 4TB drive as exFAT the smallest AUS that Windows 10 will offer me is 256kb. I'm not sure if 4k is available on smaller drives, or if you were thinking of NTFS.
    – Codemonkey
    Commented Nov 7, 2019 at 12:42
  • 4kb? I can't remember but I guess the minimum is not 128kb? Commented Feb 4, 2020 at 23:55
0

Basically, the larger the files you intend on keeping the larger each allocation unit size you may want in use - but not too big or too small! I think DragonLord explained it pretty well.

So if wasted space bugs you then maybe you might want to think about using a different file system. Something like EXT4 perhaps. Problem there is Microsoft OS's (Windows, really) don't work too well with anything other than FAT (vFAT, FAT32, etc.) or NTFS. And if you ever end up with files larger than 4Gig you may end up cursing any FAT type system you may be using. Therefore, I would recommend using the NTFS file system with the recommended allocation unit size (I believe that's 4K). That way, if you end up with files larger than 4Gig you will still be able to store your monster files at least until you can break them up or transcode them into something smaller. (I assume we're talking about huge multimedia files which is why I bring up "transcoding" since I seem to always find ways to make files smaller when I transcode, especially if they were recorded using MCE.)

About the only reason I can see for using FAT (vFAT, FAT32, FAT16, etc.) is so that other operating systems can read/write files on the storage device. FAT is about as universally accepted as it gets. Otherwise, I don't recommend using FAT (unless the device's capacity is 4Gig or less) - use NTFS at least for Windows. You can always make another partition with a different file system even if it's on the same physical drive. Hope it helps.

-1

As Wikipedia says:

To provide improvement in the allocation of cluster storage for a new file, Microsoft incorporated a method to pre-allocate contiguous clusters and bypass the use of updating the FAT table.

So basically you could choose 4KB or smaller allocation unit with exFAT and be safe when writing bigger files, like HD video material.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .