4

I'm specifically interested in the answer for NAND storage (SD cards, memory sticks).

I have a Windows 7 laptop where I run 2 VM's inside VMWare, as well as multiple programs in the host OS. VM disk access is very random, and I've had an excellent improvement in performance of my system from using a 16GB memory stick with ReadyBoost. I formatted the flash with exFAT so that I could allocate a 16GB sfcache file. I used a 32MB cluster size, but I'm not sure this is optimal.

My understanding of cluster size is that larger cluster sizes result in wasted space for smaller files but increased read speed and less filetable usage if you mostly have large files. However, I realised that ReadyBoost only helps for random I/O, not sequential I/O. This made me wonder if a smaller cluster size would be better for random I/O, and large cluster size better for sequential I/O.

I ran some IO tests on the 16GB thumb drive.

The images show the results for cluster sizes of: 1kB, 4kB, 8k, 32MB.

enter image description here

enter image description here

enter image description here

enter image description here

1k had the worst random I/O speed while 4k had the best (only slightly slower than 8k for sequential I/O, maybe within error margin). Not sure how to interpret these results. Is it possible the manufacturers optimized their device for the default Windows allocation unit size?

2 Answers 2

2

If the files you're creating are made upon formatting and don't change size afterwards, you are good with defaults. But if you don't have many files and folders and they shrink or enlarge in size, it is better to use slightly above defaults (8K or 16K). Remember that folders and files smaller than the cluster size both occupy the cluster size in space.

It is always better to experiment with cluster values for FAT32, exFAT and NTFS depending on the usage a partition is gonna have so that you can extract the most performance from your storage. I strongly recommend to config this value as close as possible to the defaults. Remember to take into consideration the minimum I/O request for HDD and SSD, usually HDDs are 512 Bytes and SSDs are 4096 Bytes.

I'll give an example: I have a Windows 10 installation on an 480GB SATA III SSD. I have a 500GB HDD which I use to store the Games Stores and the Games themselves, so I decided to dedicate the partitions for these different purposes.

  1. 16GB NTFS @ 8K cluster size - All games stores (Origin, Steam, Epic, Uplay)
  2. 384GB NTFS @ 16K cluster size - Store games that are still receiving updates
  3. 64GB exFAT @ 32K cluster size - Store old games which does not change anymore

The default cluster size for NTFS is 4096, note that going above this you will lose compression, if this matters for you.

The bigger the cluster, the less fragmentation you might have when files change a lot in size or are deleted and recreated with different sizes. In the real world, we don't have always random or sequential operations. The real world is mixed. But we can make the mixed operations less random and more sequential this way. As we know, many small operations are worse than less bigger operations.

1
  • I agree with this, with a second consideration to support your answer, if using System Restore (which creates snapshots on NTFS), the overhead of this multiplies with smaller cluster sizes, and windows will by design occasionally also defrag SSD's using system restore, 8k/16k clusters vs 4k will significantly reduce metadata fragmentation, as there is less metadata overhead.
    – Chris C
    Commented May 28, 2023 at 4:27
2

I personally would not bother as deviations from standard formatting have nearly always caused me problems down the road if not immediately in terms of reliability and support. Deviating from a standard should be well considered as code paths for these type of things are not necessarily as well tested as for the standard general case.

Unless you have some real need, I'd not experiment especially if it is any data you wish to save long term.

1
  • 1
    I'm not concerned about long term storage, I just want to optimize the one use-case: random access. Was also curious about the different results.
    – Jim
    Commented May 18, 2014 at 10:42

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .