So, this took a lot of digging but I did eventually figure this out. I learned a lot about our servers through this issue.
Background of the folders
To begin, we have two folders in question. These folders are 100% identical in terms of their data, down to the binary. These folders live on one of our servers.
This specific server was recently taken offline and upgraded from Windows Server 2008 to 2012 to 2016. Along with every other file on this server, one of the folders stayed along for the ride on the volume while the server went through its upgrades. The other folder was actually duplicated from a snapshot of the server in its 2008 state and then placed onto the current 2016 server. So we have the original folder and the duplicated folder. The discrepancy is that the duplicated folder takes up more size on disk than the original.
What I tried
My line of reasoning to figure out this issue was to drill down into the duplicated folder and find out if all files had mismatching size on disk or if it was only certain ones. To make this task a lot easier on myself I used WizTree by Antibody Software which is similar to WinDirStat except that by defualt it has a column to show each file's size on disk as well as its size. WinDirStat only shows size, I believe. So I drilled down and found that not every subfolder or files had mismatching size on disk, only some. And the ones that did had something very peculiar to me: files with 0 size on disk, even though they had nonzero size.
Some NTFS background
That discovery led me to find this answer on another Super User question. In context of my issue this is what I gathered from that answer.
- If a file is so small that the data of the file and the filesystem bookkeeping are less than 1KB, NTFS will store the data within the file record itself (MFT) and no cluster has to be allocated for it. There is no size on disk because there's nothing beyond the file record. This is called a resident file.
- Before Windows 8, NTFS "size on disk" calculation did not take into account resident vs non resident files and just rounded each file's size up to the next multiple of cluster size. Now NTFS will count files with resident data as 0kb size on disk; meaning that the calculation used in Windows 8+ is smarter than the calculation used in Windows 7-.
- Once a file has passed the threshold from a resident file to a nonresident file, the file cannot go back.
How I ended up with exactly the same files that have a different size on disk
The original folder that went through the server upgrades would have had its data rewritten through the process, therefore Windows reran its NTSF calculation and when it found the files which it now knew could become resident files, it updated the bookkeeping to accommodate them, meaning that a few hundred 4kb (1x cluster size) size on disk files turned into 0kb size on disk files, thereby reducing a 245Mb file to 244Mb.
When our IT Dept. used the 3rd party duplication software, not only was the data copied, but the bookkeeping information was as well. This application is meant to duplicate everything and that includes the bookkeeping.
The original file in 2008 took up 245mb since the NTFS did not handle the resident files in its calculation. When the data was rewritten the NTFS ran the up-to-date calculation and made certain files resident. The duplication of the original from 2008 had its bookkeeping information copied as well, so the nonresident files stayed nonresident.
So, a succinct answer without all of the background needed to understand how this happened is:
One folder contains resident files, while the other folder does not. This is because one folder had its data rewritten with calculations that handles resident files while the other folder had its data and bookkeeping information copied from a system that did not handle resident files.
Noteworthy mentions
- I can replicate this issue by copying either the original (244mb) or the duplicated (245mb) folder. The new copy will always be 244mb with resident files.
- If I look at the C: Drive's administrative share from a Windows 7 computer, since Windows 7 calculates the Size on Disk property, both folders show 245MB. If I look from a Windows 10 computer, it shows both the 244mb and 245mb since the calculation is smart enough to look for resident files.
- If (from a Windows 8+ machine) I copy a nonresident file from the duplicated folder that is a resident file in the original folder, the copy results in a resident file since the data was rewritten.