4

Recently I've run across something odd. An exact copy of a folder full of files has a different Size on Disk than the original. I could understand how this could happen between drives with different structures, except that these files are on the same exact drives.

I have run multiple tests to ensure that every file is exactly the same and that all of the properties are exactly the same as well.

What could be an explanation for this?

Screenshot of the two folders' properties: Screenshot of the properties of the two folders.

Disk Management screenshot: Screenshot of Disk Management

Considerations

  1. Compression is not turned on for either folders.
  2. There is only one partition on the drive, ergo both folders would share cluster size, file system (NTFS), etc.
2
  • Out of curiosity, what type of data is stored here? What is this folder? Commented Jun 14, 2019 at 3:52
  • @Appleoddity Everything from 12x12 bitmap images to large pdfs, executables, DLLs, etc. If you take a look at my answer you will see that the issue arose from the extremely small files, and the circumstances surrounding their existence.
    – Tyler N
    Commented Jun 18, 2019 at 15:08

4 Answers 4

1

Compare Size vs Size on Disk: The contents of the files in your picture above are the same size, but the space taken to store them differs.

Files are stored in chunks ('clusters'), and those chunks vary according to the formatting system used to prepare the drive, and the cluster size chosen (if you do not let the default cluster size be used). This Microsoft document shows the default cluster sizing] for different formatting choices and for the drive size.

You may also have Alternate Data Stream forks attached to files in the directory off the root.

6
  • Everything you mentioned is true, but how does it apply to my situation in which the same exact files on the same exact drive somehow take up different size on the disk? I'm fully aware of clusters but this isn't just files of the same size, it's exact copies of the same files on the same exact drive (same size, partition and ergo cluster size)?
    – Tyler N
    Commented Jun 13, 2019 at 20:38
  • File have two sizes: the real data size, which can be 5 MB, and reserved space in the file system, which can be 100 MB. It depends on the file creation procedure by a specific app.
    – pbies
    Commented Jun 13, 2019 at 22:16
  • @pbies The "real data size" vs "size on disk" or "reserved space in file system" is not in question. The question is how the same exact files would have different "size on disk" between the original and the copies where the only difference is the file path (C Drive vs. subfolder on C Drive)
    – Tyler N
    Commented Jun 13, 2019 at 22:41
  • 1
    @K7AAY Read my reply to that answer, in my testing subfolders do not affect size on disk. And the information he references is for Unix-based systems.
    – Tyler N
    Commented Jun 13, 2019 at 22:47
  • 1
    @K7AAY I have already done exactly that. I usually only asks questions on forums when I've exhausted all of my attempts to answer something on my own. From every test I've ran, the amount of folders that a file resides in has no effect on the "size on disk".
    – Tyler N
    Commented Jun 13, 2019 at 22:54
1

So, this took a lot of digging but I did eventually figure this out. I learned a lot about our servers through this issue.

Background of the folders

To begin, we have two folders in question. These folders are 100% identical in terms of their data, down to the binary. These folders live on one of our servers.

This specific server was recently taken offline and upgraded from Windows Server 2008 to 2012 to 2016. Along with every other file on this server, one of the folders stayed along for the ride on the volume while the server went through its upgrades. The other folder was actually duplicated from a snapshot of the server in its 2008 state and then placed onto the current 2016 server. So we have the original folder and the duplicated folder. The discrepancy is that the duplicated folder takes up more size on disk than the original.

What I tried

My line of reasoning to figure out this issue was to drill down into the duplicated folder and find out if all files had mismatching size on disk or if it was only certain ones. To make this task a lot easier on myself I used WizTree by Antibody Software which is similar to WinDirStat except that by defualt it has a column to show each file's size on disk as well as its size. WinDirStat only shows size, I believe. So I drilled down and found that not every subfolder or files had mismatching size on disk, only some. And the ones that did had something very peculiar to me: files with 0 size on disk, even though they had nonzero size.

Some NTFS background

That discovery led me to find this answer on another Super User question. In context of my issue this is what I gathered from that answer.

  1. If a file is so small that the data of the file and the filesystem bookkeeping are less than 1KB, NTFS will store the data within the file record itself (MFT) and no cluster has to be allocated for it. There is no size on disk because there's nothing beyond the file record. This is called a resident file.
  2. Before Windows 8, NTFS "size on disk" calculation did not take into account resident vs non resident files and just rounded each file's size up to the next multiple of cluster size. Now NTFS will count files with resident data as 0kb size on disk; meaning that the calculation used in Windows 8+ is smarter than the calculation used in Windows 7-.
  3. Once a file has passed the threshold from a resident file to a nonresident file, the file cannot go back.

How I ended up with exactly the same files that have a different size on disk

The original folder that went through the server upgrades would have had its data rewritten through the process, therefore Windows reran its NTSF calculation and when it found the files which it now knew could become resident files, it updated the bookkeeping to accommodate them, meaning that a few hundred 4kb (1x cluster size) size on disk files turned into 0kb size on disk files, thereby reducing a 245Mb file to 244Mb.

When our IT Dept. used the 3rd party duplication software, not only was the data copied, but the bookkeeping information was as well. This application is meant to duplicate everything and that includes the bookkeeping.

The original file in 2008 took up 245mb since the NTFS did not handle the resident files in its calculation. When the data was rewritten the NTFS ran the up-to-date calculation and made certain files resident. The duplication of the original from 2008 had its bookkeeping information copied as well, so the nonresident files stayed nonresident.

So, a succinct answer without all of the background needed to understand how this happened is: One folder contains resident files, while the other folder does not. This is because one folder had its data rewritten with calculations that handles resident files while the other folder had its data and bookkeeping information copied from a system that did not handle resident files.

Noteworthy mentions

  1. I can replicate this issue by copying either the original (244mb) or the duplicated (245mb) folder. The new copy will always be 244mb with resident files.
  2. If I look at the C: Drive's administrative share from a Windows 7 computer, since Windows 7 calculates the Size on Disk property, both folders show 245MB. If I look from a Windows 10 computer, it shows both the 244mb and 245mb since the calculation is smart enough to look for resident files.
  3. If (from a Windows 8+ machine) I copy a nonresident file from the duplicated folder that is a resident file in the original folder, the copy results in a resident file since the data was rewritten.
0

Windows automatically compresses files that do not get used frequently. If It has been compressed by Windows, You can check it :

Right click on folder > Properties > Advanced attributes

Check if compress contents to save disk space is enabled.

1
  • Compression is not turned on for either folder; I will update my question to include that information.
    – Tyler N
    Commented Jun 13, 2019 at 20:02
0

The extra 1MB of space is likely due to added metadata in the inode or index node...

The inode (index node) is a data structure in a Unix-style file system that describes a file-system object such as a file or a directory. Each inode stores the attributes and disk block location(s) of the object's data. File-system object attributes may include metadata (times of last change, access, modification), as well as owner and permission data.

Even though they are copies, the bigger one is in a different (nested) directory so, it has a different set of info for indexing.

3
  • 1
    It looks like the "inode" you're referencing is for Unix-based systems which I believe Windows is not. I believe NTFS is proprietary to Microsoft. Either way I was unable to replicate the issue by creating a different subfolder and copying the original files into that subfolder. In that scenario the size on disk matched. I would think if the the amount of subfolders affected size on disk that the issue would be able to be repeated.
    – Tyler N
    Commented Jun 13, 2019 at 22:46
  • I did find in my research that the MFT is the equivalent for NTFS of the inodes in Unix that you mentioned. In NTFS the file's location has no affect on its size on disk because the MFT is stored within the system reserved partition rather than the "normal" partition where you store the data of your files.
    – Tyler N
    Commented Jun 18, 2019 at 15:16
  • This sounds basically like the chosen answer - a newer version of the filesystem utilities copied files differently, storing tiny files inside the inode/MTF. Part of the chosen answer should also reflect that different versions of the OS/filesystem utilities wrote one set of files, that's vitally important
    – Xen2050
    Commented Jun 6 at 18:42

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .