5

I have a 2TB Seagate ST2000DM001 HDD with one NTFS partition. I hadn't used it in months, when I plugged it again this partition had inexplicably become inaccessible : the volume's letter appears in Windows Explorer, but the partition's size is no longer recognized, there's an error if I try to open it. It appears as “RAW” in the storage manager. CHKDSK gives up analyzing it right away, with an error message saying that it's unable to determine the version and the state of the volume.

Yet, if I open that drive with R-Studio, the partition appears right away with its correct size, I can open it (no scanning is even required) and access all the files that were there the last time I used it normally, with the whole directory tree, and the files' contents seem 100% correct as far as I can see. Likewise, if I open the whole drive with WinHex, it correctly recognizes the partition, and displays the files & folders with their correct contents. I also tested 2 defragmentation softwares (in analysis mode only) : MyDefrag can list the partition's contents and provides valid information for each block hovered over with the mouse pointer (file name, size, LBA...) ; but Defraggler can't. I also opened it with DMDE : like R-Studio, it can recognize the partition's contents instantly ; it also displays a red warning regarding MFT records 1, 2, 3 ; these typically correspond to : $MFTMirr, $LogFile and $Volume, three important system files, which are indeed missing in the “$MetaData” directory. If I go back to R-Studio, I can see that those files are also missing in the “Metafiles” directory. If I examine the beginning of the MFT with WinHex, I can see that MFT record 0 is fine (it points to the MFT itself), but then MFT records 1, 2 and 3 are corrupted, they are filled with “FF” (hex) / “ÿ” (ASCII). And the strange thing is that the MFT mirror (which I can still locate with WinHex using an old volume snapshot, made before the problem appeared, and its location is also indicated by R-Studio in its partition properties pannel, apparently both the MFT and MFTMirr have their LBA written in the boot sector) has the exact same corruption pattern : the first record is fine, then the next three are filled with “FF”.

Now, my guess is that the partition is inaccessible because those three MFT records are missing, thus the corresponding files can't be found. And CHKDSK must require at least those files to operate properly. How could that happen ? How could both the MFT and its mirror (with is actually only a copy of the first 4 records, but in this particular case it should have been enough to fix the issue since the 3 corrupted records are among those 4) end up corrupted at the same time ?
And how could I fix / recreate those missing MFT records, so as to fix the partition “in place”, instead of extracting all the files (which I already did as a safety measure), re-formatting the partition, and transfering them back ? I could copy the valid records from another partition, and change the variable values, knowing the template, but so far I could only identify the timestamps (which I can copy from other system files on the same partition, as they're all created at the exact same time), I can't yet locate the fields indicating the size of the clusters location. I also found out that $Volume, which is a resident file (located entirely in the MFT), contains the partition's unique identificator, which might be the most problematic hurdle here : should it necessarily be the same as before for the partition to be properly recognized, and if so, is it stored somewhere in the system, or can it be randomly chosen, and if so, is there a particular pattern it has to conform to ? The information about the basic structure of MFT records seems to be scarce, or very hard to find among thousands of pages of meandering forum threads or articles with a too broad scope to be of any use in a case like this.

The partition opened in DMDE, three error warnings WinHex showing what should be the MFT record of $Volume The valid MFT record of $Volume on another partition

I described the issue with more details on HDDGuru, but didn't have a relevant answer to the question “how can I fix it?” (regular contributors there are highly knowledgeable when it comes to hardware / firmware failures, but for that kind of logical failures they seem to give up rather quickly).
http://forum.hddguru.com/viewtopic.php?f=1&t=36969

3
  • First, I'd image the drive, lest repairs make it worse. Then I'd try opening it on another PC, or from a Live USB, because an OS can get "confused" if an HDD is removed without being ejected... The OS might be caching erroneous information. Perhaps try opening under a different OS, which might be more tolerant of missing MFT. These are not fixes, but easy to implement quick tests. Commented Jun 20, 2018 at 21:19
  • Read cgsecurity.org/wiki/Advanced_NTFS_Boot_and_MFT_Repair and this section Repair An NTFS MFT and see if testdisk helps you.
    – cybernard
    Commented Jun 20, 2018 at 21:44
  • This is super weird. The MFT "mirror" (which is not actually a mirror) is in a fixed location inside the filesystem (it starts at half of it) and it can be rebuilt from the MFT itself. It's really strange that chkdsk doesn't work. Commented Jun 21, 2018 at 13:22

1 Answer 1

4

So, I managed to fix the issue on my own. I did some research about the structure of MFT records in general, and the particular structure of the 3 corrupted records which I had to re-create (see the linked HDDGuru thread for details), while examining them on several valid partitions. Then basically I copied them from a valid partition into a temporary file in WinHex, changed some key values which were different from one partition to another, then copied the 3072 bytes file directly onto the partition, ran CHKDSK, which could proceed and (after a few trials & errors) reported that the volume had no error, and now the partition is normally accessible again. I still don't know how it happened, but it's fixed !

The values I had to change were:
– the timestamps: all system files have the same timestamps, so I just copied the timestamps fields from the first MFT record (which points to the MFT itself) on the corrupted partition;
– there's a 1 byte field on each record called “fixup” in DMDE, present at three distinct spots in each one, which as someone explained to me on the HDDGuru thread is just “a check to make sure the record is valid and not corrupt” and can be set to any value, so long as it's the same in all three instances of that field;
– the first cluster LBA for the $LogFile record (I knew where it was located thanks to the old WinHex volume snapshot, otherwise I would have had to search the file based on its header to get that value; its default size is exactly 64MB, or 67108864 bytes, same on all partitions I examined);
– for the $Volume record (which also contains the actual $Volume file, since that file is “resident”, i.e. entirely contained in its MFT record), I had to find the original 16 bytes ID (or “object identifier”) of the volume, which was the trickiest part; after some unsuccessful attempts, I found that value inside the “tracking.log” file, located in the “System Volume Information” directory (I had first checked that file for a valid partition, the value matched that which appeared in $Volume, so I copied the corresponding field from “tracking.log” on the corrupted partition and pasted it in the volume ID field in $Volume);
– also in $Volume, I changed the volume's name, to have the same name as before, but that was not necessary, I could have left the name from the other partition and changed it later on in the volume's properties once it was accessible again (actually I used a little trick here: I copied the end of the $Volume record from a partition called “TEMP”, then, instead of changing that name with “Stockage” as the partition was called before, I put “Stoc”, so that it would have the same number of characters, in order to avoid an unexpected offset, and to be sure that the “used size” value would match, since I don't yet fully understand the structure of the record);
– since I changed the volume's name, the length of the file record actually used was different, so I had to change the field corresponding to the “used size” to reflect this and preserve consistency (I put the same “used size” as the one from the partition called “TEMP”);
– there was another value which was different, at the begining of the $Volume record, called “LSNlo” in DMDE: based on my research “LSN” stands for “$LogFile Sequence Number”, and it's a reference to the last change recorded in the $LogFile regarding a given file record in the $MFT, it's not crucial for consistency, and anyway I found out that the $LogFile being limited in size it regularly “purges” old records, so, since I hadn't used that drive in months, the record corresponding to the value that was there before it was wiped has been deleted, so I just put zeroes in that field. [2022/04/09] {Re-reading this post, and editing it primarily to fix a typo, I'm not sure about what I wrote in that part — if the drive hadn't been used in months, the $LogFile entries were unlikely to have been “purged”... but anyway, that value is not crucial since it worked without it. And I'm glad if it could help some people, based on the number of votes these two posts garnered over the past few years.}


@DrMoishe Pippik: As a measure of safety, I extracted the whole contents with R-Studio to another drive, prior to attempting this in-place fix. I also made a partial backup of the first 5GB (which is enough to contain all the relevant filesystem structures – although it should be stressed that it's not always enough to get the whole MFT, I learned it the hard way !). I haven't tried to access the drive on another computer, but I don't see how it would have been different (on a Windows system anyway – perhaps it would still have been recognized and accessible on a Linux system), since those three MFT records appeared effectively wiped in WinHex, and the issue persisted after a reboot.

@cybernard: I tried TestDisk, that was one of my first ideas after checking the S.M.A.R.T. status (which was and is still fine): it did find the partition, and could list the files (just like the other softwares I mentioned), but it couldn't fix the issue, since all it can do in a case of MFT corruption is repair the first 4 records by copying them from $MFTMirr, but here those three corrupted records were also corrupted in $MFTMirr, in exactly the same way.

@Andrea Lazzarotto: According to my observations, $MFTMirr is always located at sector 16, thus at the very begining of the volume, way before the actual $MFT, which is by default around the 3GB mark. CHKDSK probably didn't work because $MFTMirr was also corrupted, and so it couldn't access the apparently crucial $Volume file, which had been wiped as well since it is part of its MFT record. [2022/04/09] {This is true for Windows 7 and most likely also 8-10-11. The default location of the $MFT and $MFTMirr areas is linked to the version of Windows used to format the partition, and has changed several times over the operating system's history.}

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .