3

In order to verify the integrity and restore the magnetic strength of data stored on disks I use to archive data (intended to last 30 years or more), I want to read and re-write every block of data on the drive every year or two. Some are HFS+ and some are NFTS. This answer suggests a utility that will do that when run from a Windows machine, but I don't have a Windows machine handy, and even if I did, I don't think the Windows utility will work with HFS+ disks.

I want to make sure that I am refreshing important "hidden" data like the partition map itself, so I'm looking for a procedure that I can run on a Mac that will simply treat the disk like raw block storage and just read and re-write each block on the disk, but at the same time provide enough information to call out which files are damaged if it encounters a read or write error. (Since I have 2 archive copies of everything, I hope I can recover a bad file on one archive with a good file from the other archive.)

I can think of a bunch of ways to read all the data on the disk if I can get the Mac to mount it as a raw drive, but no satisfactory way to write the data back to the same block or to identify which file a bad block belongs to.

A solution that re-writes the data would still be helpful even if it cannot flag which file is corrupted if a bad block is found. If you know of a solution that works only on Linux or Windows, I'd like to hear about it as long as it can handle both HFS+ and NTFS drives. Also, if you know of a utility that can determine which file a bad block is part of, given a raw block ID, that would be useful, too, as half of a two-part solution.

1 Answer 1

1
+50

First a remark : For long-term archiving, a hard disk is not the best medium. Current M-Disc technology can keep your data good for a thousand years. These discs used to be costly, but now their price has gone down. For example, on Amazon Verbatim M-Disc 5 Pack BD-R 25GB is currently $14.27 for a total of 125 GB. You will also need the right burner. The advantage is that not much maintenance is required for the data after being written.

Now about magnetic hard disks : Studies have shown that shelved disks lose about 1% of their magnetism every year. Although it would take more than 50 years to lose more than 50% of the magnetic field, it is still advisable to do preventive refresh every 3-5 years.

It also turns out that modern disk drives will rewrite every sector whose magnetic field has gone below a certain built-in threshold. If the disk is left turned on long enough, every sector will be checked by the firmware. If you don't wish to wait, all you need to do is force a read of the entire disk (surface scan) for every sector to be verified.

Some commands that can read the whole disk are :

sudo cat /dev/rdisk0 > /dev/null
sudo badblocks -b 4096 -p 1 -c 32768 /dev/rdisk0

You should also keep an eye on the S.M.A.R.T. statistics of the disk. The Back Blaze article Hard Drive SMART Stats lists five S.M.A.R.T. metrics that indicate impending disk drive failure:

  • SMART 5 – Reallocated_Sector_Count
  • SMART 187 – Reported_Uncorrectable_Errors
  • SMART 188 – Command_Timeout
  • SMART 197 – Current_Pending_Sector_Count
  • SMART 198 – Offline_Uncorrectable

Back Blaze uses as criteria metric 187 and recommends replacing the drive once it becomes non-zero. However, other metrics are equally bad : For example metric 197 counts unrecoverable sectors. For archiving, I would say that it is preferable that all these metrics remain at zero.

6
  • Unfortunately, I would need more than 100 M-Discs, and even then they would not preserve the directory structure, requiring a lengthy effort to reconstitute the drives should the project need to be revived, and then requiring a difficult and/or expensive effort to archive the project in its revised state. Also, badblocks is not present on my Mac. Do you have a reference to support your claim that "drives will rewrite every sector whose magnetic field has gone below a certain built-in threshold"? How can I confirm that my drives actually do that?
    – Old Pro
    Commented Apr 13, 2017 at 18:32
  • (1) badblocks.c and badblocks.h is a very simple one-file program that you could compile and run. (2) M-Discs nowadays can go up to 100GB each. (3) Example article referring to disk refresh is here. To know if your drive does it would need info from the manufacturer. You could also just dd the data off and back again on any kind of drive and any kind of disk.
    – harrymc
    Commented Apr 13, 2017 at 20:01
  • On another possibility among many : The Sony 1.5TB Write-Once Optical Disc Cartridg ( $146) is "guaranteed 50+ years archive life".
    – harrymc
    Commented Apr 13, 2017 at 20:06
  • 1
    Archiving media is still costly, there is no way around it. As I said, cheap magnetic media is a possibility, but S.M.A.R.T. metrics have to be periodically checked. Such media should also be refreshed from time to time, even by such simple tools as dd, and replaced at the least indication of weakness. Opinions about the frequency for such maintenance vary from 2 to 10 years. Best safety is in keeping duplicates. Large hard disks of Enterprise quality are more rugged and not that costly, and you only need one, or 2 for duplicate.
    – harrymc
    Commented Apr 14, 2017 at 6:52
  • 1
    And you should also be thinking about conserving equipment for reading. Nowadays disks use SATA, but would it be available 20-30 years from now? So you should also mothball a computer for reading them. But in 20-30 years, will that computer be able to communicate with modern networks if Ethernet will not exist any more? That's why investing in archiving media might be worth the price. The other possibility is to transfer the data to new modern media every decade or two. These are all long-term considerations needed to be considered now.
    – harrymc
    Commented Apr 14, 2017 at 9:29

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .