9

I am creating a copy of a hard drive that has lost data on it so I can attempt to recover data from a copy of the disk and keep the original safe. I am using dd on a Mac and it worked for other drives and storage media very well to create exact duplicates of entire drives.

I had dd running for 45 hours and it copied only 410GB in that time. I noticed the disk indicator led stopped flashing for 5 seconds periodically and in activity monitor it showed only 360GB was written, but 410GB was read. I was using pv to view the pipe so I could know how much it progressed and that also reported 410GB so I think activity monitor is just innacurate with measuring the amount of written data. Also in activity monitor in the graph showing the read and write operations, it only showed changes in the read operations but the write graph was just flat and not showing anything happening. I thought the second dd process that's responsible for writing to the output file had maybe locked up.

The first attempt that took so long was only writing for a couple seconds and then immediately pausing for a much longer period, and it kept getting worse the more it got towards the end of the operation. When the computer rebooted I found the output file and it was actually 410GB just like the read operations in activity monitor suggested, so the write operations were estimated pretty incorrectly. I noticed activity monitor always gives an incorrect amount of written bytes so I no longer trust what activity monitor says, why is it so inconsistent?

I don't want to wear down the old hard drive from which data needs to be recovered. It can break at any moment especially if I keep having to run it for days on end doing lots and lots of read operations.

Why is dd constantly pausing and why does it get progressively worse over time? It starts off very fast but with large amounts of data it just starts getting slower and slower and just spends more time doing seemingly nothing or waiting for something. What is it doing? Is there a better way to copy a disk to a file and have it be an exact copy of every single byte on the drive?

Here are some images of the IO activity graphs:

Right after starting:

Start

After 4 hours:

4 Hours later

After 11 hours:

11 Hours later

After 26 hours:

26 Hours later!


I ran some tests copying an old usb stick using the different methods I was given. All files had identical checksums, all these methods ran on a single thread. The results are not what I expected:

  • dd: 42:07
  • ddrescue: 47:22
  • cat: 42:36
  • pv: 42:38
4
  • Probably because the drive has seen better days? Commented Apr 24, 2023 at 21:59
  • Better option is probably HDDSuperClone as it is purpose built for copying failing drives, youtu.be/_xSXXW42Ouw Commented Apr 25, 2023 at 0:39
  • @JoepvanSteen Is there a reason to prefer it to ddrescue (mentioned in the answers)?
    – wizzwizz4
    Commented Apr 25, 2023 at 18:10
  • Author of HDDSuperClone took ddrescue (concept) as starting point sort of. Advantages are directer access to hardware, head map learning routines. Here's author's explanation: data-medics.com/forum/threads/hddsuperclone-vs-ddrescue.1648/…. BTW it's now open sourced. Commented Apr 26, 2023 at 13:25

3 Answers 3

15

You asked, "Why is dd constantly pausing and why does it get progressively worse over time?"

Without data it is impossible to give a definitive answer, but it is possible that either,

  • Your in-memory destination disk cache is filling up and the pause is to allow the cache to complete writing to the physical disk. This is particularly (but not exclusively) true of USB connected devices
  • Your source disk has hardware errors that prevent easy reading of some disk blocks. In this situation you can use ddrescue to recover as much as possible (see below).

Given your graphs of IO over time I would say you have a faulty disk.

You then asked, "Is there a better way to copy a disk to a file and have it be an EXACT copy of every single byte on the drive?". Yes there is.

If you have a command such as dd if=/dev/source_disk of=/dev/target_disk (replacing source_disk and target_disk with appropriate values) then you are using dd inefficiently and it will run horrendously slowly*. Here are two solutions to this

  1. If the source disk is good then use cat or pv instead of dd:

    cat /dev/source_disk >/dev/target_disk
    pv /dev/source_disk >/dev/target_disk     # If you have pv installed
    
  2. If the source disk might be faulty at all, then use ddrescue instead of dd. You will need to install this through Homebrew (brew install ddrescue) or some other third party tools repository:

    ddrescue /dev/source_disk /dev/target_disk /some/temporary/logspace
    

    Ensure that the logspace file is on neither /dev/source_disk nor /dev/target_disk.

In both cases /dev/source_disk and /dev/target_disk must be unmounted and otherwise unused.


* There are many good answers here and particularly on Unix & Linux that explain why dd is used wrongly and ineffectively in so many use-cases. The issues here are (a) that dd uses a default block size of 512 bytes, so it will require multiple reads for each data block on the hard disk (eight reads per block on an SSD), and (b) any hardware read error will be re-read multiple times by the underlying disk subsystem, and (c) there is no useful recovery from such a permanent read error. cat is no slower than the default settings for dd and can be significantly faster when used in simple and straightforward situations

1
1

Heat? I've seen electronic parts' performances degrade (versus fail) when the temperature changes, higher or lower, particularly after they've warmed up.

2
  • The hard drives I'm using get a lot warmer when they're in use at full speed, but they're being read very calmly at 10% of how fast they normally go. I don't think it's heat since the disk is powered by an external power supply and thus has been spinning also non stop even when the computer rebooted and the process was restarted the disk just calmly kept going and stays at a constant temperature. And the problem with how fast the transfer was going was extremely noticable every single time.
    – Foxyz
    Commented Apr 25, 2023 at 19:31
  • 1
    @Foxyz There are circuits that handle the data stream, of course. If they are idle, they might cool down enough. Once they are in use again, they warm and perhaps their performance degrades?
    – kackle123
    Commented Apr 26, 2023 at 19:57
1

When running dd you can use Ctrl-T (aka status) to show the current state of dd:

load: 2.46  cmd: dd 33584 running 0.53u 0.79s
2619487+0 records in
2619487+0 records out
1341177344 bytes transferred in 1.325725 secs (1011655769 bytes/sec)

(Ahhhh if all devices could run as fast as /dev/zero and /dev/null...)

This will tell you what dd actually did on both sides.

Note that in some cases (conv=noerror) dd can skip over errors on the input, but that will by default skip on the output, which means data after the error will not be in the same location and will probably be useless. To avoid that, you also need to specify sync (i.e. conv=noerror,sync). But there are quite a few more caveats with this, so dd probably isn't the best tool.

1
  • That's very handy! I completely forgot that was a thing. Thanks for the reply!
    – Foxyz
    Commented Apr 26, 2023 at 17:41

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .