42

So here's what's happening.

I started a backup of a drive on my server through a Linux live USB. I started copying the first drive with the dd command vanilla; just sudo dd if=/dev/sda of=/dev/sdc1 and then I remembered that this just leaves the console blank until it finishes.

I needed to run a different backup to the same drive anyway, so I started that one as well with sudo dd if=/dev/sdb of=/dev/sdc3 status=progress and then I got a line of text that shows the current rate of transfer as well as the progress in bytes.

I was hoping for a method that shows a percentage of the backup instead of doing the math of how many bytes are backed up out of 1.8TBs. Is there an easier way to do this than status=progress?

0

5 Answers 5

69

See answers from this question [1]

pv

For example you can use pv before you start

sudo apt-get install pv    # if you do not have it
pv < /dev/sda > /dev/sc3   # it is reported to be faster
pv /dev/sda > /dev/sc3     # it seems to have the same speed of the previous one
#or 
sudo dd if=/dev/sda | pv -s 1844G | dd of=/dev/sdc3  # Maybe slower 

Output [2]:

440MB 0:00:38 [11.6MB/s] [======>                             ] 21% ETA 0:02:19

Notes:
Especially for large files you may want to see man dd and set the options needed to speed up all on your hardware, e.g. bs=100M to set the buffer, oflag=sync to count the effective bytes written, maybe direct...
The option -s only takes integer parameters so 1.8T-->1844G.
As you can notice from the first lines you do not need dd at all.


kill -USR1 pid

If you already launched the dd command, once you have individuated its PID (Ctrl-Z +bg and you read it , or pgrep ^dd ... ) you may send a signal USR1 (or SIGUSR1, or SIGINFO see below) and read the output.
If the PID of the program is 1234 with

kill -USR1 1234

dd will answer on the terminal of its STDERR with something similar to

4+1 records in
4+0 records out
41943040 bytes (42 MB) copied, 2.90588 s, 14.4 MB/s

Warning: Under OpenBSD you may have to check in advance the behaviour of kill[3]: use instead
kill -SIGINFO 1234.
It exists the sigaction named SIGINFO. TheSIGUSR1 one, in this case, should terminate the program (dd)...
Under Ubuntu use -SIGUSR1 (10).

10
  • 9
    you'll almost certainly find that using 'bs' on the dd command hugely speeds it up. Like dd if=/dev/blah of=/tmp/blah bs=100M to transfer 100M blocks at a time
    – Sirex
    Commented Jan 31, 2018 at 1:49
  • 1
    @Sirex Of course you have to set the bs to optimize the transfer rate in relation with your hardware... In the answer is just repeated the commandline of the OP. :-)
    – Hastur
    Commented Jan 31, 2018 at 8:05
  • 3
    @Criggie: that's maybe because dd had already finished all the write() system calls, and fsync or close was blocked waiting for the writes to reach disk. With a slow USB stick, the default Linux I/O buffer thresholds for how large dirty write-buffers can be leads to qualitatively different behaviour than with big files on fast disks, because the buffers are as big as what you're copying and it still takes noticeable time. Commented Feb 1, 2018 at 11:00
  • 5
    Great answer. However, I do want to note that in OpenBSD the right kill signal is SIGINFO, not SIGUSR1. Using -USR1 in OpenBSD will just kill dd. So before you try this out in a new environment, on a transfer that you don't want to interrupt, you may want to familiarize yourself with how the environment acts (on a safer test).
    – TOOGAM
    Commented Feb 2, 2018 at 5:17
  • 1
    the signals advice for dd is really great info, especially for servers where you can't/don't want to install pv
    – mike
    Commented Feb 3, 2018 at 11:48
39

My go-to tool for this kind of stuff is progress:

This tool can be described as a Tiny, Dirty, Linux-and-OSX-Only C command that looks for coreutils basic commands (cp, mv, dd, tar, gzip/gunzip, cat, etc.) currently running on your system and displays the percentage of copied data. It can also show estimated time and throughput, and provides a "top-like" mode (monitoring).

"<code>progress</code> in action" screenshot

It simply scans /proc for interesting commands, and then looks at directories fd and fdinfo to find opened files and seek positions, and reports status for the largest file.

It's very light, and compatible with virtually any command.

I find it particularly useful because:

  • compared to pv in pipe or dcfldd, I don't have to remember to run a different command when I start the operation, I can monitor stuff after the fact;
  • compared to kill -USR1, it works on virtually any command, I don't have to always double-check the manpage to make sure I'm not accidentally killing the copy; also, it's nice that, when invoked without parameters, it shows the progress for any common "data transfer" command currently running, so I don't even have to look up the PID;
  • compared to pv -d, again I don't need to look up the PID.
3
  • 1
    Note: You can monitor more than just coreutils processes. Simply specify the name of the command with --command <command-name>.
    – jpaugh
    Commented Feb 1, 2018 at 15:19
  • 1
    This.Is.AWESOME!
    – Floris
    Commented Feb 3, 2018 at 19:03
  • 1
    progress -m will keep the app running in monitor mode.
    – sleblanc
    Commented Sep 28, 2020 at 14:39
27

Run dd, then, in a separate shell, invoke the following command:

pv -d $(pidof dd) # root may be required

This will make pv obtain statistics on all the opened file descriptors of the dd process. It will show you both where the read and write buffer sit.

3
  • 3
    Works after the fact!? Amazing!!
    – jpaugh
    Commented Jan 31, 2018 at 21:16
  • 3
    That's very cool. It avoids the memory-bandwidth + context-switch overhead of actually piping all the data through 3 processes! @jpaugh: I guess it just looks at /proc/$PID/fdinfo for file positions, and at /proc/$PID/fd to see which files (and thus the sizes). So yes, very cool, and good idea for a feature, but I wouldn't call it "amazing" because there are Linux APIs that let it poll the file positions of another process. Commented Feb 1, 2018 at 10:56
  • @PeterCordes I didn't realize file-position was exposed by the kernel. (I've been spending my life carefully preparing pv pipelines in advance.) Of course, I assumed as much once I saw that this does work.
    – jpaugh
    Commented Feb 1, 2018 at 15:05
10

There's an alternative to dd : dcfldd.

dcfldd is an enhanced version of GNU dd with features useful for forensics and security.

Status output - dcfldd can update the user of its progress in terms of the amount of data transferred and how much longer operation will take.

dcfldd if=/dev/zero of=out bs=2G count=1 # test file
dcfldd if=out of=out2 sizeprobe=if
[80% of 2047Mb] 52736 blocks (1648Mb) written. 00:00:01 remaining.

http://dcfldd.sourceforge.net/
https://linux.die.net/man/1/dcfldd

1
  • It's a longer command name... clearly, it is inferior. (+1)
    – jpaugh
    Commented Feb 1, 2018 at 15:12
6

As a percentage you'd have to do some maths, but you can get the progress of a dd in human readable form, even after already starting, by doing kill -USR1 $(pidof dd)

The current dd process will display similar to:

11117279 bytes (11 MB, 11 MiB) copied, 13.715 s, 811 kB/s

2
  • 4
    That's basically the same thing that status=progress gives
    – rakslice
    Commented Jan 30, 2018 at 23:01
  • 1
    I was actually about to say that's the exact same thing that status=progress gives.
    – user865814
    Commented Jan 30, 2018 at 23:02

You must log in to answer this question.