45

ORIGINAL QUESTION:

If I have 2 identical hard drives with the following characteristics:

  • SATA 6.0 Gb/s
  • 5400 rpm
  • 3TB

How long should a full dd copy take to complete?

So far it's been running for 5 hours and still going...

I am using Linux Ubuntu 12.04 64bit and the command I am using is:

dd if=/dev/sdb of=/dev/sdc

UPDATE: 1

I can now see the progress, and it's been 6+ hours for copy 430GB. The HDD is 3TB...

Is there no faster way of doing this?


UPDATE: 2

This seems a lot better than before (Thanks to Groxxda for the suggestions):

sudo dd if=/dev/sdb bs=128K | pv -s 3000G | sudo dd of=/dev/sdc bs=128K

ETA is about 9 hours for 3TB, whereas before it reached 430GB after 6 hours, so I am guessing it would have taken about 36 hours using the previous command.

Cadoiz
  • 276
oshirowanen
  • 2,621
  • 1
    Try to grab the statistics of the process:
    Sending a USR1 signal to a running 'dd' process makes it print I/O statistics to standard error and then resume copying.
    $ dd if=/dev/zero of=/dev/null& pid=$!
    $ kill -USR1 $pid
    
    

    Check your man page for the actual signal as it differs for different dd implementations.

    – groxxda Jul 12 '14 at 13:20
  • @Groxxda, I have no idea how to do that. – oshirowanen Jul 12 '14 at 13:20
  • 1
    GNU dd uses SIGUSR1, and BSD dd uses SIGINFO – groxxda Jul 12 '14 at 13:22
  • Also what do you mean with "connected to the same sata cable"? Are you using some sort of port multiplier? (If you achieve a transfer rate of 150MB/s it should take you 5-6hrs, but I think half of that is more realistic.) – groxxda Jul 12 '14 at 13:31
  • @Groxxda, No, it's a single sata cable, which allows 2 hdds to connect to a single sata port. It's doing 19MB/s for some reason... – oshirowanen Jul 12 '14 at 13:33
  • 2
    You may be able to speed up the process by specifying a different (bigger) blocksize (bs= argument to dd). Also consider connecting each HDD to its own sata port. – groxxda Jul 12 '14 at 13:38
  • @Groxxda, what blocksize do you recommend? – oshirowanen Jul 12 '14 at 13:46
  • 1
    Have a look at this thread on superuser. I would suggest you try a few values from 64K to 4M and see what works best for you. They also mention a flag direct that might speed up the copy, but I haven't used that. – groxxda Jul 12 '14 at 13:49
  • What did you finally use for the other disks? dd (with direct?) ? bs? or cat was as fast? Any insight on a SATA HDD (now connected via USB) to SSD (new that is replacing the SATA HDD) ? – tgkprog Apr 15 '16 at 16:21
  • For good block sizes, these two questions can also be considered: https://serverfault.com/questions/147935 and https://unix.stackexchange.com/questions/9432 – Cadoiz Jan 19 '21 at 06:35

5 Answers5

74

dd was useful in the old days when people used tapes (when block sizes mattered) and when simpler tools such as cat might not be binary-safe.

Nowadays, dd if=/dev/sdb of=/dev/sdc is a just complicated, error-prone, slow way of writing cat /dev/sdb >/dev/sdc. While dd still useful for some relatively rare tasks, it is a lot less useful than the number of tutorials mentioning it would let you believe. There is no magic in dd, the magic is all in /dev/sdb.

Your new command sudo dd if=/dev/sdb bs=128K | pv -s 3000G | sudo dd of=/dev/sdc bs=128K is again needlessly slow and complicated. The data is read 128kB at a time (which is better than the dd default of 512B, but not as good as even larger values). It then goes through two pipes before being written.

Use the simpler and faster cat command. (In some benchmarks I made a couple of years ago under Linux, cat was faster than cp for a copy between different disks, and cp was faster than dd with any block size; dd with a large block size was slightly faster when copying onto the same disk.)

cat /dev/sdb >/dev/sdc

If you want to run this command in sudo, you need to make the redirection happen as root:

sudo sh -c 'cat /dev/sdb >/dev/sdc'

If you want a progress report, since you're using Linux, you can easily get one by noting the PID of the cat process (say 1234) and looking at the position of its input (or output) file descriptor.

# cat /proc/1234/fdinfo/0
pos:    64155648 
flags:  0100000

If you want a progress report and your unix variant doesn't provide an easy way to get at a file descriptor positions, you can install and use pv instead of cat.

  • What is strange with large blocks is that the bottleneck is the disk, so what makes cat faster than dd ? Could it be that cat uses the cache ? – Emmanuel Jul 12 '14 at 21:37
  • 1
    @Gilles, thanks for the answer. I have another five 3TB drives to clone and will try the cat option next. As far as I can tell, that new dd command is going to take another 3 hours to complete, to about 11 hours in total. If the cat approach is faster than 11 hours for the second 3TB HDD, I will use that method for the remaining drives. – oshirowanen Jul 12 '14 at 21:44
  • @Emmanuel Both use the cache in the same way. I don't understand why there's a significant difference between cat, cp and dd with a large block size (it's easy to understand why dd with a small block size is slower: it makes more system calls for the same amount of data). – Gilles 'SO- stop being evil' Jul 12 '14 at 21:44
  • @oshirowanen You're getting a little over 80MB/s, which sounds pretty good for a 5400rpm drive. – Gilles 'SO- stop being evil' Jul 12 '14 at 21:45
  • @Gilles, I have pv installed and am currently copying the data using the cat method you suggested. How do I use pv to get a human readable progress report? – oshirowanen Jul 13 '14 at 06:36
  • @gilles I was thinking : if catuses the cache that will allow to read onto one disk while writing on the other. If dd was intended to write on tapes perhaps it bypasses the cache to have a better control. – Emmanuel Jul 13 '14 at 06:37
  • 2
    @Gilles, so to get progress report, do I use sudo sh -c 'pv /dev/sdb >/dev/sdc' instead of sudo sh -c 'cat /dev/sdb >/dev/sdc'? – oshirowanen Jul 13 '14 at 06:46
  • 2
    @oshirowanen Yes, use pv where you'd use cat. – Gilles 'SO- stop being evil' Jul 13 '14 at 10:50
  • @Emmanuel No, dd doesn't (can't) bypass the cache. – Gilles 'SO- stop being evil' Jul 13 '14 at 10:57
  • @gilles I think it can with oflag=direct but doesn't by default. – Emmanuel Jul 13 '14 at 21:42
  • @Gilles +1 for suggesting using cat. Just copied a 128-Gb SSD drive, and it took only half an hour. – dr_ Jan 21 '16 at 12:48
  • I have a question, If I've 2 hdd of 2TB (sdc, sdb), and want clone the firts sdc... should I put cat /dev/sdc >/dev/sdb ? is it secure way? Thanks!! – Milor123 Dec 26 '16 at 17:04
  • What happens if you do this on a multi-device btrfs fs? – unhammer Sep 20 '17 at 19:27
  • @unhammer I don't know how btrfs stores information about where to find the other devices. If it's able to assemble filesystem parts from their content alone regardless of their location (like Linux LVM), then just copying one or more of the devices byte for byte will work. A robust system like btrfs should be able to do this, otherwise plugging in an external disk or restoring from a backup would be very painful. – Gilles 'SO- stop being evil' Sep 21 '17 at 18:55
  • OK, I guess it'd have to be something like cat /dev/sdc1 /dev/sdd1 > /dev/sdb (given the original btrfs mount used -odevice=/dev/sdc1,device=/dev/sdd1). I felt a bit too unsure so I ended up waiting for /bin/cp -a instead, which wasn't too bad. – unhammer Sep 22 '17 at 07:41
  • 1
    @unhammer No! You would copy each device one by one, e.g. cat /dev/sdc1 >/dev/sdb1 && cat /dev/sdd1 >/dev/sde1. Concatenating the two parts doesn't make sense. If you want to change the structure of the btrfs volume, to change it from having two subvolumes to one, you need to use btrfs tools to change the structure, or as you did to create a new filesystem with the desired structure and copy the files. – Gilles 'SO- stop being evil' Sep 22 '17 at 08:44
  • OK, had a feeling it wouldn't work exactly like that, thanks for the confirmation! – unhammer Sep 22 '17 at 11:21
  • Never would have thought of cat /dev/sdb >/dev/sdc to replace dd. – WinEunuuchs2Unix Jul 05 '20 at 00:39
11

dd uses a very small blocksize by default (512 bytes). That results in a lot of overhead (one read() and write() syscall for every 512 bytes).

It goes a lot faster when you use a larger blocksize. Optimal speeds start at bs=64k or so. Most people use a still larger bs=1M so it becomes human readable (when dd says it copied 1234 blocks, you know it's 1234 MiB without doing any math). Using even larger blocksizes are unlikely to result in speed improvements, just higher memory consumption.

So the command should be:

dd bs=1M if=/dev/sdb of=/dev/sdc

If you already have a slow dd running, you can interrupt it and resume with a faster dd instance. For this it is important to know how far the copy progressed already. dd usually prints the progress when you cancel it, or you can send it the USR1 signal while it is running to make it print its progress.

kill -USR1 $(pidof dd)

For example if it copied more than 1234MiB, you can resume at position 1234MiB using:

dd bs=1M seek=1234 skip=1234 if=/dev/sdb of=/dev/sdc

If it copied fewer than 1234MiB, your copy will be incomplete. If it copied more than 1234MiB, it will re-copy some already copied parts, which normally does not do any harm. So if in doubt you should pick a value slightly smaller than what you believe was already copied.

frostschutz
  • 48,978
6

Getting statistics about ongoing dd process

You can use the kill command with the appropriate signal to make dd output statistics to standard error.
From the GNU dd man page:

Sending a USR1 signal to a running 'dd' process makes it print I/O statistics to standard error and then resume copying.
      $ dd if=/dev/zero of=/dev/null& pid=$!
      $ kill -USR1 $pid
      18335302+0 records in 18335302+0 records out 9387674624 bytes (9.4 GB)  copied,  34.6279 seconds, 271 MB/s

Make sure you check your man page for the correct signal first as it may differ on different dd implementations: (BSD dd uses SIGINFO).

Speeding up the process

  1. Connect each HDD to it's own SATA port so the data can be read from one device and written to the other at the same time.
  2. Use an appropriate blocksize using the bs= argument. Have a look at this thread on superuser and try some values for yourself.
  3. Use separate dd invocations for reading and writing and use a pipe to connect them (dd if=/dev/sda bs=1M | dd of=/dev/sdb bs=1M).
    If you do this and specify a blocksize, make sure you use the same blocksize on each invocation.
  4. You may try other optimizations like the direct argument.
  5. Make sure your hard disks are not mounted or it may result in a corrupt copy.
Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
groxxda
  • 1,028
  • You can use watch kill -USR1 $pid or also monitor the progress like that: https://www.howtogeek.com/428654/how-to-monitor-the-progress-of-linux-commands-with-pv-and-progress/ – Cadoiz Jan 19 '21 at 08:02
4

Have you tried 'gparted' ? You can literally copy-paste a partition from one drive to another and resize it accordingly as needed. You get transfer rate and time remaining. It uses 'e2image' underneath for linux partitions.

ioannis
  • 41
  • Note: you need to boot from a live-cd or live-usb with Ubuntu to be able to do it. Otherwise, the root host system will be blocked for copying. – Oleg Abrazhaev Oct 03 '21 at 12:42
0

There is commercial software called HDClone. It is licensed as both freeware and commercial. Either version can create a bootable pendrive or DVD, as well as copying disks. Connect the source and destination HDs and follow the GUI-based instructions.

The free one will transfer at a rate about 80MB/sec, while the paid versions can go much faster. For NTFS and FAT filesystems, the paid versions can be configured to clone only the occupied bits of the HD, which dramatically increases the speed of each clone.

Chris Davies
  • 116,213
  • 16
  • 160
  • 287