5

I'm copying a drive to a file, but underestimated the amount of space it would take up. So I am running out of space on the destination drive, and don't want to abort the process (it's been running quite a while on a damaged drive).

I can of course pause dd using CTRL-Z, and I'd like to swap out the destination drive for a larger one and copy the file over to the larger drive, then resume dd. Any ideas on how to accomplish this?

dd if=/dev/sdc conv=sync,noerror bs=64M | gzip -c -9 > /media/extD/drive.img.gz

EDIT: for those mentioning ddrescue as being a better tool to use, you're right. However, for whatever reason ddrescue kept barfing and erroring out on me, whilst dd has simply been chugging along without complaint.

Logos
  • 251
  • 1
    Unfortunately you have to start dd again, if you can't free up enough space on destination drive. Next time don't piping to gzip, then you can continue dd from the last position to another destination. – Ipor Sircer Aug 17 '16 at 01:02
  • If I kill dd but not gzip, can I then extract the dd img file and continue using that? Also, you say without using dd I can continue from last position but don't give any details on the how? – Logos Aug 17 '16 at 02:48
  • dd skip=X seek=X ... from manual:
       seek=N skip N obs-sized blocks at start of output
    
    
       skip=N skip N ibs-sized blocks at start of input
    
    – Ipor Sircer Aug 17 '16 at 02:51
  • 1
    @IporSircer not really, see my answer. I've done exactly this on my laptop. – Wyatt Ward Aug 17 '16 at 04:37
  • “it's been running quite a while on a damaged drive” I'd just start over with ddrescue. – Gilles 'SO- stop being evil' Aug 17 '16 at 21:35

2 Answers2

4

It's possible, but kinda painful.

First, know that dd prints its current position in a copy (bytes copied) if sent the USR1 signal. so find the PID of dd, using either ps or something like pidof or pgrep (not POSIX and not on all unix-y systems IIRC).

A ps command that works for me (also using awk, in a debian environment):

ps aux|awk '/dd/ {print $2}'|grep -v awk

grep -v awk is necessary to prevent the PID of awk from printing as well.

Having the PID of dd, send the USR1 signal:

kill -USR1 [pid of dd]

the console window dd is running in will print how many bytes it's copied. You may now kill dd for real (ctrl+c, kill -9, whatever). I don't recall whether dd reports its progress on abort if killed this way in POSIX, so send the USR1 signal first.

dd may have now copied a few more bytes since you stopped it, so run:

head -c [number of bytes reported copied from dd] > \
     /path/to/drive/you/are/moving/to/filename.bin

to put a truncated copy on the destination disk. Instead of the precise number of bytes, you may want to choose something divisible by your desired block size, to speed up the transfer when you resume the copy. Just make note of whatever you choose, and make sure you are only truncating, not growing, the image.

Once you have this copied to the new drive, run:

dd if=/dev/sdc bs=64M skip=[truncated size divided by block size, e.g. 64000000] \
   of=/path/to/part2

If you have space for the remainder of the image on the smaller original disk, delete the non-truncated image you copied to the new disk in part 1 to free space, and have dd output it to that disk instead. When the transfer is complete, you can run

cat /path/to/part2 >> /path/to/part1

to add part 2 to the end of part 1, creating a complete disk image! Note that you will need at least as much free space on the disk part 1 is located on for all of part two to be appended to it.

If you don't mind doing the whole transfer over, I'd do cat /dev/sdc | gzip -c - > /path/to/imagefile.img.gz to create a gzip compressed archive. This can be written to a hard disk partition with something like zcat /path/to/imagefile.img.gz > /dev/sdX.

[copied from my comment into the answer]

additionally, I think (but do not remember for certain) that dd writes to stdout if of= is not specified. If this is true, you can skip writing part2 to a separate file and use:

dd bs=64M skip=[skip-block-count] if=/dev/sdc >> /path/to/part1

@MatijaNalis has rightly suggested using dd_rescue or ddrescue (two different programs that accomplish the same task) to copy the disk image. I'd do this if your partition/drive has erroneous sectors or other hardware faults.

Wyatt Ward
  • 4,032
  • additionally, I think (but do not remember for certain) that dd writes to stdout if of= is not specified. If this is true, you can skip writing part2 to a separate file and use dd bs=64M skip=[skip-block-count] if=/dev/sdc >> /path/to/part1. – Wyatt Ward Aug 17 '16 at 04:40
  • 1
    Also, if you were to start transfer again, it would be much better to recover data from damaged disk by using https://www.gnu.org/software/ddrescue/ (or http://www.garloff.de/kurt/linux/ddrescue/ ) instead of dd if possible – Matija Nalis Aug 17 '16 at 07:22
  • That last do-over could be gzip -c /dev/sdc > /path/to/imagefile.img.gz – agc Aug 17 '16 at 11:30
  • So that procedure will let gzip properly exit, so I can access the truncated file in the archive? – Logos Aug 17 '16 at 13:33
  • Counting the number of bytes that dd has copied is pointless. The useful information is the number of bytes that gzip has written, and you get that with gzip -l. – Gilles 'SO- stop being evil' Aug 17 '16 at 21:38
  • @MatijaNalis Good point. – Wyatt Ward Aug 18 '16 at 01:57
  • @Giles if you aren't using gzip and are doing a raw disk copy, then the gzip advice won't help. – Wyatt Ward Aug 18 '16 at 01:58
  • @Logos gzip when used on a file or pipe will stop at EOF (end of file). So yeah, should be okay. – Wyatt Ward Aug 18 '16 at 02:01
1

You can't pause the process and resume it after moving the file on another drive¹. However you don't need to do that. Just kill the dd process, do your file transfer, and initiate a new copy process for the rest of the data.

To see where the first part stops, run gzip -l /media/extD/drive.img.gz. The “uncompressed” number is the number of bytes that were copied, and it's the offset where you need to start the new copy process.

To copy the rest of the data, tell dd to start at that offset NNNN.

dd if=/dev/sdc conv=sync,noerror,iflag=skip_bytes,skip=NNNN bs=64M |
gzip -c -9 >> /media/extE/drive.img.gz

I do not recommend using dd to copy data. Contrary to a common legend, there is no magic in dd that lets it access disks: the magic is all in /dev/sdc. Furthermore dd can lose data silently — I think that with the flags conv=sync,noerror it only replaces some data by zeros, so all the data that's successfully copied ends up at the right location, but do it at your own risk.

To copy a disk image from a working disk, just use cp or cat. To copy a part, tail -c +$((start_offset+1)) | head -c $bytes_to_copy.

To rescue data from a failing disk, use ddrescue. Ddrescue is smart about skipping unreadable data and keeps track of what it's successfully read. It makes multiple passes; with failing hard drives, letting the drive rest for a bit and trying again can often let you recover the whole thing, even if some sectors are unreadable at the first attempt.

ddrescue can use your existing partial copy and complete it, but you'll have to keep it uncompressed.

¹ Actually, you can, with ptrace — attach to the process with a debugger, pause it, update its internal data structures to point to the new file, resume. But even if you were familiar with how to do that, it would be more complicated than the straightforward solution with appending.

  • This is good, and I agree on using cat instead, but if the disk is faulty dd_rescue is the way to go. And on certain types of (mostly extinct) hardware, block transfers are faster. Also I believe that besides the ddrescue bit, my answer covered all of this, didn't it? (i've since added the ddrescue bit) – Wyatt Ward Aug 18 '16 at 02:06
  • Actually, did not know that two gzip streams appended together would decompress properly into one file. – Wyatt Ward Aug 18 '16 at 02:12