20

I just made an image of a freshly installed dual boot (Ubuntu and Windows) using this command (which I've been using for a while for smaller images):

dd if=/dev/sda | gzip > /mnt/drive.img.gz

On this drive less than 60G out of 500G are used. Nevertheless that image-file now is 409G big.

How is that? Shouldn't gzip manage to compress all those zeros? As I said, it is a freshly installed system. It couldn't be that cluttered.

Now I didn't expect for the file to be 60G, but 400G seems very huge to me.

chris137
  • 321

2 Answers2

29

How is that? Shouldn't gzip manage to compress all those zeros?

Yes, if they were zeroes.

Unused disk space does not mean it contains zeros; it means it is unused, and may contain anything.

There are programs that wipe unused disk space to zeroes. I suggest you use those before making the disk image. (I don't recall any offhand; in Linux, I'd just use dd if=/dev/zero bs=1048576 of=somefile to create files containing only zeroes, filling up each filesystem; then remove them before making the image. Also, I prefer xz over gzip.)

  • Oh thanks, I didn't even think about what was on that drive before I installed it! I guess it's faster to unzip that image now and make that smaller (to which there seem to be solutions as well). I'll also consider using xz. Thanks! – chris137 Sep 29 '16 at 07:25
  • 12
    Writing 400G of zeros every time you want to make a clone does not seem a good idea on a SSD. The other solution (using a filesystem-aware cloning program) seems better. – Federico Poloni Sep 29 '16 at 12:11
  • 4
    @FedericoPoloni: No, but the generated image is a perfect raw disk image. Disk cloning utilities use various formats, so you need the utility to restore the image. I am not aware of (but also have not checked if) any that can actually generate a raw image while replacing unused blocks in the image with zeroes; that would be truly useful. Using dd in Linux is the simplest and easiest to fix the raw image size issue, and as an one-off, is okay even on an SSD. – Nominal Animal Sep 29 '16 at 14:29
  • @NominalAnimal Is there a way to use a gzipped disk image without first decompressing it anyway? As far as I know, gunzip drive.img.gz | mount -o loop - doesn't work, nor does any other command I can come up with. If the answer is no, then gzip is just another special format that one has to decompress before using it. – Federico Poloni Sep 29 '16 at 15:15
  • 1
    @FedericoPoloni: Yes, both gzip and xz -compressed images can be mounted using nbdkit. (I'm not really worried about the formats, if they are open and supported by more than one utility; I'm only worried about utilities becoming unmaintained and buggy on new systems. I'd be happy to use SquashFS instead of xz or gzip, for example.) – Nominal Animal Sep 29 '16 at 17:26
  • @FedericoPoloni On an SSD, you'd trim. – Gilles 'SO- stop being evil' Sep 29 '16 at 22:11
  • 3
    @Gilles Trimmed space, when read, is not guaranteed to have zeros. Some controllers give that, others don't. – deviantfan Sep 30 '16 at 04:22
  • @FedericoPoloni It's not a "special" format, in that it doesn't require a very specific program to decompress - it's a very common interchange format. (It's "special" as far as the loop device driver is concerned, though) – user253751 Sep 30 '16 at 04:31
14

For backups of individual partitions you could use partclone instead.

Partclone reads the file system to see where files are stored, and backs up only those parts of the partition.

TRiG
  • 331
Pelle
  • 401