-1

I'm trying to figure out why my disk image is so much larger. Here is the source server's disks:

root # df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            1.9G     0  1.9G   0% /dev
tmpfs           395M  7.1M  388M   2% /run
/dev/sda         79G   43G   32G  58% /
tmpfs           2.0G     0  2.0G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           2.0G     0  2.0G   0% /sys/fs/cgroup
tmpfs           395M     0  395M   0% /run/user/0

and I'm creating it with:

ssh root@109.74.201.x "dd if=/dev/sda " | dd of=/backup/server-images/west.img

It's still running and the image is 72gb so far - so almost double the actual size of the disk its backing up. What am I missing? If I have a 200gb disk, is it going to be 400gb+ ?

Yet the image is

  • 1
    /dev/sda is 80 GB in size, what do you expect? You are cloning the whole drive, including "empty" regions - not just your files. – Panki Dec 10 '20 at 09:18
  • @Panki haha yes, I just realised that. For some reason I was expecting it to be like a .tar where its only the size of the contents (but compressed) – Andrew Newby Dec 10 '20 at 09:19
  • You're reading from a block file - dd never knows what files it's actually writing to the image. Also, this way of creating a backup is not guaranteed to work. – Panki Dec 10 '20 at 09:20
  • @Panki do you have a suggestion for a better way to create an image? All I want to do is create an image of the server, and download locally. Then I at least have disaster recovery should the server go down – Andrew Newby Dec 10 '20 at 09:21
  • @roaima thanks - I will give that a go. I was just using the command Linode recommended for backing up: https://www.linode.com/docs/platform/disk-images/copying-a-disk-image-over-ssh/ – Andrew Newby Dec 10 '20 at 09:23

2 Answers2

2

Using dd with no parameters is desperately inefficient - it will use 512 byte blocks for every read. You could adjust the dd blocksize (bs=32M, for example), but here are some easier and far faster alternatives

# Compressed image
ssh root@109.74.201.x "gzip --rsyncable </dev/sda" >/backup/server-images/west.img.gz

Uncompressed image

ssh root@109.74.201.x "gzip --rsyncable </dev/sda" | zcat >/backup/server-images/west.img

Uncompressed image with seriously fast network

ssh root@109.74.201.x "cat /dev/sda" >/backup/server-images/west.img

Notice that none of them uses dd. Omit the --rsyncable flag if your gzip doesn't understand it, or if you can absolutely guarantee you won't ever want to use rsync for transferring the compressed images around at a later date.

By the way, if any filesystem on /dev/sda is mounted, or any partition is otherwise in use, your backup will probably be corrupt. This is not how to perform a block-based backup of a live system.

Chris Davies
  • 116,213
  • 16
  • 160
  • 287
  • Thanks. Just giving the "cat" method a go now. Does seem to be a lot faster. How does it cope with hundreds of Gb's? (500gb+). The websites themselves are being backed up using a different method with incremental backups. So really this image is simply to put back up if the worst case scenario happens and something installed buggers up the server, or the Linode is lost / corrupted. The website / DB files can then be plopped back in from the other backup system into this image – Andrew Newby Dec 10 '20 at 09:30
  • If it's a live system this will almost certainly result in a corrupt backup. If you're lucky it'll be in the filesystem structure and you'll notice – Chris Davies Dec 10 '20 at 09:31
  • hmm ok - so how do I go about it then? I've used Linode's backup services thus far, but now we are having issues due to having lots of small files on one of the servers (they said normally anything over 3 million files can cause issues with their backup service) – Andrew Newby Dec 10 '20 at 09:33
  • Filesystem snapshot (btrfs) or LVM snapshot otherwise, and backup the snapshot. Failing that, do a file-based backup with rsync, rsnapshot, or tar even. You may still get corruption but at least it's only within a file that's actively being written at the point you're reading it, and good tools will notice and force a re-read anyway – Chris Davies Dec 10 '20 at 09:35
  • 1
    thanks will check into those methods. You obviously hope you never need to use these backups, but as I learned recently you need to have them or it can cause all kinds of headache down the line – Andrew Newby Dec 10 '20 at 09:38
  • 1
    Take a look at Veeam Free Agent (proprietary, commercial but free to use). It does snapshot-based full and incremental backups – Chris Davies Dec 10 '20 at 09:43
0

Eugh typical - just found this:

https://askubuntu.com/questions/537012/dd-image-size-does-it-equal-the-size-of-the-partition

So I guess that means this .img file will actually be the size of the disk itself, and not the size of the currently used space