1

I've got a Pi2 (running Raspbian Jessie) nicely set up with a 2Tb external USB drive (sda) set up so that I am booting off of /dev/sda1 (16Gb), downloading torrents to /dev/sda2 (200Gb) and saving all my important documents on OwnCloud /dev/sda3 (1.7Tb)

df -h:

Filesystem      Size  Used Avail Use% Mounted on
/dev/root        16G  2.0G   13G  14% /
/dev/mmcblk0p1   63M   21M   43M  33% /boot
/dev/sda3       2.5T  744G  1.7T  31% /media/owncloud
/dev/sda2       193G  131G   52G  72% /media/torrent

Now as you can see from the above, I've got about 750Gb stored on my OwnCloud. I'd really rather not lose any of that. And come to think of it, I'd really rather not lose and of the 130Gb in torrents, nor the work I've put into getting the system running JUST how I like it in /dev/root.

So I'm going to be buying a second 2Tb Hdd.

The 1st question is: What is the best way to backup/save this data?

I've not ever set up a RAID array before, but from preliminary research, I'd need to start with 2 blank drives, and then set it up from there. This isn't really a possibility (Question 2: or is it?) as I don't have anywhere to temporarily store the 870+Gb currently on the drive. (Question 3:) Also, can a RAID1 be set up with USB drives?

I could cron a rsync to back up the primary drive periodically, (Question 4:) but is that the best way to do this? And If it really is... Bonus Question 5: What period should I run (after the initial sync)? Once a day will surely not be enough and every minute may be a bit much.

Jim
  • 240
  • 2
    It's usually one question per post. After all, if you get four answers, and every one answer will cover exactly one topic, which one will you accept? – Zeta Oct 04 '16 at 13:48
  • 3
    Either way, ___a RAID is not a backup___. Your backup drive shouldn't be online all the time. After all, every software-related error that destroys your data on drive 1 can also erase your data on drive 2, if both are available at the same time. – Zeta Oct 04 '16 at 13:52
  • Well, to answer (1) you first need to decide what you want your data to survive. RAID will make it survive a single-disk failure (but not a bug which corrupts both drives, or accidental deletion, or a virus, or an admin screwup, or a disaster destroying the computer, and so on). (2) and (3) are easy: yes it's possible, and yes you can use USB drives. – derobert Oct 04 '16 at 14:34

1 Answers1

2

What kinds of danger do you expect? Data loss, of course, but how do you expect that data loss to happen? This immediately rules out several strategies. Regardless, RAID is not a backup. Some of the RAID levels (1,5,6,…) merely provide a way to keep your system running if a disk fails.

If there's an error in your system, e.g. an accidental rm -rf /media/*, all your data will be deleted across all your drives in your RAID. Note that it's possible in theory to create a RAID1 with only one drive, copy data to it, and then start mirroring, but again, it's not a backup.

So instead, just partition and format your second disk with ext4 or another file system of your choice. Now, we come to the next question: do you want incremental backups? Or do you want a mirror of your data?

A mirror is rather easy:

rsync -av --delete --progress /media/* /path/to/backupdrive/

But depending on your situation, you want incremental backups. There are several applications available, e.g. borg, and they have different features, like de-duplication, speed, and so on:

borg create /path/to/backup::repo-{now:%Y-%m-%d} /media/*

This has the nice side-effect that the mentioned rm -rf /media/* won't delete your backups (unless you've used rsync --delete).

Regardless of the method you use, put that method in a shell script, e.g. ~/utils/backup.sh. But don't create a cron job for that file. Instead, add a second file, ~/utills/backupreminder.sh, that sends you an email, SMS, notification, or prints a page on your printer to remind you that you should take your drive, go to your Raspberry, connect it, execute ~/utils/backup.sh, disconnect it, and put it back.

The physical distance is important. If your dog pulls your Raspberry from the shelf, any connected drive will likely die. If that's too much of a hassle (and your Raspberry is in an infant safe location), at least dismount the drive after each backup.

Bonus Question 5: What period should I run (after the initial sync)? Once a day will surely not be enough and every minute may be a bit much.

That completely depends on you. If you file a very important document in your OwnCloud every day, you should backup every evening. If the contents of your OwnCloud and other folder only change every second day, and you can handle the loss of such a day, backup every fourth evening.

And if disk failure is your major concern, add a third drive for that RAID1. But don't forget the backups.

However if that's all too much (which is understandable), you can always rent some space online for ~60$/year, and backup your files there.

Zeta
  • 1,029
  • Awesome. thanks for all the effort you put into answering this. The only danger I'm really worried about is drive failure. There was a ticking noise coming form my little server cupboard that I didn't like. If someone breaks into my place and steals my hardware, I'll have bigger problems then losing data. Fortunately the hardware is WELL out of reach of my dogs! – Jim Oct 05 '16 at 07:08