0

I have the following bash script that is running as a cronjob on OpenMediaVault:

BACKUP_DIR='/srv/dev-disk-by-uuid-9EE055CFE055ADF1/Backup dir/'
BACKUP_FILE_PATH="/srv/dev-disk-by-uuid-9EE055CFE055ADF1/Backup dir/Backup [ashen] ($(date +%d-%m-%Y)).tar.gz"
SERVER_DIR=/var/lib/docker/volumes/49c9e5c53ea5b9c893c0a80117860da9b493484395c0$
MAX_BACKUPS_COUNT=4

tar -zcf "$BACKUP_FILE_PATH" $SERVER_DIR

cd "$BACKUP_DIR"

[[ $( ls | wc -l ) -gt $MAX_BACKUPS_COUNT ]] && rm "$(ls -t | tail -1)"

The point of the script is to create a .tar.gz backup in the given location and, if there are more than 4 files in the backup directory, delete the oldest (the point being to keep only 4 recent backups). The last line/command doesn't always work. Running it manually in a terminal works as expected and sometimes the script will execute it but then sometimes it will stop until I manually try to run the script/line and then it seems to magically fix itself for some period of time.

Does anyone know why it sporadically stops executing the last line? Even when I see that backups are being created.

terdon
  • 242,166
Culture
  • 11
  • 1
    What makes you think the last line is not executed? – berndbausch Mar 27 '21 at 14:41
  • Are you 100% sure that your file names will never contain spaces or other whitespace? This is a very, very fragile script that can only handle simple file names. What exactly does "not work" mean? Do you get any error messages? – terdon Mar 27 '21 at 14:52
  • @berndbausch, there are more than 4 files in the backup directory. A lot more. – Culture Mar 27 '21 at 14:54
  • So? Your command will only delete one file, so having more than 4 left doesn't mean it didn't run. What makes you say the command isn't executed. What do you mean that "the script stops"? Does it hang waiting for something? – terdon Mar 27 '21 at 14:56
  • @terdon, actually I am not 100% how the last line works. I copied it from this site. My file names will always contains spaces and other stuff, the naming convention is in the "BACKUP_FILE_PATH" variable and it contains whitespaces. If you can give me a better alternative for a script that can 1. archive/backup a directory and 2. make sure only the last N backups are kept (probably by deleting the oldest if there are more than N) I'd be happy to use it. – Culture Mar 27 '21 at 14:56
  • @terdon, the only thing creating file in said directory is this script, no files are being deleted. I don't know how to check for cron script erros. :/ – Culture Mar 27 '21 at 14:57
  • Oh wow. Can you tell me where on this site you saw that? It's a very bad idea and I'd like to correct it. – terdon Mar 27 '21 at 15:03
  • @terdon, sorry, don't have the link but if you do a search for that exact command you'll probably find it. It was either here or on stackoverflow. Anyway, what command would you propose? – Culture Mar 27 '21 at 15:06
  • while I wouldn't do that with whitespace and parenthesis in the filenames, that doesn't seem obviously wrong. As long as there's no newlines in the names ls | wc -l and ls -t |tail -1 should work fine. (Not quoting the $(ls | wc-l) also doesn't matter with default IFS, and the other one is properly quoted.) And while it only ever removes one file each run, if they're only created in the same script, one at a time, it shouldn't be able to end up having more. Hmm. – ilkkachu Mar 27 '21 at 15:12
  • @ilkkachu, what would you do with whitespace and parenthesis in the filenames? – Culture Mar 27 '21 at 15:14
  • @Culture, Not use them. Especially not when naming backup files or such, when using a - or _ is just much safer wrt. shell scripts and stuff. – ilkkachu Mar 27 '21 at 15:18
  • @ilkkachu, thanks but i'm curious - why is there a problem with the spaces and brackets? – Culture Mar 27 '21 at 15:25
  • @Culture, because not all the tools are exactly well designed to deal with them (though newlines are way worse), and esp. (POSIX) shell scripts give you a bunch of tools to do it wrong. see e.g. https://unix.stackexchange.com/questions/131766/why-does-my-shell-script-choke-on-whitespace-or-other-special-characters and https://dwheeler.com/essays/fixing-unix-linux-filenames.html (neither is too short, which kinda demonstrates the problem) – ilkkachu Mar 27 '21 at 15:29
  • You could modify the script to add set -x at the top, and then see from the output what it actually runs each time it runs. Assuming your cron is properly set up to send the output as email. – ilkkachu Mar 27 '21 at 16:09
  • @ilkkachu I think the main issue is that, as you point out, the script will only ever remove one file. So if we ever have more than 5 files, we will never get back down to 4. – terdon Mar 27 '21 at 18:19

1 Answers1

1

Your script will fail if any of the file names contain newlines. Parsing ls is a very bad idea and very likely to fail. Additionally, your script will only ever delete the most recent file. So if there are 100 files, you will be left with 99. You seem to be expecting that it will delete everything except the most recent 4, but the script's logic doesn't work that way.

Here's an alternative approach that can deal with arbitrary file names and actually deletes all but the most recent 4 files:

#!/bin/bash

avoid using CAPS for local variable names in shell scripts

backup_dir='/srv/dev-disk-by-uuid-9EE055CFE055ADF1/Backup dir/' backup_file_path="/srv/dev-disk-by-uuid-9EE055CFE055ADF1/Backup dir/Backup [ashen] ($(date +%d-%m-%Y)).tar.gz" server_dir='/var/lib/docker/volumes/49c9e5c53ea5b9c893c0a80117860da9b493484395c0$'

This needs to be set to the number of files you want to keep plus one,

so that we can use tail -n $max_backups below.

max_backups=5

tar -zcf "$BACKUP_FILE_PATH" "$SERVER_DIR"

delete all but the newest 4 tar.gz files in the

backup directory

stat --printf '%Y %n\0' "$backup_dir"/tar.gz | sort -rznk1,1 | tail -z -n +"$max_backups" | sed -z 's/^[0-9] //' | xargs -0 rm -v

The work here is being done by that stat command and the various downstream pipes. Here's a breakdown of what the command is doing:

  • stat --printf '%Y %n\0' "$backup_dir"/*tar.gz: this prints the file name and file age in seconds since the epoch of all .tar.gz files in the backup directory. In order to be able to handle file names with newlines (\n), we need to end each entry with a NULL (\0). This is what the output looks like:

    $ stat --printf '%Y %n\0' * | tr '\0' '\n'
    1616867929 ./afile 5  tar.gz
    1616868565 ./file 10  tar.gz
    1616868560 ./file 1  tar.gz
    1616868561 ./file 2  tar.gz
    1616867927 ./file 3  tar.gz
    1616867928 ./file 4  tar.gz
    1616867930 ./file 6  tar.gz
    1616868562 ./file 7  tar.gz
    1616868563 ./file 8  tar.gz
    1616868564 ./file 9  tar.gz
    

For this example, I piped the output to tr '\0' '\n' so that it is legible, but in the actual output the end of each record has a \0 instead.

  • sort -rznk1,1: the output of the stat above, is piped to sort which will sort it numerically (-n), in reverse order (-r), using \0 as the record separator (-z) and only considering the 1st field (-k1,1), which is the file's age.

    The output looks like:

      $ stat --printf '%Y %n\0' "$backup_dir"/*tar.gz | 
          sort -rznk1,1 | tr '\0' '\n'
      1616868565 ./file 10  tar.gz
      1616868564 ./file 9  tar.gz
      1616868563 ./file 8  tar.gz
      1616868562 ./file 7  tar.gz
      1616868561 ./file 2  tar.gz
      1616868560 ./file 1  tar.gz
      1616867930 ./file 6  tar.gz
      1616867929 ./afile 5  tar.gz
      1616867928 ./file 4  tar.gz
      1616867927 ./file 3  tar.gz
    
  • tail -z -n +"$max_backups": the command tail -n +X will print the last records you give it starting from record X. Here, X is the $max_backups variable, which is why that variable needs to be set to the number of files you want to keep plus one. The -z lets tail deal with null-terminated records.

    At this point, we have the list of files we want to delete, but they also have their age and we need to remove it:

       $ stat --printf '%Y %n\0' "$backup_dir"/*tar.gz  | sort -rznk1,1       
          | tail -z -n +5 | tr '\0' '\n'
      1616868561 ./file 2  tar.gz
      1616868560 ./file 1  tar.gz
      1616867930 ./file 6  tar.gz
      1616867929 ./afile 5  tar.gz
      1616867928 ./file 4  tar.gz
      1616867927 ./file 3  tar.gz
    
  • sed -z 's/^[0-9]* //' : removes the file's age, leaving only the name. Once again, the -z is to deal with null-terminated records:

      $ stat --printf '%Y %n\0' "$backup_dir"/*tar.gz  | 
              sort -rznk1,1 | tail -z -n +5 | 
                  sed -z 's/^[0-9]* //' | tr '\0' '\n' 
      ./file 2  tar.gz
      ./file 1  tar.gz
      ./file 6  tar.gz
      ./afile 5  tar.gz
      ./file 4  tar.gz
      ./file 3  tar.gz
    
  • xargs -0 rm -v: the last step. This will delete the files and, once more, the -z is so it can handle null-terminated records.

IMPORTANT: the script assumes you are using GNU tools. Open Media Vault claims to be Linux and run Debian, so it should work for you, but I have never used that system so I cannotbe sure.

terdon
  • 242,166