0

I have a folder in my system(ubuntu) that gets synchronized with work using wget. File-names are in the following format A156.0.1.x, A156.0.y, A156.0.z, A156.0.a, A156.0.b. All files are created at some time in my office and all have the same time and date. Rsync and any other connection to the office is not permitted.

I am synchronizing 4 times a day and there is not a pattern of how often the files will be created. There might not be a change in the folders for a couple of weeks or it might be 10 times in a day. Once the new file is created it will be named something like A156.1.[a,b,x,y,z]. Each file is huge (~500MB).

So i am ending having more than one set of files (5), in my system, and i have in total 10 files×500MB = 5GB.

Is there any easy script that can be run by cron to check frequently the folder and delete the older files? So i will end up only with the latest set of 5 ones. i could run something like delete files that are older than x days, but we are never sure when the next set of files will get created.

john
  • 351

3 Answers3

0

You can use find piped into sort to list files sorted by date, and then use cut on the output to generate a list of files, and then use rm delete all files except the latest 5 files. Running this periodically should have the result you are looking for.

I don't know of an existing script, but this should be fairly trivial to implement.

0

If you can use zsh, it has glob qualifiers that make this pretty easy:

zsh -c 'rm work-folder/*(om[6,-1])'

That says to select all of the files in the work-folder directory, ordered o by modification time m, and further selecting only the range from 6 to the end. This leaves the most recent 5 files in the folder.

This assumes that you have 6 or more files in the directory to begin with; you could wrap a test around the removal to be safer (all in zsh):

files=(work/*(om))
[ ${#files[@]} -gt 5 ] && echo rm "${files[6,-1]}"

It's more work in bash, as you need to call stat on each file and keep track yourself, along this line.

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
0

The following script will display a list of "new files" and "old files" in a directory. By "new files" is meant files that have been modified after the last run of the script, and by "old files" is meant that have not been modified since the last run of the script.

The script writes the output of date to a "timestamp file", and uses this file in the next run to determine what files have changed. on the first run, no output will be produced.

The script should be run manually, and as it is written it will only give you an opportunity to detect what files have been modified in a particular directory.

#!/bin/sh

topdir=$HOME  # change this to point to the top dir where your files are

stamp="$topdir/timestamp"

if [ -f "$stamp" ]; then
    echo 'New files:'
    find "$topdir" -type f ! -name timestamp -newer "$stamp"

    echo 'Old files:'
    find "$topdir" -type f ! -name timestamp ! -newer "$stamp"
fi

date >"$stamp"

This could be modified to

  • prompt the user for deleting the old files,
  • detect only files matching a certain pattern (using -name 'pattern', e.g. -name 'A156.1.[abxyz]'),
  • look at the inode change time ("ctime") instead of the modification time (using -cnewer instead of -newer if your find supports it),
  • etc.
Kusalananda
  • 333,661