3

The following syntax will remove the files under hive folder:

/usr/bin/find /var/log/hive -type f -print -delete

I am trying to do the following:

Remove the oldest files under /var/log/hive only if folder size is more than 10G

NOTE - the deletion process will stop when size under hive folder is exactly 10G , so purging process will start if size is more then 10G

Can we create this solution with find command or maybe another approach?

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
yael
  • 13,106

3 Answers3

6

On a GNU system, you could do something like:

cd /var/log/hive &&
  find . -type f -printf '%T@ %b :%p\0' |
    sort -zrn |
    gawk -v RS='\0' -v ORS='\0' '
      BEGIN {max = 10 * 1024 * 1024 * 1024} # 10GiB; use max=10e9 for 10GB
      {du += 512 * $2}
      du > max {
        sub("[^:]*:", ""); print
      }' | xargs -r0 echo rm -f

That is sort the regular files by last modification time (from newest to oldest), then count their cumulative disk usage (here assuming there are no hard links) and delete every file when we've passed the 10GiB threshold.

Note that it doesn't take into account the size of the directory files themselves. It only considers the disk usage of regular files.

Remove echo when satisfied with the result.

On one line:

find . -type f -printf '%T@ %b :%p\0' |sort -zrn|gawk -vRS='\0' -vORS='\0' '{du+=512*$2};du>10*(2^30){sub("[^:]*:","");print}'|xargs -r0 echo rm -f

To delete only *.wsp files when the cumulative disk usage of all regular files goes over 10GiB, you'd want to list the non-wsp files first. And at the same time, we can also account for the disk usage of directories and other non-regular files we were missing earlier:

cd /var/log/hive &&
  find . \( -type f -name '*.wsp' -printf WSP -o -printf OTHER \) \
     -printf ' %T@ %b :%p\0' |
    sort -zk 1,1 -k2,2rn |
    gawk -v RS='\0' -v ORS='\0' '
      BEGIN {max = 10 * 1024 * 1024 * 1024} # 10 GiB
      {du += 512 * $3}
      du > max && $1 == "WSP" {
        sub("[^:]*:", ""); print
      }' | xargs -r0 echo rm -f
  • can I run it in one line ? – yael Aug 09 '18 at 12:59
  • @yael, Sure, but why would you want to do that? All shells accept commands on several lines (though it can get awkwards with csh or tcsh). – Stéphane Chazelas Aug 09 '18 at 13:01
  • because I want to run it crom crontab -:) – yael Aug 09 '18 at 13:05
  • @yael, OK. Though here, I'd rather create a script with that code and have the crontab run that script. See edit for the one line version. You may want to test that it works OK first. I've not tested it myself. – Stéphane Chazelas Aug 09 '18 at 13:08
  • lets say that we want to delete only the files that are ended with ".wsp" , can you please give update to show the second approach that delete only the .wsp deletion – yael Aug 09 '18 at 13:23
  • we test your syntax , we create folder with 12G , and we run the syntax and after that we check the folder size , and folder size became 10G ( du -sh . ) , but can you please give a option how we can deleted only the ".wsp" files ? – yael Aug 09 '18 at 13:36
  • not exactly , the concept will be the same , if total size of hive is more then 10G ( as du -sh . ) no matter which files , then it will be deleted only the oldest .wsp files , – yael Aug 09 '18 at 13:53
  • I can open another question if you want ? – yael Aug 09 '18 at 13:54
  • Great let me test this syntax , I will back soon with results .... – yael Aug 09 '18 at 14:19
  • in case we want the size folder as 5G insted 10G , in that case we can change the du > 10 * (2^30) && $1 == "WSP" TO du > 5 * (2^30) && $1 == "WSP" ? – yael Aug 09 '18 at 15:54
  • first thank you so much , I see that you update the answer , so just to summary it in case for example we want to limit the folder to 5G , so we need to change the line as --> BEGIN {max = 5 * 1024 * 1024 * 1024} # 10 GiB , am I right ? – yael Aug 09 '18 at 17:43
  • I see that you made change , why not do find /var/log/hive ? , why need to "cd /var/log/hive" , if you can search by find the logs under /var/log/hive – yael Aug 10 '18 at 12:49
  • @yael, that would work too, but using find . means that file names will be shorter which reduces the amount of processing by all utilities. It can also reduce the number of rm commands that xargs need to run. It also avoid potential problems with /var/log/hive being renamed while the script is running. More generally, as find can't accept arbitrary directory/file names as arguments, I got into the habit of using it with . wherever possible. – Stéphane Chazelas Aug 10 '18 at 13:04
  • about my previous question "to start purge when total size 5G" , am I right ? – yael Aug 10 '18 at 13:07
2

Try this,

Option 1: To delete folder older than 90days and more than 10G

find /var/log/hive -size +10G -mtime +90 -type f -print -delete

Option 2: To delete the oldest folder

find /var/log/hive -size +10G -type f -printf '%T+ %p\n' | sort | head -n 1 | cut -d" " -f2 | xargs rm
Siva
  • 9,077
  • so both options will delete the files until size will be exactly 10G ? ( in case size became for example 11.5G ) – yael Aug 09 '18 at 12:36
  • 3
    This deletes files larger than 10Gb. – Kusalananda Aug 09 '18 at 12:36
  • I not want to delete files that are more then 10G , what I mean is that --> in case of size under hive folder is more then 10G , only then the deletion mechanizem will work to delete the oldest files , – yael Aug 09 '18 at 12:39
1

How about

while test "$(du -s /var/log/hive | cut -f1)" -gt 10000000 ; do rm -i /var/log/hive/"$(ls -t /var/log/hive | tail -1)" ; done

?

Hermann
  • 6,148
  • 1
    Related: https://unix.stackexchange.com/questions/128985 – Kusalananda Aug 09 '18 at 12:37
  • dose it stop to delete when size is 10G ? ( what we need is to stop the deletion when size iz exactly 10G ) , I also update the question with note so it will be more clear – yael Aug 09 '18 at 12:41
  • @yael: Yes, it stops if the size reaches 10GB or less (10GB, not 10GiB).

    @Kusalananda: I took this from https://unix.stackexchange.com/questions/242496/bash-script-to-remove-the-oldest-file-from-from-a-folder#242502. How to select the oldest file in a directory without parsing ls?

    – Hermann Aug 09 '18 at 16:10