0

I need to delete directories that only contain files (created/modified) older than X days.

I tried this, but it will only show files from today and not recursively...

for d in *; do   
  find "$d" -mindepth 1 -mtime -180 -print -quit | | grep -q . ||     
  echo rm -rf "$d";
done
Kusalananda
  • 333,661
  • Firstly, mtime doesn't mean indicate the age of the files. It indicates the modification time. If that is actually what yo want, then you need -mtime +180. mtime -180 means modified within the last 180 days. I would also recommend against using * because you could accidentally return data that you don't want. To make sure, simply use the parent directory as the argument. – Nasir Riley Feb 08 '22 at 12:40
  • What should happen to a directory that contains child directories? Presumably you don't want to delete it as the children in turn might contain files that are younger than your age limit – Chris Davies Feb 08 '22 at 12:54
  • @NasirRiley If a file is created or modified does not really matter. I need to delete the directory if the content is older than x. – Vercingetorix Feb 08 '22 at 13:49

2 Answers2

2

You want "to delete directories that only contain files older than X days".

Here's one way of handling it:

  • For each directory in turn, starting from leaf nodes:
  • Count the number of its child directories; skip if non-zero
  • Count the number of its non-directory items; skip if zero
  • Count the number of its files matching your criterion (age)
  • Delete the directory if the two file counts are the same

This solution requires GNU find or an alternate version that understands -mindepth, -maxdepth, and printf.

This particular example sets your X to 180 (days).

days=180
find ./* -depth -type d -exec sh -c '
    [ -z "'"$days"'" ] && exit 1
    printf "Considering directory: %s\n" "$1"
dirs=$(find "$1" -mindepth 1 -maxdepth 1 -type d -print | xargs)
if [ -n "$dirs" ]
then
    printf "Found child directories: (%s)\n" "$dirs"
    exit 0
fi

all=$(find "$1" -maxdepth 1 ! -type d -printf x | wc -c)
got=$(find "$1" -maxdepth 1 -type f -mtime +'"$days"' -printf x | wc -c)
printf "Found %d item(s) and matched %d\n" $all $got

if [ $all -gt 0 ] && [ $all -eq $got ]
then
    printf "Delete directory: %s\n" "$1"
    : rm -rf "$1"
fi

' _ {} ;

The seemingly strange quoting around the two uses of the variable $days is to ensure that although the rest of the script is enclosed in single quotes, the variable is enclosed in double-quotes so that its value can be determined. It may be more easily understood as '..start of script..' "$variable" '..remainder of script..', except that we have no spaces between the ending of one type of quotes and the start of the other.

Remove the printf statements if you want the code to run silently. Remove the colon (:) in front of the rm -rf when you are ready to execute the delete action.

Chris Davies
  • 116,213
  • 16
  • 160
  • 287
  • shouldn't you put -mtime +179 for 180 days and more? this way it'll be 181 days and more... reference: https://unix.stackexchange.com/questions/92346/why-does-find-mtime-1-only-return-files-older-than-2-days – golder3 Feb 08 '22 at 14:39
  • 1
    @golder3 the question actually says "older than X days", so I'll adjust my 180 to be X – Chris Davies Feb 08 '22 at 14:43
  • my point was that if you input 180 as X you actually get files older than 181 days – golder3 Feb 08 '22 at 14:55
  • @golder3 -mtime 180 will match files that were either modified precisely 181 days ago, or modified more than 181 days ago. That's what the OP asked for: "files older than 180 days". – Chris Davies Feb 08 '22 at 15:15
  • Yes, and logically, if you don't round the number of days, a file 180 days and 2 hours old is more than 180 days old and it won't be deleted... But OK, just wanted to point that out to OP – golder3 Feb 08 '22 at 15:22
  • @golder3 it's an interesting issue, but (as I suspect you know) it's down the integer size of the units being considered. Since we're in days, 180.9 days is still 180 days – Chris Davies Feb 08 '22 at 16:24
  • The OP had -mtime -180 and with POSIX compliant find implementations including GNU find, the opposite of is -mtime +179 (or '(' -mtime 180 -o -mtime +180 ')'), not -mtime +180. So you'd need -mtime +"$((days - 1))" for files older than $days days, 180.9 days is more than 180 days with most people's definition of more. – Stéphane Chazelas Feb 09 '22 at 08:48
  • You could make it days=180 find to avoid that funny quoting. – Stéphane Chazelas Feb 09 '22 at 08:49
  • Note that while -mindepth and -maxdepth are now widespread, AFAIK, -printf is still GNU-specific. Omitting the list of files find is to work on (here .) is also non-standard, non-portable. – Stéphane Chazelas Feb 09 '22 at 08:50
  • The script considers directory files (-type d), regular files (-type f), but what about the other types of files (symlinks, devices, fifos etc)? – Stéphane Chazelas Feb 09 '22 at 08:51
  • @StéphaneChazelas I'm pretty sure they're covered ($all counts them), in that if they exist in the directory then the criterion "only contain files […]" cannot match $got counting just files – Chris Davies Feb 09 '22 at 10:05
  • 1
    Ah sorry missed that, so you're skipping directories that contain non-directory non-regular files even if they're all old. I suppose that makes sense and makes it a fail-safe if files of those types are not expected. – Stéphane Chazelas Feb 09 '22 at 11:15
1

Assuming you want the shallowest directories that don't contain (at any level) non-directory files not older than 180 days (if both ./a/b and ./a don't contain recent files, only report ./a as ./a/b is part of that), that would be a variation on Listing only shallowest directories containing no files, all the way down, so you could use the same approach as my answer there if on a GNU system:

find . -type d -print0 -o -mtime -180 -printf 'f/%h\0' |
  LC_ALL=C sort -zru |
  LC_ALL=C awk -F/ -v RS='\0' '
    function parent(path) {
      sub("/[^/]*$", "", path)
      return path
    }
    $1 == "f" {
      sep = path = ""
      for (i = 2; i <= NF; i++) {
        black[path = path sep $i]
        sep = FS
      }
      next
    }
    ! ($0 in black) && ($0 == "." || parent($0) in black)'

Where instead of painting black the directories that have any non-directory file, we paint black the ones that have any non-directory file not older than 180 days.

Or a variation which removes directories from an array when a recent file is found:

find . -type d -printf '%p/\0' -o -mtime -180 -printf '%h/f\0' |
  LC_ALL=C sort -zu |
  LC_ALL=C awk -F/ -v RS='\0' '
    function parent(path) {
      sub("[^/]+/?$", "", path)
      return path
    }
    /\/$/ {dir[$0]; next}
    {
      path = ""
      for (i = 1; i <= NF; i++)
        delete dir[path = path $i FS]
    }
    END {
      for (path in dir) if (! (parent(path) in dir)) print path
    }'

If you want both ./a and ./a/b above (not only the shallowest), it becomes simpler:

find . -type d -printf '%p/\0' -o -mtime -180 -printf '%h/f\0' |
  LC_ALL=C sort -zu |
  LC_ALL=C awk -F/ -v RS='\0' '
    /\/$/ {dir[$0]; next}
    {
      path = ""
      for (i = 1; i <= NF; i++)
        delete dir[path = path $i FS]
    }
    END {
      for (path in dir) print path
    }'

If you want to delete them, add a -v ORS='\0' to awk to print them NUL-delimited, and pipe to xargs -r0 rm -rf, but beware that if ./ is returned (if there's no recent file anywhere in the current directory), most rm implementations will refuse to do anything¹.

Since those approaches run find and traverse the whole directory tree only once, they're more efficient than more naive approaches that crawl the contents of each directory for recent files.

One such less efficient approach in zsh:

print -rC1 -- **/*(ND/^e['()(($#)) $REPLY/**/*(NDm-180Y1^/)'])

(won't include ., though you could add it to the list with {.,**/*}(...))

If it's only the subdirectories of the current directory that you want to consider, then that approach becomes more efficient (as the Y1 above stops at the first match) as well as being shorter/simpler:

print -rC1 -- *(ND/^e['()(($#)) $REPLY/**/*(NDm-180Y1^/)'])

¹ as a work around for the fact that the expansion of .* in some shells includes . and .. and could be catastrophic with rm -rf .*. The rm builtin of zsh (a shell that doesn't have that misfeature), which you enable with zmodload zsh/files, will accept rm -rf ./ and empty the current working directory.