0

The following bash script searches through all directories and tabulates how many files are in each directory.

Sample output testing on my own R installation is below - exactly what I need.

find . -mindepth 6  -type d -print0 | while IFS= read -r -d '' i ; do echo -n $i": " ; ls -p "$i" | grep -v / | wc -l ; done

My question is, how can I read this output "files.txt", (for example) into another statement, such as this:

xargs rm -f files.txt    # ("<" is missing)

to read the contents of files.txt containing all directories transversed, and delete all files (and only files, NOT folders, directory structure must not be changed) of those directories with MORE than one file in it?

In the output below, all files in each directory would be deleted, excluding -

./R/R-3.6.1/src/library/tcltk/R/windows: 1  
./R/R-3.6.1/src/library/compiler/man: 1   
./R/R-3.6.1/src/library/compiler/R: 1

Sample output:

./R/R-3.6.1/src/library/tools/man: 64   
./R/R-3.6.1/src/library/tools/tests: 3   
./R/R-3.6.1/src/library/tools/src: 16  
./R/R-3.6.1/src/library/tools/po: 23  
./R/R-3.6.1/src/library/tools/R: 49    
./R/R-3.6.1/src/library/tcltk: 4   
./R/R-3.6.1/src/library/tcltk/man: 14  
./R/R-3.6.1/src/library/tcltk/exec: 12  
./R/R-3.6.1/src/library/tcltk/src: 7   
./R/R-3.6.1/src/library/tcltk/po: 21  
./R/R-3.6.1/src/library/tcltk/R: 6   
./R/R-3.6.1/src/library/tcltk/R/unix: 2  
./R/R-3.6.1/src/library/tcltk/R/windows: 1  
./R/R-3.6.1/src/library/tcltk/demo: 5  
./R/R-3.6.1/src/library/compiler: 4  
./R/R-3.6.1/src/library/compiler/man: 1  
./R/R-3.6.1/src/library/compiler/noweb: 2  
./R/R-3.6.1/src/library/compiler/tests: 10  
./R/R-3.6.1/src/library/compiler/po: 10  
./R/R-3.6.1/src/library/compiler/R: 1   
./R/R-3.6.1/src/library/graphics: 4

Thanks.

Chris Davies
  • 116,213
  • 16
  • 160
  • 287
  • Why not use the exec option of find to remove those files? – Panki Nov 11 '19 at 15:26
  • 2
    Or perhaps just ask your actual question, which is more like "how can I delete all files (and only files, NOT folders, directory structure must not be changed) of those directories with MORE than one file in it?" – Jeff Schaller Nov 11 '19 at 15:35

3 Answers3

0

Something like this?

find . -type f -mindepth 6 \
  | sed -r 's:/[^/]+$::' \
  | sort \
  | uniq -c \
  | awk '$1 > 1 {print $2}' \
  | xargs -i% find % -type f -delete

Breakdown

find . -type f -mindepth 6 get a list of files

sed -r 's:/[^/]+$::' remove the file names, leaving just the directory

sort required for next command to work correctly

uniq -c count consecutive unique lines

awk '$1 > 1 {print $2}' filter out where only 1 items was found, what remains has 2 or more files in it

xargs -i% find % -type f -delete search each resulting directory for files and remove them all

With some backflips and a temp file we could probably avoid doing the xargs find, instead getting the list of things to delete from the input.

bxm
  • 4,855
0

If you want to read the file into rm then each line needs to be in a meaningful form that tells rm to remove all of the files in the listed directories, which is

rm ./path/to/delete/*

So....

sed -E "s|:\s[0-9]+$|/*|" files.txt

Tells sed to replace every occurrence of the sequence : whitespace \s one or more digits [0-9]+ and line end $ in your file with /*.

If you feed this as it stands into rm then it will throw an error because you are also asking it to delete directories so, if the messages bug you, redirect stderr

rm $(sed -E "s|:\s[0-9]+$|/*|" files.txt) 2>/dev/null

This will fail if there is whitespace in directory names, in which case, still sticking with feeding your file to rm, you could change IFS and reset it afterwards

OFS=$IFS; IFS=$'\n'; rm $(sed -E "s|:\s[0-9]+$|/*|" files.txt) 2>/dev/null; IFS=$OFS
bu5hman
  • 4,756
-1

Since you already have a method to detect which directories contain more than one file, and the result is stored in a file (you called it files.txt), you could use a shell script to do the task:

#!/bin/bash

IFS=":"
while read path count
do
    if (( count > 1 ))
    then
        echo "Remove all files in $path (count = $count)"
        rm "$path/"*
    fi
done < files.txt
AdminBee
  • 22,803
  • This is great!!! Would there be a way to add one additional criteria:
    to rm -f only files > "X" number of days? Thanx.
    – jumboshrimps Nov 12 '19 at 16:33
  • That should also be possible, but I have to ask -- do you mean only files which are older than X days? – AdminBee Nov 13 '19 at 07:51