4

I want to remove all the files from different directories, and want to keep only latest files 20 files. is this the correct command to do this?

ls -t1 /mnt/dwh/ftp/dwh_ftp_cbs/ARLOGS/ | tail -n +22 | xargs rm -f
Ali
  • 41
  • 3
    useful tip 1: replace the rm -f after xargs with ls -ld to see what it would delete. tip 2: if you're using GNU xargs (default on linux), you can use -d'\n' to use newlines as the input delimiter. this will prevent files with spaces etc in the from causing problems. tip3: ls -1 is mostly safe, but be wary of parsing the output of ls, in particular ls -l. tip4: find is usually better choice than ls for this kind of job. – cas Oct 30 '15 at 07:24

3 Answers3

3

In your command there are many disadvantages, for example the processing of the output of ls, it's kind of a catechism. See this article why not to do this.

This approach uses no GNU extensions, and is tested on bash, zsh, ksh93, dash and ash:

for f in * .*; do 
  [ -f "$f" ] && printf "%s " $(perl -e 'print((stat("$ARGV[0]"))[9])' -- "$f") && \
  printf "%s\0" "$f";
done | tr '\0\n' '\n\0' | sort -k1nr | \
sed "1,20d;s/'/'\"'\"'/g;s/[^ ]* \(.*\)/rm -f -- '\1';/" | tr '\0' '\n' | sh

This is the weirdest filename I was able to produce, it works with even this:

touch -- '--a'"'"'b"c
d$e\nf\r!g;h^i`j(k)l*m%n=o?p.txt'

Or a filename which tries to "jump out":

touch '\'"'"'echo test '"'"'\'
chaos
  • 48,171
  • It's still GNU specific in that most other sort or sed implementations will choke on input containing NUL characters. You'd want to do the whole thing in perl. Note that stat will check the modification time of the target of symlinks, use lstat so it be more like ls -t. Also it will not support sub-second resolution. – Stéphane Chazelas Oct 30 '15 at 13:28
  • Also note that the . regexp operator in GNU sed won't match bytes not forming valid characters, so you should fix the locale to C. – Stéphane Chazelas Oct 30 '15 at 13:45
3

With zsh and glob-qualifiers:

print -rl -- *(D.Om[1,-21])

will list all regular files except the last (most recently modified) twenty.
D selects hidden files, . selects only regular files, Om means reverse sort by mtime (so oldest first) and [1,-21] selects from the first up to the 21st-to-last.
If you're happy with the result replace print -rl with rm:

rm -- *(D.Om[1,-21])

If you have a huge number of files you may have to use zargs to avoid arguments list too long:

autoload zargs
zargs ./*(D.Om[1,-21]) -- rm
don_crissti
  • 82,805
3
ls -t1 /mnt/dwh/ftp/dwh_ftp_cbs/ARLOGS/ | tail -n +22 | xargs rm -f

Won't work because the output of ls will only include the files' names, not their full path. It would also skip files whose name starts with a ..

cd /mnt/dwh/ftp/dwh && ls -At1 | tail -n +21 | xargs rm -f --

would solve that, but you'd still have problems with filenames that contain blanks or newline or apostrophe or backslash or double quote characters (and possibly file names with bytes that don't form valid characters in the locale).

(export LC_ALL=C
 cd /mnt/dwh/ftp/dwh && ls -At1 |
   tail -n +21 |
   sed 's/"/\\"/g;s/.*/"&"/' |
   xargs rm -f --
)

would fix most of those, but you'd still have a problem with filenames containing newline characters. The problem is that the output of ls -At1 is not post-processable.

The output of ls -Atd1 ./.* ./* could be post-processed (as those ./ prefixes indicate where each filename starts in the output, but not easily), but you run the risk of reaching the limit on the number of arguments passed to a command by passing the list of filenames to ls like that.

Best would be to use a shell that can do that sorting in its globs, or rely on GNU extensions if your only need to work on GNU systems:

 cd /mnt/dwh/ftp/dwh &&
   find . ! -name . -prune ! -type d -printf '%T@\t%p\0' |
     tr '\0\n' '\n\0' |
     sort -rn |
     tail -n +21 |
     cut -f 2- |
     tr '\0\n' '\n\0' |
     xargs -r0 rm -f

Here, we're excluding file of type directory, which will affect the numbering but is probably closer to what you want as I don't suppose you want to remove directories. If you want to remove directories, remove the -type d and add the -r option to rm.

Or:

eval "set -- $(COLUMNS=4294967295 ls -Atx --quoting-style=shell-always)"
if [ "$#" -gt 20 ]; then
  shift 20
  printf '%s\0' "$@" | xargs -r0 rm -f --
fi

Or use a generic programming language like perl, python, ruby...