I run this command to find the biggest files:
du -Sh | sort -rh | head -5
Then I do -rm rf someFile
.
Is there a way to automatically delete the files found from the former command?
I run this command to find the biggest files:
du -Sh | sort -rh | head -5
Then I do -rm rf someFile
.
Is there a way to automatically delete the files found from the former command?
If you're using GNU tools (which are standard on linux), you could do something like this:
stat --printf '%s\t%n\0' ./* |
sort -z -rn |
head -z -n 5 |
cut -z -f 2- |
xargs -0 -r echo rm -f --
(remove the 'echo' once you've tested it).
The stat
command prints out the filesize and name of each file in the current directory separated by a tab, and with each record terminated by a NUL (\0) byte.
the sort
command sorts each NUL-terminated record in reverse numeric order. The head
command lists only the first five such records, then cut
removes the file size field from each record.
Finally xargs
takes that (still NUL-terminated) input and uses it as arguments for echo rm -f
.
Because this uses NUL as the record (filename) terminator, it copes with filenames that have any valid character in them.
If you want a minimum file size, then you could insert awk
or something between the stat
and the sort
. e.g.
stat --printf '%s\t%n\0' ./* |
awk 'BEGIN {ORS = RS = "\0" } ; $1 > 25000000' |
sort -z -rn | ...
NOTE: GNU awk
doesn't have a -z
option for NUL-terminated records, but does allow you to set the record separator to whatever you want. We have to set both the output record separator (ORS) and the input record separator (RS) to NUL.
Here's another version that uses find
to explicitly limit itself to regular files (i.e. excluding directories, named pipes, sockets, etc) in the specified directory only (-maxdepth 1
, no subdirs) which are larger than 25M in size (no need for awk
).
This version doesn't need stat
because GNU find
also has a printf
feature. BTW, note the difference in the format string - stat
uses %n
for the filename, while find
uses %p
.
find . -maxdepth 1 -type f -size +25M -printf '%s\t%p\0' |
sort -z -rn |
head -z -n 5 |
cut -z -f 2- |
xargs -0 -r echo rm -f --
To run it for a different directory, replace the .
in the find command. e.g. find /home/web/ ....
shell script version:
#!/bin/sh
for d in "$@" ; do
find "$d" -maxdepth 1 -type f -size +25M -printf '%s\t%p\0' |
sort -z -rn |
head -z -n 5 |
cut -z -f 2- |
xargs -0 -r echo rm -f --
done
save it as, e.g., delete-five-largest.sh
somewhere in your PATH and run it as delete-five-largest.sh /home/web /another/directory /and/yet/another
This runs the find ...
once for each directory specified on the command line. This is NOT the same as running find
once with multiple path arguments (which would look like find "$@" ...
, without any for
loop in the script). It deletes the 5 largest files in each directory, while running it without the for loop would delete only the five largest files found while searching all of the directories. i.e. five per directory vs five total.
stat --printf '%s\t%n\0' /home/web/* | ...
– cas
Jul 24 '17 at 11:26
sh r.sh
, and nothing is printed out. I put the script in the directory that has all the folders for files, and the files directly in them (no more subfolders)
– Dan P.
Jul 24 '17 at 12:08
Folder1/file.jpg (and many other files)
Folder2/blabla.mp3 (and many other files)
r.sh
on the same level as Folder1 and Folder2.
– Dan P.
Jul 24 '17 at 12:08
.
means the current directory, not the directory containing the script. i'll add an example of how to turn this into a script.
– cas
Jul 24 '17 at 12:10
find
's -maxdepth 1
option. that explicitly limits find to the specified directory only, without any recursion to subdirectories. that's quite deliberate. If you save the shell script as in my updated answer above, you'd then run it like delete-five-largest.sh /path/to/Folder[12]/
– cas
Jul 24 '17 at 12:22
With recent GNU tools (you're already using GNU-specific options):
du -S0 . |sort -zrn | sed -z 's@[^/]*@.@;5q' | xargs -r0 echo rm -rf
(remove echo
if happy).
The -0
/-z
is to be able to copy with files/directories with arbitrary names.
Note that most rm
implementations will refuse to remove .
(the current working directory), so you may want to do it from one level up and do:
du -S0 dir | sort -zrn | sed -z 's@\s*\d+\s*@@;5q' | xargs -r0 echo rm -rf
So it can remove dir
if that's one of the biggest files (note that it would also remove all the subdirs). It's not clear from your requirements if it's really what you want.
Now, if all you want is to remove the 5 biggest regular files (excluding other types of files like directories, devices, symlinks...), it's just a matter of using zsh
and:
echo rm -f ./**/*(D.OL[1,5])
(OL
is to reverse-sort by length (size, not disk usage)).
du -Sh | sort -rh | head -5
following your command (without the echo), the 5 biggest files are still there.
– Dan P.
Jul 24 '17 at 11:35
rm -f
to rm -fv
to get it to print out what it's deleting as it deletes them.
– cas
Jul 24 '17 at 11:36
-r
to remove directories (and their content). See also the edit.
– Stéphane Chazelas
Jul 24 '17 at 11:40
du
reports the disk usage, not the size (it also shows hard links to a given file only once)
– Stéphane Chazelas
Jul 24 '17 at 11:45
du -S0 . |sort -zrn | sed -z 's@[^/]*@.@;5q' | xargs -r0 echo rm -f ./**/*(D.OL[1,5])
?
– Dan P.
Jul 24 '17 at 11:48
truncate -s100T file
, you have a 100TiB file that takes no space on disk.
– Stéphane Chazelas
Jul 24 '17 at 11:49
echo rm -f ./**/*(D.OL[1,5])
, but in the zsh
shell (and remove echo
if happy)
– Stéphane Chazelas
Jul 24 '17 at 11:50
du
and went straight for stat
(and later, find
). i also assumed he wanted to delete files rather than directories, so didn't use rm -r
and took steps to avoid sub-directories.
– cas
Jul 24 '17 at 11:52
zsh
and rm -f ./**/*(D.OL[1,500])
with success. Cleared up some good space on the server. Thanks.
– Dan P.
Jul 24 '17 at 12:04
Here you've got a (subshell intensive) loop for each file. Replace the echo
by your rm command:
du -Sh /your/search/path/ |\
sort -rh |\
head -5 |\
awk '{print $2}' |\
while read file ; do
echo "$file"
done
Works within acutal bash. But this is anything else but an nice script. And I am sure to earn some comments because of whitespaces inside filenames. ;) They are welcome!
If you are familar with cron jobs, execute this script periodically.
du
command and did't verify that. Right now there are some competent answers you could take a look to ...
– ChristophS
Jul 24 '17 at 11:29
Here's a simple answer that hopefully helps you - 'find / -type f -size 1G -exec rm {} \;' This will find any file under root that is a file, not a directory, that is over 1G in size, and will remove it. You can add extra file sorting after exec if you need to choose a file by name for example. Size can be changed to M (megabytes), k (kilobytes), c (bytes). There are many options to find and it's a powerful command, check out the man page! :)