17

I have a folder with 266778 subfolders. How can I delete it?

I have tried

cd ~/.local/share/Trash/
sudo rm -rf *

but it takes much time. After 1 minute 25 seconds real time and 0.072 seconds user time it only deleted 2500 folders. This way, it will take over two hours to delete this folder.

Is there a faster way to delete this folder? Why is there such a big difference between user time and real time?

real    1m25.474s
user    0m0.072s
sys     0m28.142s

I use Linux 2.6.32 (Ubuntu 10.04.4 LTS).

Braiam
  • 35,991
Martin Thoma
  • 2,842
  • I have just googled this problem and it seems that some people have discovered that rsync can be used as a "many-files-deletion" tool quite efficiently. Whether it truly is faster remains up to you to evaluate. – Johan Mar 04 '13 at 12:51
  • 2
    For what it's worth: performance when deleting many folders/files is highly filesystem dependent. In my experience the difference when deleting millions of small files on ext3 (slow) vs. XFS (fast) can be hours. – pdo Apr 18 '16 at 11:11
  • If you often have this case and you can plan ahead, using a filesystem like btrfs and using a subvolume, you can speed things up fast by just dumping that subvolume. – PlasmaHH Apr 18 '16 at 12:49
  • Here is where you can find the answer. The perl one is the fastest. https://unix.stackexchange.com/questions/37329/efficiently-delete-large-directory-containing-thousands-of-files – SDsolar Aug 17 '17 at 10:10

5 Answers5

22

It depends on your definition of fast. The answers already here give a good solution for actually removing the directories from the filesystem, but if what you really need is to free the directory name as fast as possible, a rename on the same filesystem is instantaneous:

{ mv directory directory.gone && rm -rf directory.gone; } &

Technically this is cheating since I haven't sped up the actual deletion, but practically it's very useful: I use this trick all the time so I don't have to wait for slow deletion operations.

kojiro
  • 4,644
  • Great. What is your use case for doing this all the time? If you do it a lot, isn't there a danger you will backlog, get multiple 'directory.gone's and fail? I presume you use a suffix like '$$' or '%(date ...)' – smci Nov 28 '17 at 00:37
  • 1
    If I needed that I could probably use mktemp with arguments that ensure it stays on the same filesystem. But I can’t say I have a specific example right now. – kojiro Nov 28 '17 at 00:40
  • kojiro yeah thanks, mktemp is what I was trying to remember... – smci Nov 28 '17 at 01:03
19

If your version of "find" implements the -delete sub-command, then you can try

find directory -delete

In this case:

find ~/.local/share/Trash/ -delete

Some commands, like rm, perform most of their work in the kernel. In the file-system routines, to be exact. Time spent performing system calls are accounted for in that way, so whilst your "rm" command runs for a long time, it doesn't do much work in user-land - the system calls performs most of the work.

Johan
  • 4,148
  • +1 ; though this also deletes the parent dir and I suspect the OP only wanted to delete the contents of the Trash folder not the folder itself – don_crissti Mar 04 '13 at 14:34
  • 1
    @don_crissti : good remark. if the OP wanted to only delete subdirs under ~/.local/share/Trash (and not files on the 1st level), then : find ~/.local/share/Trash/*/ -delete (of course, this will also delete files (and dirs) in any of those Trash/*/ subdirs as well) – Olivier Dulac Mar 04 '13 at 16:16
  • 4
    Is find directory -delete really faster than rm -rf directory? After all, they perform the same work, and there aren't two ways to do it. – Gilles 'SO- stop being evil' Mar 04 '13 at 23:00
  • @Gilles That is a good question and I believe the only reason why find is faster is because of the implementation. Now you got me curious as to the why - I will make time to trace this and find out! – Johan Mar 05 '13 at 07:55
  • 1
    @Johan find is really fast. Did you ever get a chance to find out the reason? – Harshdeep Mar 16 '17 at 16:35
  • As far as I can tell it is because Find doesn't sort the directory entries. – Johan Mar 17 '17 at 08:42
2

rm -rf directory or rm -rf * of course is the fastest method unless your local rm implementation is broken.

Using find gives no advantages.

Whether this is fast or slow mainly depends on the filesystem and OS implementation. So the question seems to be inappropriate.

UFS and ZFS on Solaris are known to be very fast with this kind of task as both filesystem implementations include delayed background delete code that causes the unlink() and rmdir() calls to return fast even when the related object will take more time in total.

With the delayed background delete in the kernel, the directory updates can be done fast as well and this help to speed up the whole operation.

schily
  • 19,173
0

This is only a partial answer, sheding light on the three values the command returns; quoted from the time(1) manpage:

(i) the elapsed real time between invocation and termination, (ii) the user CPU time (the sum of the tms_utime and tms_cutime values in a struct tms as returned by times(2)), and (iii) the system CPU time (the sum of the tms_stime and tms_cstime values in a struct tms as returned by times(2))."

deizel.
  • 103
schaiba
  • 7,631
0

If you don't want to wait, and you want to avoid downtime, or you just need to get rid of the folder fast, queue the delete operation via mv on your next reboot. Also, the mv file operation is always faster than anything else, and no need to wait for blocking file-io operation, and continue what you are currently doing on that folder.

Just mv folder_to_be_deleted /tmp/folder_queue_for_deletion. Files in the /tmp directory will be deleted upon your next reboot.

Benchmark:

$ cat make_million_files.sh
#!/usr/bin/env bash

mkdir folder_to_be_deleted for i in $(seq 0 1000000); do touch folder_to_be_deleted/$i; done

$ ./make_million_files.sh
real    66m3.613s
user    5m47.507s
sys     61m15.432s

IO blocking operation

$ rm -rf folder_to_be_deleted
real    0m32.451s
user    0m2.086s
sys     0m25.094s

Non-IO blocking operation (Queue deletion on next reboot)

$ mv folder_to_be_deleted /tmp/folder_queue_for_deletion
real    0m0.012s
user    0m0.001s
sys     0m0.010s

In essence, the benefit is noticeable if you have multiple folders with lots of subfolders and you want to delete them fast, without downtime in your work, you might consider this solution, as it only takes 60 seconds to delete a million folders with lots of subfolders.

For 1000 folders with lots of subfolders, it would take ~1 hour of blocking IO using rm -rf, vs. 12 seconds via mv. And it would just take 60 seconds of boot time to delete everything. Finally, if you don't want to reboot, just mv the folder out of your way, and rm -rf it somewhere (other TTY session, etc.).

  • 1
    Why should this be faster than rm -rf – Martin Thoma Nov 29 '22 at 16:55
  • If you have ~100k files in a folder, ‘rm -rf’ would take some time to complete. ‘mv folder /tmp/folder’ would take less than 3 seconds. – JB Juliano Nov 30 '22 at 17:25
  • you are cheating. You need to consider the time to restart as well. Similarly, you could enter a cron job and claim its faster as it requires no time at all (at the moment at least) – Martin Thoma Nov 30 '22 at 21:19
  • Why downvote? It's not cheating, /tmp is just a convenient example, because it exists on all *nix platforms. You can create a TMPFS and just remount it, and it's just the same without restarting. – JB Juliano Dec 01 '22 at 08:12
  • Because I have serious doubts that it's faster. Your benchmark does not account for the time necessary for the restart. On my system the restart is ~3 minutes - which is well above the 30s. – Martin Thoma Dec 01 '22 at 15:24
  • How could you be unreasonable? I did indicate that this solution is for people who "just need to get rid of the folder fast" and "want to queue the delete operation on the next reboot" without dealing with "blocking file-io operation". Everyone restarts eventually, so it's a queued deletion, not entirely using "rm -rf".

    As for the safety, what's unsafe with 'mv' vs. 'rm'?

    – JB Juliano Dec 01 '22 at 16:23
  • I'm not going to discuss this more. I gave my reasons for the downvote. As it is, the answer is not useful to me. It also also not directly related to the question. Calling me unreasonable does not help to change my opinion. – Martin Thoma Dec 01 '22 at 17:17
  • You've asked "How can I delete a folder with lots of subfolders fast?", and you have tried "rm -rf", "but it takes much time." And I provide an alternative.

    The answer found here in SO is not for you only, it's a public channel for everyone to see, not just for you. Some might adopt this solution for themselves and use it, for you it's useless, but not for everyone.

    – JB Juliano Dec 01 '22 at 21:40