11

I've just caught a confusing error:

rm: cannot remove `xxx/app/cache/prod': Directory not empty

which was caused by the following command:

rm -rf $cache_dir/*

where $cache_dir is defined as xxx/app/cache

So I see it like: rm removed everything in cache/prod dir, then right before it attempted to remove the cache/prod directory - another program created a file/a directory inside it thus it caused rm failure.

Is my assumption correct?

Braiam
  • 35,991
zerkms
  • 775
  • 7
    Your assumption is correct - rm -r is not atomic. If you want to be sure that no more files get created in the directory while the rm -rf is running, you could rename it first, then remove the renamed directory. – Johnny Oct 22 '13 at 00:19
  • @Johnny: yep, that's what I actually already implemented :-) – zerkms Oct 22 '13 at 00:21
  • Though even that isn't completely safe. If an app is currently operating out of that directory, it'll just go with the move and keep operating normally. – phemmer Oct 22 '13 at 12:40
  • This has nothing to do with rm -rf being thread safe: if you run it multiple times concurrently on the same directory, the directory get deleted. This is about rm -r not being atomic. – Gilles 'SO- stop being evil' Oct 22 '13 at 22:47
  • @Gilles: it depends: "A piece of code is thread-safe if it only manipulates shared data structures in a manner that guarantees safe execution by multiple threads at the same time". So if we assume "thread" as a rm invocation, we may speak about thread-safety. But anyway, it doesn't change anything – zerkms Oct 22 '13 at 23:36
  • @zerkms rm -rf is thread-safe: rm -rf & rm -rf works as expected. The combination of rm -rf and file creation isn't thread-safe. – Gilles 'SO- stop being evil' Oct 23 '13 at 02:32
  • This sort of command makes me shudder, when that $cache_dir var, despite your best intentions, somehow resolves to a null string. I hope there is plenty of testing around that. – Aitch Oct 25 '13 at 21:20
  • @Aitch: the code that works with it handles all the possible cases. So even if directory or its contents completely or partially disappears at any moment - then it just regenerates it back. – zerkms Oct 25 '13 at 23:15

1 Answers1

8

The error message given was "Directory not empty" (ENOTEMPTY), given this your assumption sounds correct, that it's a race condition where a program created a file in that directory just before rm tried to remove the directory, giving the expected ENOTEMPTY error from the underlying rmdir(2).

NOTE: To be on the safe side you could move/rename the directory to a new name, and then execute your deletion of this directory.

slm
  • 369,824
  • That's another good reason indeed. (but in my case - that directory only stores the write-once-only files). – zerkms Oct 21 '13 at 23:44
  • @zerkms - I'm trying to workout an example that shows this better. – slm Oct 21 '13 at 23:45
  • it's actually enough for me, thanks :-) Only if you wish to do that for the further readers – zerkms Oct 21 '13 at 23:45
  • 2
    This answer is wrong, you can remove directory entries even when a file is in use, and then delete the directory. A simple test of mkdir x; cat > x/a &; tail -f x/a &; rm -r x shows that a directory can be removed even when files are in use, regardless of whether they are open for reading or writing. – wingedsubmariner Oct 22 '13 at 01:15
  • @wingedsubmariner - that's not my entire answer, the file descriptors behind those files and directories can not be deleted, a simple look through /proc/ will show that the FD's are still in play even after the files are deleted. Re-read the 2nd paragraph, last sentence. – slm Oct 22 '13 at 01:17
  • 1
    Yes, the files still exist, but this has nothing to do with why deleting the directory didn't succeed. This statement in your answer specifically is false: "The system will not delete a directory that has files residing in it that are opened in read/write mode". There is some good stuff in your answer, it just doesn't pertain to the question :) – wingedsubmariner Oct 22 '13 at 01:22
  • 1
    Also, be careful not to confuse file descriptors with files. File descriptors are never deleted, only closed. – wingedsubmariner Oct 22 '13 at 01:23
  • @wingedsubmariner - OK, I'll remove that statement, I've been googling trying to find a definitive answer, I too was perplexed with having a file open in a dir. and still being able to delete it, wasn't sure how to explain that away. – slm Oct 22 '13 at 01:26
  • @wingedsubmariner - any insight into why the directory couldn't of been deleted? I've run into this scenario, but wasn't able to replicate it now that I actually wanted to. 8-) I know having a directory open in some fashion was the ultimate reason, but the nature of it had something to do with it. – slm Oct 22 '13 at 01:28
  • @wingedsubmariner - BTW, I've removed the offending paragraph. – slm Oct 22 '13 at 01:30
  • 1
    Your first paragraph might need some work too. You are right about the file deletion not happening when the files are still open, it's just that once the file has been unlinked from that directory they don't prevent the directory from being deleted. Yes, this means UNIX allows files to exist that aren't in any directory, as weird as that seems at first. – wingedsubmariner Oct 22 '13 at 01:32
  • 1
    I really can only think of two reasons why the deletion would fail, either OP's intuition was correct and a new file was created, or it's a permission error. rm complains about permission errors, so I think we can eliminate that. I'm not confident enough to post an answer though. – wingedsubmariner Oct 22 '13 at 01:35
  • @wingedsubmariner - agreed on the 1st PP as well. I've added your example and details of what's going on throughout it. I'm pretty confident I've run into this issue in the past and it wasn't new files or permissions. There's a 3rd reason here. I've run into .nfsxxxxx files getting written on NFS mounted shares, FUSE does this too, but the experiences I've run into were neither of these. – slm Oct 22 '13 at 01:39
  • @wingedsubmariner - would you mind re-reading my answer one more time and critiquing it again? I've re-worked the 1st part as well. – slm Oct 22 '13 at 01:47
  • @sim The error message given was Directory not empty (ENOTEMPTY), so your answer's opening line a process had the directory opened is not correct. The underlying unlinkat(2) and rmdir(2) system calls have no errors related to having a directory open; you can remove open directories. @zerkms assumption is correct: It was a race condition where a program created a file in that directory just before rm tried to remove the directory, giving the expected ENOTEMPTY error from the underlying rmdir(2). – Ian D. Allen Oct 22 '13 at 06:30
  • 1
    @IDAllen - OK thank you for confirming this. I'll take that bit out of the answer. So is his approach of moving/renaming it going to work then? – slm Oct 22 '13 at 06:39
  • @IDAllen - if you have a second could you please double check my updates? Thank you! – slm Oct 22 '13 at 06:48
  • @sim Your opening two paragraphs answer the question, by confirming the OP's assumption and providing a solution. All the rest of your answer, starting with When you encounter these types of errors you could try to determine if there is another process that may be accessing this directory has nothing to do with the ENOTEMPTY error or answering the question, since the ENOTEMPTY has nothing to do with other processes accessing the directory. All that process access and strace stuff belongs somewhere else, not as an answer to "Why did I get ENOTEMPTY?". – Ian D. Allen Oct 22 '13 at 07:19
  • 1
    @IDAllen - OK, I've removed the rest as you suggested. Thanks for taking the time to help clarify all this! – slm Oct 22 '13 at 07:22