8

I have lots of subfolders inside a particular folder which in turn contains lots of smaller files. They are created programmatically and so I do not know how many of them are there inside.

I decided to remove all these sub-folder and files and so I issued the command,

rm -rf foldername/

However, the rm command is taking so much time to execute which I believe is perfectly normal since it has to unlink all the files.

But, I decided to check if the size of this folder is getting reduced by issuing the command,

du -sh foldername/

However, the above command gives me the error as,

du: cannot access `foldername/file': No such file or directory

Why is this error happening?

Ramesh
  • 39,297

2 Answers2

2

du, like any command that traverses directory trees recursively, operates in the following way:

  1. Read information about a file, accessed via its path¹. In the case of du, the system call stat provides the file type (in particular, whether it's a directory) and size. Initially, the names are taken from the command line.
  2. If the file is a directory, open it and read the list of file names.
  3. For each file name in the directory, construct a file path (DIRECTORY/ENTRY_NAME) and act on it recursively starting at step 1. This step may be performed partly in parallel with the previous one (it depends on the implementation).

rm is running and deleting files one by one. Occasionally, du reads a file name in step 2, but by the time it gets around to processing it in step 3, rm has deleted it. Whether you see this error at all and how many times depends on the relative speed of rm and du and is pretty much unpredictable.

¹ There are only two ways to directly access a file: by path (including directory information, relative or absolute), or (if the file is open) by descriptor.

  • this pretty much explains the theory behind the working of du command. Thanks for this wonderful explanation. So if a file is unlinked by rm command, will the file be still accessible using its path or the descriptor? – Ramesh Jun 28 '14 at 22:38
  • 1
    @Ramesh Once a file is unlinked by rm, it is no longer accessible via its path (that's what “unlink” means). The file can still be accessed by a process that has it open, if there is one. du doesn't open files apart from directories; if it has a directory open while rm deletes it, it can still read from it, but the read will return “no more entries” since rm has removed them all. The file could still be accessed by other names (hard links) if it has them. – Gilles 'SO- stop being evil' Jun 28 '14 at 22:58
1

Just Ignore the du command error

As per this link, I could ignore the du errors by just mentioning,

du 2> >(grep -v '^du: cannot \(access\|read\)' >&2)

But I am more specifically interested in knowing what is happening with the deletion of files. I am particularly interested in knowing why the du command couldn't give the size and why it reports the error when the rm command has unlinked the files.

This is explained in this link. I am just rephrasing to see what has happened here.

  1. The rm command has unlinked the file. (i.e. delete the file name entry from its parent directory).
  2. But, the file handle still remains valid though there is no file name associated with it. This is where the du command reports that it doesn't see the file or directory.

Verification

I did some more research on verifying that the files are in fact unlinked.

I got the pid of the rm process using ps command. Now, I issued the below command to see if the files are still available.

lsof +L | grep 11771

The above command gave me the below output.

rm   11771  root  4r DIR  8,17 175882240     2   47333397 /foldername/filename

So as per the above output, the file is unlinked.

Since, the rm command is still running, the du command reports the error.

Ramesh
  • 39,297