2

I want to get the number of directories and sub directories.I tried following command which takes very long time . I tried waiting about an hour to finish the command.

commands I used

$ find . -type d | wc -l

and

$ du -ch | wc -l 

Both commands took more than 1 hour and did not complete the main folder size I'm trying to get information is about 120 GB.

slm
  • 369,824
Özzesh
  • 3,669

2 Answers2

3

If you only want a single level, then there is a trick to doing this without having to enumerate the directory. If you want recursion, then what you've got is the best you're going to get.

The single level trick:

stat --printf='%h\n' /path/to/dir

...and subtract 2. The result is the number of directories within that directory (non recursive).

That command shows the number of hard links on the specified file. Whenever you create a directory inside a directory, the sub directory has a hard link entry to the parent directory, the ... So by creating a sub directory, you increase the number of hard links to the parent directory by one. But we subtract 2 because every directory starts off with 2 hardlinks. One hardlink is in the parent directory and points to it: the dir entry inside /path/to. The other hardlink is the directory containing a link to itself: the . entry.


However with recursion, you have to examine each directory. The problem is that there's no way to say "give me a list of only directories within this directory". You have to get a list of every single entry in the directory, and then stat each one to find out if it's a directory or a file.

Now when you stat the directory, you can use the above hardlink trick to find if that directory contains any sub directories, and thus you can save yourself a little bit of time and not descend into that directory. The find utility actually uses this trick to get a little performance gain in the process.

So basically, using find is going to be the best you can do if you want recursion.

phemmer
  • 71,831
  • 2
    Note that the hardlink trick doesn't work on all filesystems. It doesn't work on btrfs for instance. You can do find . -type d -printf x -links 2 -prune | wc -c to use it here. – Stéphane Chazelas Feb 24 '14 at 10:59
2

find . -type d | wc -l does not give you the correct value if there are newlines somewhere. Furthermore it counts the start directory which probably isn't intended. I do not believe that the pipeline is the bottleneck but this can easily be optimized:

find . -mindepth 1 -type d -printf . | wc -c
Hauke Laging
  • 90,279