How can I use find
to generate a list of directories which contain the most numbers of files. I'd like the list to be from highest to lowest. I'd only like the listing to go 1 level deep, and I'd typically run this command from the top of my filesystem, i.e. /
.

- 369,824
-
Different question (actually the same but asked differently), but wouldn't the answer solve your question as well? http://unix.stackexchange.com/questions/117093/find-where-inodes-are-being-used – phemmer Apr 03 '14 at 02:25
-
Also related - http://stackoverflow.com/questions/15216370/how-to-count-number-of-files-in-each-directory. This is what I based my original answer on the inode question off of, although I think my approach offers some improvements over the ones there. – Graeme Apr 03 '14 at 02:36
-
@Patrick - this is a loaded Q just to house Graemes A. True the bits are buried in the other Q's A's, but this was to bring this bit out so that it could be referenced going forward. – slm Apr 03 '14 at 02:38
-
@slm Then I really don't get why this isn't a duplicate. His answer seems to be just an elaboration of an answer on another question. So now we have 3 questions for the same thing. I think the answer on my link is cleaner too. Launching a shell for every directory found just feels dirty. – phemmer Apr 03 '14 at 02:44
-
@Patrick - just b/c the same answer can be used for 2 Q's doesn't make them dups. The other Q is asking about finding inodes, this one is asking about files/directories in the first level. If you feel your A is better on that Q then feel free to post it on this one as a potential A. – slm Apr 03 '14 at 03:03
-
@Patrick - I'm sure we have more than 3 Q&A's with some of these bits kicking around here. I would expect a user to think of looking for files/dirs. but not necessarily understand inodes, that's why I created this Q as well. – slm Apr 03 '14 at 03:07
-
1@Patrick, I have reworked the answer so that the GNU solution doesn't start a new shell for every directory. Though note this is the standard solution to deal with any filename portably. – Graeme Apr 03 '14 at 03:38
-
@slm This doesn't address inodes. These are directory listings - nothing more. You can easily have many more directory listings than you do inodes. – mikeserv Apr 03 '14 at 09:43
-
@Graeme - I fixed my answer so that it does handle inodes now. – mikeserv Apr 03 '14 at 12:11
5 Answers
UPDATE: I did all of that below, which is cool, but I came up with a better way of sorting directories by inode use:
du --inodes -S | sort -rh | sed -n \
'1,50{/^.\{71\}/s/^\(.\{30\}\).*\(.\{37\}\)$/\1...\2/;p}'
And if you want to stay in the same filesystem you do:
du --inodes -xS
Here's some example output:
15K /usr/share/man/man3
4.0K /usr/lib
3.6K /usr/bin
2.4K /usr/share/man/man1
1.9K /usr/share/fonts/75dpi
...
519 /usr/lib/python2.7/site-packages/bzrlib
516 /usr/include/KDE
498 /usr/include/qt/QtCore
487 /usr/lib/modules/3.13.6-2-MANJARO/build/include/config
484 /usr/src/linux-3.12.14-2-MANJARO/include/config
NOW WITH LS:
Several people mentioned they do not have up-to-date coreutils and the --inodes option is not available to them. So, here's ls:
sudo ls -AiR1U ./ |
sed -rn '/^[./]/{h;n;};G;
s|^ *([0-9][0-9]*)[^0-9][^/]*([~./].*):|\1:\2|p' |
sort -t : -uk1.1,1n |
cut -d: -f2 | sort -V |
uniq -c |sort -rn | head -n10
This is providing me pretty much identical results to the du
command:
DU:
15K /usr/share/man/man3
4.0K /usr/lib
3.6K /usr/bin
2.4K /usr/share/man/man1
1.9K /usr/share/fonts/75dpi
1.9K /usr/share/fonts/100dpi
1.9K /usr/share/doc/arch-wiki-markdown
1.6K /usr/share/fonts/TTF
1.6K /usr/share/dolphin-emu/sys/GameSettings
1.6K /usr/share/doc/efl/html
LS:
14686 /usr/share/man/man3:
4322 /usr/lib:
3653 /usr/bin:
2457 /usr/share/man/man1:
1897 /usr/share/fonts/100dpi:
1897 /usr/share/fonts/75dpi:
1890 /usr/share/doc/arch-wiki-markdown:
1613 /usr/include:
1575 /usr/share/doc/efl/html:
1556 /usr/share/dolphin-emu/sys/GameSettings:
I think the include
thing just depends on which directory the program looks at first - because they're the same files and hardlinked. Kinda like the thing above. I could be wrong about that though - and I welcome correction...
The underlying method to this is that I replace every one of ls
's filenames with its containing directory name in sed.
Following on from that... Well, I'm a little fuzzy myself. I'm fairly certain it's accurately counting the files, as you can see here:
% _ls_i ~/test
> 100 /home/mikeserv/test/realdir
> 2 /home/mikeserv/test
> 1 /home/mikeserv/test/linkdir
DU DEMO
% du --version
> du (GNU coreutils) 8.22
Make a test directory:
% mkdir ~/test ; cd ~/test
% du --inodes -S
> 1 .
Some children directories:
% mkdir ./realdir ./linkdir
% du --inodes -S
> 1 ./realdir
> 1 ./linkdir
> 1 .
Make some files:
% printf 'touch ./realdir/file%s\n' `seq 1 100` | . /dev/stdin
% du --inodes -S
> 101 ./realdir
> 1 ./linkdir
> 1 .
Some hardlinks:
% printf 'n="%s" ; ln ./realdir/file$n ./linkdir/link$n\n' `seq 1 100` |
. /dev/stdin
% du --inodes -S
> 101 ./realdir
> 1 ./linkdir
> 1 .
Look at the hardlinks:
% cd ./linkdir
% du --inodes -S
> 101
% cd ../realdir
% du --inodes -S
> 101
They're counted alone, but go one directory up...
% cd ..
% du --inodes -S
> 101 ./realdir
> 1 ./linkdir
> 1 .
Then I ran my ran script from below and:
> 100 /home/mikeserv/test/realdir
> 100 /home/mikeserv/test/linkdir
> 2 /home/mikeserv/test
And Graeme's:
> 101 ./realdir
> 101 ./linkdir
> 3 ./
So I think this shows that the only way to count inodes is by inode. And because counting files means counting inodes, you cannot doubly count inodes - to count files accurately inodes cannot be counted more than once.
OLD:
I find this faster, and it's portable:
sh <<-\CMD
{ echo 'here='"$PWD"
printf 'cd "${here}/%s" 2>/dev/null && {
set --
for glob in ".[!.]*" "[!.]*" ; do
set -- $glob "$@" &&
[ -e "./$1" ] || shift
done
printf "%%s\\t%%s\\n" $# "$PWD"
}\n' $( find . -depth -type d 2>/dev/null )
} | . /dev/stdin |
sort -rn |
sed -n \
'1,50{/^.\{71\}/s/^\(.\{30\}\).*\(.\{37\}\)$/\1...\2/;p}'
CMD
It doesn't have to -exec
for every directory - it only uses the one sh
ell process and one find
. I have to get the set -- $glob
right still to include .hidden
files and all else, but it's very close and very fast. You would just cd
into whatever your root directory should be for the check and off you go.
Here's a sample of my output run from /usr
:
14684 /usr/share/man/man3
4322 /usr/lib
3650 /usr/bin
2454 /usr/share/man/man1
1897 /usr/share/fonts/75dpi
...
557 /usr/share/gtk-doc/html/gtk3
557 /usr/share/doc/elementary/latex
539 /usr/lib32/wine/fakedlls
534 /usr/lib/python2.7/site-packages/bzrlib
500 /usr/lib/python3.3/test
I also use sed
at the bottom there to trim it to the top 50 results. head
would be faster, of course, but I also trim each line if necessary:
...
159 /home/mikeserv/.config/hom...hhkdoolnlbekcfllmednbl/4.30_0/plugins
154 /home/mikeserv/.config/hom...odhpcledpamjachpmelml/1.3.11_0/js/ace
...
It's crude, admittedly, but it was a thought. Another crude device I use is dumping 2>stderr
for both find
and cd
into 2>/dev/null
. It's just cleaner than looking at permissions errors for directories I can't read without root access - perhaps I should specify that to find
. Well, it's a work in progress.
Ok, so I did fix the shell globs like this:
for glob in ".[!.]*" "[!.]*" ; do
set -- $glob "$@" &&
[ -e "./$1" ] || shift
done
I was actually going to ask a question on how it could be done, but as I was typing in the question title the site pointed me to a suggested related question where, lo and behold, Stephane had already weighed in. So that was convenient. Apparently [^.],
while well-supported, is not portable and you have to use the !bang.
I found that in Stephane's comment there.
Anyway, just pulling in hidden files wasn't enough though, obviously. So I have to set
twice in order to avoid searching positionals for the literal $glob
. Still, it doesn't seem to affect performance at all, and it reliably adds every file in the directory.
-
@Graeme You know, neither of our solutions are actually handling inodes, though. A lot of those files we're listing are likely hard-linked to one another. I think I could do this with
ls -i
and... I guess... probablygrep
... maybe - well, you're using-xdev,
which is a start...uniq
andsort
? – mikeserv Apr 03 '14 at 05:05 -
-
-
That's a bleeding edge feature :-) I'm running 8.21. Looks like it was added 2013-07-27: http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commit;h=333dc83d52e014a0b532e316ea8cd93b048f1ac6 – phemmer Apr 03 '14 at 13:10
-
Also, if you don't mind, could you post that on this question. I don't think I'll accept it as it's not very portable, but I will upvote, and it'd be nice to have another solution on the question. – phemmer Apr 03 '14 at 13:13
-
@Patrick - It's not bleeding edge - it's stable GNU Coreutils, I'm not running a beta version. Still, yeah, I could. I can it work with ls -i as well. – mikeserv Apr 03 '14 at 13:22
-
No
--inodes
for me either, I guess Debian is behind the game with packaging this one. – Graeme Apr 03 '14 at 13:44 -
@Graeme I can do it with ls - with one invocation of ls even - I just got an idea of how. But Debian's kind of famously behind the game so it's no real surprise. But a little while longer and I'll show you the other way. – mikeserv Apr 03 '14 at 13:47
-
The
ls
one works well and is much faster than the find approach. I would add that a mount can be used to search the root filesystem. Also, it won't work with relative paths unless you have./
(or../
) at the start. – Graeme Apr 04 '14 at 14:14 -
@Graeme I thought I fixed that path thing. Maybe I just didn't save the edit - it's probably open somewhere in one of these tabs. The truth is though that this should be done with
awk,
I think.sed
is very difficult here because I can't squeeze the inode numbers in regex - whereasawk
(I think) could with ease. Unfortunately - I never learned how to use it. Possibly I could also usegrep
, but I really think at least two of thosesorts
would be entirely unnecessary ifawk
did this. – mikeserv Apr 04 '14 at 14:19 -
@Graeme it does handle any path as written, however, I guess I had two different ones up there. One was right(er?) - and one was wrong. It's rectified now I think. Sorry about that. – mikeserv Apr 04 '14 at 14:51
-
If I use the version here on a directory
mnt
, I just get a single number printed. Its the/^[./]/
that does it AFICT. Not having a.
or/
at the beginning makes it more difficult to identify a path (since you could have a relative path starting with numbers. I would just leave it and not the requirement. Btw my last comment was supposed to say 'bind mount', which probably makes more sense sincels
doesn't have a-xdev
equivalent. – Graeme Apr 04 '14 at 15:59 -
What if I only want to display the folders that have at least 100 files in them? – Nike Dattani Feb 03 '23 at 05:42
Using GNU tools:
find / -xdev -type d -print0 |
while IFS= read -d '' dir; do
echo "$(find "$dir" -maxdepth 1 -print0 | grep -zc .) $dir"
done |
sort -rn |
head -50
This uses two find
commands. The first finds directories and pipes them to a while
loop runs the next find for each directory. The second lists all the child files/directories in the first level while grep
counts them. The grep
allows -print0
to be used with the second find since wc
does not have a -z
equivalent. This stops filenames with a newline from being counted twice (although using wc
and no -print0
wouldn't make much difference).
The result of the second find
is placed in the argument to echo
so it and the directory name can easily be placed on the same line (the $(..)
construct automatically trims the newline at the end of grep
). Lines are then sorted by number and the 50 largest numbers shown with head
.
Note that this will also include the top level directories of mount points. A simple way to get around this is to use a bind mount and then use the directory of the mount. To do this:
sudo mount --bind / /mnt
A more portable solution uses a different shell instance for each directory (also answered here):
find / -xdev -type d -exec sh -c '
echo "$(find "$0" | grep "^$0/[^/]*$" | wc -l) $0"' {} \; |
sort -rn |
head -50
Sample output:
9225 /var/lib/dpkg/info
6322 /usr/share/qt4/doc/html
4927 /usr/share/man/man3
2301 /usr/share/man/man1
2097 /usr/share/doc
2097 /usr/bin
1863 /usr/lib/x86_64-linux-gnu
1679 /var/cache/apt/archives
1628 /usr/share/qt4/doc/src/images
1614 /usr/share/qt4/doc/html/images
1308 /usr/share/scilab/modules/overloading/macros
1083 /usr/src/linux-headers-3.13-1-common/include/linux
1071 /usr/src/linux-headers-3.13-1-amd64/include/config
847 /usr/include/qt4/QtGui
774 /usr/include/qt4/Qt
709 /usr/share/man/man8
616 /usr/lib
611 /usr/share/icons/oxygen/32x32/actions
608 /usr/share/icons/oxygen/22x22/actions
598 /usr/share/icons/oxygen/16x16/actions
579 /usr/share/bash-completion/completions
574 /usr/share/icons/oxygen/48x48/actions
570 /usr/share/vim/vim74/syntax
546 /usr/share/scilab/modules/m2sci/macros/sci_files
531 /usr/lib/i386-linux-gnu/wine/wine
530 /usr/lib/i386-linux-gnu/wine/wine/fakedlls
496 /etc/ssl/certs
457 /usr/share/mime/application
454 /usr/share/man/man2
450 /usr/include/qt4/QtCore
443 /usr/lib/python2.7
419 /usr/src/linux-headers-3.13-1-common/include/uapi/linux
413 /usr/share/fonts/X11/misc
413 /usr/include/linux
375 /usr/share/man/man5
374 /usr/share/lintian/overrides
372 /usr/share/cmake-2.8/Modules
370 /usr/share/fonts/X11/75dpi
370 /usr/share/fonts/X11/100dpi
356 /usr/share/icons/gnome/24x24/actions
356 /usr/share/icons/gnome/22x22/actions
356 /usr/share/icons/gnome/16x16/actions
353 /usr/share/icons/gnome/48x48/actions
353 /usr/share/icons/gnome/32x32/actions
341 /usr/lib/ghc/ghc-7.6.3
326 /usr/sbin
324 /usr/share/scilab/modules/compatibility_functions/macros
324 /usr/share/scilab/modules/cacsd/macros
320 /usr/share/terminfo/a
319 /usr/share/i18n/locales
-
To use Graeme's find solution on OSX I first needed to install findutils via brew
brew install findutils
... and thengfind . -xdev -type d -exec sh -c 'echo "$(find "$0" | grep "^$0/[^/]*$" | wc -l) $0"' {} \; | sort -rn | head -50
– robbogdan Aug 10 '22 at 21:47 -
@robbogdan Ok, but none of the command that you show requires GNU
find
. The part that requires GNUfind
from this answer is-print0
which is also supported byfind
on macOS. – Kusalananda Aug 15 '22 at 13:05
Why not use something like KDirStat Although it was originally written for KDE but it works fine with GNOME aswell It gives you best view of number of file/dir and respective usage in GUI
To find a list of top directories that contain the biggest number of entries (files and directories) I ended up with a simple command (GNU tools):
find /usr -xdev -type d -print | xargs -n1 du --inodes -sS | sort -rn | head -10
and the output looks as follow:
20418 /usr/share/doc/libreoffice-7.3.6.2/sdk/docs/idl/ref
12155 /usr/share/man/man3
5989 /usr/share/gtk-doc/html/gtk4
3866 /usr/lib64
3862 /usr/share/doc/openssl-1.1.1q/html/man3
3046 /usr/share/gtk-doc/html/gdk4
2478 /usr/bin
2382 /usr/share/fonts/noto
2376 /usr/share/man/man1
2371 /usr/src/linux-5.16.20-gentoo/arch/arm/boot/dts

- 751
That calls for zsh
glob qualifiers:
print -rC1 -- **/*(ND/nOe['(){REPLY=$#;} $REPLY/*(NDoN)'][1,50])
print -rC1 --
:print
s its argumentsr
aw on1
C
olumn**/*
: recursive globbing: any file in any number of subdirectories(N...)
: glob qualifiers to further qualify the glob expansionN
:N
ullglob. Does not complain if there's no match.D
:D
otglob. Also consider hidden filesnOe[code]
: reverseO
rdern
umerically based on thee
valuation of thecode
.- the code here sets
$REPLY
to the number of files in the directory by passing the expansion of$REPLY/*(NDoN)
to an anonymous function that stores its number of arguments in$REPLY
. [1,50]
: return only the first 50.

- 544,893