This is reposted from here at the asker's behest:
du --inodes --separate-dirs | sort -rh | sed -n \
'1,50{/^.\{71\}/s/^\(.\{30\}\).*\(.\{37\}\)$/\1...\2/;p}'
And if you want to stay in the same filesystem you do:
du --inodes --one-file-system --separate-dirs
Here's some example output:
15K /usr/share/man/man3
4.0K /usr/lib
3.6K /usr/bin
2.4K /usr/share/man/man1
1.9K /usr/share/fonts/75dpi
...
519 /usr/lib/python2.7/site-packages/bzrlib
516 /usr/include/KDE
498 /usr/include/qt/QtCore
487 /usr/lib/modules/3.13.6-2-MANJARO/build/include/config
484 /usr/src/linux-3.12.14-2-MANJARO/include/config
NOW WITH LS:
Note that the above require GNU du
(i.e., from GNU coreutils),
because POSIX du
does not support
--inodes
, --one-file-system
or --separate-dirs
.
(If you have Linux, you probably have GNU coreutils.
And if you have GNU du
,
you can abbreviate --one-file-system
to -x
(lower case)
and --separate-dirs
to -S
(upper case).
POSIX du
recognizes -x
, but not -S
or any long options.)
Several people mentioned they do not have up-to-date coreutils
and the --inodes
option is not available to them.
(But it was present in GNU coreutils version 8.22;
if you have a version older than that, you should probably upgrade.)
So, here's ls
:
ls ~/test -AiR1U |
sed -rn '/^[./]/{h;n;}; G;
s|^ *([0-9][0-9]*)[^0-9][^/]*([~./].*):|\1:\2|p' |
sort -t : -uk1.1,1n |
cut -d: -f2 | sort -V |
uniq -c | sort -rn | head -n10
If you're curious, the heart-and-soul of that tedious bit of regex there
is replacing the filename in each of ls
's recursive search results
with the directory name in which it was found.
From there it's just a matter of squeezing repeated inode numbers,
then counting repeated directory names and sorting accordingly.
The -U
option is especially helpful with the sorting in that it specifically does not sort, and instead presents the directory list in original order – or, in other words, by inode number.
And of course -A
for (almost) all, -i
for inode and -R
for recursive
and that's the long and short of it.
The -1
(one) option was included out of force of habit.
The underlying method to this is that I replace every one of ls's filenames with its containing directory name in sed. Following on from that... Well, I'm a little fuzzy myself. I'm fairly certain it's accurately counting the files, as you can see here:
% _ls_i ~/test
100 /home/mikeserv/test/realdir
2 /home/mikeserv/test
1 /home/mikeserv/test/linkdir
(where _ls_i
represents the above ls
-sed
-... pipeline,
defined as an alias or a script).
This is providing me pretty much identical results to the du
command:
DU:
15K /usr/share/man/man3
4.0K /usr/lib
3.6K /usr/bin
2.4K /usr/share/man/man1
1.9K /usr/share/fonts/75dpi
1.9K /usr/share/fonts/100dpi
1.9K /usr/share/doc/arch-wiki-markdown
1.6K /usr/share/fonts/TTF
1.6K /usr/share/dolphin-emu/sys/GameSettings
1.6K /usr/share/doc/efl/html
LS:
14686 /usr/share/man/man3:
4322 /usr/lib:
3653 /usr/bin:
2457 /usr/share/man/man1:
1897 /usr/share/fonts/100dpi:
1897 /usr/share/fonts/75dpi:
1890 /usr/share/doc/arch-wiki-markdown:
1613 /usr/include:
1575 /usr/share/doc/efl/html:
1556 /usr/share/dolphin-emu/sys/GameSettings:
If you tediously compare the above, line by line,
you'll notice that the 8th line of the du
output is /usr/share/fonts/TTF
(1.6K)
while the 8th line of the ls
output is /usr/include
(1613).
I think the include
thing just depends on which directory the program looks at first – because they're the same files and hardlinked.
Kinda like the thing above.
I could be wrong about that though – and I welcome correction....
DU DEMO
% du --version
du (GNU coreutils) 8.22
Make a test directory:
% mkdir ~/test ; cd ~/test
% du --inodes --separate-dirs
1 .
Some children directories:
% mkdir ./realdir ./linkdir
% du --inodes --separate-dirs
1 ./realdir
1 ./linkdir
1 .
Make some files:
% printf 'touch ./realdir/file%s\n' `seq 1 100` | . /dev/stdin
% du --inodes --separate-dirs
101 ./realdir
1 ./linkdir
1 .
Some hard links:
% printf 'n="%s" ; ln ./realdir/file$n ./linkdir/link$n\n' `seq 1 100` |
. /dev/stdin
% du --inodes --separate-dirs
101 ./realdir
1 ./linkdir
1 .
Look at the hard links:
% cd ./linkdir
% du --inodes --separate-dirs
101
% cd ../realdir
% du --inodes --separate-dirs
101
They're counted alone, but go one directory up...
% cd ..
% du --inodes --separate-dirs
101 ./realdir
1 ./linkdir
1 .
Then I ran my ran script from below and:
100 /home/mikeserv/test/realdir
100 /home/mikeserv/test/linkdir
2 /home/mikeserv/test
And output from Graeme's answer to a similar question:
101 ./realdir
101 ./linkdir
3 ./
So I think this shows that the only way to count inodes is by inode.
And because counting files means counting inodes, you cannot doubly count inodes
– to count files accurately inodes cannot be counted more than once.
ls -a
bad point for scripting in recursion, because it show.
and..
Then you'll have duplicated data, you can use-A
instead of-a
– PersianGulf Feb 26 '14 at 18:13/tmp
and then later the system is configured to mount a tmpfs on/tmp
. Then you won't be able to find the files withfind
alone. Unlikely senario, but worth noting. – Graeme Feb 26 '14 at 18:25sort
in the command? That should not be necessary. The entries will already be grouped. – phemmer Feb 26 '14 at 20:38find
may output a/b, a/b/c, a/b (tryfind . -printf '%h\n' | uniq | sort | uniq -d
) – Stéphane Chazelas Feb 26 '14 at 20:39ls -l | wc -l
. But if I had seen this post earlier, I could have checked the file system once before backing up. But neverthless, +1 for a great answer and the explanation :) – Ramesh Jul 03 '14 at 23:55-printf
appears to be a GNU extension to find, as the BSD version available in OS X does not support it. – Xiong Chiamiov Jun 11 '16 at 19:19du --inodes -x / | sort -n
– OrangeDog Oct 02 '19 at 10:56--max-depth=1
indu
? – ᴍᴇʜᴏᴠ Dec 03 '19 at 13:36sudo find / -xdev -printf "%h\n" | gawk '{a[$1]++}; END{for (n in a){ if (a[n]>1000){ print a[n],n } } }' | sort -nr | less
– Aaron_H Dec 16 '19 at 08:33