5

My friend works in a IT company where he is required to keep a count of XML files inside a directory (including subdirectories).

However the count using ls -LR | grep .xml or similar takes a lot of time as the count of files is in the millions.

I was thinking what can be a better approach?

Can some kind of background process take care of this, so that whenever there is a new file created or modification time change, it affects the counter (no. of files)?

Mat
  • 52,586
Novice User
  • 1,349

2 Answers2

6

The daemon you describe could use inotify.

But perhaps using find and wc -l would already be fast enough?

I only ran one quick test, but there is a significant difference between time ls -lR /mm/|grep -c jpg (real 0m2.168s) and time find /mm -type f -name \*jpg|wc -l (real 0m0.397s) on my system. Both return approximately 42000 files, so the difference on larger directories would probably be larger.

(I ran both commands several times to exclude disk caching effects.)

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
Bram
  • 2,459
  • Thanks for your inputs. find and wc -l takes lots of time, also some kind of notifier is better as the number of files changes are very less, so repetitive find and wc -l doesnt suit mycase. – Novice User May 12 '12 at 13:41
  • Counting lines might take a while. But you can use "-printf ." parameter of find and pipe it to "wc -c" to count the printed dots. That should be faster (see my answer). – hluk May 12 '12 at 15:41
3

Following script will watch a directory indefinitely and print number of '*.xml' files every time the directory content changes.

DIR="a_path_to_directory_to_watch"
(echo; inotifywait -m -r -e create -e delete "$DIR" 2> /dev/null) |
    while read; do
        find "$DIR" -name '*.xml' -printf . | wc -c
    done
hluk
  • 236
  • 1
  • 2