2

I wrote the following script for finding the number of pdf and tex files from the current directory, including the subdirectories and hidden files. The following code is able to find the number of pdf files upto 2 levels of subdirectories below, but after that it tells that there are no sub directories....

#!/bin/bash

touch t.txt

k=`find -type d |wc -l`
k1=`expr $k - 1`

echo $k1

message1="*.pdf *.tex"
count=`ls -al $message1|wc -l`
find -type d > t.txt

i=2

while [ $i -le $k ]
do
    kd=`head -$i t.txt|tail -1`
    echo $kd
    touch $kd/t.txt
    cp t.txt $kd/t.txt
    i=`expr $i + 1`
done

i=2
while [ $i -le $k ]
do
    nd=`head -$i t.txt|tail -1`
    set -x
    echo $nd
    set +x
    cd $nd
    j=`ls -al $message1|wc -l`
    count=`expr $count + $j`
    i=`expr $i + 1`
done
#set +x

echo $count

3 Answers3

11

You can do this in pure bash:

shopt -s nullglob dotglob globstar
set -- **/*.pdf **/*.tex
echo "$#"

set sets the positional parameters of the current shell to the result of the glob. $# then retrieves the number of these parameters set.

If you do use the positional parameters (unlike in the script in the inquirer's case), then you can do the same using an array:

shopt -s nullglob dotglob globstar
files=(**/*.pdf **/*.tex)
echo "${#files[@]}"
Chris Down
  • 125,559
  • 25
  • 270
  • 266
8

find works fine to me:

$ find . -name '*.pdf' -o -name '*.tex' | wc -l
75
$ find . -name '*.pdf' | wc -l
16
$ find . -name '*.tex' | wc -l
59
$ echo $((16+59))
75

Edit:
To handle special case: newline in filename

$ find . \( -name '*.pdf' -o -name '*.tex' \) -printf x | wc -c
kev
  • 966
  • This will break for files with newlines in their filename. – Chris Down Dec 22 '11 at 08:36
  • It does break, as you can quite clearly see executing the following code: > $'foo\nbar.pdf' ; > $'baz\nqux.tex' ; find . -name '*.pdf' -o -name '*.tex' | wc -l -- the reply is 4, which is not correct (there are two files). – Chris Down Dec 22 '11 at 08:47
  • @ChrisDown. You are right. – kev Dec 22 '11 at 08:54
  • 1
    @ChrisDown: I am always reluctant to make the code more complex only to take into account "newlines in filenames", because I have never seen such a case in everyday situations. Obviously, for code to release to the public, it is correct to take into account every possibility. Are you aware of cases where "newlines in filenames" are not create by mistake or deliberately to test a software? – enzotib Dec 22 '11 at 09:15
  • @enzotib I've seen it multiple times, but only by people using graphical file managers. Often it happens when they go to paste something from another source that contains newlines into a filename, and they don't expect the newlines to still be present. – Chris Down Dec 22 '11 at 09:16
  • @kev: But, say i have a hidden file names as '.pdf'(it has no extension), the 'find . -name '*.pdf' | wc -l' counts that also... the script also takes that into account.And also the 'ls -al' is not showing the hidden files sometimes – user13522 Dec 22 '11 at 14:03
  • @user13522. Just another special case: -name '*?.pdf' – kev Dec 22 '11 at 14:08
  • @kev: great...also solved the problem with the script I had already posted... And, one doubt: 'ls -l .pdf 'shows pdf files only, but why 'ls -al .pdf' is showing the same files (it is not showing the hidden pdf files) And, what does the -name attribute of the find command do? – user13522 Dec 23 '11 at 05:07
  • Instead of counting the filenames, how about counting characters? You don't need the names anyway: find . \( -name '*.pdf' -o -name '*.tex' \) -printf x | wc -c – l0b0 Jan 03 '12 at 11:38
0

I would recommend (if available) using locate instead of find. You would be querying a database and results would be instant and there is practically no load on the system. Though the database only gets updated when your system runs updatedb so if you wanted up to the second information you would have to make sure that you ran it first and it would put a load on the system but, it depends on how you intend to use your search.

You could use whatever regex meets your needs.

system1:/unix.stackexchange # locate *.tex *.pdf | grep unix.stack.*
   /unix.stackexchange/access_me/1/file.pdf
   /unix.stackexchange/access_me/1/file.tex
2bc
  • 3,978