6

A simple for loop in Unix would be:

for FILE in $BASE_WORK_DIR/*.pdf
   do
     echo $FILE
   done

This will echo all .pdf files inside BASE_WORK_DIR directory.

What if BASE_WORK_DIR contains sub directories as well which also contain the pdf file.

In that case how can I design my for loop to take all pdf files from BASE_WORK_DIR as well as sub directories of BASE_WORK_DIR ?

Avinash Raj
  • 3,703
Vicky
  • 205

3 Answers3

6

In bash4, and following symlinks to directories is desirable, you can enable globstar and use **:

shopt -s globstar
for file in "$base_work_dir"/**/*.pdf
do
  echo "$file"
done

Otherwise in an sh script, find is probably the best way:

IFS='
'
set -f
for file in $(find "$base_work_dir" -name *.pdf)
do
  echo "$file"
done

(add the -L option to find to follow symlinks like with bash globstar)

Note that you will have issues here if any filenames contain newlines.

Graeme
  • 34,027
  • Better use ksh93 -o globstar or zsh whose ** don't have the symlink issue. – Stéphane Chazelas Jan 22 '14 at 15:26
  • @Stephane why is the set -f necessary when the *.pdf is quoted? – Graeme Jan 22 '14 at 15:39
  • In every shell but zsh globbing is performed upon command substitution, so for instance if there's a file called *.pdf in $base_work_dir/foo, all the PDFs in foo except *.pdf will be counted twice. Leaving a variable or command substitution unquoted is the split+glob operator. If you don't want the glob part, you need set -f. – Stéphane Chazelas Jan 22 '14 at 15:45
6

The standard and canonical and reliable syntax is:

find . -type f -name '*.pdf' -exec sh -c '
  for f do
    something with "$f"
  done' sh {} +

(note that it may run several sh invocation if the list of files is very big).

With zsh, the equivalent would be

for f (./**/*(.NDoN)) {
  something with "$f"
}

(. for -type f, D to include hidden files, oN to not bother sorting the list, N to not complain if there's no matching file), except that you wouldn't get error messages for the directories you don't have access to.

3

Another solution with find that would solve the problem of newlines in file names:

find "$BASE_WORK_DIR" -name '*.pdf' -print0 |
  while IFS= read -d '' -r file;do
    # your magic here
  done

Caveat emptor: this only works in Zsh and Bash.

-print0 only works in a few find implementations like GNU find. Use -exec printf '%s\0' {} + if your find doesn't support it.

Joseph R.
  • 39,549