0

I have this short if-then-else script:

for INV in "$(ls np4178/*pdf)" ;\
do INVNUMB="$(pdfgrep -ho 'IN[0-9]{6,6}' $INV)" ; \
if [[ -z ${INVNUMB+x} ]]; \
then \
  echo "\n$INVNUMB" ; \
else \
  echo "wrong  \n$INVNUMB" ; \
fi ; \
done

Which produces this:

wrong  \nIN353886
IN353897
IN353905
IN353910
IN353902
IN353864
IN353875
IN353840
IN353862
IN353922
IN353739
IN353876
IN353920

However, if I make a change to the else statement I get this:

for INV in "$(ls np4178/*pdf)" ;\
   do INVNUMB="$(pdfgrep -ho 'IN[0-9]{6,6}' $INV)" ; \
   if [[ -z ${INVNUMB+x} ]]; \
   then \
     echo "\n$INVNUMB" ; \
   else \
     echo "\nIN000000" ; \
   fi ; \
done

Then I only get this:

\nIN000000

Why? How can changing the text string in the else clause the entire behaviour and results of the script to change?

Why is the newline character \n printed as a literal in the else clause?

  • 1
    Bash pitfall number one. See where it reads "You can't simply double-quote the substitution either". – Kamil Maciorowski May 11 '21 at 20:02
  • 2
    None of the backslashes at the ends of lines are needed. – choroba May 11 '21 at 20:06
  • To begin with, I was using the wrong test. -z should be -n – James B. Byrne May 11 '21 at 20:17
  • If the xpg_echo shopt option was set, \n would be special to echo; but, then, every echo "wrong \n$INVNUMB" would print wrong + a newline + the expansion of $INVNUMB. This is not what we see, telling us that xpg_echo is not set. Hence, (since echo -e is not used) \n is not special to echo, meaning that your first output snippet is produced by a single echo "wrong \n$INVNUMB". Which seems consistent with the result you get when you change "wrong \n$INVNUMB" into "\nIN000000". – fra-san May 11 '21 at 20:22

2 Answers2

5

There are some flaws in this code.

  1. Don't parse ls. And since you have the command substitution quoted every result is passed to INV as a single string. Instead you can just loop over the glob results:
for inv in np4178/*pdf; do
  1. You don't need any of the ;'s or \'s in that code. In fact the way they are written actually cancel each other out. ; is a newline separator and \ is an escape character which when used at the end of a line escapes the new line.

  2. echo (normally) doesn't interpret backslash escapes (\n), you could use echo -e but printf is better than echo

  3. As you have already discovered your test construct is checking if the variable is null when you apparently want to check if it is not null

  4. There is absolutely no reason to do ${INVNUMB+x}.

Here is what the code you want may look like:

for inv in np4178/*pdf; do
    invnumb="$(pdfgrep -ho 'IN[0-9]{6,6}' "$inv")"
    if [[ -n "$invnumb" ]]; then
        printf '\n%s\n' "$invnumb"
    else
        printf '\n%s\n' "IN000000"
    fi
done
jesse_b
  • 37,005
  • 1
    Why do you even store the output of pdfgrep in a variable?? if ! pdfgrep ... "$inv"; then echo IN000000; fi. This is assuming pdfgrep is sane about its exit-status. – Kusalananda May 11 '21 at 20:42
  • I've never used pdfgrep, but it's possible that this is just a pared down example of OPs code and the invnumb variable will be used for something else – jesse_b May 11 '21 at 20:43
  • I'm just assuming it works like grep. – Kusalananda May 11 '21 at 20:44
  • @Kusalananda: it appears that it does https://pdfgrep.org/doc.html – jesse_b May 11 '21 at 20:45
-1

Thank you for all the help. It was very useful. The example I gave was not the complete script, obviously. And I did not comprehend the significance of enclosing env variables with "", so lesson learned. And it was an early attempt at understanding the problem on my part so I was somewhat confused about what exactly I was trying to accomplish.

I posted this as an answer to get the code formatting right. What follows is, more or less, what I ended up with:

for INV in $(ls -1 $SRCDIR/HP3000-INV*pdf)
# search files for IN###### and return only first occurrence of match
do INVNUMB="$(pdfgrep -h -o -m 1 'IN[0-9]{6,6}' $INV)"
  if [[ -n ${INVNUMB} ]]; \
    then
      # prepend the invoice number to the file name
      # and move renamed file to xfr directory
      mv -f $INV $(dirname $INV)"/"$TODAY"/"$INVNUMB"-"$(basename $INV)
  fi
done