0

I have a bash script that for each file in a set, greps each line in the file for a string. Then it splits the line on commas, converts the 7th element to a float, and increments a running total by that value.

It looks like this:

for filename in data*.CSV; do
   echo $filename
   ACTUAL_COST=0
   grep '040302010' $filename | while read -r line ; do
       IFS=',' read -a array <<< "$line"
       ACTUAL_COST=$(echo "$ACTUAL_COST + ${array[7]}" | bc)
       echo $ACTUAL_COST
   done
   echo $ACTUAL_COST
done

But the problem I'm having is that this produces output like this:

53.4
72.2
109.1
0

The last value is always 0. After Googling a bit, I think this is because the while loop is executing in a subshell, and so the outer variable isn't changed.

I understand that I probably need to execute the inner loop in a function.

Rui F Ribeiro
  • 56,709
  • 26
  • 150
  • 232
Richard
  • 3,463

3 Answers3

2

That's not how you do shell scripting. You're running several commands in sequence for each line of the files!

Here you want something like:

awk -F, '/040302010/ {actual_cost += $7}
         ENDFILE {print FILENAME ":", +actual_cost; actual_cost=0}
        ' data*.CSV

(assuming GNU awk).

That's one command in total for all your files.

1

To avoid the subshell you can use the following:

while read -r line
do
    your_stuff
done < <(grep '040302010' $filename')

That way you are able to fill in the result(s) into the variable.

Lambert
  • 12,680
0

It might be handy to introduce another command substitution, with the core loop logic defined in a function:

sum_cost() {
   sum=0
   while read -r line ; do
       IFS=',' read -a array <<< "$line"
       sum=$(echo "$sum + ${array[7]}" | bc)
   done
   echo $sum
}

for filename in data*.CSV; do
   echo $filename
   ACTUAL_COST=$(grep '040302010' $filename | sum_cost)
   echo $ACTUAL_COST
done
yaegashi
  • 12,326