1

with my code im trying to sum up the values with the specific name of a column in a csv file, depending on the input of the name. Here's my code:

#!/bin/bash

updatedata() {

index=0
while IFS="" read -r line
do
    IFS=';' read -ra array <<< "$line"
    for arrpos in "${array[@]}"
    do
        if [ "$arrpos" == *"$1"* ] || [ "$1" == "$arrpos" ]
        then
            break
        else
            let index=index+1
        fi
    done
    break

done < data.csv
((index=$index+1))



if [ $pos -eq 0 ]
then
    v0=$(awk -F";", -v index=$index '{x+=$index}END{print x}' ./data.csv )
elif [ $pos -eq 1 ]
then
    v1=$(awk -F";" '{x+=$index}END{print x}' ./data.csv )
elif [ $pos -eq 2 ]
then
    v2=$(awk -F";" '{x+=$index}END{print x}' ./data.csv )
elif [ $pos -eq 3 ]
then
    v3=$(awk -F";" '{x+=$index}END{print x}' ./data.csv )
fi



}

in the middle of code you can see in v0=, i was trying to experiment a little, but i just keep getting error. First I tried this:

v0=$(awk -F";" '{x+=$index}END{print x}' ./data.csv)

but it gave me this error: 'awk: line 1: syntax error at or near }'

so then i decided to try this(as you can see in the code)

v0=$(awk -F";", -v index=$index '{x+=$index}END{print x}' ./data.csv )

And i got this error: 'awk: run time error: cannot command line assign to index type clash or keyword FILENAME="" FNR=0 NR=0'

i dont know what to do, can you guys help me

  • 3
    index is a built-in awk function. You may want to use another name for this variable (and use $(varname) in awk). You also should not have a comma after -F ';'. Not turning this into an answer as a real answer should probably also point to better ways of doing this operation (the shell loop is probably not needed). – Kusalananda Aug 28 '20 at 09:33
  • See why-is-using-a-shell-loop-to-process-text-considered-bad-practice. If you [edit] your question to include concise, testable sample input and expected output then we could help you do whatever it is you're trying to do the right way. – Ed Morton Aug 28 '20 at 13:37

1 Answers1

0

Given some CSV data in data.csv,

A;B;C
1;2;3
4;5;6
-1.2;3;3.3

the following script would calculate the sum of the column named by the colname variable given on the command line:

BEGIN {
        FS = ";"
    if (colname == "") {
            print "Did not get column name (colname) to work with" >"/dev/stderr"
            exit 1
    }

}

FNR == 1 { colnum = 0

    for (i = 1; i <= NF; ++i)
            if ($i == colname) {
                    colnum = i
                    break
            }

    if (colnum == 0) {
            printf "Did not find named column (colname = \"%s\")\n", colname >"/dev/stderr"
            exit 1
    }

    sum = 0
    next

}

{ sum += $colnum }

END { print sum }

Testing it:

$ awk -v colname='A' -f script.awk data.csv
3.8
$ awk -v colname='B' -f script.awk data.csv
10
$ awk -v colname='C' -f script.awk data.csv
12.3
$ awk -v colname='D' -f script.awk data.csv
Did not find named column (colname = "D")

Shorter variant of the script without so much error checking:

BEGIN { FS = ";" }

FNR == 1 { for (i = 1; i <= NF; ++i) if ($i == colname) break

    if (i &gt; NF) exit 1
    next

}

{ sum += $i }

END { print sum }

or, as a "one-liner":

$ awk -v colname='A' -F ';' 'FNR == 1 { for (i = 1; i <= NF; ++i) if ($i == colname) break; if (i > NF) exit 1; next } { sum += $i } END { print sum }' data.csv

Ideally, though, you'd use some form of CSV parser, like CSVkit:

$ csvstat --sum -c A data.csv
3.8

The csvstat utility calculats several different statistics for any given CSV file. Here, it figures out that the delimiter is ; on its own. In this example I ask it for the sum of the column named A.

Kusalananda
  • 333,661