2

I have a scenario where i want to sum the multiple column

The data in the file is :

ID|NAME|SAL|COST|PER|TAG

1|A|10|10|20|10|

1|B|10|15|20|10|

1|C|10|17|25|80|

1|D|115|110|20|100|

1|E|10|10|10|10|

i want sum of COLUMN - SAL | COST | PER | TAG

one i did with simple command but how to do by creating function

awk '{FS="|"}{s+=$3}END{print s}' file.txt

The function should be parameterised so that when i pass column name it should calculate sum for that column

The sum column's may differ. their may be requirement like only two column sum needed then it should take the two column names and process the sum for that

muru
  • 72,889

3 Answers3

2

Taking inspiration from https://stackoverflow.com/a/32616101 :

$ col=SAL
$ colnum=$(awk -v RS='|' '/'$col'/{ print NR; exit}' testfile)
$ awk '{FS="|"}{s+='$colnum'}END{print s}' testfile 
18

The trick here is the quoting: The vars are outside the single-quotes.

That's easily wrapped into a script sumcols.sh:

#!/bin/bash
FILE="$1"
COLUMNS="${@:2}"
for col in $COLUMNS; do
  colnum=$(awk -v RS='|' '/'$col'/{ print NR; exit}' $FILE)
  awk '{FS="|"}{s+='$colnum'}END{print "'$col' ", s}' $FILE
done | column -t

Call it with the file to process as the first positional parameter and follow with the columns to process. E.g.:

$ ./sumcols.sh testfile SAL COST
SAL   18
COST  24
markgraf
  • 2,860
2

Using a sufficiently recent version of Miller

$ mlr --csvlite --allow-ragged-csv-input --fs '|' stats1 -a sum -f SAL file.txt
SAL_sum
155

(recent version only required because your input is ragged i.e. has a trailing empty column with no corresponding name in the header). You can easily sum multiple columns in Miller by passing their names as a comma separated list to the -f option:

... stats1 -a sum -f SAL,COST,PER,TAG ...

Similarly with GNU Datamash

$ datamash -Ht '|' sum SAL,COST,PER,TAG < file.txt
sum(SAL)|sum(COST)|sum(PER)|sum(TAG)
155|162|95|210
steeldriver
  • 81,074
2

Using awk (and losing the spaces in your input string)

myv='SAL|COST|PER|TAG'
awk -v ar="$myv" '
  BEGIN{FS="|"; getline; for (i=1;i<=NF;i++) {if ($i ~ ar) head[i]=0;title[i]=$i}}
  NF>1{for (h in head) head[h]+=$h}
  END{for (h in head) print title[h]":\t"head[h]}
' file

This assumes the regex matching is unique. If not then ...

myv='SAL|COST|PER|TAG'
awk -v ar="$myv" '
  BEGIN{FS="|"; getline; for (i=1;i<=NF;i++) head[$i]=i; split(ar,titles,"|")}
  NF>1{for (i=1; i<=NF; i++) val[i]+=$i}
  END{for (t in titles) print titles[t]":\t"val[head[titles[t]]]}
' file

Output

SAL:    155
COST:   162
PER:    95
TAG:    210
bu5hman
  • 4,756