4

I had a csv file of data such like that when read into shell:

name,income,reward,payment
Jackson,10000,2000,1000
Paul,2500,700,200
Louis,5000,100,1800

and I want to find the net earning for each person, use formula: "net = income+reward-payment".

when I used command to do this, it only calculate the first row of data.

$ cat data.csv | awk -F ',' '{for (i=1;i<=NF;i++) net[i] = $2+$3-$4} END {for (p in total) print p, "net = ", net[p]}' > result.txt

How can I do the calculation here?

By the way, the names are not unique, so I try (for loop) to create index for the array [net].

My expected output is:

1 Jackson net = 11000
2 Paul net = 3000
3 Louis net = 3300
αғsнιη
  • 41,407
Polsop
  • 43

3 Answers3

5
$ awk -F, -v OFS=, 'NR>1 { print $1, $2+$3-$4 }' data.csv 
Jackson,11000
Paul,3000
Louis,3300

Or if you want the net appended to the existing data, along with the (updated) header line:

$ awk -F, -v OFS=, 'NR==1 {print $0,"net"}; NR>1 {print $0, $2+$3-$4}' data.csv 
name,income,reward,payment,net
Jackson,10000,2000,1000,11000
Paul,2500,700,200,3000
Louis,5000,100,1800,3300
cas
  • 78,579
  • Assuming the CSV is really simple (no escaped commas, etc.)… – D. Ben Knoble May 03 '21 at 03:21
  • 1
    @D.BenKnoble yep. To parse CSV properly (with quotes and escapes etc), you need a real CSV parser. Either a tool like csvkit, or a CSV-parsing library for perl (e.g. Text::CSV) or python (e.g. csv) or whatever language you're using. This answer is only good for simple comma-delimited input like in the OP's example, not full CSV. – cas May 03 '21 at 07:32
4

do in this way:

$ awk 'BEGIN{ FS=OFS="," } { $5=NR>1?$2+$3-$4:"net" }1' infile
name,income,reward,payment,net
Jackson,10000,2000,1000,11000
Paul,2500,700,200,3000
Louis,5000,100,1800,3300

or having your updated expected output:

$ awk -F, 'NR>1{ print NR-1, $1, "net= " $2+$3-$4 }' ifnile
1 Jackson net= 11000
2 Paul net= 3000
3 Louis net= 3300
αғsнιη
  • 41,407
1

Using awk:

awk -F',' '{ p[NR] = $1;net[NR] = $2+$3-$4} END {for (i=2;i<=NR;i++) print (i-1), p[i], "net = ", net[i]}' data.csv

Because this is a csv file, FS = ',' or -F ','

Next p and net arrays created. These are indexed on NR(record numbers).

Then a for loop (i-1), p[i], "net = ", net[i] works as you planned. for loop is being started from 2 because first line is header line.

pipe in this case is useless use. Anohter thing your loop is good but limit should be NR not NF.