3

I would like to sum numerical values listed in the third column of the table below based on entries were given in the first column of the same table. The table content is as follows:

John|Login|2
Mary|Login|10
Mary|Payroll|100
John|Login|200
John|Logout|10
Mary|Payroll|10

The expected outcome is as follows:

John|Login|202
John|Logout|10
Mary|Login|10
Mary|Payroll|110

How can I use awk to get this output?

2 Answers2

1

This command worked for me. it should work for you :

     awk -F '|' '{a[$1"|"$2"|"]+= $3} END{for (i in a) print i, a[i]}' filename | sort -k 1,1
John|Login| 202
John|Logout| 10
Mary|Login| 10
Mary|Payroll| 110
GAD3R
  • 66,769
1

If you have GNU awk installed, you don't have to use another tool sort to set the order right, but you can do it in awk itself. You can set the way the array traversal is done by setting a special variable PROCINFO["sorted_in"]. See Using Predefined Array Scanning Orders with gawk

In your case though, you can set it to sort by the index in ascending order

awk -vFS="|" -vOFS="|"  '{ 
                             primaryKey=($1 FS $2)
                         }{ 
                             db[primaryKey]+=$3; next 
                         } END { 
                             PROCINFO["sorted_in"] = "@ind_str_asc"
                             for(key in db) 
                                 print key, db[key] 
                         }' file
Inian
  • 12,807