I have
Sample_A 100
Sample_A 200
Sample_B 300
Sample_B 100
And I want to print the average of the values in row 2 for each key in row 1
Sample_A 150
Sample_B 200
I can print the sums of the values in row 2 for each key in row 1 using the excellent answer to another question: Sum First Column on basis of Second Column
The command is:
awk 'NR { k = $1; cnt[k] += $2 } END { print; for (k in cnt) print k,cnt[k]}' File.txt
And this produces
Sample_A 300
Sample_B 400
But in order to calculate the average, I need a way to save the number of occurrences of the key, something like
awk 'NR { k = $1; cnt[k] += $2; count(k)=$2} END { print; for (k in cnt) print k,cnt[k]/count(k)}' File.txt
But my count(k)
code is kind of a shot in the dark and doesn't work.
count
rather than assign$2
to it i.e.count[k]++
orcount[k] += 1
– steeldriver Aug 30 '18 at 17:46