How to use awk to print nth column and remove duplicates?

Question

I am using awk below to print 8th column and remove duplicates in that very column.

awk -F "," '{print $8}' filecsv | awk '!NF || !seen[$0]++'

How to do it with just one awk instead running awk twice in the above pipline

score 3 · Accepted Answer · answered Sep 04 '18 at 15:00

3

awk -F , '!seen[$8]++ { print $8 }' filecsv

This checks whether the value of the eighth field has already been seen, and only if it hasn’t, prints it.

answered Sep 04 '18 at 15:00

Stephen Kitt

1

note that seen is not an awk command but a variable name that can be anything (e.g. !_[$8]++). This lead to some confusion on the strange syntax when I first saw this really cool solution. – pLumo Sep 04 '18 at 15:05
1

@RoVo indeed; it’s quite a common AWK “trick”, see this answer for example. – Stephen Kitt Sep 04 '18 at 15:08

1 Answers1