0

I have a list like this:

2017-12-11  AAOI    40.33
2017-11-15  AAOI    44.3492
2017-12-15  AEIS    70.98
2017-11-15  AEIS    80.137
2017-10-23  AIEQ    25.1601
2017-11-15  AMBA    52.6501
2017-12-05  ATHM    57.2
2017-11-09  AUDC    7.02
2017-12-22  BEW 0.58
2017-10-17  BIOP    8.19
2017-12-08  BLDP    4.86
2017-12-21  BLOC    2.3
2017-12-12  BLOC    2.7
2017-12-11  BLOC    2.32
2017-12-04  BLOC    2.39
2017-11-27  BLOC    2.6
2017-11-15  BOX     21.63
2017-12-22  BTL 10.5638
etc.

I want to get the first (most rescent) match for each symbol, symbol held in second column. With the sample input above this should be the output:

2017-12-11  AAOI    40.33
2017-12-15  AEIS    70.98
2017-10-23  AIEQ    25.1601
2017-11-15  AMBA    52.6501
2017-12-05  ATHM    57.2
2017-11-09  AUDC    7.02
2017-12-22  BEW 0.58
2017-10-17  BIOP    8.19
2017-12-08  BLDP    4.86
2017-12-21  BLOC    2.3
2017-11-15  BOX 21.63
2017-12-22  BTL 10.5638

The list is already sorted by column 2 ascending, then column 1 descending.

I am thinking along the lines of using awk to set the matching pattern to $2 (second column) and pipe matches based on this pattern into head.

This is not the first unique occurrence; it is the first unique occurrence where uniqueness is based on column 2 only. Like a uniq by column and return first occurrence only. Accordingly generous with the tags.

I fail connecting the dots. How would you do it?

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255

2 Answers2

3

Two ways to do it:

sort -u -k2,2 infile
awk -F" " '!_[$2]++' infile

0

I have done this by awk and sed combination.


for  w in `cat filename | awk '{print $2}' | sort | uniq`; do sed -n '/'$w'/p' filename| sed -n '1p'; done 

output


2017-12-11  AAOI    40.33
2017-12-15  AEIS    70.98
2017-10-23  AIEQ    25.1601
2017-11-15  AMBA    52.6501
2017-12-05  ATHM    57.2
2017-11-09  AUDC    7.02
2017-12-22  BEW 0.58
2017-10-17  BIOP    8.19
2017-12-08  BLDP    4.86
2017-12-21  BLOC    2.3
2017-11-15  BOX     21.63
2017-12-22  BTL 10.5638