I have a file containing two columns and 10 million rows. The first column contains many repeated values, but there is a distinct value in column 2. I want to remove the repeated rows and want to keep only one using awk
. Note: the file is sorted with values in column 1. For example:
1.123 -4.0
2.234 -3.5
2.234 -3.1
2.234 -2.0
4.432 0.0
5.123 +0.2
8.654 +0.5
8.654 +0.8
8.654 +0.9
.
.
.
.
Expected output
1.123 -4.0
2.234 -3.5
4.432 0.0
5.123 +0.2
8.654 +0.5
.
.
.
.
sort -buk1,1
– Stéphane Chazelas Oct 08 '14 at 11:07