$ cat data.txt
aaaaaa
aaaaaa
cccccc
aaaaaa
aaaaaa
bbbbbb
$ cat data.txt | uniq
aaaaaa
cccccc
aaaaaa
bbbbbb
$ cat data.txt | sort | uniq
aaaaaa
bbbbbb
cccccc
$
The result that I need is to display all the lines from the original file removing all the duplicates (not just the consecutive ones), while maintaining the original order of statements in the file.
Here, in this example, the result that I actually was looking for was
aaaaaa
cccccc
bbbbbb
How can I perform this generalized uniq
operation in general?
{ if (!seen[$0]++) print }
– camh Apr 24 '11 at 22:32if
,print
, parentheses, and braces:awk '!seen[$0]++'
– Gordon Davisson Apr 25 '11 at 06:29seen
? I've searched the User's Guide and can't find anything. – Christoph Wurm Jan 17 '12 at 14:58'!LarryWall[$0]++'
for all awk cares, but "seen" helps people understand the program better. – cjm Jan 17 '12 at 19:14++
do? – Christoph Wurm Jan 18 '12 at 09:55awk
trick is also well explained here. – Serge Stroobandt May 20 '14 at 22:43