1

If the input is

foo,bar,baz
bar,baz,qux
qux,quux,baz
bar,foo,qux
waldo,fred,garply

the output should be

foo,bar,baz
bar,baz,qux
waldo,fred,garply

As you can see, records are deduplicated based on the 3rd column's value. If multiple records have the same 3rd column value, pick a random one (or the first one; doesn't matter)

2 Answers2

9

The idiomatic awk answer is awk -F, '!seen[$3]++' file

That will print a line the first time a value is seen in the 3rd column.

glenn jackman
  • 85,964
3

If you're not bothered about the order of the output, can just use sort, as below.

  • -t, sets the field delimiter to ","
  • -k3 defines the sort key as the third field
  • -u indicates that only unique results are wanted

    $ sort -t, -k3 -u file
    foo,bar,baz
    waldo,fred,garply
    bar,baz,qux
    $
    
RalfFriedl
  • 8,981
steve
  • 21,892