1

I have a file in below format, where columns are separated with comma.

[1], Value1,   UAC,                 AB
[2.2], Check1, BOH D2A D2A BOH,     SD
[63], name2,   MFB MFB,              k
...

I want to remove duplicate values from column (say 3rdcolumn) like below:

[1], Value1,   UAC,             AB
[2.2], Check1, BOH D2A ,        SD
[63], name2,   MFB,              k
...

How to use uniq or AWK for particular column.

αғsнιη
  • 41,407
Jack15
  • 13

2 Answers2

0

with awk:

awk -F, '{
    printf $1 FS $2 FS; 
    split($3, arr, / +/); for(val in arr) !uniq_arr[ arr[val] ]++;
    for (key in uniq_arr) { 
        printf (key!="")? SPACE key:""; SPACE=" "; delete uniq_arr[key]
    };
    printf FS $4"\n"
}' infile

[1], Value1, UAC, AB
[2.2], Check1, D2A BOH, SD
[63], name2, MFB, k
  • This split($3, arr, / +/) splits column#3 into the array arr based on space separator (there may one-or-more spaces will be there as separator).

  • In for(val in arr) !uniq_arr[ arr[val] ]++, we are creating a new array with removed duplicated values taken from array arr; so the final values in array uniq_arr are only unique values in each column3.

  • next we are just printing saved values in uniq_arr and delete that key after it printed; note that values of column#1, #2 & #4 were printed separately.

see also:

αғsнιη
  • 41,407
-1

enter image description here

Uniq -f option helps you please check