-1

Remove duplicate words from between lists example:

We have two lists; the first list contains:

a
b
c
d

The second list contains:

a
b
c
d
e
f

I want to make a comparison between the first and second lists, removing the matches contained in both, resulting in this:

e
f

I couldn't find a solution to do that with bash, but I did find one in python: https://stackoverflow.com/questions/7961363/removing-duplicates-in-lists/7961390#7961390

ajgringo619
  • 3,276
Notme
  • 83

1 Answers1

1

You can use diff using --GTYPE-group-format=GFMT option. From man diff :

--GTYPE-group-format=GFMT
    format GTYPE input groups with GFMT

LTYPE is 'old', 'new', or 'unchanged'. GTYPE is LTYPE or 'changed'.

GFMT (only) may contain:

%< lines from FILE1

%> lines from FILE2

%= lines common to FILE1 and FILE2

In your case, you can use diff --new-group-format='%>' --unchanged-group-format='' list1 list2

$ cat list1
a
b
c
d

$ cat list2 a b c d e f

$ diff --new-group-format='%>' --unchanged-group-format='' list1 list2 e f

Explanation

  • The --new-group-format='%>' will output any new entries from FILE2 (%>) that doesn't exist in FILE1.
  • The --unchanged-group-format='' will prevent diff to print any identical lines.
annahri
  • 2,075