-1

I have two files that I want to compare. A sorted and an unsorted one.

ex fileA (sorted)

 ABA 
 FRE 
 DIR 

ex fileB (unsorted)

 AJGHEKSLANVJJFABAKEIFJA 
 OPTOEKSMKVMGKVABAALKKSK 

is there a way to find which words from fileA exist in fileB?

2 Answers2

0

There may be tools to do it faster, but you could consume the first file in a loop and check like

while read -r pat; do
    if grep -q "$pat" fileB; then
        printf '%s has a match' "$pat"
    fi
done < fileA
Eric Renouf
  • 18,431
  • does this code contains any mistakes? – Angelos Souleles Feb 23 '17 at 18:20
  • @AngelosSouleles I guess that's a hard question to answer for sure, but for me with the sample files you gave it finds ABA and not either of the other 2. It will, as written, treat your words in fileA as patterns, not fixed strings. add -F to the grep command to fix that if you want. The printf also doesn't include a newline, so that could be a mistake as well – Eric Renouf Feb 23 '17 at 18:35
0

Try this :

grep -f fileB fileA

All the lines from fileA that are there in fileB would be displayed on the console.

ss_iwe
  • 1,146
  • 1
    I don't think that's what the question is going for. For example, ABA appears in the middle of the second line, so is a word from fileA that's in fileB, but your search wouldn't report it – Eric Renouf Feb 23 '17 at 15:18
  • Perhaps yeah, but the question "is there a way to find which words from fileA exist in fileB?" made me think the way I answered the question. – ss_iwe Feb 23 '17 at 15:26
  • I would like the ABA on the second line to be reported by the search, sorry for the misunderstanding – Angelos Souleles Feb 23 '17 at 15:30
  • @EricRenouf Apart from the misunderstanting, the -x and -w options could tackle this problem. – FelixJN Feb 23 '17 at 16:16
  • @Fiximan Since the pattern can appear anywhere in a line, I don't think that -x or -w will help with this particular problem. don_cristi's answer that he linked to above has a good potential solution: grep -oFf fileA fileB | sort -u – Eric Renouf Feb 23 '17 at 17:19
  • @Fiximan non of the solutions worked. I tested them with files I knew they have some of the fileA words but I didnt get any result – Angelos Souleles Feb 23 '17 at 18:22
  • @AngelosSouleles @EricRenouf Not what I meant. I was talking about searching for e.g. "A" and matching only lines like "this A that", "A that" and "this A" (-w option, note the spaces) or lines that are "A" (and this only, -x option), respectively, WITHOUT matching lines like "thisAthat". But it looks like I misunderstood the requirements, when the questioner was referring to "words" that needed matching. – FelixJN Feb 24 '17 at 09:45