0

I have 2 files:

  • a file full of values I want to look for
  • my source text file

I wrote a short shell command to loop thru my list of values and grep against my source file. If it doesn't find the value in the file, I want to print the value out.

The problem is it's printing every value so I'm not grepping the value correctly thus it always never matches and then prints the value because of that. Hopefully someone can tell me what I'm doing wrong. Thanks in advance.

Here's my script

for i in `cat uniq_val.out`
do
   found=`grep "$i" fd.out`
   if [ -z "${found}" ]
   then
      echo $i
   fi
done

So for example, if my uniq_val.out contains this:

abc123
def456
ghi789
jkl101112
mno131415

And my fd.out contains this:

abc123
def456
mno131415

I want my shell script to return

ghi789
jkl101112
  • 2
    You have a typo in your script. There must be no space before the = sign in the assignment of found. – AdminBee Aug 07 '23 at 14:02
  • 1
    are the lines in file1 patterns to match against the lines in file2, i.e. can you have fo*b that would match foobar; or are you looking for one-to-one matches between identical lines? – ilkkachu Aug 07 '23 at 19:36
  • @ilkkachu, thanks for looking at my question. I've simplified for my example it so it's not identical lines but what I'm trying to do is in my uniq_val.out file, it's a unique string, like a serial number or part number that I'm looking for in fd.out. If that serial number doesn't appear in fd.out, I want to be alerted to it. – Classified Aug 07 '23 at 20:47
  • Both files contain only the uniq ids? Each line contains a single string, and you just want to find the lines in uniq_val.out that don't appear in fd.out? – aviro Aug 08 '23 at 07:29
  • 1
  • Please [edit] your question to state (and show with sample input/output) whether you want to search for matches to strings or regexp and whether you want full-word, full-line, or partial or some other matching. See how-do-i-find-the-text-that-matches-a-pattern for why that matters, otherwise you're likely to get a solution that works for some set of sample input but fails later on your real world data. – Ed Morton Aug 14 '23 at 15:37

2 Answers2

2

I'd suggest the following different approach:

grep -f <(grep -o -f uniq_val.out fd.out) -v uniq_val.out

i.e. the inner grep uses uniq_val.out as a pattern file and returns the matched part only; the outer grep then does an inverse grep of these values against the list.

To the best of my knowledge -o is not POSIX, though.


EDIT following example files in question:

If both your files are really single-line strings that need to be matched only, reverse the logic of what the pattern file is and use -x for whole-line matches:

grep -vx -f fd.out uniq_val.out

This is POSIX-compliant.

FelixJN
  • 13,566
1

GNU coreutils provide tools to handle sets of (sorted) strings. In your case I'd suggest to abandon grep and for-loop and use comm instead:

$ comm -23 uniq_val.out fd.out
ghi789
jkl101112

man comm:

comm - compare two sorted files line by line. The keys -2 and -3 make comm to print only lines unique to FILE1.

Other useful tools to use on sets of strings and tables are tr, sort and uniq to prepare data and join, cut and paste to do some trivial operations. Those tools are simpler than universal sed, grep and awk, not mentioning perl and python.

legolegs
  • 321