0

I want to compare file1 with file2.

file1 contains a set of pathnames and file2 contains a set of words.

If any pathname contains any of the words in the second file, it should be commented out by prepending the line with //.

file1:

xxx/AAA.tmp.v
xxx/BBB.tmp.v
xxx/CCC.tmp.v

file2:

BBB
CCC
FFF

Desired output in a new file:

xxx/AAA.tmp.v
// xxx/BBB.tmp.v
// xxx/CCC.tmp.v
Kusalananda
  • 333,661
  • is the data to be matched from file2 always in the form of /data. in file1? solution would be easier if that is the case... and it is better to add your efforts made and point out where you got stuck... – Sundeep Sep 26 '17 at 04:50
  • @Sundeep: your script no error syntax and result exactly same my expectation, thanks a lot. awk 'NR==FNR{a[$0]; next} {for(line in a)if($0 ~ line)$0="// "$0} 1' file2 file1 – Khanh Nguyen Sep 26 '17 at 06:37

2 Answers2

1

You could use awk.

awk -F"[/.]" 'NR==FNR{seen[$0];next} 
    ($2 in seen){print "// "$0; next}1
' file2 file1
  • -F"[/.]" defines fields delimiters as slash / or point ..
  • NR==FNR this is true always for first input data (here file2), Record Number==File Record Number.
  • seen[$0];next if above is true, then hold entire line of file2 into array named seen, then read next line next (actually goto first and run this block again until NR!=FNR)
  • ($2 in seen){print "// "$0; next}1 this is only apply for second input file (here file1), and looking for the seen array if contains same string as column#2 $2 in file1, then print entire line of file1 with pre-appended //, and goto next check condition again until it's match, otherwise print the entire line with 1 condition (that's enable awk's default action).
αғsнιη
  • 41,407
0

With sed you can do it like this:

sed '/tmp/!{H;d;}
G;s/$/\
/;s_.*\(..*\).*\n\1\n_// &_
P;d' file2 file1

You collect the patterns of file2 in the hold space and for each line of file1 you append that collection and check with backreferences whether the pattern is found in the line.

For a more detailed explanation see this question and answer.

Note that I used the string tmp as indication that we are in file1; you may need to adapt that to your actual case. The strange substitution starting in line 2 adds a newline to the end so we know that each pattern will be surrounded by newlines.

Philippos
  • 13,453