-2

I have file look like: file1:

8189157 7423700 7227471
8189199 7533538 7574425
8189200 7533538 7574425
8189273 7241932 7538338
8191298 6353935 6770138
8191332 7427024 7756709
8192601 6353935 7378544
8192680 7533538 7574348
8193100 6678109 7755961
8193158 6678109 7367734
8193159 6678109 7367734
8193176 7427024 7377679
8193180 7427024 7377679
8193227 6678109 7347206
8207305 7427024 7575134
8207315 6353935 7767680
8207316 6353935 7767680
8207317 6353935 7767680
8207371 6678109 7793130
8209083 7533538 7426859
8212702 7268724 7367752
8212704 7268724 7367752
8212718 7753798 7575212
8212719 7753798 7575212

I want to extract all the rows from file1 which have a common value with file 2:

7753798
6353935
7423700

so the result should be a third file like:

8212718 7753798 7575212
8212719 7753798 7575212
8207315 6353935 7767680
8207316 6353935 7767680
8207317 6353935 7767680
8191298 6353935 6770138
8192601 6353935 7378544
8189157 7423700 7227471

Any suggestions please by considering the fact that the real file1 is huge.

Thanks

zara
  • 1,313

2 Answers2

0

Key -f for grep is what you need.

$ grep -f file_with_patterns file_to_scan > result_file
White Owl
  • 5,129
  • 2
    You'd probably want to add -F (--fixed-strings) and in general -w (whole word) or in the case of single-column data -x (whole line) to avoid substring matches - although that may not matter in this case since the values all appear to have the same number of characters. – steeldriver Apr 30 '22 at 12:02
-2

You can easily achieve the result as:

while read FIRST
do
cat a.txt | grep "$FIRST" >> resut.txt
done < b.txt

let say your files are a.txt and b.txt and you want to get result in result.txt.

You have to loop over the b.txt file to get data line by line(you can use while loop here) and get data in FIRST variable and then use grep to search data in b.txt file and then append data in result.txt.

DecPK
  • 105
  • 4
    This will be very, very slow since you need to reprocess the file for every search pattern. But using the shell for this is a bad idea in general, please see Why is using a shell loop to process text considered bad practice?. Also, grep can take a file name as an argument, there is no need to cat data to it, just use grep "$FIRST" a.txt. Finally, avoid using CAPS for variable names in shell scripts, by convention global environment vars are capitalized and if your own var names are also in caps, you can get weird bugs because of naming collision – terdon Apr 30 '22 at 12:51
  • @terdon Thanks for your suggestion. I've just started shell scripting. I'll remember from now on. – DecPK Apr 30 '22 at 15:04