How to extract lines from a textfile that contains string from a list in another file in order of search list?

Question

File 1: sourcefile.txt

Hello, It's the beginning of the sentence. 
it is the beginpoint of my career.
The end is always far.
We can start our beginpoint anytime we want.
The time we utilise to make our life good should be more.
This text doesn't mean anything.
I am writing this to include my three points:
beginpoint
time
end

File 2: strings.txt

beginpoint
end
time

Required output:

it is the beginpoint of my career
We can start our beginpoint anytime we want.
beginpoint
The end is always far.
end
The time we utilise to make our life good should be more.
time

I used

grep -w -F -f  strings.txt sorcefile.txt > outputfile.txt

I got output:

it is the beginpoint of my career.
The end is always far.
We can start our beginpoint anytime we want.
The time we utilise to make our life good should be more.
beginpoint
time
end

So the lines are as needed but i want to group them by the search term order and not in the same order as the source file

Repost of How to extract lines from a textfile that contains string from a list in another file? — AdminBee, Jun 15 '20 at 13:25

score 0 · Accepted Answer · answered Jun 15 '20 at 08:33

One way is to call grep once per each line of strings.txt

$ while IFS= read -r line; do grep -wF "$line" sourcefile.txt; done < strings.txt
it is the beginpoint of my career.
We can start our beginpoint anytime we want.
beginpoint
The end is always far.
end
The time we utilise to make our life good should be more.
time

If the strings.txt file is too long, this will can be slow, see Why is using a shell loop to process text considered bad practice?

With sed if it supports e flag:

$ sed 's/.*/grep -wF '"'&'"' sourcefile.txt/e' strings.txt
it is the beginpoint of my career.
We can start our beginpoint anytime we want.
beginpoint
The end is always far.
end
The time we utilise to make our life good should be more.
time

Ed Morton · Answer 2 · 2020-06-15T13:20:43.863

Assuming your list of strings doesn't contain spaces as in your example:

$ awk -F'[^[:alnum:]_]+' '
    NR==FNR { strs[$0]; next }
    { for (str in strs) for (i=1; i<=NF; i++) if ($i==str) print str, FNR, $0 }
' file2 file1 | sort -k1,1 -k2,2n | cut -d' ' -f3-
it is the beginpoint of my career.
We can start our beginpoint anytime we want.
beginpoint
The end is always far.
end
The time we utilise to make our life good should be more.
time

The above works by not only printing the line that contains the matching string but also the string that was matched plus the line number it was matched on (to retain relative order after sorting - not necessary if we used GNU sort for -s) then sorting and then removing the adornments that were added in step one. Here it is step by step:

$ awk -F'[^[:alnum:]_]+' 'NR==FNR{strs[$0];next} {for (str in strs) for (i=1; i<=NF; i++) if ($i==str) print str, FNR, $0}' file2 file1
beginpoint 2 it is the beginpoint of my career.
end 3 The end is always far.
beginpoint 4 We can start our beginpoint anytime we want.
time 5 The time we utilise to make our life good should be more.
beginpoint 8 beginpoint
time 9 time
end 10 end

.

$ awk -F'[^[:alnum:]_]+' 'NR==FNR{strs[$0];next} {for (str in strs) for (i=1; i<=NF; i++) if ($i==str) print str, FNR, $0}' file2 file1 | sort -k1,1 -k2,2n
beginpoint 2 it is the beginpoint of my career.
beginpoint 4 We can start our beginpoint anytime we want.
beginpoint 8 beginpoint
end 3 The end is always far.
end 10 end
time 5 The time we utilise to make our life good should be more.
time 9 time

.

$ awk -F'[^[:alnum:]_]+' 'NR==FNR{strs[$0];next} {for (str in strs) for (i=1; i<=NF; i++) if ($i==str) print str, FNR, $0}' file2 file1 |
    sort -k1,1 -k2,2n | cut -d' ' -f3-
it is the beginpoint of my career.
We can start our beginpoint anytime we want.
beginpoint
The end is always far.
end
The time we utilise to make our life good should be more.
time

How to extract lines from a textfile that contains string from a list in another file in order of search list?

2 Answers2