I'm trying to make a script that looks through each line of one file and if a line fails to match anywhere in any line of another text file then remove that line from the original file.
An example of and input and output desired from this script would be:
example input: file 1 (groups file),
hello
hi hello
hi
great
interesting
file 2:
this is a hi you see
this is great don't ya think
sometimes hello is a good expansion of its more commonly used shortening hi
interesting how brilliant coding can be just wish i could get the hang of it
Example script output - file 1 changed to:
hello
hi
great
interesting
So its removed hi hello
, because its not present in the second file
here is the script, it seems to work to the point of making the variables.
#take first line from stability.contigs.groups
echo | head -n1 ~/test_folder/stability.contigs.groups > ~/test_folder/ErrorFix.txt
#remove the last 5 character
sed -i -r '$ s/.{5}$//' ~/test_folder/ErrorFix.txt
#find match of the word string in errorfix.txt in stability.trim.contigs.fasta if not found then delete the line containing the string in stability.contigs.groups
STRING=$(cat ~/test_folder/MothurErrorFix.txt)
FILE=~/test_folder/stability.trim.contigs.fasta
if [ ! -z $(grep "$STRING" "$FILE") ]
then
perl -e 's/.*\$VAR\s*\n//' ~/test_folder/stability.contigs.groups
fi