remove lines in file 1 from file 2

Question

i do have a text include lines which i would like to delete from my original text.

as example

Original Text

11
111111111111111111,111111111,11
12,12
99999999999999999,19,1999,199

Text Include Lines to be removed

12,12
99999999999999999,19,1999,199

Expected Output

11
111111111111111111,111111111,11

so what is the best solution for that case?

Seems like a case for grep -xvfF to me – Jeff Schaller Dec 04 '17 at 03:23 — Jeff Schaller, Dec 04 '17 at 03:23
@JeffSchaller grep -Fxf :( – αԋɱҽԃ αмєяιcαη Dec 04 '17 at 03:27 — αԋɱҽԃ αмєяιcαη, Dec 04 '17 at 03:27

score 1 · Answer 1 · edited Jan 06 '18 at 20:08

1

I achieved the mentioned result using this awk one-liner

$ cat file1
11
111111111111111111,111111111,11
12,12
99999999999999999,19,1999,199

$ cat file2
12,12
99999999999999999,19,1999,199

The following command deletes the contents of file2 from file1

awk 'NR==FNR {a[$1];next}!($1 in a ) {print $1}' file2 file1

Output:

11
111111111111111111,111111111,11

edited Jan 06 '18 at 20:08

grg

197

answered Dec 04 '17 at 05:34

Praveen Kumar BS

5,211

igal · Accepted Answer · 2017-12-04T03:44:43.863

Here's a one-liner user grep:

grep -Fxv -f file1.txt file2.txt

This command outputs the lines in file1.txt which are not in file2.txt - in the order that they appear in.

If you don't care about preserving order, the you can also use the comm command:

comm -23 <(sort file1.txt) <(sort file2.txt)

This command outputs the line in file1.txt which are not in file2.txt - in sorted order.

You could also use a while-loop to iterate over the lines of the first file (e.g. file1.txt), check each line against the second file (e.g. file2.txt) using grep, and print the line if it isn't found. This will have the effect of producing the text comprising the lines of file1.txt with the lines from file1.txt removed. It could look something like this:

while read line; do
    if ! grep -qF -- "${line}" file2.txt; then
        echo "${line}";
    fi;
done < file1.txt

If you want to write the results to a file you could use output redirection, e.g.:

while read line; do
    if ! grep -qF -- "${line}" file2.txt; then
        echo "${line}";
    fi;
done < file1.txt > output.txt

The same thing goes for the grep and comm commands:

grep -Fxv -f file1.txt file2.txt > output.txt

comm -23 <(sort file1.txt) <(sort file2.txt) > output.txt

NOTE: You can't redirect the output back to file1.txt. Because of the way output redirection is implemented this will only end up deleting the contents of file1.txt. For further discussion of this issue see, e.g. the following post:

Why does the command shuf file > file leave an empty file, but similar commands do not?

If you want to replace the original file you can just overwrite it with the output file, i.e.:

mv output.txt file1.txt

You could also turn this into a script. Here's a script using the while-loop:

#!/usr/bin/env bash
# removelines.sh

# Set filenames
INPUTFILE="$1"
FILTERFILE="$2"
OUTPUTFILE="$(mktemp)"

# Write the lines from INPUTFILE to OUTPUTFILE
# minus the lines from FILTERFILE
while read line; do
    if ! grep -qF -- "${line}" "${FILTERFILE}"; then
        echo "${line}";
    fi;
done < "${INPUTFILE}" > "${OUTPUTFILE}"

# Replace INPUTFILE with OUTPUTFILE
mv "${OUTPUTFILE}" "${INPUTFILE}"

And here's the same script using comm:

#!/usr/bin/env bash
# removelines.sh

# Set filenames
INPUTFILE="$1"
FILTERFILE="$2"
OUTPUTFILE="$(mktemp)"

# Write the lines from INPUTFILE to OUTPUTFILE
# minus the lines from FILTERFILE
comm -23 <(sort "${INPUTFILE}") <(sort "${FILTERFILE}") > "${OUTPUTFILE}"

# Replace INPUTFILE with OUTPUTFILE
mv "${OUTPUTFILE}"

Note that I use the mktemp function to generate a random filename for the output file.

Here's what the script would look like in action:

user@host:~$ cat <<HEREDOC > file1.txt
11
111111111111111111,111111111,11
12,12
99999999999999999,19,1999,199
HEREDOC

user@host:~$ cat <<HEREDOC > file2.txt
12,12
99999999999999999,19,1999,199
HEREDOC

user@host:~$ bash removelines.sh file1.txt file2.txt

user@host:~$ cat file1.txt
11
111111111111111111,111111111,11

@αԋɱҽԃαмєяιcαη Sure thing. But check out my updated solution. I added a one-liner using comm. — igal, Dec 04 '17 at 03:32
you forgot to mention that comm need files to be sorted firstly, otherwise can be handle through comm -i <(sort -i test1.list) <(sort -i test2.list) — αԋɱҽԃ αмєяιcαη, Dec 04 '17 at 03:35
@αԋɱҽԃαмєяιcαη Yeah, thanks. I was kind of rushing. I added a third solution using only grep. — igal, Dec 04 '17 at 03:45

remove lines in file 1 from file 2

2 Answers2