Delete the matched line and one consecutive line (for loop)

Question

I found a few similar questions and solution on this topic, but I can't solve the problem in for loops with the previous solution suggested.

fileB:

88569.abcrat
44689.defhom
3702.ghigop

Example of text in file named 234:

9606.jklpan
how is the weather
88569.abcrat
today is a sunny day
44689.defhom
tomorrow will be a rainy day
3702.ghigop
yesterday was a cloudy day
10116.zyxtak
i am happy to see rainbow

desired output for file named 234:

9606.jklpan
how is the weather
10116.zyxtak
i am happy to see rainbow

Then, I will need to repeat the process of searching, matching and deleting for the other files listed in fileA.

fileA:

234
123
456

I was trying:

for i in $(cat fileA); do for j in $(cat fileB); do awk "/$j/ {while (/$j/ && getline>0) ; next} 1" $i; done; done

for i in $(cat fileA); do for j in $(cat fileB); do sed -e "/**$i/$j**/ { N; d; }" $i; done; done

but none of them worked so far. There must be something wrong somewhere. Hope to get some help here. Perhaps some suggestion for better command if possible.

Also, I wonder if i wrote the bold part in the second script correctly?

PS: I am a beginner in scripting. I would appreciate any help given. Thanks!

So... you want to find lines from fileB in fileA, then delete them along with the immediately following lines? — steeldriver, Aug 24 '19 at 14:46
@web I didn't understand you used two files file A and file A(234). — Prvt_Yadav, Aug 24 '19 at 17:53
why do you have two "fileA"s? what's the connection (if any) between "fileA' and "fileA (234)"? are they separate files, or two different examples of content in fileA? — cas, Aug 25 '19 at 01:04
I wanted to do "if the strings in fileB find a match in file named 234, delete the line and the next consecutive line in the file". 234 is one of the file in the list --- from fileA. I will then need to repeat the same thing for all the files with names listed in fileA. — web, Aug 25 '19 at 03:37
see Why is using a shell loop to process text considered bad practice? — cas, Aug 26 '19 at 03:38

Prvt_Yadav · Answer 1 · 2019-08-24T20:34:49.947

0

What I understood is that you are having multiples files and their names are stored in a file called fileA, and then you want to print everything from each file except the text stored in fileB, so you can do:

while read -r file_name
do
grep -v -f <(grep -A1 -f fileB "$file_name") "$file_name"
done < file

It will print the content on stdout.

edited Aug 24 '19 at 20:34

answered Aug 24 '19 at 18:12

Prvt_Yadav

5,882

cas · Answer 2 · 2019-08-25T06:59:29.183

The following works if the filenames in fileA are listed exactly one per line AND the filenames don't contain any linefeed (\n) characters:

$ xargs -d'\n' <fileA \
    perl -MFile::Slurp -e '
     my @patterns=read_file(shift, {chomp=>1});
     $re = join ("|",@patterns);

     while (<>) {
       if (m/$re/o) { readline; next };
       print
     }' fileB
9606.jklpan
how is the weather
10116.zyxtak
i am happy to see rainbow

xargs is used to provide a list of filename arguments to the perl script, reading in fileA one line at a time.

The perl script starts by reading in the first filename argument on the command line (fileB) and constructing a regular expression combining each line (after chomp-ing the newline character ending each input line).

After that, it loops through each of the remaining filename arguments, skipping any lines that match and the following line -- printing the remaining lines.

Note that this script just prints the output from all input files to stdout, and does not make any attempt to differentiate output from different input files.

If you wanted the output from each input file to go to a different output file (e.g. output from file 234 would go to 234.new), you could replace the entire while (<>) {...} loop with something like this:

my $lastfn="";
while (<>) {
  if(eof) { close(OUTFILE) };

  if ($lastfn != $ARGV) {
    $lastfn=$ARGV;
    open(OUTFILE,">","$ARGV.new")
  };

  if (m/$re/o) { readline; next; };
  print OUTFILE
}

or if you just wanted the filenames to be shown in the output:

my $lastfn="";
my $nl="";   # we dont want to print a LF before the first output filename
while (<>) {
  if ($lastfn != $ARGV) {
    print "$nl", $ARGV,":\n";
    $nl="\n";
    $lastfn=$ARGV };
  };

  if (m/$re/o) { readline; next };
  print
}

or with the input filename prefixed to each output line:

while (<>) {
  if (m/$re/o) { readline; next };
  print "$ARGV:$_"
}

Finally, this can be done entirely within perl, without needing xargs:

$ perl -MFile::Slurp -e '
   my @patterns=read_file(shift, {chomp=>1});
   $re = join ("|",@patterns);

   my @files=read_file(shift, {chomp=>1});
   @ARGV=@files;

   while (<>) {
     if (m/$re/o) { readline; next };
     print
   }' fileB fileA

Rakesh Sharma · Answer 3 · 2019-08-25T12:36:52.687

We will approach this problem by first constructing a sed commands file by examining the fileB and then applying this commands file on the files listed in fileA.

The point to note here is we are quoting the contents of fileB because they should be a valid sed syntax when used later.

$ sed -e '
   s:[][\/.^$*]:\\&:g
   s:.*:/&/{$q;N;d;}:
' < fileB > cmds

$ < fileA xargs -d'\n' -r -l sed -f cmds

Here is yet another perspective for approaching your problem wherein we store the lines of fileB as keys of a hash and then when reading the files listed in fileA check to see whether any key is found.

$ < fileA xargs -d'\n' -r \
   perl -ne 'BEGIN { $argc = @ARGV - 1; }
       @ARGV == $argc and $h{$_}++,next;
       print,close(ARGV) if eof;
       my $n = <>;
       print $_,$n if ! exists $h{$_};
' fileB

Delete the matched line and one consecutive line (for loop)

3 Answers3