10

Input.txt:

    8B0C
    remove
    8B0D
    remove
    8B0E
    remove
    8B0F
    8B10
    remove
    8B14
    remove
    8B15
    remove
    8B16
    remove
    8B17
    remove
    8AC0
    8AC1
    remove
    8AC2
    remove
    8AC3
    remove
    8AE4
    8AE5
    8AE6
    remove

Desired output:

    8B0F
    8AC0
    8AE4
    8AE5

I want to print a line if that line or the next line does not contain 'remove'. I am using solaris 5.10, KSH.

don_crissti
  • 82,805
ayrton_senna
  • 1,091

3 Answers3

18

With sed:

sed '$!N;/remove/!P;D' infile

This pulls the Next line into pattern space (if not ! on la$t line) and checks if pattern space matches remove. If it doesn't (means none of the two lines in the pattern space contains the string remove) it Prints up to the first \newline character (i.e. it prints the first line). Then it Deletes up to the first \newline character and restarts the cycle. This way, there are never more than two lines in the pattern space.


It's probably easier to understand the N,P,D cycle if you add l before and after the N to look at the pattern space:

sed 'l;$!N;l;/remove/!P;D' infile

so, using only the last six lines from your example:

    8AC3
    remove
    8AE4
    8AE5
    8AE6
    remove

the last command outputs:

    8AC3$
    8AC3\n    remove$
    remove$
    remove\n    8AE4$
    8AE4$
    8AE4\n    8AE5$
    8AE4
    8AE5$
    8AE5\n    8AE6$
    8AE5
    8AE6$
    8AE6\n    remove$
    remove$
    remove$

Here is a short explanation:

cmd        output            cmd
l     8AC3$                  N # read in the next line
l     8AC3\n    remove$      D # delete up to \n (pattern space matches so no P)
l     remove$                N # read in the next line
l     remove\n    8AE4$      D # delete up to \n (pattern space matches so no P)
l     8AE4$                  N # read in the next line
l     8AE4\n    8AE5$        # pattern space doesn't match so print up to \n
P     8AE4                   D # delete up to \n
l     8AE5$                  N # read in the next line
l     8AE5\n    8AE6$        # pattern space doesn't match so print up to \n
P     8AE5                   D # delete up to \n 
l     8AE6$                  N # read in the next line
l     8AE6\n    remove$      D # delete up to \n (pattern space matches so no P)
l     remove$                # last line so no N 
l     remove$                D # delete (pattern space matches so no P)
don_crissti
  • 82,805
6
awk '
    !/remove/ && NR > 1 && prev !~ /remove/ {print prev} 
    {prev = $0} 
    END {if (!/remove/) print}
' Input.txt 
glenn jackman
  • 85,964
2
gawk 'BEGIN{ RS="remove\n"; ORS="" }
      RT{ print gensub("[^\n]*\n$","","") }; !RT{ print }' file

The above method does not read Records line-by-line, rather it reads multi-line Records from one Record Separator (RS) to the next (or end-of-file) – the RS being the "remove" line itself (including its trailing `\n).

The !RT test is needed for when the last line is not an RS line.
RT, a gawk-ism, is the actual text of the current record's RS.
gensub is also a gawk-ism.

If you need to check for a marker line that matches "remove" anywhere in the line, vs. a line which equals "remove", then just change the Record Separator to:

`RS="[^\n]*remove[^\n]*\n"`  

Output:

8B0F
8AC0
8AE4
8AE5
Peter.O
  • 32,916