15

I have a multi line log entry format that I need to process.

The log looks something like this:

--START--
Device=B
Data=asdfasdf
Lorem=Ipsum
--END--
--START--
Device=A
Data=asdfasdf
Lorem=Ipsum
--END--
--START--
Device=B
Data=asdfasdf
--END--
--START--
Device=A
Data=asdfasdf
--END--
--START--
Device=B
Data=asdfasdf
--END--
--START--
Device=C
Data=asdfasdf
Lorem=Ipsum
--END--

I want to print everything between --START-- and --END-- if a particular pattern is matched.

e.g:

Print all entries where Device=A

--START--
Device=A
Data=asdfasdf
Lorem=Ipsum
--END--
--START--
Device=A
Data=asdfasdf
--END--

All I've been able to do so far is write:

sed -e -n '/--START--/,/--END--/p' < input

Which effectively prints the input but I think I need to add {} to filter with N and then print if that condition matches.

I also think I'm completely lost.

Any idea on how to print multiple lines if a single line matches a condition?

cuonglm
  • 153,898
Brad
  • 153

4 Answers4

28
$ sed -n '/--START--/{:a;N;/--END--/!ba; /Device=A/p}' file
--START--
Device=A
Data=asdfasdf
Lorem=Ipsum
--END--
--START--
Device=A
Data=asdfasdf
--END--

(The above was tested on GNU sed. It would have to be massaged to run on BSD/OSX.)

How it works:

  • /--START--/{...}

    Every time we reach a line that contains --START--, run the commands inside the braces {...}.

  • :a

    Define a label a.

  • N

    Read the next line and add it to the pattern space.

  • /--END--/!ba

    Unless the pattern space now contains --END--, jump back to label a.

  • /Device=A/p

    If we get here, that means that the patterns space starts with --START-- and ends with --END--. If, in addition, the pattern space contains Device=A, then print (p) it.

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
John1024
  • 74,655
  • 1
    Finally tried these guys. This implementation both produces the right results and is the most concise. – Brad Sep 23 '15 at 19:18
  • 1
    You, sir, just single-handedly taught me how to think in sed: labels in C. Thank you. – Sean May 13 '16 at 05:16
  • 1
    Thanks, worked great for me. I used the trick to substitue text within a certain xml-element. sed -i -E '/<Element.*/{:a;N;/</Element>/!ba;s/Pattern/Replacement/}' file – Joachim Jul 10 '18 at 09:17
7

Other sed variant with hold space use

sed 'H              #add line to hold space
     /--START--/h   #put START into hold space (substitute holded in)
     /--END--/!d    #clean pattern space (start next line) if not END
     x              #put hold space into pattern space
     /Device=A/!d   #clean pattern space if it have not "Device=A"
    ' file
Costas
  • 14,916
1

With sed:

$ -e:1 -e'$!N;/--END--/{
  /Device=A/!d
  b
}' -eb1 <file
--START--
Device=A
Data=asdfasdf
Lorem=Ipsum
--END--
--START--
Device=A
Data=asdfasdf
--END--

This read all line between --START-- and --END-- into pattern space. If matched --END--, we check if pattern space didn't contain Device=A, delete it, else sed print pattern space then start next cycle.

With awk:

awk '
  /--START--/ {
    getline d
    if (d ~ /Device=A/) {
      p = 1
      printf "%s\n%s\n", $0, d
      next
     }
  }
  p
  /--END--/ { p = 0 }
' <file
cuonglm
  • 153,898
  • With GNU awk you can just parse match and print each group as a record: gawk 'BEGIN{RS=ORS="--END--\n"} /Device=A/'. To match only on the first line after --START-- also set FS="\n" and test $2=="Device=A" . – dave_thompson_085 Sep 11 '15 at 06:57
0

There are already several other good answers here which demonstrate how to test for your string between --START-- and --END-- blocks, but, given your sample input, it could be that you don't really need to worry about --START-- at all.

sed -n '$!N;/^Device=A\n/,/\n--END--/P;D'

...would print everything from Device=A through to the last \newline occurring just before the next occurring --END-- in each block. It doesn't bother to worry about --START-- because it doesn't appear there's ever any need to check for it.

So for a block like:

--START--
Device=A
...stuff...
--END--

...the output would be...

Device=A
...stuff...

...but for a block like...

--START--
Device=B
Device=A
...stuff...
--END--

...it would output the same.

It works by appending the Next input line to every line which is ! not the $ last, and so pattern space looks like:

^line1\nline2$

For every pattern space which occurs between the sequences ^Device=A\n and \n--END-- the first of the two lines currently in pattern is Printed and afterward Deleted before starting the next cycle with what remains. And so the next pattern space looks like...

^line2\nline3$
mikeserv
  • 58,310