How to perform a multi line grep

Question

How would you perform a grep for text that appears on two lines?

For example:

pbsnodes is a command I use that returns the utilization of a linux cluster

root$ pbsnodes
node1
    state = free
    procs = 2
    bar = foobar

node2
    state = free
    procs = 4
    bar = foobar

node3
    state = busy
    procs = 8
    bar = foobar

I want to determine the number of procs that match nodes that are in state 'free'. So far I have been able to determine the "number of procs" and "the nodes in free state", but I want to combine them into one command that shows all free procs.

In the above example, the correct answer would be 6 (2+4).

What I have

root$ NUMBEROFNODES=`pbsnodes|grep 'state = free'|wc -l`
root$ echo $NUMBEROFNODES
2

root$ NUMBEROFPROCS=`pbsnodes |grep "procs = "|awk  '{ print $3 }' | awk '{ sum+=$1 } END { print sum }'`
root$ echo $NUMBEROFPROCS
14

How can I search for every line that reads 'procs = x', but only if the line above it reads 'state = free?

Stéphane Chazelas · Answer 1 · 2013-10-01T11:43:55.590

12

If the data is always in that format, you could simply write it:

awk -vRS= '$4 == "free" {n+=$7}; END {print n}'

(RS= means records are paragraphs).

Or:

awk -vRS= '/state *= *free/ && match($0, "procs *=") {
  n += substr($0,RSTART+RLENGTH)}; END {print n}'

edited Oct 01 '13 at 11:43

answered Sep 30 '13 at 21:49

Stéphane Chazelas

544,893

score 5 · Answer 2 · answered Oct 01 '13 at 06:11

$ pbsnodes
node1
    state = free
    procs = 2
    bar = foobar

node2
    state = free
    procs = 4
    bar = foobar

node3
    state = busy
    procs = 8
    bar = foobar
$ pbsnodes | grep -A 1 free
    state = free
    procs = 2
--
    state = free
    procs = 4
$ pbsnodes | grep -A 1 free | grep procs | awk '{print $3}'
2
4
$ pbsnodes | grep -A 1 free | grep procs | awk '{print $3}' | paste -sd+ 
2+4
$ pbsnodes | grep -A 1 free | grep procs | awk '{print $3}' | paste -sd+ | bc 
6

https://en.wikipedia.org/wiki/Pipeline_(Unix)

slm · Answer 3 · 2013-10-01T15:29:47.973

4

Here's one way to do it using pcregrep.

$ pbsnodes | pcregrep -Mo 'state = free\n\s*procs = \K\d+'
2
4

Example

$ pbsnodes | \
    pcregrep -Mo 'state = free\n\s*procs = \K\d+' | \
    awk '{ sum+=$1 }; END { print sum }'
6

edited Oct 01 '13 at 15:29

answered Sep 30 '13 at 21:14

slm

369,824

binfalse · Answer 4 · 2013-09-30T22:20:27.970

The GNU implementation of grep comes with two arguments to also print the lines before (-B) and after (-A) a match. Snippet from the man page:

   -A NUM, --after-context=NUM
          Print NUM lines of trailing context after matching lines.  Places a line containing  a  group  separator  (--)  between  contiguous  groups  of  matches.   With  the  -o  or
          --only-matching option, this has no effect and a warning is given.

   -B NUM, --before-context=NUM
          Print  NUM  lines  of  leading  context  before  matching  lines.   Places  a  line  containing  a group separator (--) between contiguous groups of matches.  With the -o or
          --only-matching option, this has no effect and a warning is given.

So in your case, you would have to grep for state = free and also print the following line. Combining that with the snippets from your question you'll arrive at something like that:

usr@srv % pbsnodes | grep -A 1 'state = free' | grep "procs = " | awk  '{ print $3 }' | awk '{ sum+=$1 } END { print sum }'
6

and a bit shorter:

usr@srv % pbsnodes | grep -A 1 'state = free' | awk '{ sum+=$3 } END { print sum }'
6

awk does pattern matching; you don't need grep: see Stephane's answer — jasonwryan, Sep 30 '13 at 22:01
Well, sed does pattern matching as well. You could also use perl, or php, or what ever language you prefer. But at least the headline of the question asked for multi line grep... ;-) — binfalse, Sep 30 '13 at 22:18

score 3 · Answer 5 · edited Sep 30 '13 at 21:59

If you have a fixed length data (fixed length referring to the number of lines in a record), in sed you can use the N command (several times), which joins the next line to the pattern space:

sed -n '/^node/{N;N;N;s/\n */;/g;p;}'

should give you output like:

node1;state = free;procs = 2;bar = foobar
node2;state = free;procs = 4;bar = foobar
node3;state = busy;procs = 8;bar = foobar

For variable record composition (e.g. with an empty separator line), you could make use of branching commands t and b, but awk is likely to get you there in a more comfortable way.

Joseph R. · Answer 6 · 2013-09-30T21:30:14.410

Your output format is primed for Perl's paragraph slurp:

pbsnodes|perl -n00le 'BEGIN{ $sum = 0 }
                 m{
                   state \s* = \s* free \s* \n 
                   procs \s* = \s* ([0-9]+)
                 }x 
                    and $sum += $1;
                 END{ print $sum }'

Note

This only works because Perl's idea of a "paragraph" is a chunk of non-blank lines separated by one or more blank lines. If you didn't have blank lines between the node sections, this wouldn't have worked.

See also

score 0 · Answer 7 · answered Oct 01 '13 at 09:16

0

... and here is a Perl solution:

pbsnodes | perl -lne 'if (/^\S+/) { $node = $& } elsif ( /state = free/ ) { print $node }'

answered Oct 01 '13 at 09:16

reinierpost

494

score 0 · Answer 8 · answered Oct 01 '13 at 15:04

You may use the awk getline command :

$ pbsnodes | awk 'BEGIN { freeprocs = 0 } \
                  $1=="state" && $3=="free" { getline; freeprocs+=$3 } \
                  END { print freeprocs }'

From man awk :

   getline               Set $0 from next input record; set NF, NR, FNR.

   getline <file         Set $0 from next record of file; set NF.

   getline var           Set var from next input record; set NR, FNR.

   getline var <file     Set var from next record of file.

   command | getline [var]
                         Run command piping the output either into $0 or var, as above.

   command |& getline [var]
                         Run  command  as a co-process piping the output either into $0 or var, as above.  Co-processes are a
                         gawk extension.

How to perform a multi line grep

8 Answers8

Example

Linked

Related