6

Using just shell script, how to search a text file and list all whole blocks of lines that has inside some text (simple grep criteria).

The text file has blocks of lines separated by "-----------------" (precisely, each block start with "\n\n\n--------------------"... about 50 chars "-").

A sample could be:

-------------------------------
Abracadabra, blablablalbalba
blablablabla, banana



-------------------------------
Text, sample text, sample text, sample text
Text, sample text, sample text, sample text
Text, sample text, sample text, sample text
Text, sample text, sample text, sample text


-------------------------------  
Text, sample text, sample text, sample text
banana. Sample text, sample text, sample text, sample text
Text, sample text, sample text, sample text

Lets consider the word "banana" the search criteria. So, the blocks listed would be:

-------------------------------
Abracadabra, blablablalbalba
blablablabla, banana


-------------------------------
Text, sample text, sample text, sample text
banana. Sample text, sample text, sample text, sample text
Text, sample text, sample text, sample text

EDIT:

Testing answers to try awk, like: awk 'BEGIN{RS="\n------------"}/INFO/{print}' where INFO is what was searched for. I cannot get the whole block. So, follows a real sample and the result:

A REAL SAMPLE (including the first 3 new lines):




-------------------------------------------------
Diretório separado do nome o arquivo: adis, IWZLM (/home/interx/adis/src/IWZLM.SRC)
Gerando rotina em linguagem C:
(yla5 adis IWZLM -if)
.INFO =>Rotina BLOQUEADA (status 'M'): Geracao ignorada (use -is para ignorar checagem do status)

[  OK-I ] IWZLM (adis) - Lista lay: Geracao ignorada do codigo em C.



-------------------------------------------------
Diretório separado do nome d arquivo: adis, ADISA (/home/interx/adis/src/ADISA.SRC)
Gerando rotina em linguagem C:
(yla5 adis ADISA -if)
.ERRO: Falha inesperada

Compilando o programa:
(ycomp adis ADISA -exe adis/exe/ADISA.temp.exe )
adis/exe/ADISA.temp.exe => adis/exe/ADISA

[  OK   ] ADISA (adis) - Menu A : Gerada e compilada com sucesso.



-------------------------------------------------
Diretório separado do nome o arquivo: adis, ADISD1 (/home/interx/adis/src/ADISD1.SRC)
Gerando rotina em linguagem C:
(yla5 adis ADISD1 -if)
.ATENCAO: Definicao nao localizada

Compilando o programa:
(ycomp adis ADISD1 -exe adis/exe/ADISD1.temp.exe )
adis/exe/ADISD1.temp.exe => adis/exe/ADISD1

[  OK   ] ADISD1 (adis) - Menu : Gerada e compilada com sucesso.

I cannot get the whole block, just the line containing "INFO", like a ordinary grep, either setting or not ORS:

$ cat file  | awk 'BEGIN{RS="\n------------"}/INFO/{print}' 
.INFO =>Rotina BLOQUEADA (status 'M'): Geracao ignorada (use -is para ignorar checagem do status)

NOTES: It is the awk from AIX 7.1, not gawk.

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
Luciano
  • 1,159

4 Answers4

5
awk '
{
  if (/-------------------------------------------------/) {
    if (hold ~ /INFO/) {
      print hold;
    }
    hold="";
  } else {
    hold=hold "\n" $0
  }
} 
END {
  if (hold ~ /INFO/) {
    print hold;
  }
}' file

This uses a 'hold'ing variable (ala sed) to accumulate lines between separated blocks; once a new block (or EOF) is encountered, print the held value only if it matches the /INFO/ pattern.

(re: the older comments, I deleted my previous inadequate awk and perl answers to clean up this answer)

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
4

Should be quite easy with awk if you don't need all - in the output:

awk -vRS='----' '/banana/{print}' file

alternatively pcregrep:

pcregrep -M '^-+[^-]*banana[^-]*' file
jimmij
  • 47,140
4

If you don't mind the missing leading empty lines, here's a sed solution:

sed '/---/b end                      # if line matches pattern branch to : end
//!{H                                # if it doesn't match, append to hold space
$!d                                  # and if not on the last line, delete it
$b end                               # if it's the last line branch to : end
}
: end                                # label end
x                                    # exchange hold buffer and pattern space
/PATTERN/!d                          # if pattern space doesn't match, delete it
' <infile
don_crissti
  • 82,805
  • You can also do this on one line like so: sed -e '/---/b end' -e '//!{H' -e '$!d;$b end' -e '}; : end' -e 'x; /PATTERN/!d' infile (I've found that many people appreciate the one line versions of complex sed commands.) – Wildcard Apr 26 '16 at 01:49
0

One of the things is that for passing regular expressions, when backslash is involved it must be escaped. it was tested against the input provided as A REAL SAMPLE

parrsel code

#!/usr/bin/nawk -f
BEGIN{ORS=RS="\n\n\n"}   # the record separator is considering three \n
$0~var1{print}           # when record contains var1 print record 

Execution

## the pattern is passed as var1 and is considering the occurrence of OK as a word
parrsel -v var1=paragraphs -vvar1='\\<OK\\>' data

-------------------------------------------------
Diretório separado do nome o arquivo: adis, IWZLM (/home/interx/adis/src/IWZLM.SRC)
Gerando rotina em linguagem C:
(yla5 adis IWZLM -if)
.INFO =>Rotina BLOQUEADA (status 'M'): Geracao ignorada (use -is para ignorar checagem do status)

[  OK-I ] IWZLM (adis) - Lista lay: Geracao ignorada do codigo em C.



-------------------------------------------------
Diretório separado do nome d arquivo: adis, ADISA (/home/interx/adis/src/ADISA.SRC)
Gerando rotina em linguagem C:
(yla5 adis ADISA -if)
.ERRO: Falha inesperada

Compilando o programa:
(ycomp adis ADISA -exe adis/exe/ADISA.temp.exe )
adis/exe/ADISA.temp.exe => adis/exe/ADISA

[  OK   ] ADISA (adis) - Menu A : Gerada e compilada com sucesso.
Rui F Ribeiro
  • 56,709
  • 26
  • 150
  • 232