How to get line strings with special word start and end?

Question

Here is the file strings.xml which includeing 3 sentences:

1.string name="schedulelist_nofiles">Não existe agenda de registro! Por favor,

here is missing line tecla “Add” (Adicionar) para adicionar uma. string

2.You should skip this line!

3.string name="Programme_name">GUIA DE PROGRAMA string

I use cat command 'cat strings.xml | grep "string name=" but it only get below lines:

string name="schedulelist_nofiles">Não existe agenda de registro! Por favor,

string name="Programme_name">GUIA DE PROGRAMA string

I want to get the complete line like this , just like sentence 3, start with 'string name' and end with "string". What can I do for this?

you can try sed '/first words/,/last words/ !d' to select lines that match this pattern be careful to choose the rigth pattern to not select paragraph :) — francois P, Jan 17 '18 at 07:35

Stéphane Chazelas · Answer 1 · 2018-01-17T08:33:20.397

0

With pcregrep:

pcregrep -Mo '(?s)\bstring name=.*?\bstring\b' < file.xml

-M: multi-line mode where pcregrep pulls more lines from the input as needed to satisfy the regex
-o: print the portion that matches the regexp like in GNU grep (you can omit it or replace with -x if the regex matches full lines.
(?s): activate the s flag that causes . to also match on newline
.*?: non-greedy version of .*: any number of characters, as few as possible.
\b: word boundary.

If you don't have pcregrep, you can do the same with perl with something like:

perl -l -0777 -ne 'print for /\bstring name=.*?\bstring\b/gs' < file.xml

Though that means loading the whole file in memory.

edited Jan 17 '18 at 08:33

answered Jan 17 '18 at 07:48

I use the Linux server to scripting, and I don't have root permission, so the command pcregrep can not be installed, other command can replace it? – Huang Jan 17 '18 at 08:18
Thanks very much. If above 3 scentens I just want to get the first one, how can I do? I mean only need to capture the string ID 'schedulelist_nofiles' sentence, don't includeing the last scentens, how can I filter it? Because it is also including keys words 'string name= ... string'. – Huang Jan 17 '18 at 10:29

score 0 · Answer 2 · answered Jan 17 '18 at 08:31

0

With sed, assuming those string name= ... string take up full lines and that all string name=s are terminated by a string:

sed '
  /^string name=/!d
  :1
  / string$/!{
    N
    b1
  }' < strings.xml

answered Jan 17 '18 at 08:31

score 0 · Accepted Answer · answered Jan 17 '18 at 12:56

Below sed oneliner can be used to achieve the same. Tested it worked fine

First method1

sed -n '/^string name.*string$/p'  filename

In method1 It will searches for line beginning with "string name" and ends with "string"

Method 2

sed -n '/^string name/p' filename | sed -n '/string$/p'

Output

string name="Programme_name">GUIA DE PROGRAMA string