0

I have a file that looks like as follows:

line1
line2
line3
line4
line5
line6
line7
line5
line2

I want to get data between line1 and line5. I used awk '/line1/,/line5/' myfile. Expecting that output will be:

line1
line2
line3
line4
line5

But awk reads until last matched line5. I desire that awk shall stop on first match and not on last match.

αғsнιη
  • 41,407
kalpesh
  • 11

3 Answers3

3

I can’t reproduce your problem. Awk* does what I would expect: print each line between the first occurrence of line1 until the first occurrence of line5:

$ awk '/line1/,/line5/' file
line1
line2
line3
line4
line5

Is it possible that you have a hidden non-printing character somewhere within the string line5 in the fifth line of your file? This would explain why awk isn’t matching it.


You can double-check by running the sed equivalent:

$ sed -n '/line1/,/line5/p' file
line1
line2
line3
line4
line5

The -n instructs sed to not print every line (its default behaviour) while /line1/,/line5/p instructs it to print each line from the first match of line1 until the first match of line5.


If you want to print only the first set of lines starting with a line that matches the pattern line1 and ending with a line that matches line5, you could use:

sed -n '/line1/,$p;/line5/q' file

* I checked using gawk, the GNU implementation of awk (and Kusalananda has confirmed that awk and mawk on OpenBSD also do the right thing).

  • FWIW, OpenBSD awk and mawk also does the correct thing. – Kusalananda Jan 26 '18 at 10:38
  • If there's a non-printing character in the fifth line, it would need to be somewhere within the string line5 for it not to match. – Kusalananda Jan 26 '18 at 10:59
  • @Kusalananda Thanks for the information about OpenBSD awk and mawk. I've also incorporated your clarification; that was what I had intended but had not made sufficiently explicit. – Anthony Geoghegan Jan 26 '18 at 14:47
2

You could use below awk:

awk '/line1/{prnt=1} prnt{print} /line5/{exit}' infile

This will print lines matched with line1 until next first line matched with line5 then exits immediately.

αғsнιη
  • 41,407
0
awk '/foo/,/bar/'

prints all the sections of the file that start with a line matching foo and end with the next line after that (or even the same line contrary to the sed equivalent) that matches bar (and note that the line1 regexp also matches on a line that contains line1anything including line10).

If you wanted to match on the first section starting with a line that matches line1 and ending with the next line after that (or the same line) that matches line5, you could do:

sed '/line1/,$!d;/line5/q' < file

(note that if there's no line that matches line5 after the first one that matches line1, it will print from the line1 line to the end of the file. Also note that sed patterns are basic regular expressions, while awk ones are extended ones).