How to print lines before and lines after th line

Question

The question asked here asks for some lines before and afer pattern match.

But here objective is to take a line number and to fetch some lines before and after it from a file

EG:

If line number is 6 it should provide 4 numbers before, that line, and 3 numbers after that line. That is

This will produce range of lines from n to n. This is not what I asked for. Please read edited question with example. — Ramaprakasha, May 28 '17 at 06:39
Well, you're about to find out that sometimes it's easier to do simple arithmetic yourself than write a script to do it. shrug — Satō Katsura, May 28 '17 at 07:05
If line number is 6 What decide it is line 6? Do you search for a word or something? This line seq 10 | grep -B4 -A3 "6" will search for a line which matches "6" and print 4 lines before and 3 lines after. — hschou, May 28 '17 at 08:15
It was mere example to represent the problem. Thats why I mentioned that the solution should not rely on pattern matching. — Ramaprakasha, May 28 '17 at 10:53

Kusalananda · Accepted Answer · 2017-05-28T06:59:42.773

2

z=6   # focus line
x=4   # lines before
y=3   # lines after

start=$(( z - x ))
end=$(( z + y ))

Using sed:

seq 10 | sed -n "$start,${end}p"
2
3
4
5
6
7
8
9

This simply uses the print (p) command of sed with a explicit range of lines to print. The other lines are ignored using -n.

Using awk:

seq 10 | awk -v start="$start" -v end="$end" 'NR >= start { print } NR >= end { exit }'
2
3
4
5
6
7
8
9

This is similar to Stéphane Chazelas' answer, but implemented in awk; the script starts outputting the input lines after having read start number of lines. At end number of lines, the script exits.

Both alternatives will display a portion of the input data, starting at x lines before the line z and ending y lines after line z.

edited May 28 '17 at 06:59

answered May 28 '17 at 06:35

Kusalananda

333,661

1

Simpler awk: seq 10 | awk -v start="$start" -v end="$end" 'NR==start,NR==end' – Satō Katsura May 28 '17 at 07:12
For the sed solution in case of big files you need to apply an exit condition , otherwise sed will exit when the whole file is over - see my benchmark. – George Vasiliou May 28 '17 at 21:51
@GeorgeVasiliou I know. That is basically Stéphane's answer though, so I don't feel I can use it... – Kusalananda May 28 '17 at 21:52
OK, fair enough. – George Vasiliou May 28 '17 at 21:53
By the way, in my little benchmark awk needed three times more time than sed! – George Vasiliou May 28 '17 at 21:55
@GeorgeVasiliou I'm not entirely surprised. awk is a high-level language while sed is a stream editor. awk has more to think about, so to speak, field splitting etc. – Kusalananda May 28 '17 at 22:02

Stéphane Chazelas · Answer 2 · 2017-05-28T06:54:25.180

With POSIX shells:

$ before=4 after=3 line=6
$ seq 10 | sed "$((line-before)),\$!d; $((line+after))q"
2
3
4
5
6
7
8
9

Translates to:

delete any line but (!) from the range from the line - before^th one to the end ($).
quit on the line + after^th line

That way we don't even bother reading past the line + after^th line.

That means however that the command feeding its data to sed will get aborted with a SIGPIPE if it continues sending data shortly after that which may or may not be desirable.

George Vasiliou · Answer 3 · 2017-05-28T22:02:59.767

Just for completeness :

$ l=60;seq 100 |head -n$((l+3)) |tail -n+$((l-4))
56
57
58
59
60
61
62
63

Rumors and various benchmarks say that the combination of head + tail is much faster than any other tool:

$ a=1000000000
$ time seq $a |awk 'NR>=499998{print}NR >= 500004 { exit }' 
499998
499999
500000
500001
500002
500003

real    0m0.158s
user    0m0.152s
sys 0m0.004s

$ time seq $a |sed -n "499998,500003p"
499998
499999
500000
500001
500002
500003

real    1m30.249s
user    1m21.284s
sys 0m12.312s

$ time seq $a |sed "$((500000-2)),\$!d; $((500000+3))q"  #Stephan's Solution
499998
499999
500000
500001
500002
500003

real    0m0.052s
user    0m0.044s
sys 0m0.004s

$ time seq $a |head -n$((500000+3)) |tail -n+$((500000-2))
499998
499999
500000
500001
500002
500003

real    0m0.024s
user    0m0.024s
sys 0m0.004s

$ time seq $a |sed -n "499998,500003p;500004q"
499998
499999
500000
500001
500002
500003

real    0m0.056s
user    0m0.048s
sys 0m0.004s

score 0 · Answer 4 · 2017-05-28T10:56:50.783

# define line range constants
before=4
  line=6
 after=3

# setup the sed commands s.t. pattern space holds $before number
# of lines before we hit the line number $line and $after after
s='$!N'
p=`seq -s "$s"   "$before"`
a=`seq -s "$s" 0 "$after"`

N=${p//[0-9]/;}
n=${a//[0-9]/;}

# main...
seq 10 |
sed -e "
   1{ $N }
   \$d;N
   $line!D
   $n;q
"

Another method is slurping the file and set the FS to \n so that the fields (now lines) are in @F. What remains is slicing it around the 6th line and 4elements before and 3 lines after:

perl -alF\\n -0777ne '$,=$\;print @F[6-4-1..6+3-1]' yourfile

Results

How to print lines before and lines after th line

4 Answers4

Results

Related