12

I want to extract an exact line from a very big file. For example, line 8000 would be gotten like this:

command -line 8000 > output_line_8000.txt
terdon
  • 242,166
  • 1
    Many of the methods below are mentioned in this SO Q&A as well: http://stackoverflow.com/questions/6022384/bash-tool-to-get-nth-line-from-a-file – slm May 17 '14 at 12:27

6 Answers6

14

There's already an answer with perl and awk. Here's a sed answer:

sed -n '8000{p;q}' file

The advantage of the q command is that sed will quit as soon as the 8000-th line is read (unlike the other perl and awk methods (it was changed after common creativity, haha)).

A pure Bash possibility (bash≥4):

mapfile -s 7999 -n 1 ary < file
printf '%s' "${ary[0]}"

This will slurp the content of file in an array ary (one line per field), but skip the first 7999 lines (-s 7999) and only read one line (-n 1).

9

It's Saturday and I had nothing better to do so I tested some of these for speed. It turns out that the sed, gawk and perl approaches are basically equivalent. The head&tail one is the slowest but, suprisingly, the fastest by an order of magnitude is the pure bash one:

Here are my tests:

$ for i in {1..5000000}; do echo "This is line $i" >>file; done

The above creates a file with 50 million lines which occupies 100M.

$ for cmd in "sed -n '8000{p;q}' file" \
            "perl -ne 'print && exit if $. == 8000' file" \
            "awk 'FNR==8000 {print;exit}' file" 
            "head -n 8000 file | tail -n 1" \
            "mapfile -s 7999 -n 1 ary < file; printf '%s' \"${ary[0]}\"" \
            "tail -n 8001 file | head -n 1"; do 
    echo "$cmd"; for i in {1..100}; do
     (time eval "$cmd") 2>&1 | grep -oP 'real.*?m\K[\d\.]+'; done | 
        awk '{k+=$1}END{print k/100}'; 
    done
sed -n '8000{p;q}' file
0.04502
perl -ne 'print && exit if $. == 8000' file
0.04698
awk 'FNR==8000 {print;exit}' file
0.04647
head -n 8000 file | tail -n 1
0.06842
mapfile -s 7999 -n 1 ary < file; printf '%s' "This is line 8000
"
0.00137
tail -n 8001 file | head -n 1
0.0033
terdon
  • 242,166
6

You can do it many ways.

Using perl:

perl -nle 'print && exit if $. == 8000' file

Using awk:

awk 'FNR==8000 {print;exit}' file

Or you can use tail and head to prevent from reading entire file until the 8000th line:

tail -n +8000 | head -n 1
cuonglm
  • 153,898
4

Another version with tail and head

head -n 8000 file | tail -n 1
4

You could use sed:

sed -n '8000p;' filename

If the file is large, then it'd be better to quit:

sed -n '8000p;8001q' filename

You could similarly quit reading the entire file using awk or perl too:

awk 'NR==8000{print;exit}' filename
perl -ne 'print if $.==8000; last if $.==8000' filename
devnull
  • 10,691
1

How about this?

$ cat -n filename | grep -E "[ \t]+8000"

Example

$ cat -n /etc/abrt/plugins/CCpp.conf  | grep -E "^[ \t]+16"
    16  #DebuginfoLocation = /var/cache/abrt-di
slm
  • 369,824
mbsingh
  • 111