I am trying to print every Nth line out of a file with more than 300,000 records into a new file. This has to happen every Nth record until it reaches the end of the file.
-
see also: https://unix.stackexchange.com/q/214445/117549 – Jeff Schaller Jun 04 '17 at 19:54
-
Looking in your comments, we cant understand what you need. Provide sample input and sample output. Do you need a range ? From Nth line up to EOF? – George Vasiliou Jun 04 '17 at 20:25
-
thanks, I have 355,000 records which is sorted but I need to get a sample of the data (1/3 which is about 100,000) so I thought if I retrieve the 300th of the sorted file from 1 to EOF, I should be able to get a fair sample. – Terisa Jun 04 '17 at 20:50
-
What the word "records" means to you? Do you refer to number of lines in a file or you refer to a number of files? Better describe your problem with terms like files and lines. Avoid the word record. Tell us how many lines has your file or how many files you need to parse. – George Vasiliou Jun 04 '17 at 21:00
-
3Please explain your requirements more clearly. Against my answer you wrote. "For example for an input file with 300000 I should get 100000 records in the output." That sentence doesn't make any sense, unless if you mentioned that n=3 and you wanted the 3rd, 6th, 9th line. Or perhaps, you wanted the 1st, 4th, 7th line. There are multiple different solutions because the way you're asking the question is not clear. – Stephen Quan Jun 05 '17 at 02:27
4 Answers
awk 'NR % 5 == 0' input > output
This prints every fifth line.
To use an environment variable:
NUM=5
awk -v NUM=$NUM 'NR % NUM == 0' input > output

- 2,566
-
I ran this command and got only 1166 in the output. I expected 100,000. – Terisa Jun 04 '17 at 21:09
-
-
1
-
awk 'NR % 3 == 0' 350000.records > 100000-records
. That will give you$((350000/3))
lines, or 116666. – Deathgrip Jun 04 '17 at 21:17 -
-
4As commented in your "answer" below, pleas accept this answer as the solution. Thank you. – Deathgrip Jun 04 '17 at 21:30
-
2or every 5th line starting at the 1st using
NR % 5 == 1
or every 5th line starting at the 4th usingNR % 5 == 4
– northern-bradley Oct 24 '18 at 20:23 -
1ffmpeg and other programs expect the data of a file or files piped in. Your solution helped me list all the JPGs in a dir and feed ever 5th filename to cat to read the data to pipe to ffmpeg. Couldn't have done it without you! (I looked all over and tried dozens of possible solutions).
@northern-bradley your
==1
helps too if there's only 1 file in the directory.
– Able Mac Jun 15 '19 at 04:15cat $(ls *.jpg | awk 'NR % 5 == 1' -) | ffmpeg -r 15 -f image2pipe -vcodec mjpeg -i - -r 30 test.mp4
To print every N th line, use
sed -n '0~Np'For example, to copy every 5th line of
oldfile
to newfile
, do
sed -n '0~5p' oldfile > newfile
This uses sed
’s first ~step address form,
which means “match every step’th line starting
with line first.”
In theory,
this would print lines 0, 5, 10, 15, 20, 25, …, up to the end of the file.
Of course there is no line 0, so it just prints lines 5, 10, 20, 25, …;
0~5
is just a convenient alternative way of saying 5~5
(which prints every 5th line starting with line 5;
i.e., lines 5, 10, 15, 20, 25, …).
For another example of this sed
capability
(which does not answer the question),
sed -n '2~5p' oldfile
would print lines 2, 7, 12, 17, 22, 27, …, up to the end of the file.
Note: This approach requires GNU sed,
as the first ~step address form
is a non-portable extension.
(Some old versions of GNU sed may require the 5~5
form
as opposed to the 0~5
form.)
-
4i like that this uses
sed
which is what i originally searched for but to my brain @deathgrip's use ofawk
is clearer – northern-bradley Oct 24 '18 at 20:17 -
1The
sed
solution is about 3 times faster to run than theawk
solution on my computer. I confirm it is not a standard option though. – Totor Mar 05 '21 at 23:05 -
1
-
Just a note that the answer already says** that the
~
syntax requires GNU sed, and has done so since March 5, 2021 (long before you posted that comment). – G-Man Says 'Reinstate Monica' Dec 01 '22 at 20:13 -
Outstanding answer. I appreciate the '2~5p' addendum because I wanted to split a file into 5 parts, and I could do "every 5th line" 5 times to create them. Using a different first number each time, of course. – Mike S Jan 13 '23 at 20:58
Similarly to sed, we have also awk:
$ seq 1000000000 |awk 'NR==500000{print;exit}'
500000
NR=Number of line you want to print (and then exit to avoid waiting the file to finish). In your case
awk 'NR==Nth{print;exit}' inputfile >outputfile
Where Nth is the Nth line number you need to print.

- 7,913
-
Looks like the question was initially worded badly, and this answers the wrong question. – rjmunro Oct 09 '20 at 14:55