-3

How can I retrieve the Nth record of a large text file and insert them into a new file?

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
Terisa
  • 707
  • How do you define "record"? What does some sample data look like? What have you tried? https://unix.stackexchange.com/help/how-to-ask – phemmer Jun 04 '17 at 20:45
  • 3
    You've just duplicated https://unix.stackexchange.com/q/369181/117549 -- be more specific about what you want – Jeff Schaller Jun 04 '17 at 20:56
  • It sounded like, from the other question, that you want every N lines – Jeff Schaller Jun 04 '17 at 20:57
  • create a small example that could represent your problem. give sample input / sample output. – George Vasiliou Jun 04 '17 at 21:04
  • That is right, I put the new question out there before I noticed your tip. The records are 20 digit numbers, the total is more than 300,000. The first 4 digits are representative of location and year (e.g. 0117). I need one third of them to reflect the weigh of each category (location-year). I thought I get a fair sample if I sort the data and take the 300th record written in a new file. – Terisa Jun 04 '17 at 21:04
  • 0100000044116R007010 0105000045035C005005 – Terisa Jun 04 '17 at 21:05
  • This is location 01 and year 2000 – Terisa Jun 04 '17 at 21:06
  • Edit the question to include these details – Jeff Schaller Jun 04 '17 at 21:07
  • Thrig, the "duplicate" I linked to was originally a different question, as evidenced by the answers. In that Q's comments, it came to light that the OP wanted something different, and so opened this question. – Jeff Schaller Jun 04 '17 at 22:57
  • @JeffSchaller regardless, the "duplicate" has been edited so it's now the same question... I'm voting to close anyway since they are now the same. what are we going to do, revert the edits to the other question so the answers make sense? then we're back to square one with a question that would be closed as "unclear what you're asking". – strugee Jun 04 '17 at 23:28

1 Answers1

1
NUM=5
awk -v NUM=$NUM 'NR % NUM == 0' input > output

Set environment variable NUM to the desired number. "input" the name of your input file. "output" the name of your output file will contain every NUMth line from the input.

Deathgrip
  • 2,566