I have a csv file that is 6 gigabytes, but I don't need that much data, I need like 100 rows or so. How can I truncate it?
Asked
Active
Viewed 3,347 times
3
-
@K7AAY, sorry, I have no idea, that would require me to download the whole thing from s3 and check, which will take a while. – Pavel Orekhov May 29 '19 at 15:40
-
@K7AAY do csv files have '\n' at the end, should i just readline 100 times and write it to another file? – Pavel Orekhov May 29 '19 at 15:42
-
Windows and DOS use carriage return and line feed ("\r\n") as a line ending, which Unix uses just line feed ("\n"). – K7AAY May 29 '19 at 15:44
2 Answers
8
Depending on what you want you can:
Take the 1st 100 rows as suggested by @K7AAY.
head -n100 filename.csv > file100.csv
Take the last 100 rows
tail -n100 filename.csv > file100.csv
Take a random selection of 100 rows. This requires you have the GNU
shuf
program installed. It should be installable from your distribution's repositories if you're on Linux.shuf -n100 filename.csv > file100.csv
Alternatively, if your
sort
supports the-R
(random sort) option, you can do:sort -R filename.csv | head -n100 > file100.csv

terdon
- 242,166