What's the best way to take a segment out of a text file? and the many others in the right sidebar there are almost duplicates.
The one difference is that my file was too large to fit in the available RAM+VM and so anything I tried would not only do nothing for minutes till killed, but would bog down the system. One of them made me unable to do anything until the system crashed.
I can write a loop in the shell or any other to read, count, and discard lines until the count (line number wanted) is reached, but maybe there exists already a single command that will do it?
After trying a few things (vim, head -X | tail -1, GUI editor), I gave up, deleted the file and changed the program that created it to give me only the lines needed.
Looking further, Opening files of size larger than RAM, no swap used. suggests that vi should do it, but if vi is the same, it was definitely doing something that takes minutes not seconds.
split
comes into mind. e.g.split -n 4 file.txt
should split thefile.txt
into 4 parts. I don't really knowsplit
since I don't use it but you can give it a try? – Jetchisel Jan 18 '20 at 01:21split -l N-1
and then head -1 on the second file. Can't give it a try without re-creating the huge file. Rather not experiment, since one attempt already crashed the system. – WGroleau Jan 18 '20 at 02:09sed
,awk
, etc will not load the whole file in the memory. But they would load whole lines in the memory, which may be exactly what brings your system to its knees -- because of overlong lines. What kind of file is that? Is it an xml by chance? – Jan 18 '20 at 02:10vim
with any huge files -- not only they will load it whole in the memory, but they will create a structure out of it, which will double / triple or worse its size ;-) – Jan 18 '20 at 02:13head -60000 file | tail -1
filed three screens with a single line of garbage that I am pretty sure was NOT in thle file. (And it did that after a minute or more of no visible output.). The file was created by looping through image file names and grepping for each one in two other files. So the lines were in the form filename: path/to/image. The reason it was so big is that one of the image file names was ".jpg" which matched everything in the other two files. – WGroleau Jan 18 '20 at 02:32LC_COLLATE=C tr -sc ' -~\n' '\n' <infile >outfile
. – Jan 18 '20 at 02:45grep
is the guilty party. If not, then head & tail failed due to size. – WGroleau Jan 18 '20 at 05:37