why does it take so long to read the top few lines of my file?

Question

I have a huge text file ~ 33Gb and due to its size, I wanted to just read the first few lines of the file to understand how the file is organized. I have tried head, but it took like forever to even finish the run. Is it because in UNIX, head needs to run through the WHOLE file first before it can do anything? If so, is there a faster way to display part of such a file?

cat | cut won't work. Sorry, as I am new to UNIX. – B Chen Apr 26 '16 at 19:24 — B Chen, Apr 26 '16 at 19:24

score 14 · Accepted Answer · edited Apr 13 '17 at 12:36

This doesn't really answer your question; I suspect the reason head is slow is as given in Julie Pelletier's answer: the file doesn't contain any (or many) line feeds, so head needs to read a lot of it to find lines to show. head certainly doesn't need to read the whole file before doing anything, and it stops reading as soon as it has the requested number of lines.

To avoid slowdowns related to line feeds, or if you don't care about seeing a specific number of lines, a quick way of looking at the beginning of a file is to use dd; for example, to see the first 100 bytes of hugefile:

dd if=hugefile bs=100 count=1

Another option, given in Why does GNU head/tail read the whole file?, is to use the -c option to head:

head -c 100 hugefile

score 12 · Answer 2 · answered Apr 26 '16 at 19:05

12

The only times I've seen similar cases was when the file didn't have line feeds as head only reads the required number of lines from the file.

answered Apr 26 '16 at 19:05

Julie Pelletier

7,622

why does it take so long to read the top few lines of my file?

2 Answers2