1

I have a big text file (>500GB), all the ways I can find (sed/tail and others) all require write the 500GB content to disk. Is there anyway to quickly remove the first a few lines in place without writing 500GB to disk?

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
1a1a11a
  • 121

2 Answers2

0

By using the tail command in that way:

# tail -n +<lines to skip> filename

for example:

tail -n +1000 hugefile.txt > hugefile-wo-the-first-1000-lines.txt

And that's all.- For more information: https://es.wikipedia.org/wiki/Tail

BTW: Don't be fooled if someone tell you this is exactly the opposite what you want to do, I've tested it:

$ tail -n +3 /tmp/test 
3
4
5

$ cat /tmp/test 
1
2
3
4
5
guile
  • 51
0

You can use sed to delete lines in place with the -i option:

$ cat foo.txt
bar
baz
lorem
$ sed -i '1d' foo.txt
$ cat foo.txt
baz
lorem

You can also delete a range of lines; for example sed -i '1,4d' foo.txt will remove lines 1-4.

EDIT: as don pointed out in the comments, the -i option still creates a copy.

edaemon
  • 356
  • 3
    This will also create a temporary file, write the 500GB minus a few lines to the temporary file then overwrite the original. – don_crissti Feb 16 '17 at 23:39
  • @don_crissti: does it? It's possible, I'm not 100% familiar with sed's inner workings, but the -i option in the manual says: "edit files in place". I always assumed that meant it would just modify the file without having to create a copy. – edaemon Feb 16 '17 at 23:42
  • 2
    As Don says. sed -i ... is equivalent to sed ... file >tmpfile && mv tmpfile file. Removing lines from a file in place (properly) is not possible as the length of the file changes. – Kusalananda Feb 16 '17 at 23:43
  • @Kusalananda: huh, okay. Learned something new, I guess. – edaemon Feb 16 '17 at 23:45
  • Thank you for your answer even though it didn't solve the problem. – 1a1a11a Feb 17 '17 at 00:18