Gilles beat me to it:
there is no “pointer that points to the first line of the file”.
The first line of the file — the beginning of the file —
is always the first character of the file.
(There may be obscure, individual applications
that recognize such a notion,
but there’s nothing like this at the system level.)
What you already know:
Commands like
sed '1,6d' filename
sed -n '7,$p' filename
tail -n +7 filename
(and probably other variants)
will write all but the first 6 lines of filename
to the standard output.
(They all, of course, read all of the file.)
While we’re at it,
sed -n '1,6p' filename
sed '7,$d' filename
head -n 6 filename
sed '6q' filename
will write the first 6 lines of filename
to the standard output.
The first two might or might not read the entire file;
the last two probably will not.
Also,
command input_filename > the_same_filename
doesn’t work, as discussed in
Warning regarding “>”.
What you might not know:
command arguments 1<> filename
will open filename
for reading and writing
without truncating (clobbering) it.
So,
sed '1,6d' filename 1<> the_same_filename
might be the first step in the solution you are looking for.
This is probably as close as you’re going to come
to removing the first
M lines of a file “in place”;
it will read the file and overwrite it concurrently,
without creating another file.
If
M is small enough (or, specifically,
if the number of bytes in the first
M lines is small enough),
this may read each block of the file once and write each block once —
and you can’t do any better than that.
Just the first step?
I created this test file:
$ cat -n foo
1 a
2 bcd
3 efghi
4 jklmnop
5 qrstuvwxy
6 z0123456789
7 ABCDEFGHIJKLM
8 Once upon a midnight dreary, while I pondered, weak and weary,
9 Over many a quaint and curious volume of forgotten lore—
10 While I nodded, nearly napping, suddenly there came a tapping,
11 As of some one gently rapping—rapping at my chamber door.
12 "'Tis some visitor," I muttered, "tapping at my chamber door—
13 Only this and nothing more."
14 The quick brown
15 fox jump over the
16 lazy dog. Once upon
17 this midnight dreary,
This file is painstaking constructed
so that the lengths of the lines (including newlines)
are 2, 4, 6, 8, 10, 12, 14, 63, 57, 63, 58, 62, 63, 16, 18, 20,
and 22.
Note that the first six lines therefore contain 2+4+6+8+10+12=42 bytes.
The last two lines contain 20+22 bytes, which is coincidentally (!) also 42.
(The total file size is 504.)
So,
$ ls -l foo
-rw-r--r-- 1 myusername mygroupname 504 May 18 04:25 foo
$ sed '1,6d' foo 1<> foo
$ ls -l foo
-rw-r--r-- 1 myusername mygroupname 504 May 18 04:32 foo
$ cat -n foo
1 ABCDEFGHIJKLM
2 Once upon a midnight dreary, while I pondered, weak and weary,
3 Over many a quaint and curious volume of forgotten lore—
4 While I nodded, nearly napping, suddenly there came a tapping,
5 As of some one gently rapping—rapping at my chamber door.
6 "'Tis some visitor," I muttered, "tapping at my chamber door—
7 Only this and nothing more."
8 The quick brown
9 fox jump over the
10 lazy dog. Once upon
11 this midnight dreary,
12 lazy dog. Once upon
13 this midnight dreary,
OK, good, the first six lines are gone.
The original line number 7 (“ABCDEFGHIJKLM”) is now line number 1.
But, what’s this?
The file has gone from 17 lines to 13.
It should be 11 (17−6).
And the last two lines (“lazy dog … midnight dreary”) are there twice.
This is one of the pitfalls of the 1<>
operator —
if you don’t truncate the output file,
you can’t end up with a file that’s smaller than the one you started with.
Specifically, here, the output from sed '1,6d' foo
is 462 bytes
(504−42, since the first six lines contain 42 bytes),
and so it overwrites the first 462 bytes of the output file —
which is also foo
.
And the first 462 bytes of foo
are all but the last 42 (504−462) —
so the last two lines do not get overwritten.
The two copies of the last two lines (“lazy dog … midnight dreary”)
are one that’s the output from sed
,
followed by one that’s left over from the original contents of the file.
So, what next?
All we need to do now is to throw away the last 42 bytes of the file.
As it happens, this can be done
by just moving the pointer that points to the end of the file.
OK, it’s not actually a pointer; it’s an integer file size —
potAto, potAHto.
For the past 20 or 30 years,
Unix has allowed you to truncate a file to a desired size,
leaving the data up to that point untouched,
and discarding the data beyond that point.
An ancient command that will do this is
dd if=/dev/null bs=462 seek=1 of=foo 2> /dev/null
which copies /dev/null
over foo
, starting at byte 462.
Yes, it’s somewhat of a kluge.
A newer command that does this function is
truncate -s 462 foo
This might not be present on all systems; it is not specified by POSIX.
So, putting it all together,
#!/bin/sh
filename="$1"
bytes_to_remove=$(sed '6q' "$filename" | wc -c)
total_size=$(stat -c '%s' "$filename")
sed '1,6d' "$filename" 1<> "$filename"
new_size=$((total_size - bytes_to_remove))
truncate -s "$new_size" "$filename"
We use wc -c
to count the characters in the first six lines
(produced by sed '6q'
), subtract that from the total file size,
and truncate the file to that size.
You can use any of the alternative commands
to output the first M lines or the last N−M lines,
and you can replace the last line with
dd if=/dev/null bs="$new_size" seek=1 of="$filename" 2> /dev/null
Caveats:
I haven’t tested this on files with
- CR-LF line endings, or
- multibyte characters,
and these might be problematic.
coreutils
once you get it working. /s Does that make sense ? - Nope. You're confusing files with data structures. – Satō Katsura May 17 '17 at 16:23sed
makes a copy, only if you do inplace with backup (not even with). – ctrl-alt-delor May 17 '17 at 16:32head
only reads first few lines and then quits. – ctrl-alt-delor May 17 '17 at 16:32sed
makes a copy, only if you do inplace with backup (not even with).
– ctrl-alt-delor May 17 '17 at 16:34head
only reads first few lines and then quits.tail
could read backwards (that is what I would do).