IO redirection and the head command

Question

I was trying to quickly edit an .hgignore file from the Cygwin bash shell today, and I added a line that was a mistake. I'm not sure if this was the best way to do it, but I quickly thought of using head -1 .hgignore to remove the offending line (I had previously only had one line in the file). Sure enough, when executed it gives the first line as the only output.

But when I tried to redirect the output and rewrite the file using head -1 .hgignore > .hgignore, the file was empty. Why does this happen? If I try appending instead, head -1 .hgignore >> .hgignore, it appends correctly but this is obviously not the desired result. Why does a truncating redirect not work in this case?

Similar: Can I make cut change a file in place?, How can I make iconv replace the input file with the converted output? — Gilles 'SO- stop being evil', Jun 29 '11 at 22:18

score 12 · Answer 1 · edited Apr 13 '17 at 12:37

12

I think Bruce answers what's going on here with the shell pipeline.

One of my favorite little utilities is the sponge command from moreutils. It solves exactly this problem by "soaking" up all available input before it opens the target output file and writing the data. It allows you to write pipelines exactly how you expected to:

$ head -1 .hgignore | sponge .hgignore

The poor-man's solution is to pipe the output to a temporary file, then after the pipline is done (for example the next command you run) is to move the temp file back to the original file location.

$ head -1 .hgingore > .hgignore.tmp
$ mv .hgignore{.tmp,}

edited Apr 13 '17 at 12:37

Community

1

answered Jun 29 '11 at 20:43

Caleb

70,105

Looking at this a few years later, a thought occurred to me: couldn't we just do head -1 .hgignore | tee .hgignore? tee is in coreutils, and as a perk/side-effect, this also writes to STDOUT – voithos Mar 28 '14 at 14:51
1

@voithos To my knowledge tee opens and truncates the file it is writing to when it is instantiated just like everything else so it does not solve the main issue here of the race condition on reading the file contents before you truncate it with the write. – Caleb Mar 28 '14 at 17:37
You bring up a point that I wasn't aware of, actually - namely, that piped commands are started immediately, instead of sequentially. Is that accurate? I did, however, test it out and tee seems to do the desired thing. I've got version 8.13 on my machine. – voithos Mar 28 '14 at 17:47
2

@voithos Yes commands in a pipline and all the input/output channels involved are started in reverse order so the pipeline is ready to receive data when the first one starts giving it. I suspect your test is flawed because you probably used too small a chunk of data and it got the whole thing cached in a read buffer before you needed it. The tee program will truncate your files, it is not setup to double buffer them. – Caleb Mar 29 '14 at 11:32

score 11 · Accepted Answer · answered Jun 29 '11 at 18:43

When the shell gets a command line like: command > file.out the shell itself opens (and maybe creates) the file named file.out. The shell sets file descriptor 0 to the file file descriptor it got from the open. That's how I/O redirection works: every process knows about file descriptors 0, 1 and 2.

The hard part about this is how to open file.out. Most of the time, you want file.out opened for write at offset 0 (i.e. truncated) and this is what the shell did for you. It truncated .hgignore, opened it for write, dup'ed the filedescriptor to 0, then exec'ed head. Instant file clobbering.

In bash shell, you do a set noclobber to change this behavior.

Aha, I see. I did think that the shell was truncating the file before running the command, but I didn't know why. Thanks for the explanation! — voithos, Jun 29 '11 at 20:18

Stéphane Chazelas · Answer 3 · 2013-02-19T21:38:18.063

In

head -n 1 file > file

file is truncated before head is started, but if you write it:

head -n 1 file 1<> file

it's not as file is opened in read-write mode. However, when head finishes writing, it doesn't truncate the file, so the line above would be a no-op (head would just rewrite the first line over itself and leave the other ones untouched).

However, after head has returned and while the fd is still open, you can call another command that does the truncate.

For instance:

{ head -n 1 file; perl -e 'truncate STDOUT, tell STDOUT'; } 1<> file

What matters here is that truncate above, head just moves the cursor for fd 1 inside the file just after the first line. It does rewrite the first line which we didn't need it to, but that's not harmful.

With a POSIX head, we could actually get away without rewriting that first line:

{ head -n 1 > /dev/null
  perl -e 'truncate STDIN, tell STDIN'
} <> file

Here, we're using the fact that head moves the cursor position in its stdin. While head would typically read its input by big chunks to improve performance, POSIX would require it (where possible) to seek back just after the first line if it had gone beyond it. Note however that not all implementations do it.

Alternatively, you can use the shell's read command instead in this case:

{ read -r dummy; perl -e 'truncate STDIN, tell STDIN'; } <> file

Stephane, do you know of a standard or coreutils command that can truncate STDIN similar to what you've accomplished using perl above — iruvar, Aug 27 '15 at 14:11
@1_CR, no. dd can truncate at any arbitrary absolute offset in the file though. So you can determine the byte offset of the second line and truncate from there with dd bs=1 seek="$offset" of=file — Stéphane Chazelas, Aug 27 '15 at 14:36

Gilles 'SO- stop being evil' · Answer 4 · 2011-06-29T22:48:27.087

1

The Real Man's solution is

ed .hgignore
$d
wq

or as a one-liner

printf '%s\n' '$d' 'wq' | ed .hgignore

Or with GNU sed:

sed -i '$d' .hgignore

(No, I'm kidding. I'd use an interactive editor. vi .hgignore GddZZ)

edited Jun 29 '11 at 22:48

answered Jun 29 '11 at 22:16

Gilles 'SO- stop being evil'

829,060

I've wondered, is there any advantage to using :wq over ZZ? – voithos Jun 29 '11 at 22:40
Also, :x which is what my fingers do automatically – glenn jackman Jun 30 '11 at 00:04
and ZQ is the same as :q! – glenn jackman Jun 30 '11 at 00:05
ZZ and :x only write if there is something to write... :w always fsyncs the file to disk regardless if it needs it. I use :xa because I use tabs. – xenoterracide Jul 10 '12 at 09:37

Zombo · Answer 5 · 2016-04-14T05:33:54.307

1

You can use Vim in Ex mode:

ex -sc '2,d|x' .hgignore

2, select lines 2 until end
d delete
x save and close

edited Apr 14 '16 at 05:33

answered Apr 11 '16 at 02:58

Zombo

1
5
44
63

score 0 · Answer 6 · edited May 23 '17 at 12:40

0

For in-place file editing you may also use the open file handle trick as shown by Jürgen Hötzel in Redirect output from sed 's/c/d/' myFile to myFile.

exec 3<.hgignore
rm .hgignore  # prevent open file from being truncated
head -1 <&3 > .hgignore

ls -l .hgignore  # note that permissions may have changed

edited May 23 '17 at 12:40

Community

1

answered Jun 30 '11 at 10:25

dan55

17

2

And just after rm .hgignore your power fails, taking away hours of hard work. Ok, it doesn't matter for .hgignore, but why would you do something that complicated anyway? Thus my downvote: technically correct but a very bad idea. – Gilles 'SO- stop being evil' Jun 30 '11 at 20:57
@Gilles, maybe not so good an idea, but that's for instance what perl -i (for inplace editing) does, and I wouldn't be surprised if some implementations of sed -i did it as well (though latest version of GNU sed seems not to). – Stéphane Chazelas Feb 19 '13 at 20:20

IO redirection and the head command

6 Answers6

Linked

Related