Why does `cat`ing a file into itself erase it?

Question

Possible Duplicate:
IO redirection and the head command

I just wanted to remove all but the first line of a file. I did this:

head -1 foo.txt

... and verified that I saw only the first line. Then I did:

head -1 foo.txt > foo.txt

But instead of containing only the first line, foo.txt was now empty.

Turns out that cat foo.txt > foo.txt also empties the file.

Why?

A more interesting question (yet slightly more pedantic) would be to know if the evaluation order is defined in POSIX or is it implementation-specific. — rahmu, Jun 20 '12 at 18:19
@rahmu POSIX does specify the order but it doesn't need to because if you think about it for just a moment, you'll realize that the shell is the one that does the redirections and it has to do them before running the command, since it will be too late to do it after the command has already started. Add that bit of common sense to the fact that the > operator includes truncation and this behavior becomes very logical. — jw013, Jun 20 '12 at 21:55

Levon · Accepted Answer · 2012-06-21T12:53:36.900

13

Before the shell starts processing any data, it needs to make sure all the input and output is squared away.

So in your case using > foo.txt basically tells the system: "create a (new) file named foo.txt and stick all the output from this command into that file".

The problem is, as you found out, that that wipes out the previous contents.

Related, >> will append to an existing file.

Update:

Here's a solution using sed, handle with care:

 sed -i '2,$d' foo.txt

It will delete lines 2 to "last" in-place in file foo.txt. Best to try this out on a file you can afford to mess up first :)

This slightly modified version of the command will keep a copy of the original with the .bak extension:

 sed -i.bak '2,$d' foo.txt

You can specify any sequence of characters (or a single character) after the -i command line switch for the name of the "backup" (ie original) file.

edited Jun 21 '12 at 12:53

answered Jun 20 '12 at 17:47

Levon

11,384
4
45
41

Interesting. How would you remove all but the first line of a file, since head -1 foo.txt > foo.txt won't work? – Nathan Long Jun 20 '12 at 17:55
@NathanLong Off the top of my head, I'd just use a different temporary file for the output and then rename it. Or is this something you are going to do over? – Levon Jun 20 '12 at 18:00
It's not something I'll do a lot. I just wondered if I was missing some obvious, better way. – Nathan Long Jun 20 '12 at 18:02
1

This works: head -1 foo.txt | tee foo.txt – Nathan Long Jun 20 '12 at 18:05
@NathanLong I just posted a solution using sed that will fit your requirements too. – Levon Jun 20 '12 at 18:06
Regarding @Levon's temporary file. For 1 line of text a simple shell variable is enough: IFS= read -r < foo.txt && echo "$REPLY" > foo.txt. This is faster than the sed solution and more reliable than @NathanLong's tee workaround, which works only on small files. Regarding the sed solution, sed -i 'q' foo.txt is shorter and faster. – manatwork Jun 20 '12 at 18:22
@manatwork thanks for the additional information, always good to add to the toolbox. – Levon Jun 20 '12 at 18:35
1

Please don't recommend sed -i. It's not portable and probably doesn't even work correctly on links. Non-portable alternatives include cmd file | sponge file. Portable alternatives include ed (ed is a file editor; sed is a stream editor, not a file editor), which has more or less the same commands as sed, or a temporary file. – jw013 Jun 20 '12 at 21:59
why does tee work here ? from this SO discussion of tee vs. sponge it seems like tee would have the same issues as just redirecting straight to the file. – orion elenzil Dec 03 '19 at 17:12

score 3 · Answer 2 · answered Jun 20 '12 at 17:47

Because the shell that you use to invoke cat does the redirection indicated by >.

The shell (bash, zsh, ksh, dash, whatever) reads the command cat foo.txt > foo.txt. The shell has to set up the redirection indicated by > foo.txt. > means to start writing the file from the top, >> would mean to append to foot.txt.

By the time the shell actually gets cat running, foo.txt has disappeared.

Why does `cat`ing a file into itself erase it?

2 Answers2

Linked

Related