24

Are these two commands any different on how they go about zero-ing out files? Is the latter a shorter way of doing the former? What is happening behind the scenes?

Both

$ cat /dev/null > file.txt

$ > file.txt 

yield

-rw-r--r--  1 user  wheel  0 May 18 10:33 file.txt
KM.
  • 2,224

5 Answers5

29

cat /dev/null > file.txt is a useless use of cat.

Basically cat /dev/null simply results in cat outputting nothing. Yes it works, but it's frowned upon by many because it results in invoking an external process that is not necessary.
It's one of those things that is common simply because it's common.

Using just > file.txt will work on most shells, but it's not completely portable. If you want completely portable, the following are good alternatives:

true > file.txt
: > file.txt

Both : and true output no data, and are shell builtins (whereas cat is an external utility), thus they are lighter and more 'proper'.

 

Update:

As tylerl mentioned in his comment, there is also the >| file.txt syntax.

Most shells have a setting which will prevent them from truncating an existing file via >. You must use >| instead. This is to prevent human error when you really meant to append with >>. You can turn the behavior on with set -C.

So with this, I think the simplest, most proper, and portable method of truncating a file would be:

:>| file.txt
phemmer
  • 71,831
  • 3
    The colon command is defined in POSIX. It is a null operation that exists to expand commandline args. – kojiro May 19 '14 at 15:26
  • @kojiro actually it's not implementation specific. POSIX very explicitly states that true is a shell builtin. – phemmer May 19 '14 at 15:57
  • 4
    LOL, "cat abuse" – KM. May 19 '14 at 16:04
  • 2
    @kojiro : is also mandated by POSIX to be built-in, and in fact is different from true in that it is considered a "special" built-in. – jw013 May 19 '14 at 16:35
  • @jw013 I think "special" in the context on that page means it's a "special utility", not a "special built-in". At least I don't see it mentioning any difference between a special built-in and a non-special built-in. – phemmer May 19 '14 at 16:38
  • @Patrick The phrase "special utility" does not appear anywhere. The link I gave explains the difference between a special built-in and a regular built-in: however, the special built-in utilities described here differ from regular built-in utilities in two respects: ... What do you mean by special utility? – jw013 May 19 '14 at 16:52
  • 2
    don't forget about noclobber. >| file is a more explicit truncate. – tylerl May 19 '14 at 18:40
  • So to clear a file, you just have to apply the right smiley :> – celtschk May 19 '14 at 19:16
  • > file doesn't work in csh, tcsh, fish or zsh (except in sh/ksh emulation) so I wouldn't say it works in most shells. It works in most shells of the Bourne and rc families. – Stéphane Chazelas May 19 '14 at 19:49
  • 1
    No true is not required to be builtin and it traditionally wasn't. : is built in all the shells of the Bourne family. : is a special builtin per POSIX (so : > file will exit the shell for instance if file can't be open for writing in POSIX shells) and true is not. POSIX even mentions that : may be more efficient than true on some systems. – Stéphane Chazelas May 19 '14 at 19:53
  • No ksh hasn't that by default. The POSIX spec is based on ksh, so if it was in ksh, it would be the case in all POSIX shells. set -o noclobber aka set -C and >| are POSIX but not Bourne nor csh, tcsh, rc, es, fish, so you can't really say it's portable. – Stéphane Chazelas May 19 '14 at 20:15
  • @StephaneChazelas What do you make of 1.c in Command Search and Execution? The way I interpret it, true must bypass PATH lookup and unless a shell hardcodes a fixed path to an external true binary, the only way it can meet that requirement is to make the listed utilities built-in. – jw013 May 20 '14 at 21:26
  • 1
    @jw013, in practice, all the POSIX shells I know have true builtin anyway and like you I don't see the point of executing a hardcoded true, but think of busybox as a possible counter-example. Note that very few shells implement that section of the spec (they generally still run [ or echo for instance even if they're not found in $PATH). I find it strange that false, true are included in that list and not type. – Stéphane Chazelas May 21 '14 at 21:50
25

In terms of portability:

                      Bourne POSIX  zsh    csh/tcsh  rc/es  fish
> file                Y      Y      N(1)   N(1)      N      N
: > file              N/Y(2) Y(3)   Y      Y(4)      N(5)   N(5)
true > file           Y(5)   Y      Y      Y(5)      Y(5)   Y(5)
cat /dev/null > file  Y(5)   Y      Y(5)   Y(5)      Y(5)   Y(5)
eval > file           Y(3,8) Y(3)   Y      Y(6)      Y      Y
cp /dev/null file (7) Y(5)   Y      Y(5)   Y(5)      Y(5)   Y(5)
printf '' > file      Y(5)   Y      Y      Y(5)      Y(5)   Y

Notes:

  1. except in sh or ksh emulation, for redirections without a command, in zsh, a default command is assumed (a pager for stdin redirection only, cat otherwise), that can be tuned with the NULLCMD and READNULLCMD variables. That's inspired from the similar feature in (t)csh
  2. Redirections were initially not performed for : in UnixV7 as : was interpreted half-way between a comment leader and a null command. Later they were and like for all builtins, if the redirection fails, that exits the shell.
  3. : and eval being special built-ins, if the redirection fails, that exits the shell (bash only does that in POSIX mode).
  4. Interestingly, in (t)csh, that's defining a null label (for goto), so goto '' there would branch there. If the redirection fails, that exits the shell.
  5. Unless/if the corresponding command is available in $PATH (: generally isn't; true, cat, cp and printf generally are (POSIX requires them)).
  6. If the redirection fails, that exits the shell.
  7. If file is a symlink to an non-existing file however, some cp implementations like GNU's will refuse to create it.
  8. The initial versions of the Bourne shell didn't support redirecting builtins though

In terms of legibility:

(this section is highly subjective)

  • > file. That > looks too much like a prompt or a comment. Also the question I'll ask when reading that (and most shells will complain about the same) is what output exactly are you redirecting?.
  • : > file. : is known as the no-op command. So that reads straight away as generating an empty file. However, here again, that : can easily be missed and/or seen as a prompt.
  • true > file: what has boolean to do with redirection or file content? What is meant here? is the first thing that comes to my mind when I read that.
  • cat /dev/null > file. Concatenate /dev/null into file? cat being often seen as the command to dump the content of the file, that can still make sense: dump the content of the empty file into file, a bit like a convoluted way to say cp /dev/null file but still understandable.
  • cp /dev/null file. Copies the content of the empty file to file. Makes sense, though someone not knowing how cp is meant to do by default might think you're trying to make file a null device as well.
  • eval > file or eval '' > file. Runs nothing and redirects its output to a file. Makes sense to me. Strange that it's not a common idiom.
  • printf '' > file: explicitly prints nothing into a file. The one that makes most sense to me.

In terms of performance

The difference is going to be whether we're using a shell builtin or not. If not, a process has to be forked, the command loaded and executed.

eval is guaranteed to be built in all shells. : is built-in wherever it's available (Bourne/csh likes). true is builtin in Bourne-like shells only.

printf is built-in most modern Bourne-like shells and fish.

cp and cat generally are not built-in.

Now cp /dev/null file does not invoke shell redirections, so things like:

find . -exec cp /dev/null {} \;

are going to be more efficient than:

find . -exec sh -c '> "$1"' sh {} \;

(though not necessarily than:

find . -exec sh -c 'for f do : > "$f"; done' sh {} +

).

Personally

Personally, I use : > file in Bourne-like shells, and don't use anything other than Bourne-like shells these days.

5

You might want to look at truncate, which does exactly that: truncate a file.

For example:

truncate --size 0 file.txt

This is probably slower than using true > file.txt.

My main point however is: truncate is intended for truncating files, while using > has the side effect of truncating a file.

Fabian
  • 1,095
  • 2
    Truncate is nice when you want to truncate a file to something other than 0. That said, even without a shell is a strange statement: can you describe a context where truncate would be available, but neither > nor unistd C libraries would be available? – kojiro May 19 '14 at 15:30
  • Not really. There is probably a more elegant solution for every script or programming language available. – Fabian May 19 '14 at 15:41
  • 3
    truncate is a FreeBSD utility, relatively recently (2008) added to the GNU coreutils (though the --size GNU long option style is GNU specific), so it's not available in non-GNU-or-FreeBSD systems, and it's not available in older GNU systems, I wouldn't say it's portable. cp /dev/null file would work without a shell redirection and would be more portable. – Stéphane Chazelas May 19 '14 at 20:02
  • Okay, I will remove that portability comment. Though your definition of recent seems to differ. – Fabian May 19 '14 at 21:04
2

The answer depends a bit on what file.txt is, and how process write to it!

I'll cite a common use case : you have a growing logfile called file.txt, and want to rotate it.

Therefore you copy, for example, file.txt into file.txt.save, then truncate file.txt.

In this scenario, IF the file is not opened by another_process (ex: another_process could be a program outputting to that file, for example a program logging something), then your 2 proposals are equivalent, and both work well (but the 2nd is prefered as the first "cat /dev/null > file.txt" is a Useless Use of Cat and also opens and reads /dev/null).

But the real trouble would be if the other_process is still active, and still has an open handle going to the file.txt.

Then, 2 main cases arise, depending on how other process opened the file :

  • If other_process opens it in the normal way, then the handle will be still pointing to the former location in the file, for example at offset 1200 bytes. The next write will therefore start at offset 1200, and thus you'll have again a file of 1200bytes (+ whatever other_process wrote), with 1200 leading null characters! Not what you want, I presume.

  • If other_process opened file.txt in "append mode", then each time it writes, the pointer will actively seek to the end of the file. Therefore, when you truncate it, it will "seek" until byte 0, and you won't have the bad side effect! This is what you want (... usually!)

Note that this means you need, when you truncate a file, to make sure that all other_process still writing to that location have opened it in the "append" mode. Otherwise you'll need to stop those other_process, and start them again, so they start pointing at the beginning of the file instead of the former location.

References : https://stackoverflow.com/a/16720582/1841533 for a cleaner explanation, and a nice short example of difference between normal and append mode logging at https://stackoverflow.com/a/984761/1841533

  • 3
    Very little of this answer actually is relevant to or answers the question. The difference between a cat /dev/null > file and a > file is a cat /dev/null and that makes no difference to the file. – jw013 May 19 '14 at 17:05
  • @jw013: True! But I just wanted to take the question's opportunity to re-state the "what you want/not what you want" information, as it's not very well known, and could hit hard someone trying to rotate logs (a common case where you want to truncate a file). – Olivier Dulac May 20 '14 at 07:29
  • 1
    There is a time and a place for everything. Your information may be useful in some other context but it does not belong here - you should find a more appropriate place for it because nobody trying to rotate logs is going to look in this completely unrelated redirection question. Here your answer is the equivalent of a digital weed, just as an otherwise useful pumpkin plant in the middle of a cornfield would be considered a weed. – jw013 May 20 '14 at 14:53
1

I like this and use it often because it looks cleaner and not like somebody hit the return key by accident:

echo -n "" > file.txt

Should be a built-in too?

awsm
  • 11
  • 3
    There are many ways to zero-out a file. I think KM. was only interested in understanding the difference between the two methods shown in the question. – drs May 19 '14 at 18:53
  • 6
    Many echo implementations don't support -n (and would output -n<SPC><NL> here. printf '' > file.txt would be more portable (at least across modern/POSIX systems). – Stéphane Chazelas May 19 '14 at 20:03