6
cat a > b

and

cp a b

If they are functionally the same for all intents and purposes, which one is faster?

Steven Lu
  • 2,282

4 Answers4

6

In terms of functionality I think they are the same.

If I had to venture a guess about which is faster, I would say the cp command because its purpose is to do file operations just for copy purposes, so it would be optimized for that.

cat, by contrast, is meant concatenate files, meaning joining several files into a series. If no file is specified, it will display a file to the console (thanks to @bahamat for reminding us). In this example the output is redirected to another file. I think this indirection would be less efficient than a direct cp.

I don't know if the difference would be noticeable for regular sized files, though it would be interesting to time these on very large files. I guess one could do repeated trials with /usr/bin/time and see if one is consistently faster/slower than the other.

Do you have any particular reason to ask about this, or is it just simple curiosity (nothing wrong with that at all of course)

Levon
  • 11,384
  • 4
  • 45
  • 41
  • I'm also wondering if there are some strange cases where cat might modify the contents as it passes it through whereas cp wouldn't. I don't know enough about the details of what goes on here. What drove me to ask the question is I saw in one of my shell scripts two tasks accomplishing the same thing but using these different methods and I started wondering what the difference was between them, and if I had that in mind when I wrote them. – Steven Lu Jun 30 '12 at 02:38
  • 3
    There is 1 important functional difference between the 2; permissions. cp will try and keep the same file permissions (mode) on the new file as are on the original. Obviously this may not be possible due to filesystem restrictions or other factors though. – phemmer Jun 30 '12 at 02:39
  • Good point about cat .. since it's main purpose is to display data, perhaps it would tweak the data for display purposes ever so slightly? I guess I would use cp for copy, not cat – Levon Jun 30 '12 at 02:39
  • Won't cat stop at an EOF character? – James McLeod Jun 30 '12 at 04:15
  • 3
    cat is not to display a file to the screen. cat is short for "concatenate", which is defined as "to unite in a series or chain". The original purpose of cat was to combine multiple files into one (e.g., cat file1 file2). It just so happens that if you combine one file with nothing it prints only that one. – bahamat Jun 30 '12 at 06:13
  • 2
    @Patrick cp will keep the same permissions only if -p option is used. Otherwise it will create absolutely new file according to umask. Issue about timestamps is the same. – rush Jun 30 '12 at 08:09
  • 2
    @JamesMcLeod there is no such thing a an EOF that cat responds to when reading from a file. @Steven cat is designed to be perfectly transparent. It doesn't change anything. – jippie Jun 30 '12 at 08:11
  • Exactly. End of file is signalled out-of-band (comparing the file position against the file size). The C EOF return value is -1 and is only a special return flag for some C functions, not a value encountered in a file. – Alexios Jun 30 '12 at 10:22
  • @bahamat I knew that cat stood for concatenate (its right there on the man page), though I don't think I new about its original purpose to join multiple files into one - thanks for the information. – Levon Jun 30 '12 at 10:32
  • @bahamat I updated my answer with your information, thanks for helping to improve the information presented. – Levon Jun 30 '12 at 10:39
  • @rush from the man page: preserve the specified attributes (default: mode,ownership,timestamps). Note the default part. And I've even tested it, it preserves the mode without passing the argument, and no, I dont have an alias set. – phemmer Jun 30 '12 at 17:45
  • @Gilles: Yeah, in 1971, more than 2 years after UNIX was created (though, not necessarily cat). The 1st edition manual merely documented what was already in use, not the original intention of the authors (which is probably lost). – bahamat Jul 01 '12 at 23:42
  • @bahamat The 1st edition man page is the intention of the authors, the manual was written by dmr and kt. Admittedly it may not be their original intention, but rather what they concluded after some initial use. – Gilles 'SO- stop being evil' Jul 01 '12 at 23:45
  • @Gilles: exactly my point, it is "what they concluded after some initial use". Sometimes the secondary use of something is far more beneficial than the original intention. Unix itself was written to play Star Trek. I'm glad they found it had other uses too :-) – bahamat Jul 01 '12 at 23:49
4

Functionally similar, but specifically different. Essentially, they both read a bunch of data from the first file, write it to another file.

When I do an strace on Linux:

$ strace cat /etc/fstab > test.txt
...
open("/etc/fstab", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=691, ...}) = 0
fadvise64_64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
read(3, "# /etc/fstab: static file system"..., 32768) = 691
write(1, "# /etc/fstab: static file system"..., 691) = 691
read(3, "", 32768)                      = 0
close(3)                                = 0
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?

$ strace cp /etc/fstab test.log
...
open("/etc/fstab", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=691, ...}) = 0
open("test.log", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0644) = 4
fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
read(3, "# /etc/fstab: static file system"..., 32768) = 691
write(4, "# /etc/fstab: static file system"..., 691) = 691
read(3, "", 32768)                      = 0
close(4)                                = 0
close(3)                                = 0
_llseek(0, 0, 0xbfaa3fb0, SEEK_CUR)     = -1 ESPIPE (Illegal seek)
close(0)                                = 0
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?
sybreon
  • 724
  • Did you forget the -f in your strace of the cat-command, or where is the open for FD 1? – Nils Jun 30 '12 at 20:31
  • @Nils there is no open for FD 1, its already open when the program is executed (programs inherit open file descriptors, and FD 1 is already opened by the shell). – phemmer Jun 30 '12 at 20:56
3

There isn't much difference: both copy the content of the old file into a new file with the same content. Both overwrite the target if it is an existing file.

Some old systems might stop copying or truncate lines if you try to copy binary files with cat, because they might choke on null characters. I don't think any unix system you're likely to encounter now has a problem there. cp is guaranteed not to have a problem.

cp allows you to specify a directory as the destination: the file is copied to have the same name as the original, in the new directory.

If the destination doesn't exist, cp uses the permission bits of the source file, modified by the current umask.

You can protect against overwriting the target file when doing cat … >target by setting the noclobber option in the shell with set -C. You can protect against overwriting the target file with cp by passing the -i option (alias cp='cp -i'); cp will ask for confirmation.

It is often useful to preserve the date of the original file. You can use cp -p for that.

Performance will vary, depending on the size of the file, the filesystem, the kind of source and target disk, the operating system, etc. For raw disk copies under Linux, I found next to no difference.

0

Look like cat is faster than cp

root@SHW:/tmp# time cp debug.log test1
real    0m0.021s
user    0m0.000s
sys 0m0.000s
root@SHW:/tmp# time cat debug.log > test2
real    0m0.013s
user    0m0.000s
sys 0m0.000s
root@SHW:/tmp# du -h debug.log 
4.0K    debug.log
root@SHW:/tmp# file debug.log
debug.log: ASCII text
SHW
  • 14,786
  • 14
  • 66
  • 101