cat a > b
and
cp a b
If they are functionally the same for all intents and purposes, which one is faster?
cat a > b
and
cp a b
If they are functionally the same for all intents and purposes, which one is faster?
In terms of functionality I think they are the same.
If I had to venture a guess about which is faster, I would say the cp command because its purpose is to do file operations just for copy purposes, so it would be optimized for that.
cat, by contrast, is meant concatenate files, meaning joining several files into a series. If no file is specified, it will display a file to the console (thanks to @bahamat for reminding us). In this example the output is redirected to another file. I think this indirection would be less efficient than a direct cp
.
I don't know if the difference would be noticeable for regular sized files, though it would be interesting to time these on very large files. I guess one could do repeated trials with /usr/bin/time and see if one is consistently faster/slower than the other.
Do you have any particular reason to ask about this, or is it just simple curiosity (nothing wrong with that at all of course)
cat
might modify the contents as it passes it through whereas cp
wouldn't. I don't know enough about the details of what goes on here. What drove me to ask the question is I saw in one of my shell scripts two tasks accomplishing the same thing but using these different methods and I started wondering what the difference was between them, and if I had that in mind when I wrote them.
– Steven Lu
Jun 30 '12 at 02:38
cp
will try and keep the same file permissions (mode) on the new file as are on the original. Obviously this may not be possible due to filesystem restrictions or other factors though.
– phemmer
Jun 30 '12 at 02:39
cat
.. since it's main purpose is to display data, perhaps it would tweak the data for display purposes ever so slightly? I guess I would use cp
for copy, not cat
– Levon
Jun 30 '12 at 02:39
cat
is not to display a file to the screen. cat
is short for "concatenate", which is defined as "to unite in a series or chain". The original purpose of cat
was to combine multiple files into one (e.g., cat file1 file2
). It just so happens that if you combine one file with nothing it prints only that one.
– bahamat
Jun 30 '12 at 06:13
cp
will keep the same permissions only if -p
option is used. Otherwise it will create absolutely new file according to umask
. Issue about timestamps is the same.
– rush
Jun 30 '12 at 08:09
cat
responds to when reading from a file. @Steven cat
is designed to be perfectly transparent. It doesn't change anything.
– jippie
Jun 30 '12 at 08:11
cat
stood for concatenate (its right there on the man page), though I don't think I new about its original purpose to join multiple files into one - thanks for the information.
– Levon
Jun 30 '12 at 10:32
preserve the specified attributes (default: mode,ownership,timestamps)
. Note the default part. And I've even tested it, it preserves the mode without passing the argument, and no, I dont have an alias set.
– phemmer
Jun 30 '12 at 17:45
Functionally similar, but specifically different. Essentially, they both read a bunch of data from the first file, write it to another file.
When I do an strace on Linux:
$ strace cat /etc/fstab > test.txt
...
open("/etc/fstab", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=691, ...}) = 0
fadvise64_64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
read(3, "# /etc/fstab: static file system"..., 32768) = 691
write(1, "# /etc/fstab: static file system"..., 691) = 691
read(3, "", 32768) = 0
close(3) = 0
close(1) = 0
close(2) = 0
exit_group(0) = ?
$ strace cp /etc/fstab test.log
...
open("/etc/fstab", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=691, ...}) = 0
open("test.log", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0644) = 4
fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
read(3, "# /etc/fstab: static file system"..., 32768) = 691
write(4, "# /etc/fstab: static file system"..., 691) = 691
read(3, "", 32768) = 0
close(4) = 0
close(3) = 0
_llseek(0, 0, 0xbfaa3fb0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
close(0) = 0
close(1) = 0
close(2) = 0
exit_group(0) = ?
-f
in your strace of the cat-command, or where is the open for FD 1?
– Nils
Jun 30 '12 at 20:31
There isn't much difference: both copy the content of the old file into a new file with the same content. Both overwrite the target if it is an existing file.
Some old systems might stop copying or truncate lines if you try to copy binary files with cat
, because they might choke on null characters. I don't think any unix system you're likely to encounter now has a problem there. cp
is guaranteed not to have a problem.
cp
allows you to specify a directory as the destination: the file is copied to have the same name as the original, in the new directory.
If the destination doesn't exist, cp
uses the permission bits of the source file, modified by the current umask.
You can protect against overwriting the target file when doing cat … >target
by setting the noclobber
option in the shell with set -C
. You can protect against overwriting the target file with cp
by passing the -i
option (alias cp='cp -i'
); cp
will ask for confirmation.
It is often useful to preserve the date of the original file. You can use cp -p
for that.
Performance will vary, depending on the size of the file, the filesystem, the kind of source and target disk, the operating system, etc. For raw disk copies under Linux, I found next to no difference.
Look like cat is faster
than cp
root@SHW:/tmp# time cp debug.log test1
real 0m0.021s
user 0m0.000s
sys 0m0.000s
root@SHW:/tmp# time cat debug.log > test2
real 0m0.013s
user 0m0.000s
sys 0m0.000s
root@SHW:/tmp# du -h debug.log
4.0K debug.log
root@SHW:/tmp# file debug.log
debug.log: ASCII text
time
also include the >
operation, of only the cat command itself?
– Bernhard
Jun 30 '12 at 11:11
cat
first. Then cat is probably slower than cp
. I suspect you are seeing a file being read from cache in the second command.
– Nils
Jun 30 '12 at 20:34