I usually assumed that tar
was a compression utility, but I am unsure, does it actually compress files, or is it just like an ISO file, a file to hold files?

- 965
-
Also on SuperUser. – allquixotic Apr 30 '14 at 17:38
4 Answers
Tar is an archiving tool (Tape ARchive), it only collects files and their metadata together and produces one file. If you want to compress that file later you can use gzip/bzip2/xz. For convenience, tar provides arguments to compress the archive automatically for you. Checkout the tar man page for more details.
-
9A slight clarification on the answer. It is GNU tar that provides those extra compression arguments. For example, Solaris tar does not provide arguments for compression. – Tero Kilkanen Apr 29 '14 at 22:20
-
5
-
BSD tar provides an argument for compression as well, though it only accepts
z
and determines the compression method based on the extension, whereas GNU tar has separatezZjJ
arguments for the different compression methods. – wingedsubmariner Apr 30 '14 at 00:59 -
@wingedsubmariner The BSD tar manpage doesn't say it supports -j, but it (at least on mac) does. – Kevin Apr 30 '14 at 01:13
-
@wingedsubmariner: I don't know if the BSD tar on Mac is modified by Apple or not, but it supports
zZjJ
as well. Even though the man page does not mention the-J
flag, it actually accepts-J
and outputs anxz
file. – Siyuan Ren Apr 30 '14 at 02:58 -
2Just read the BSD tar manpage, and it turns out I was mistaken, BSD tar uses separate
zZjJ
for compression just like GNU tar. However, it does automatically detect compression when decompressing though, whereas GNU tar expectszZjJ
then also. – wingedsubmariner Apr 30 '14 at 03:10 -
5@wingedsubmariner: no; modern-ish versions of GNU
tar
decompress automatically without requiring the-zZjJ
options. – Jonathan Leffler Apr 30 '14 at 04:02 -
-
@staticx: Which version of GNU
tar
are you running, and on which platform? – Jonathan Leffler Apr 30 '14 at 16:04 -
@JonathanLeffler: RHEL 5. tar (GNU tar) 1.23 Copyright (C) 2010 Free Software Foundation, Inc. – Engineer2021 Apr 30 '14 at 16:05
-
@JonathanLeffler: I did
tar cvfz test.tar.gz test.c ; tar xvf test.tar.gz
and got test.c back – Engineer2021 Apr 30 '14 at 16:07 -
@staticx: curious! GNU
tar
1.26 on Ubuntu 12.04 doesn't, but I'm tolerably certain I'd have to go back further than 2010 to find a version that doesn't decompress at least some file types automatically. Thegzip
automatic decompression has been around a long time, AFAICR (meaning, mostly, I don't remember when it was added, but it was quite a long time ago). Periodically, new compression formats were released (.bz2
,.lz
,.xz
,.7z
) and for a while I needed to holdtar
's hand with--use-compress-program=whatever
as an option. The set of compression formats evolves, therefore. – Jonathan Leffler Apr 30 '14 at 16:11 -
@staticx: OK; that's consistent with 'decompresses automatically'. You do have to tell it which 'compress' to use (either by flag or possibly by file extension); that won't change. – Jonathan Leffler Apr 30 '14 at 16:12
-
@JonathanLeffler: Yes, sorry I may have misconstrued your sentence. I thought you were implying that you had to use
xvfz
when in fact it will detect the file extension and try that. – Engineer2021 Apr 30 '14 at 16:12 -
@JonathanLeffler: This also works:
tar cvfz test.tar ; tar xvf test.tar
. – Engineer2021 Apr 30 '14 at 16:19 -
@staticx: as a point of detail, it works by content rather than extension (or as well as extension). Try:
tar -czf /tmp/junk.tar.bz2 *.*
, thenfile /tmp/junk.tar.bz2
, andtar -tvf /tmp/junk.tar.bz2
. – Jonathan Leffler Apr 30 '14 at 16:19 -
@JonathanLeffler: Right, I figured there is a header that it reads to determine the type since relying on the
.gz
,.bz2
, etc is unreliable. So it will decompress automatically – Engineer2021 Apr 30 '14 at 16:20
tar
produces archives; compression is a separate functionality. However tar
alone can reduce space usage when used on a large number of small files that are smaller than the filesystem's cluster size. If a filesystem uses 1kb clusters, even a file that contains a single byte will consume 1kb (plus an inode). A tar
archive does not have this overhead.
BTW, an ISO file is not really "a file to hold files" - it's actually an image of an entire filesystem (one originally designed to be used on CDs) and thus its structure is considerably more complex.
-
3
-
@psusi so for a file of bytes 1-1023 will consume 1024 always which results in wastage of 1023-1 bytes. – Shiplu Mokaddim May 14 '19 at 13:36
-
tar
has significant alignment / block size overhead, due to its origin as a Tape Archiver. Ifa
is an empty file,tar -cf a.tar a
will create a 10240-byte filea.tar
. You can use a hex editor orod
to verify that most of the file is NUL (zero) bytes. – Clement Cherlin Sep 12 '22 at 15:59
The original UNIX tar command did not compress archives. As was mentioned in a comment, Solaris tar doesn't compress. Nor does HP-UX, nor AIX, FWIW. By convention, uncompressed archives end in .tar
.
With GNU/Linux you get GNU tar. (You can install GNU tar on other UNIX systems.) By default it does not compress; however, it does compress the resulting archive with gzip (also by GNU) if you supply -z
. The conventional suffix for gzipped files is .gz
, so you'll often see tarballs (slang for a tar archive, usually implying it's been compressed) that end in .tar.gz
. That ending implies tar was run, followed by gzip, e.g. tar cf - .|gzip -9v > archive.tar.gz
. You'll also find archives ending in .tgz
, e.g. tar czf archive.tgz .
.
Edit: www.linfo.org/tar.html reminded me that GNU tar supports much more functionality than merely compressing with gzip, and it reminded me that the suffixes are more than plain conventions. They have built-in semantics. It also supports bzip2 (-j
for .bz2
) and old compress (-Z
for .Z
). Then I looked at the man page and was reminded that -a
automatically maps your desired compression method based on suffix.
One other nit. As the Linux tar man page says, GNU produces info pages, not man pages, so to learn all about GNU tar, run info tar
.

- 211
-
The GNU tar still doesn't handle compressions by itself, it just pipes to/from gzip, bzip2, compress and others. – ott-- Aug 06 '15 at 20:03
-
I had a look at the source. GNU tar handles compression! The implementation takes advantage of code reuse and sound UNIX user space architectural principles. "Just pipes" is understating the way compression is tightly integrated into the tool. The fact that it happens to fork helper programs is a technicality. If you want to defend "just pipes," then cite file names and line numbers and let's see which side the community takes. – tbc0 Aug 06 '15 at 21:15
-