19

The problem is I have some database dumps which are either compressed or in plain text. There is no difference in file extension etc. Using zcat on uncompressed files produces an error instead of the output.

Is there maybe another cat sort of tool that is smart enough to detect what type of input it gets?

rsk82
  • 293
  • What file extension are they? Is there any way I could get some examples to play around with? – Seth May 25 '14 at 16:33
  • Just use zcat on only the compressed dumps. – mikeserv May 25 '14 at 16:36
  • Yea, but I don't know which ones are compressed without manually checking, some of them are *.gz some not and that is the problem. May I rephrase the question, how to check if file is gzipped ? and use that information in next command ? – rsk82 May 25 '14 at 16:39
  • @rsk82 But you just said they all have the same extension.. So you mean they all have .gz but only some of those are actually compressed? The others are just plain text? – Seth May 25 '14 at 16:41
  • Well - that sounds like your problem. Whatever system you've setup that provides that kind of output needs revising. In the meantime, GNU grep can be instructed what to do if it encounters a binary type file - and that might make a good filter for the cleanup. – mikeserv May 25 '14 at 16:45

5 Answers5

26

Just add the -f option.

$ echo foo | tee file | gzip > file.gz
$ zcat file file.gz
gzip: file: not in gzip format
foo
$ zcat -f file file.gz
foo
foo

(use gzip -dcf instead of zcat -f if your zcat is not the GNU (or GNU-emulated like in modern BSDs) one and only knows about .Z files).

  • One correction: you need the c parameter for gzip to get the output to stdout: gzip -cdf. But zcat -f works also fine in Mac OS if you use stdin: zcat -f < file. – David Ongaro May 25 '14 at 23:58
  • Addendum: by using stdin zcat can only take one argument which kind of defeats the purpose of cat, so I think using gzip is the best solution (in fact zcat is just a hardlink to gzip but it seems to be more picky about the file extension when called as zcat in Mac OS). – David Ongaro May 26 '14 at 00:24
10

One portable, simple suggestion would be to use zgrep instead of zcat, and just use a search pattern that matches every line.

zgrep $ some-file

Unlike zcat, zgrep will happily handle uncompressed files. From man zgrep:

zgrep - search possibly compressed files for a regular expression
godlygeek
  • 8,053
  • What do you mean by portable? If you mean that it can be installed and used on any system then isn't is as portable as any other? – mikeserv May 25 '14 at 17:16
  • I mean that zcat and zgrep are normally packaged together, so this ought to work anywhere where zcat was available to begin with. And it's agnostic to the particular shell being used - ought to work fine in bash, zsh, and even csh or Solaris's non-POSIX bourne /bin/sh. – godlygeek May 25 '14 at 17:22
  • Oh, cool. I noticed I had both - but I didn't know they came together. Thanks. You've got my vote. – mikeserv May 25 '14 at 17:26
  • 2
    zgrep is a script that wraps around gzip and grep. The reason it works with uncompressed data is because it passes the -f option to gzip – Stéphane Chazelas May 25 '14 at 20:13
  • Also note that as far as portability goes, zcat was first only dealing with .Z files. GNU came up with gzip and the gz format later and its gzip/zcat handles both .Z and .gz files. You'll probably still find commercial Unices where zcat knows nothing about gz files. – Stéphane Chazelas May 25 '14 at 20:16
5

With GNU gzip you can do zcat file 2> /dev/null || cat file. This is not POSIX-standard, and does not work on BSD gzip, you really should fix your system so that all gzipped files have the .gz extension (of course plain text files may have any extension, including .gz).

fkraiem
  • 554
  • You should add an || { echo "$filename" ; cat $file ; } >&2 to cat to keep the out streams separate so the asker can more easily clean up the mess. I mean - well, I hope that's clear enough.... But this is a good answer. – mikeserv May 25 '14 at 16:58
2

To add my conclusion from the comments as an answer, I think the best most compatible way is to use

gzip -cdf [ name ... ]

This is also how zless and zgrep do it internally.

0

An alternative in Bash:

for filename in *;do #Replace with your actual loop
    case $filename in
    *.gz) gunzip <"$filename";;
    *)    cat "$filename";;
    esac
done
Joseph R.
  • 39,549