200

I have created zlib-compressed data in Python, like this:

import zlib
s = '...'
z = zlib.compress(s)
with open('/tmp/data', 'w') as f:
    f.write(z)

(or one-liner in shell: echo -n '...' | python2 -c 'import sys,zlib; sys.stdout.write(zlib.compress(sys.stdin.read()))' > /tmp/data)

Now, I want to uncompress the data in shell. Neither zcat nor uncompress work:

$ cat /tmp/data | gzip -d -
gzip: stdin: not in gzip format

$ zcat /tmp/data 
gzip: /tmp/data.gz: not in gzip format

$ cat /tmp/data | uncompress -
gzip: stdin: not in gzip format

It seems that I have created gzip-like file, but without any headers. Unfortunately I don't see any option to uncompress such raw data in gzip man page, and the zlib package does not contain any executable utility.

Is there a utility to uncompress raw zlib data?

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
mykhal
  • 3,231

14 Answers14

211

It is also possible to decompress it using standard + , if you don't have, or want to use or other tools.
The trick is to prepend the gzip magic number and compress method to the actual data from zlib.compress:

printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" |cat - /tmp/data |gzip -dc >/tmp/out

Edits:
@d0sboots commented: For RAW Deflate data, you need to add 2 more null bytes:
"\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x00"

This Q on SO gives more information about this approach. An answer there suggests that there is also an 8 byte footer.

Users @Vitali-Kushner and @mark-bessey reported success even with truncated files, so a gzip footer does not seem strictly required.

@tobias-kienzler suggested this function for the :
zlibd() (printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" | cat - "$@" | gzip -dc)

nVitius
  • 103
wkpark
  • 2,234
  • 4
    gzip doesn't work, but zlib-flate does (pdf page content stream). – Daniil Iaitskov May 16 '17 at 12:23
  • 1
    inverse operation: cat input.txt | gzip -c | tail -c +9 >compressed.gzbody to remove the first 8 bytes – milahu Apr 20 '22 at 16:31
  • Small detail about shell functions: the suggested function uses parenthesis as block delimiter, which causes the unnecessary creation of a subshell. Using curly braces: zlibd() { printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" | cat - "$@" | gzip -dc; } – SenhorLucas Feb 02 '23 at 15:51
  • 1
    TFW you find a great answer, see yourself quoted in the answer, but can't find the comment that you were quoted from anywhere... – D0SBoots Jul 09 '23 at 09:23
157
zlib-flate -uncompress < IN_FILE > OUT_FILE

I tried this and it worked for me.

zlib-flate can be found in package qpdf (in Debian Squeeze, Fedora 23, and brew on MacOS according to comments in other answers)

(Thanks to user @tino who provided this as a comment below the OpenSSL answer. Made into propper answer for easy access.)

Catskul
  • 1,988
  • 5
    In contrast to the other answers, this one works on OS X. – polym Dec 29 '15 at 18:32
  • 3
    @polym, how did you get zlib-flate installed on macOS? I don't see it anywhere. – Wildcard Oct 02 '16 at 04:34
  • 7
    @Wildcard sorry for the late response. I think it came with the qpdf package that I've installed with brew as mentioned in the comment above - or look at the last sentence of this answer :). Also, qpdf is really cool, so have a look at it too if you have time! – polym Oct 16 '16 at 13:28
  • 2
    brew install qpdf, then the command listed above :-) thank you! – Fernando Gabrieli Sep 18 '19 at 15:13
  • 3
    This is really useful if you're learning how git objects are stored, using this instead of git cat-file -p works fine! – Rodrigo García Jan 21 '20 at 22:36
  • 1
    @Rodrirokr the problem of cat-file -p is that it pretty prints the content, so it can give a completely false impression of the contents e.g. an on-disk tree object is very different than what cat-file shows. – Masklinn Jul 09 '22 at 19:06
93

I have found a solution (one of the possible ones), it's using openssl:

$ openssl zlib -d < /tmp/data

or

$ openssl zlib -d -in /tmp/data

*NOTE: zlib functionality is apparently available in recent openssl versions >=1.0.0 (OpenSSL has to be configured/built with zlib or zlib-dynamic option, the latter is default)

mykhal
  • 3,231
  • 29
    On Debian Squeeze (which has OpenSSL 0.9.8) there is zlib-flate in the qpdf package. It can be used like zlib-flate -uncompress < FILE. – Tino Sep 16 '12 at 14:09
  • 24
    zlib got removed from the latest versions of OpenSSL so this tip is is very helpful @Tino – Alexandr Kurilin Dec 02 '14 at 10:59
  • 1
    Thanks. This solution provides a better experience in decompressing short input files than the answer using "gzip" ("openssl" decompressed as much as it could while "gzip" aborted printing "unexpected end of file"). – Daniel K. Sep 16 '15 at 10:01
  • 2
    @Tino this should be a separate answer – Catskul Nov 01 '15 at 03:16
  • 1
    @Tino, it is also available via the package qpdf on Fedora 23. Alexandr Kurilin, zlib is still available in 1.0.2d-fips. – maxschlepzig Nov 24 '15 at 08:37
  • 1
    there another openssl syntax that does the same, using openssl enc:

    $openssl enc -z -none -d < /tmp/data

    – Danny R Jun 27 '16 at 05:45
73

I recommend pigz from Mark Adler, co-author of the zlib compression library. Execute pigz to see the available flags.

You will notice:

-z --zlib Compress to zlib (.zz) instead of gzip format.

You can uncompress using the -d flag:

-d --decompress --uncompress Decompress the compressed input.

Assuming a file named 'test':

  • pigz -z test - creates a zlib compressed file named test.zz
  • pigz -d -z test.zz - converts test.zz to the decompressed test file

On OSX you can execute brew install pigz

snodnipper
  • 831
  • 6
  • 3
12

On macOS, which is a full POSIX compliant UNIX (formally certified!), OpenSSL has no zlib support, there is no zlib-flate either and while the first solution works as well as all the Python solutions, the first solution requires the ZIP data to be in a file and all the other solutions force you to create a Python script.

Here's a Perl based solution that can be used as a command line one-liner, gets its input via STDIN pipe and that works out of the box with a freshly installed macOS:

cat file.compressed | perl -e 'use Compress::Raw::Zlib;my $d=new Compress::Raw::Zlib::Inflate();my $o;undef $/;$d->inflate(<>,$o);print $o;'

Nicer formatted, the Perl script looks like this:

use Compress::Raw::Zlib;
my $decompressor = new Compress::Raw::Zlib::Inflate();
my $output;
undef $/;
$decompressor->inflate(<>, $output);
print $output;

Optimized version from Marco d'Itri (see comments):

cat file.compressed | perl -MCompress::Zlib -E 'undef $/;print uncompress(<>)'
Mecki
  • 276
11

zlib implements the compression used by gzip, but not the file format. Instead, you should use the gzip module, which itself uses zlib.

import gzip
s = '...'
with gzip.open('/tmp/data', 'w') as f:
    f.write(s)
l0b0
  • 51,350
Jeremy
  • 1
  • ok, but my situation is that i have tens/hundreds thousands of those files created, so.. :) –  Sep 20 '11 at 22:14
  • 1
    so... your files are incomplete. Perhaps you'll have to uncompress them with zlib and recompress them with gzip, if you don't still have the original data. – Greg Hewgill Sep 20 '11 at 22:18
  • 6
    @mykhal, why did you create ten/hundred thousands of files before checking that you could actually uncompress them? –  Sep 20 '11 at 22:19
  • 3
    harpyon, i can uncompress them, i just wonder which less or more common urility or zgip settings can be used for that, if i don't want to do it in python again –  Sep 20 '11 at 22:47
7

The example program zpipe.c found here by Mark Adler himself (comes with the source distribution of the zlib library) is very useful for these scenarios with raw zlib data. Compile with cc -o zpipe zpipe.c -lz and to decompress: zpipe -d < raw.zlib > decompressed. It can also do the compression without the -d flag.

Henno Brandsma
  • 211
  • 2
  • 5
6

This might do it:

import glob
import zlib
import sys

for filename in sys.argv:
    with open(filename, 'rb') as compressed:
        with open(filename + '-decompressed', 'wb') as expanded:
            data = zlib.decompress(compressed.read())
            expanded.write(data)

Then run it like this:

$ python expander.py data/*
Jeremy
  • 1
  • thanks, i know about zlib.decompress. probably i'd use some walk function. i'm not sure if shell would handle my huge amount of files with glob wildcard :) –  Sep 20 '11 at 22:28
  • The file that is created by expanded still checks out as "zlib compressed data" for me, using the shell file command? How is that? – K.-Michael Aye Nov 30 '18 at 00:18
  • nope doesn't work for me even with the fake header. – cybernard Jan 26 '19 at 01:02
2

You can use this to compress with zlib:

openssl enc -z -none -e < /file/to/deflate

And this to deflate:

openssl enc -z -none -d < /file/to/deflate
muru
  • 72,889
Danny R
  • 121
2

During development of eIDAS related code, i've came up with bash script, that decodes SSO (SingleSignOn) SAMLRequest param, which is usually encoded by base64 and raw-deflate (php gzdeflate)

#!/bin/bash
# file decode_saml_request.sh

urldecode() { : "${*//+/ }"; echo -e "${_//%/\\x}"; }

if [[ $contents == *"SAMLRequest" ]]; then
  # extract param SAMLRequest from URL, strip all following params
  contents=$(cat ${1} | awk -F 'SAMLRequest=' '{print $2}' | awk -F '&' '{print $1}')
else
  # work with raw base64 encoded string
  contents=$(cat ${1})
fi

# add gzip raw-deflate header bytes and gunzip (`gzip -dc` can be replaced by `gunzip`)
printf "\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x00" | cat - <(echo `urldecode $contents` | base64 -d) | gzip -dc

You can use it like

> decode_saml_request.sh /path/to/file_with_sso_url
# or
> echo "y00tLk5MT1VISSxJBAA%3D" | decode_saml_request.sh

Script is published also as gist here: https://gist.github.com/smarek/77dacb9703ac8b715b5eced5314d5085 so i may not maintain this answer but I will maintain the source gist

Marek Sebera
  • 121
  • 5
2

I have an addition to @Alex Stragies conversion for those who need a proper header and footer (an actual conversion from zlib to gzip).

It would probably be easier to use one of the above methods, however if the reader has a case like mine which requires conversion of zlib to gzip without decompression and recompression, this is the way to do it.

According to RFC1950/1952, A zlib file can only have a single stream or member. This is different from gzip in that:

A gzip file consists of a series of "members" (compressed data sets). ... The members simply appear one after another in the file, with no additional information before, between, or after them.

This means that while a single zlib file can always be converted to a single gzip file, the converse is not strictly true. Something to keep in mind.

zlib has both a header (2 bytes) and a footer (4 bytes) which must be removed from the data so that the gzip header and footer can be appended. One way of doing that is as follows:

# Remove zlib 4 byte footer
trunc_size=$(ls -l infile.z | awk '{print $5 - 4}')
truncate -s $trunc_size infile.z

Remove zlib 2 byte header

dd bs=1M iflag=skip_bytes skip=2 if=infile.z of=tmp1.z

Now we have just raw data and may append the gzip header (from @Alex Stragies)

printf "\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x00" | cat - tmp1.z > tmp2.z

The gzip footer is 8 bytes long. It consists the CRC32 of the uncompressed file, plus the size of the file uncompressed mod 2^32, both in big endian format. If you don't know these but have means of getting an uncompressed file:

generate_crcbig() {
    crc=$(crc32 $uncompressedfile)
    crcbig=$(echo "\x${crc:6:2}\x${crc:4:2}\x${crc:2:2}\x${crc:0:2}")
}

generate_lbig () { leng=$(ls -l $uncompressedfile | awk '{print $5}') lmod=$(expr $leng % 4294967296) # mod 2^32 lhex=$(printf "%x\n" $lmod) lbig=$(echo "\x${lhex:6:2}\x${lhex:4:2}\x${lhex:2:2}\x${lhex:0:2}") }

And then the footer may be appended as such:

printf $crcbig$lbig | cat tmp3.z - > outfile.gz

Now you have a file which is in the gzip format! It can be verified with gzip -t outfile.gz and uncompressed with any application complying with gzip specifications.

2

I get it that author doesn't want to use Python but I believe that Python3 1-liner is natural choice for most Linux users, so let it be here:

python3 -c 'import sys,zlib; sys.stdout.write(zlib.decompress(sys.stdin.buffer.read()).decode())' < $COMPRESSED_FILE_PATH

0

The simple inflate program pufftest.c found in contrib/puff of zlib packet by Mark Adler himself can handle raw zlib data whithout header bytes and Adler32 checksum. Compile with cc -o pufftest puff.c pufftest.c and to inflate: pufftest < raw.zlib > decompressed. Note, it can't deflate.

-5
zcat -f infile > outfile 

works for me on fedora25

muru
  • 72,889