How to uncompress zlib data in UNIX?

Question

I have created zlib-compressed data in Python, like this:

import zlib
s = '...'
z = zlib.compress(s)
with open('/tmp/data', 'w') as f:
    f.write(z)

(or one-liner in shell: echo -n '...' | python2 -c 'import sys,zlib; sys.stdout.write(zlib.compress(sys.stdin.read()))' > /tmp/data)

Now, I want to uncompress the data in shell. Neither zcat nor uncompress work:

$ cat /tmp/data | gzip -d -
gzip: stdin: not in gzip format

$ zcat /tmp/data 
gzip: /tmp/data.gz: not in gzip format

$ cat /tmp/data | uncompress -
gzip: stdin: not in gzip format

It seems that I have created gzip-like file, but without any headers. Unfortunately I don't see any option to uncompress such raw data in gzip man page, and the zlib package does not contain any executable utility.

Is there a utility to uncompress raw zlib data?

There are many additional answers here: http://stackoverflow.com/questions/3178566/deflate-command-line-tool — Jack O'Connor, Jan 24 '14 at 22:40

score 211 · Accepted Answer · edited Aug 11 '21 at 09:44

211

It is also possible to decompress it using standard shell-script + gzip, if you don't have, or want to use openssl or other tools.
The trick is to prepend the gzip magic number and compress method to the actual data from zlib.compress:

printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" |cat - /tmp/data |gzip -dc >/tmp/out

Edits:
@d0sboots commented: For RAW Deflate data, you need to add 2 more null bytes:
→ "\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x00"

This Q on SO gives more information about this approach. An answer there suggests that there is also an 8 byte footer.

Users @Vitali-Kushner and @mark-bessey reported success even with truncated files, so a gzip footer does not seem strictly required.

@tobias-kienzler suggested this function for the bashrc:
zlibd() (printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" | cat - "$@" | gzip -dc)

edited Aug 11 '21 at 09:44

nVitius

103

answered Sep 25 '12 at 03:36

wkpark

2,234

4

gzip doesn't work, but zlib-flate does (pdf page content stream). – Daniil Iaitskov May 16 '17 at 12:23
1

inverse operation: cat input.txt | gzip -c | tail -c +9 >compressed.gzbody to remove the first 8 bytes – milahu Apr 20 '22 at 16:31
Small detail about shell functions: the suggested function uses parenthesis as block delimiter, which causes the unnecessary creation of a subshell. Using curly braces: zlibd() { printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" | cat - "$@" | gzip -dc; } – SenhorLucas Feb 02 '23 at 15:51
1

TFW you find a great answer, see yourself quoted in the answer, but can't find the comment that you were quoted from anywhere... – D0SBoots Jul 09 '23 at 09:23

Catskul · Answer 2 · 2021-08-16T18:45:40.327

157

zlib-flate -uncompress < IN_FILE > OUT_FILE

I tried this and it worked for me.

zlib-flate can be found in package qpdf (in Debian Squeeze, Fedora 23, and brew on MacOS according to comments in other answers)

(Thanks to user @tino who provided this as a comment below the OpenSSL answer. Made into propper answer for easy access.)

edited Aug 16 '21 at 18:45

answered Nov 01 '15 at 03:18

Catskul

1,988

5

In contrast to the other answers, this one works on OS X. – polym Dec 29 '15 at 18:32
3

@polym, how did you get zlib-flate installed on macOS? I don't see it anywhere. – Wildcard Oct 02 '16 at 04:34
7

@Wildcard sorry for the late response. I think it came with the qpdf package that I've installed with brew as mentioned in the comment above - or look at the last sentence of this answer :). Also, qpdf is really cool, so have a look at it too if you have time! – polym Oct 16 '16 at 13:28
2

brew install qpdf, then the command listed above :-) thank you! – Fernando Gabrieli Sep 18 '19 at 15:13
3

This is really useful if you're learning how git objects are stored, using this instead of git cat-file -p works fine! – Rodrigo García Jan 21 '20 at 22:36
1

@Rodrirokr the problem of cat-file -p is that it pretty prints the content, so it can give a completely false impression of the contents e.g. an on-disk tree object is very different than what cat-file shows. – Masklinn Jul 09 '22 at 19:06

mykhal · Answer 3 · 2015-09-17T11:45:31.937

93

I have found a solution (one of the possible ones), it's using openssl:

$ openssl zlib -d < /tmp/data

or

$ openssl zlib -d -in /tmp/data

*NOTE: zlib functionality is apparently available in recent openssl versions >=1.0.0 (OpenSSL has to be configured/built with zlib or zlib-dynamic option, the latter is default)

edited Sep 17 '15 at 11:45

answered Oct 17 '11 at 00:56

mykhal

3,231

29

On Debian Squeeze (which has OpenSSL 0.9.8) there is zlib-flate in the qpdf package. It can be used like zlib-flate -uncompress < FILE. – Tino Sep 16 '12 at 14:09
24

zlib got removed from the latest versions of OpenSSL so this tip is is very helpful @Tino – Alexandr Kurilin Dec 02 '14 at 10:59
1

Thanks. This solution provides a better experience in decompressing short input files than the answer using "gzip" ("openssl" decompressed as much as it could while "gzip" aborted printing "unexpected end of file"). – Daniel K. Sep 16 '15 at 10:01
2

@Tino this should be a separate answer – Catskul Nov 01 '15 at 03:16
1

@Tino, it is also available via the package qpdf on Fedora 23. Alexandr Kurilin, zlib is still available in 1.0.2d-fips. – maxschlepzig Nov 24 '15 at 08:37
1

there another openssl syntax that does the same, using openssl enc:
$openssl enc -z -none -d < /tmp/data
– Danny R Jun 27 '16 at 05:45

score 73 · Answer 4 · answered Sep 26 '16 at 12:27

73

I recommend pigz from Mark Adler, co-author of the zlib compression library. Execute pigz to see the available flags.

You will notice:

-z --zlib Compress to zlib (.zz) instead of gzip format.

You can uncompress using the -d flag:

-d --decompress --uncompress Decompress the compressed input.

Assuming a file named 'test':

pigz -z test - creates a zlib compressed file named test.zz
pigz -d -z test.zz - converts test.zz to the decompressed test file

On OSX you can execute brew install pigz

answered Sep 26 '16 at 12:27

snodnipper

831
6
3

8

Good find! It looks like it can detect zlib files by itself, so unpigz test.zz will work as well. – Stéphane Chazelas Sep 26 '16 at 12:55
did not decompress my data. – cybernard Jan 26 '19 at 01:09
1

@cybernard perhaps you don't have a zlib file. check with: $>file hello.txt.zz hello.txt.zz: zlib compressed data – snodnipper Feb 01 '19 at 12:08
1

Worked well with partial files too. – Joe DF Mar 05 '20 at 16:52
Playing around with git objects, the following will work: unpigz -z < FILE – abetusk Aug 29 '22 at 15:44

score 12 · Answer 5 · edited Nov 20 '23 at 15:59

On macOS, which is a full POSIX compliant UNIX (formally certified!), OpenSSL has no zlib support, there is no zlib-flate either and while the first solution works as well as all the Python solutions, the first solution requires the ZIP data to be in a file and all the other solutions force you to create a Python script.

Here's a Perl based solution that can be used as a command line one-liner, gets its input via STDIN pipe and that works out of the box with a freshly installed macOS:

cat file.compressed | perl -e 'use Compress::Raw::Zlib;my $d=new Compress::Raw::Zlib::Inflate();my $o;undef $/;$d->inflate(<>,$o);print $o;'

Nicer formatted, the Perl script looks like this:

use Compress::Raw::Zlib;
my $decompressor = new Compress::Raw::Zlib::Inflate();
my $output;
undef $/;
$decompressor->inflate(<>, $output);
print $output;

Optimized version from Marco d'Itri (see comments):

cat file.compressed | perl -MCompress::Zlib -E 'undef $/;print uncompress(<>)'

A shorter solution is: perl -MCompress::Zlib -E 'undef $/;print uncompress(<>)' — Marco d'Itri, Jan 11 '22 at 06:00

score 11 · Answer 6 · edited Sep 25 '12 at 15:28

11

zlib implements the compression used by gzip, but not the file format. Instead, you should use the gzip module, which itself uses zlib.

import gzip
s = '...'
with gzip.open('/tmp/data', 'w') as f:
    f.write(s)

edited Sep 25 '12 at 15:28

l0b0

51,350

answered Sep 20 '11 at 22:10

Jeremy

1

ok, but my situation is that i have tens/hundreds thousands of those files created, so.. :) – Sep 20 '11 at 22:14
1

so... your files are incomplete. Perhaps you'll have to uncompress them with zlib and recompress them with gzip, if you don't still have the original data. – Greg Hewgill Sep 20 '11 at 22:18
6

@mykhal, why did you create ten/hundred thousands of files before checking that you could actually uncompress them? – Sep 20 '11 at 22:19
3

harpyon, i can uncompress them, i just wonder which less or more common urility or zgip settings can be used for that, if i don't want to do it in python again – Sep 20 '11 at 22:47

score 7 · Answer 7 · answered Mar 06 '18 at 10:38

The example program zpipe.c found here by Mark Adler himself (comes with the source distribution of the zlib library) is very useful for these scenarios with raw zlib data. Compile with cc -o zpipe zpipe.c -lz and to decompress: zpipe -d < raw.zlib > decompressed. It can also do the compression without the -d flag.

Jeremy · Answer 8 · 2016-10-18T02:17:06.537

6

This might do it:

import glob
import zlib
import sys

for filename in sys.argv:
    with open(filename, 'rb') as compressed:
        with open(filename + '-decompressed', 'wb') as expanded:
            data = zlib.decompress(compressed.read())
            expanded.write(data)

Then run it like this:

$ python expander.py data/*

edited Oct 18 '16 at 02:17

answered Sep 20 '11 at 22:20

Jeremy

1

thanks, i know about zlib.decompress. probably i'd use some walk function. i'm not sure if shell would handle my huge amount of files with glob wildcard :) – Sep 20 '11 at 22:28
The file that is created by expanded still checks out as "zlib compressed data" for me, using the shell file command? How is that? – K.-Michael Aye Nov 30 '18 at 00:18
nope doesn't work for me even with the fake header. – cybernard Jan 26 '19 at 01:02

score 2 · Answer 9 · edited Oct 17 '17 at 07:31

2

You can use this to compress with zlib:

openssl enc -z -none -e < /file/to/deflate

And this to deflate:

openssl enc -z -none -d < /file/to/deflate

edited Oct 17 '17 at 07:31

muru

72,889

answered Jun 27 '16 at 05:48

Danny R

121

8

Gives unknown option '-z' on Ubuntu 16.04 and OpenSSL 1.0.2g 1 Mar 2016 – Tino May 22 '18 at 10:50
2

same error on Mac – K.-Michael Aye Nov 30 '18 at 00:13

score 2 · Answer 10 · answered Dec 02 '19 at 14:29

During development of eIDAS related code, i've came up with bash script, that decodes SSO (SingleSignOn) SAMLRequest param, which is usually encoded by base64 and raw-deflate (php gzdeflate)

#!/bin/bash
# file decode_saml_request.sh

urldecode() { : "${*//+/ }"; echo -e "${_//%/\\x}"; }

if [[ $contents == *"SAMLRequest" ]]; then
  # extract param SAMLRequest from URL, strip all following params
  contents=$(cat ${1} | awk -F 'SAMLRequest=' '{print $2}' | awk -F '&' '{print $1}')
else
  # work with raw base64 encoded string
  contents=$(cat ${1})
fi

# add gzip raw-deflate header bytes and gunzip (`gzip -dc` can be replaced by `gunzip`)
printf "\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x00" | cat - <(echo `urldecode $contents` | base64 -d) | gzip -dc

You can use it like

> decode_saml_request.sh /path/to/file_with_sso_url
# or
> echo "y00tLk5MT1VISSxJBAA%3D" | decode_saml_request.sh

Script is published also as gist here: https://gist.github.com/smarek/77dacb9703ac8b715b5eced5314d5085 so i may not maintain this answer but I will maintain the source gist

score 2 · Answer 11 · answered Aug 04 '20 at 18:33

I have an addition to @Alex Stragies conversion for those who need a proper header and footer (an actual conversion from zlib to gzip).

It would probably be easier to use one of the above methods, however if the reader has a case like mine which requires conversion of zlib to gzip without decompression and recompression, this is the way to do it.

According to RFC1950/1952, A zlib file can only have a single stream or member. This is different from gzip in that:

A gzip file consists of a series of "members" (compressed data sets). ... The members simply appear one after another in the file, with no additional information before, between, or after them.

This means that while a single zlib file can always be converted to a single gzip file, the converse is not strictly true. Something to keep in mind.

zlib has both a header (2 bytes) and a footer (4 bytes) which must be removed from the data so that the gzip header and footer can be appended. One way of doing that is as follows:

# Remove zlib 4 byte footer
trunc_size=$(ls -l infile.z | awk '{print $5 - 4}')
truncate -s $trunc_size infile.z
Remove zlib 2 byte header
dd bs=1M iflag=skip_bytes skip=2 if=infile.z of=tmp1.z

Now we have just raw data and may append the gzip header (from @Alex Stragies)

printf "\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x00" | cat - tmp1.z > tmp2.z

The gzip footer is 8 bytes long. It consists the CRC32 of the uncompressed file, plus the size of the file uncompressed mod 2^32, both in big endian format. If you don't know these but have means of getting an uncompressed file:

generate_crcbig() {
    crc=$(crc32 $uncompressedfile)
    crcbig=$(echo "\x${crc:6:2}\x${crc:4:2}\x${crc:2:2}\x${crc:0:2}")
}
generate_lbig () {
    leng=$(ls -l $uncompressedfile | awk '{print $5}')
    lmod=$(expr $leng % 4294967296) # mod 2^32
    lhex=$(printf "%x\n" $lmod)
    lbig=$(echo "\x${lhex:6:2}\x${lhex:4:2}\x${lhex:2:2}\x${lhex:0:2}")
}

And then the footer may be appended as such:

printf $crcbig$lbig | cat tmp3.z - > outfile.gz

Now you have a file which is in the gzip format! It can be verified with gzip -t outfile.gz and uncompressed with any application complying with gzip specifications.

score 2 · Answer 12 · answered Nov 26 '20 at 14:45

2

I get it that author doesn't want to use Python but I believe that Python3 1-liner is natural choice for most Linux users, so let it be here:

python3 -c 'import sys,zlib; sys.stdout.write(zlib.decompress(sys.stdin.buffer.read()).decode())' < $COMPRESSED_FILE_PATH

answered Nov 26 '20 at 14:45

Eugene Shatsky

121

1

The .decode() here makes this useless for binary data, unfortunately – mystery Apr 23 '22 at 10:19
Another .buffer fixes that: python3 -c 'import sys,zlib; sys.stdout.buffer.write(zlib.decompress(sys.stdin.buffer.read()))' – D0SBoots Jul 09 '23 at 10:04

Annie Y · Answer 13 · 2020-07-07T12:14:02.617

0

The simple inflate program pufftest.c found in contrib/puff of zlib packet by Mark Adler himself can handle raw zlib data whithout header bytes and Adler32 checksum. Compile with cc -o pufftest puff.c pufftest.c and to inflate: pufftest < raw.zlib > decompressed. Note, it can't deflate.

edited Jul 07 '20 at 12:14

answered Jul 07 '20 at 08:35

Annie Y

1

score -5 · Answer 14 · edited Oct 17 '17 at 07:31

-5

zcat -f infile > outfile

works for me on fedora25

edited Oct 17 '17 at 07:31

muru

72,889

answered Oct 17 '17 at 07:27

sigxcpu

1

6

zcat only works with files in the gzip format. – Anthony Geoghegan Oct 17 '17 at 08:50

How to uncompress zlib data in UNIX?

14 Answers14

Remove zlib 2 byte header

Linked