203

I have created zlib-compressed data in Python, like this:

import zlib
s = '...'
z = zlib.compress(s)
with open('/tmp/data', 'w') as f:
    f.write(z)

(or one-liner in shell: echo -n '...' | python2 -c 'import sys,zlib; sys.stdout.write(zlib.compress(sys.stdin.read()))' > /tmp/data)

Now, I want to uncompress the data in shell. Neither zcat nor uncompress work:

$ cat /tmp/data | gzip -d -
gzip: stdin: not in gzip format

$ zcat /tmp/data 
gzip: /tmp/data.gz: not in gzip format

$ cat /tmp/data | uncompress -
gzip: stdin: not in gzip format

It seems that I have created gzip-like file, but without any headers. Unfortunately I don't see any option to uncompress such raw data in gzip man page, and the zlib package does not contain any executable utility.

Is there a utility to uncompress raw zlib data?

1

14 Answers 14

216

It is also possible to decompress it using standard + , if you don't have, or want to use or other tools.
The trick is to prepend the gzip magic number and compress method to the actual data from zlib.compress:

printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" |cat - /tmp/data |gzip -dc >/tmp/out

Edits:
@d0sboots commented: For RAW Deflate data, you need to add 2 more null bytes:
"\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x00"

This Q on SO gives more information about this approach. An answer there suggests that there is also an 8 byte footer.

Users @Vitali-Kushner and @mark-bessey reported success even with truncated files, so a gzip footer does not seem strictly required.

@tobias-kienzler suggested this function for the :
zlibd() (printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" | cat - "$@" | gzip -dc)

4
  • 4
    gzip doesn't work, but zlib-flate does (pdf page content stream). Commented May 16, 2017 at 12:23
  • 1
    inverse operation: cat input.txt | gzip -c | tail -c +9 >compressed.gzbody to remove the first 8 bytes
    – milahu
    Commented Apr 20, 2022 at 16:31
  • Small detail about shell functions: the suggested function uses parenthesis as block delimiter, which causes the unnecessary creation of a subshell. Using curly braces: zlibd() { printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" | cat - "$@" | gzip -dc; } Commented Feb 2, 2023 at 15:51
  • 2
    TFW you find a great answer, see yourself quoted in the answer, but can't find the comment that you were quoted from anywhere...
    – D0SBoots
    Commented Jul 9, 2023 at 9:23
159
zlib-flate -uncompress < IN_FILE > OUT_FILE

I tried this and it worked for me.

zlib-flate can be found in package qpdf (in Debian Squeeze, Fedora 23, and brew on MacOS according to comments in other answers)

(Thanks to user @tino who provided this as a comment below the OpenSSL answer. Made into propper answer for easy access.)

6
  • 5
    In contrast to the other answers, this one works on OS X.
    – polym
    Commented Dec 29, 2015 at 18:32
  • 3
    @polym, how did you get zlib-flate installed on macOS? I don't see it anywhere.
    – Wildcard
    Commented Oct 2, 2016 at 4:34
  • 8
    @Wildcard sorry for the late response. I think it came with the qpdf package that I've installed with brew as mentioned in the comment above - or look at the last sentence of this answer :). Also, qpdf is really cool, so have a look at it too if you have time!
    – polym
    Commented Oct 16, 2016 at 13:28
  • 2
    brew install qpdf, then the command listed above :-) thank you! Commented Sep 18, 2019 at 15:13
  • 3
    This is really useful if you're learning how git objects are stored, using this instead of git cat-file -p works fine! Commented Jan 21, 2020 at 22:36
94

I have found a solution (one of the possible ones), it's using openssl:

$ openssl zlib -d < /tmp/data

or

$ openssl zlib -d -in /tmp/data

*NOTE: zlib functionality is apparently available in recent openssl versions >=1.0.0 (OpenSSL has to be configured/built with zlib or zlib-dynamic option, the latter is default)

6
  • 30
    On Debian Squeeze (which has OpenSSL 0.9.8) there is zlib-flate in the qpdf package. It can be used like zlib-flate -uncompress < FILE.
    – Tino
    Commented Sep 16, 2012 at 14:09
  • 24
    zlib got removed from the latest versions of OpenSSL so this tip is is very helpful @Tino Commented Dec 2, 2014 at 10:59
  • 1
    Thanks. This solution provides a better experience in decompressing short input files than the answer using "gzip" ("openssl" decompressed as much as it could while "gzip" aborted printing "unexpected end of file").
    – Daniel K.
    Commented Sep 16, 2015 at 10:01
  • 2
    @Tino this should be a separate answer
    – Catskul
    Commented Nov 1, 2015 at 3:16
  • 1
    @Tino, it is also available via the package qpdf on Fedora 23. Alexandr Kurilin, zlib is still available in 1.0.2d-fips. Commented Nov 24, 2015 at 8:37
74

I recommend pigz from Mark Adler, co-author of the zlib compression library. Execute pigz to see the available flags.

You will notice:

-z --zlib Compress to zlib (.zz) instead of gzip format.

You can uncompress using the -d flag:

-d --decompress --uncompress Decompress the compressed input.

Assuming a file named 'test':

  • pigz -z test - creates a zlib compressed file named test.zz
  • pigz -d -z test.zz - converts test.zz to the decompressed test file

On OSX you can execute brew install pigz

5
  • 8
    Good find! It looks like it can detect zlib files by itself, so unpigz test.zz will work as well. Commented Sep 26, 2016 at 12:55
  • did not decompress my data.
    – cybernard
    Commented Jan 26, 2019 at 1:09
  • 1
    @cybernard perhaps you don't have a zlib file. check with: $>file hello.txt.zz hello.txt.zz: zlib compressed data
    – snodnipper
    Commented Feb 1, 2019 at 12:08
  • 1
    Worked well with partial files too.
    – Joe DF
    Commented Mar 5, 2020 at 16:52
  • Playing around with git objects, the following will work: unpigz -z < FILE
    – abetusk
    Commented Aug 29, 2022 at 15:44
13

On macOS, which is a full POSIX compliant UNIX (formally certified!), OpenSSL has no zlib support, there is no zlib-flate either and while the first solution works as well as all the Python solutions, the first solution requires the ZIP data to be in a file and all the other solutions force you to create a Python script.

Here's a Perl based solution that can be used as a command line one-liner, gets its input via STDIN pipe and that works out of the box with a freshly installed macOS:

cat file.compressed | perl -e 'use Compress::Raw::Zlib;my $d=new Compress::Raw::Zlib::Inflate();my $o;undef $/;$d->inflate(<>,$o);print $o;'

Nicer formatted, the Perl script looks like this:

use Compress::Raw::Zlib;
my $decompressor = new Compress::Raw::Zlib::Inflate();
my $output;
undef $/;
$decompressor->inflate(<>, $output);
print $output;

Optimized version from Marco d'Itri (see comments):

cat file.compressed | perl -MCompress::Zlib -E 'undef $/;print uncompress(<>)'
1
  • 3
    A shorter solution is: perl -MCompress::Zlib -E 'undef $/;print uncompress(<>)' Commented Jan 11, 2022 at 6:00
11

zlib implements the compression used by gzip, but not the file format. Instead, you should use the gzip module, which itself uses zlib.

import gzip
s = '...'
with gzip.open('/tmp/data', 'w') as f:
    f.write(s)
4
  • ok, but my situation is that i have tens/hundreds thousands of those files created, so.. :)
    – mykhal
    Commented Sep 20, 2011 at 22:14
  • 1
    so... your files are incomplete. Perhaps you'll have to uncompress them with zlib and recompress them with gzip, if you don't still have the original data. Commented Sep 20, 2011 at 22:18
  • 6
    @mykhal, why did you create ten/hundred thousands of files before checking that you could actually uncompress them?
    – Harpyon
    Commented Sep 20, 2011 at 22:19
  • 3
    harpyon, i can uncompress them, i just wonder which less or more common urility or zgip settings can be used for that, if i don't want to do it in python again
    – mykhal
    Commented Sep 20, 2011 at 22:47
7

The example program zpipe.c found here by Mark Adler himself (comes with the source distribution of the zlib library) is very useful for these scenarios with raw zlib data. Compile with cc -o zpipe zpipe.c -lz and to decompress: zpipe -d < raw.zlib > decompressed. It can also do the compression without the -d flag.

6

This might do it:

import glob
import zlib
import sys

for filename in sys.argv:
    with open(filename, 'rb') as compressed:
        with open(filename + '-decompressed', 'wb') as expanded:
            data = zlib.decompress(compressed.read())
            expanded.write(data)

Then run it like this:

$ python expander.py data/*
3
  • thanks, i know about zlib.decompress. probably i'd use some walk function. i'm not sure if shell would handle my huge amount of files with glob wildcard :)
    – mykhal
    Commented Sep 20, 2011 at 22:28
  • The file that is created by expanded still checks out as "zlib compressed data" for me, using the shell file command? How is that? Commented Nov 30, 2018 at 0:18
  • nope doesn't work for me even with the fake header.
    – cybernard
    Commented Jan 26, 2019 at 1:02
2

You can use this to compress with zlib:

openssl enc -z -none -e < /file/to/deflate

And this to deflate:

openssl enc -z -none -d < /file/to/deflate
2
  • 8
    Gives unknown option '-z' on Ubuntu 16.04 and OpenSSL 1.0.2g 1 Mar 2016
    – Tino
    Commented May 22, 2018 at 10:50
  • 2
    same error on Mac Commented Nov 30, 2018 at 0:13
2

During development of eIDAS related code, i've came up with bash script, that decodes SSO (SingleSignOn) SAMLRequest param, which is usually encoded by base64 and raw-deflate (php gzdeflate)

#!/bin/bash
# file decode_saml_request.sh

urldecode() { : "${*//+/ }"; echo -e "${_//%/\\x}"; }

if [[ $contents == *"SAMLRequest" ]]; then
  # extract param SAMLRequest from URL, strip all following params
  contents=$(cat ${1} | awk -F 'SAMLRequest=' '{print $2}' | awk -F '&' '{print $1}')
else
  # work with raw base64 encoded string
  contents=$(cat ${1})
fi

# add gzip raw-deflate header bytes and gunzip (`gzip -dc` can be replaced by `gunzip`)
printf "\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x00" | cat - <(echo `urldecode $contents` | base64 -d) | gzip -dc

You can use it like

> decode_saml_request.sh /path/to/file_with_sso_url
# or
> echo "y00tLk5MT1VISSxJBAA%3D" | decode_saml_request.sh

Script is published also as gist here: https://gist.github.com/smarek/77dacb9703ac8b715b5eced5314d5085 so i may not maintain this answer but I will maintain the source gist

2

I have an addition to @Alex Stragies conversion for those who need a proper header and footer (an actual conversion from zlib to gzip).

It would probably be easier to use one of the above methods, however if the reader has a case like mine which requires conversion of zlib to gzip without decompression and recompression, this is the way to do it.

According to RFC1950/1952, A zlib file can only have a single stream or member. This is different from gzip in that:

A gzip file consists of a series of "members" (compressed data sets). ... The members simply appear one after another in the file, with no additional information before, between, or after them.

This means that while a single zlib file can always be converted to a single gzip file, the converse is not strictly true. Something to keep in mind.

zlib has both a header (2 bytes) and a footer (4 bytes) which must be removed from the data so that the gzip header and footer can be appended. One way of doing that is as follows:

# Remove zlib 4 byte footer
trunc_size=$(ls -l infile.z | awk '{print $5 - 4}')
truncate -s $trunc_size infile.z


# Remove zlib 2 byte header
dd bs=1M iflag=skip_bytes skip=2 if=infile.z of=tmp1.z

Now we have just raw data and may append the gzip header (from @Alex Stragies)

printf "\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x00" | cat - tmp1.z > tmp2.z

The gzip footer is 8 bytes long. It consists the CRC32 of the uncompressed file, plus the size of the file uncompressed mod 2^32, both in big endian format. If you don't know these but have means of getting an uncompressed file:

generate_crcbig() {
    crc=$(crc32 $uncompressedfile)
    crcbig=$(echo "\x${crc:6:2}\x${crc:4:2}\x${crc:2:2}\x${crc:0:2}")
}

generate_lbig () {
    leng=$(ls -l $uncompressedfile | awk '{print $5}')
    lmod=$(expr $leng % 4294967296) # mod 2^32
    lhex=$(printf "%x\n" $lmod)
    lbig=$(echo "\x${lhex:6:2}\x${lhex:4:2}\x${lhex:2:2}\x${lhex:0:2}")
}

And then the footer may be appended as such:

printf $crcbig$lbig | cat tmp3.z - > outfile.gz

Now you have a file which is in the gzip format! It can be verified with gzip -t outfile.gz and uncompressed with any application complying with gzip specifications.

2

I get it that author doesn't want to use Python but I believe that Python3 1-liner is natural choice for most Linux users, so let it be here:

python3 -c 'import sys,zlib; sys.stdout.write(zlib.decompress(sys.stdin.buffer.read()).decode())' < $COMPRESSED_FILE_PATH

2
  • 1
    The .decode() here makes this useless for binary data, unfortunately
    – mystery
    Commented Apr 23, 2022 at 10:19
  • Another .buffer fixes that: python3 -c 'import sys,zlib; sys.stdout.buffer.write(zlib.decompress(sys.stdin.buffer.read()))'
    – D0SBoots
    Commented Jul 9, 2023 at 10:04
0

The simple inflate program pufftest.c found in contrib/puff of zlib packet by Mark Adler himself can handle raw zlib data whithout header bytes and Adler32 checksum. Compile with cc -o pufftest puff.c pufftest.c and to inflate: pufftest < raw.zlib > decompressed. Note, it can't deflate.

-5
zcat -f infile > outfile 

works for me on fedora25

1
  • 6
    zcat only works with files in the gzip format. Commented Oct 17, 2017 at 8:50

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .