Curl is uncompressing a compressed file when I didn't ask it to

Question

This is the opposite of the issue that all my searches kept coming up with answers to, where people wanted plain text, but got compressed.

I'm writing a bash script that uses curl to fetch the mailing list archive files from a Mailman mailing list (using the standard Mailman web interface on the server end).

The file (for this month) is http://lists.example.com/private.cgi/listname-domain.com/2013-September.txt.gz (sanitized URL).

When I save this with my browser I get, in fact, a gzipped text file, which when ungzipped contains what I expect.

When I fetch it with Curl (after previously sending the login password and getting a cookie set, and saving that cookie file to use in the request), though, what comes out stdout (or is saved to a -o file) is the UNCOMPRESSED text.

How can I get Curl to just save the data into a file like my browser does? (Note that I am not using the --compressed flag in my Curl call; this isn't a question of the server compressing data for transmission, it's a question of downloading a file that's compressed on the server disk and I want to keep it compressed.)

(Obviously I can hack around this by re-compressing it in my bash script. Waste of CPU resources, and a problem waiting to happen in the future, though. Or I can leave it uncompressed, and hack the name and store it as just September.txt; that wastes disk space instead. Again, that would break if the behavior changed in the future, though. The problem seems to me to be that Curl is getting confused between compressed transmittal, and and actual compressed data.)

How are you verifying that the file has been ungzipped? (I know that sounds like a weird question, but if the answer is "I looked at it", then with what tool?) (Or, to be less mysterious, if you looked at the file with less, try less -L) — rici, Commented Oct 1, 2013 at 5:12
possible duplicate of How to properly handle a gzipped page when using curl? — Jayan, Commented Oct 1, 2013 at 8:50

Cole Tierney · Accepted Answer · 2013-10-01 11:40:59Z

5

Is it possible the server is decompressing the file based on headers sent (or not sent) by curl? Try the following header with curl:

--header 'Accept-Encoding: gzip,deflate'

answered Oct 1, 2013 at 11:40

Cole Tierney

10k2 gold badges28 silver badges36 bronze badges

Add a comment |

Community · Accepted Answer · 2017-05-23 11:58:55Z

3

You can download the *.txt.gz directly, without any uncompressing, with 'wget' instead of 'curl'.

wget http://lists.example.com/private.cgi/listname-domain.com/2013-September.txt.gz

If curl is essential, then check out the details here

edited May 23, 2017 at 11:58

CommunityBot

11 silver badge

answered Oct 1, 2013 at 8:15

philshem

25.1k8 gold badges64 silver badges129 bronze badges

Add a comment |

Collectives™ on Stack Overflow

Curl is uncompressing a compressed file when I didn't ask it to

2 Answers 2

Not the answer you're looking for? Browse other questions tagged
bash
curl
compression
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Not the answer you're looking for? Browse other questions tagged bashcurlcompression or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
bash
curl
compression
or ask your own question.