wget file gets neol converted text how to get correct file

Question

This is a file I need to get for an assignment pg100.txt available on https://www.gutenberg.org/cache/epub/100/pg100.txt I login to an Linux machine ssh user@machine

wget https://www.gutenberg.org/cache/epub/100/pg100.txt

I get the file but the file I get is garbled text. I want to know 1) How can I get correct text file 2) Why is the text garbled when I do a wget it opens normally in browser. I login to the remote server (CentoS7) from my windows 10 machine via putty.

I tried asking on SO but there bot redirected me here. If this is not the right place to ask let me know where to ask.

brainchild · Accepted Answer · 2019-10-14 13:15:13Z

Web servers provide information about the response body in the response header.

To see only the header, we can run:

$ wget --spider --server-response https://www.gutenberg.org/cache/epub/100/pg100.txt  
Spider mode enabled. Check if remote file exists.
--2019-10-14 09:13:55--  https://www.gutenberg.org/cache/epub/100/pg100.txt
Resolving www.gutenberg.org (www.gutenberg.org)... 152.19.134.47
Connecting to www.gutenberg.org (www.gutenberg.org)|152.19.134.47|:443... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 200 OK
  Server: Apache
  Content-Location: pg100.txt.utf8.gzip
  Vary: negotiate
  TCN: choice
  Last-Modified: Sun, 01 Oct 2017 05:16:47 GMT
  X-Frame-Options: sameorigin
  X-Connection: Close
  Content-Type: text/plain; charset=utf-8
  Content-Encoding: gzip
  X-Powered-By: 1
  Content-Length: 2023394
  Date: Mon, 14 Oct 2019 13:13:55 GMT
  X-Varnish: 1859043781 1856607983
  Age: 104391
  Via: 1.1 varnish
Length: 2023394 (1.9M) [text/plain]
Remote file exists.

Once we see that the content is actually compressed with gzip, we can use gunzip to decompress it:

$ wget -O - https://www.gutenberg.org/cache/epub/100/pg100.txt | gunzip -c > pg100.txt

When the page is displayed in a modern browser, you will find that the browser has done this work for us.

Stack Exchange Network

wget file gets neol converted text how to get correct file

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
ssh
putty
wget
centos-7
.

Hot Network Questions

wget file gets neol converted text how to get correct file

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged sshputtywgetcentos-7.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
ssh
putty
wget
centos-7
.