1

So I want to download all images from a web server, particularly jpegs. The command I am running looks legit and I know the website has jpegs on it. So for example

wget -r -P C:/ -A.jpg http://somesitewithjpegs.com

It is my understanding that this command will scan the whole server recursively searching dutifully only for jpeg images and then download those images to my C:/ drive. For some reason this is not working.

Looking at the source code I can see that the images are not actually directly embedded in the page but are rather hosted in another directory on the server. Is this why wget is failing to download these images?

1
  • This might only scan start page for links to JPEGS.
    – Basilevs
    Commented Dec 9, 2013 at 3:17

2 Answers 2

2

To answer my own question it is true that wget can only follow links and download files directly. Seeing as how most images are linked to a directory that doesn't support directory listings or has restrictions, wget has no way of parsing the contents of said directory.

A good example of this is a wordpress site that stores images under the wp-content folder. Attempting to traverse this folder yields a 403 forbidden error. Even though we can see this image in our browser as a linked picture, wget has no access to it because the image is stored in a directory with no direct access.

Somebody can add to this answer if I'm missing details or not explaining the process correctly.

0

Is this why wget is failing to download these images?

Ans :Maybe / Most probably.

Try adding these options :

-l1 -H

The -H tells the app to span domains, meaning it should follow links that point away from the site (Maybe the images are served from a different server). And the -l1 means to only go one level deep :that is, don’t follow links on the linked site.This way you might be able to download contents from a different server that hosts image files.

iirc , while mirroring a complete wordpress site , you are able to access images from wp-content folder though.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .