13

This is the proper way to download a website with all the images and css files so that it has the same layout as the original but I don't know why the -K --backup-converted and -E --adjust-extension options are necessary.

After the website is updated how do I update my backup/downloaded copy of the website? Just the same as for downloading?

wget -mpHkKEb -t 1 -e robots=off -U 'Mozilla/5.0 (X11; Ubuntu;
Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0' http://www.example.com
  • –m (--mirror) : turn on options suitable for mirroring (infinite recursive download and timestamps).

  • -p (--page-requisites) : download all files that are necessary to properly display a given HTML page. This includes such things as inlined images, sounds, and referenced stylesheets.

  • -H (--span-hosts): enable spanning across hosts when doing recursive retrieving.

  • –k (--convert-links) : after the download, convert the links in document for local viewing.

  • -K (--backup-converted) : when converting a file, back up the original version with a .orig suffix. Affects the behavior of -N.

  • -E (--adjust-extension) : add the proper extension to the end of the file.

  • -b (--background) : go to background immediately after startup. If no output file is specified via the -o, output is redirected to wget-log.

  • -e (--execute) : execute command (robots=off).

  • -t number (--tries=number) : set number of tries to number.

  • -U (--user-agent) : identify as agent-string to the HTTP server. Some servers may ban you permanently for recursively download if you send the default User Agent.

1

1 Answer 1

15
wget --mirror --convert-links --adjust-extension --page-requisites --no-parent http://example.org

ive used this in the past

from Make Offline Mirror of a Site using wget:

Explanation of the various flags:

  • --mirror – Makes (among other things) the download recursive.
  • --convert-links – Convert all the links (also to stuff like CSS stylesheets) to relative, so it will be suitable for offline viewing.
  • --adjust-extension – Adds suitable extensions to filenames (html or css) depending on their content-type.
  • --page-requisites – Download things like CSS style-sheets and images required to properly display the page offline.
  • --no-parent – When recursing do not ascend to the parent directory. It useful for restricting the download to only a portion of the site.

You must log in to answer this question.