How to download files and rename them using href name?

Question

How would you mass download files from a web page and also rename them using the href name(description) they have?

The idea is that files downloaded have descriptive names, unlike the original file names which are anything but that.

For example, given that a web page contains the following link

<a href='http://www.example.com/docs/ex160.pdf'>Advanced Foo Bar</a>

Ideally, I would like to save it as "Advanced Foo Bar.pdf", but even "Advanced Foo Bar" would be fine, as I can use a Bulk renaming utility to add pdf extension to hundred or so files I have to download.

I have been using FlashGotAll extension for Firefox to download, and it works splendidly for bulk downloading, except there is no renaming function built in.

I can also fire up Linux(or use cygwin) and use curl or wget, if need be for this solution.

hlovdal · Accepted Answer · 2011-06-13 18:54:18Z

Assuming that the html content is nice looking like your example (i.e. only one href per line, not split on several lines, no mix of HREF and href, etc), you can download the page and run

prompt$ grep www.example.com the_page.html | sed 's/.*href="\([^"]\+\)">\([^<]*\)<.*/wget -O "\2".pdf \1/' | tee files_to_download
wget -O "Advanced Foo Bar".pdf http://www.example.com/docs/ex160.pdf
...
prompt$

Edit files_to_download if applicable, and then download by running sh files_to_download.

Stack Exchange Network

How to download files and rename them using href name?

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
download
html
rename
download-manager
.

Hot Network Questions

How to download files and rename them using href name?

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged downloadhtmlrenamedownload-manager.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
download
html
rename
download-manager
.