I would like to crawl a backup site I lost access to. The site is backed up in subdomain.somesite.com while the links on the web page are www.subdomain.com
this leads to the following situation:
the link http://subdomain.somesite.com/?page_id=number works but the link in the actual html is http://www.subdomain.com/?page_id=number and doesn't work.
Any ideas how to do that with out writing a custom crawler?
I have access to www.subdomain.com which is on top of wordpress. One idea is to redirect all of the pages with the pattern /?page_id=number.
Example. http://www.subdomain.com/?page_id=255 will lead to http://subdomain.somedomain/?page_id=255