5

Is there any way to clone a MediaWiki site that's been abandoned by the owner and all admins? None of the admins have been seen in 6 months and all attempts to contact any of them over the past 3-4 months have failed and the community is worried for the future of the Wiki. We have all put countless man-hours into the Wiki and to lose it now would be beyond devastating.

What would be the simplest way to go about this?

Thanks.

3
  • 2
    While it may be possible to scrape the data, you do not own the site or the content regardless of the effort you put in. At least not without an agreement or license. Do not do it! You can be sued. Who says the site is abandoned? You? The owner of the site may feel differently.
    – closetnoc
    Commented Dec 17, 2016 at 16:52
  • 1
    @closetnoc Being a MediaWiki site, the content is freely available to be used and even copied by anybody as long as they follow the Creative Commons licensing agreement. If anyone tries to sue anyone over copying a Wiki it would be laughed out of court assuming by some miracle it even made it that far. And yes, the site is abandoned.
    – Bob Smith
    Commented Dec 17, 2016 at 18:43
  • Because the site uses MediaWiki does not mean that the content is not copyrighted which was my point. If you recall, I said at least not without an agreement or license. Since I cannot know what site you are talking about, it is impossible for me to have access to the same information you have. Therefore, the default answer is not to copy a site. It is the only responsible response anyone here can give. My suggestion is to check with the site owner which you have already tried. Otherwise, you may be out of luck legally. Cheers!!
    – closetnoc
    Commented Dec 17, 2016 at 19:41

3 Answers 3

4

Backing up a wiki without server shell access, requires Python v2 (v3 didn't yet work last time I did this).

From the command-line run the the WikiTeam Python script dumpgenerator.py to get an XML dump, including edit histories with all images and their descriptions.

python dumpgenerator.py --api=http://www.abandoned.wiki/w/api.php --xml --images

Note this XML dump doesn't create a complete backup of the wiki database. It doesn't contain user accounts, etc. Also the extensions and their configuration aren't backed up, file types other than images don't get saved. But it does save enough to recreate the wiki on another server.

Full instructions are at the WikiTeam tutorial.

For restoring the wiki from the XML dump see MediaWiki Manual:Importing XML dumps, etc.

4

You can use the API to export all the text content, with something like action=query&generator=allpages&export. Files you'll have to scrape via some script, such as pywikibot. You can see what extensions are installed via Special:Version if you want to set up an identical wiki; some of the configuration settings are available via the siteinfo API, most you'll have to guess. There is no way to bulk clone user accounts, but you can use the MediaWikiAuth extension to transfer them when they log in.

2

Media-Wiki pages can be exported in a special XML format to upload import into another MediaWiki installation You can use 'Special:Export' usually in most standard Mediawiki. At lease you can get all pages of each namespace.

imho this depends on the size. This works good for small mediawikis, I never tried to get a xml dump of huge Wikis (like wikipedia;)

But its worth trying.

1
  • I'm not sure what is considered a "small" wiki, but we've got 12,000 pages and 2000 articles. Much of it was created by less than 20 individuals so I would assume that qualifies as small.
    – Bob Smith
    Commented Dec 18, 2016 at 10:32

Not the answer you're looking for? Browse other questions tagged or ask your own question.