13
$\begingroup$

I recently fixed a broken link to a PlanetMath page, by changing http://planetmath.org/encyclopedia/RotationalInvarianceOfCrossProduct.html to http://planetmath.org/RotationalInvarianceOfCrossProduct.

Some links also have .html at the end, without /encyclopedia/ see e.g. http://planetmath.org/FullyIndecomposableMatrix.html from this Answer.

I also found some other links that I cannot understand, but now link to the homepage. These links start with planetmath.org/?op and now redirect to the homepage. I'm not sure what they were supposed to link to, however. I found this by performing a search for PlanetMath on MO. An example post there is this one.

There are also posts starting with planetmath.org/sites (as pointed out by Martin/hardmath). Some of these even link to PDF files that are no longer there.

As mentioned by Glorfindel, some posts are not fixed by the above procedure. I would guess that some pages have been renamed or deleted. It might be worth pointing out that PlanetMath has an index page which we can use to check before making the change.

Here are the posts that we can be sure have broken links:

  1. (From Martin's comment) Searching$^{1}$ with url:"*planetmath.org/encyclopedia*" gives 178 hits.

  2. Searching$^2$ with url:"*planetmath.org/?op*" gives a further 17 hits.

  3. Trying to find posts with .html at the end: url:"*planetmath.org/*html" gives 248 results. If the wildcard * works in the middle of the search input, this should strictly contain the results of point 1.

  • Are there other possible formats of PlanetMath links that are now broken?

  • How shall we go about fixing the links?


1. Note: searching for "PlanetMath" only gives 146 hits, but "url:"*planetmath.org/*"" gives 953 hits. So searching just for the website name does not search in the embedded links. Also note that wildcards * are supported in SE's built-in search.

$\endgroup$
7
  • 8
    $\begingroup$ Some related discussion recently on MathOverflow with regards to mass linkrot-correction: meta.mathoverflow.net/questions/5124/… $\endgroup$
    – Asaf Karagila Mod
    Commented Jan 13, 2022 at 13:53
  • 9
    $\begingroup$ I think that the prudent thing is also to replace http:// with https://, if we're updating stuff. $\endgroup$
    – Asaf Karagila Mod
    Commented Jan 13, 2022 at 13:54
  • 8
    $\begingroup$ If we assume that the problematic posts are the ones containing planetmath.org/encyclopedia, I got about 180 posts. You can get them by searching for url:"*planetmath.org/encyclopedia*". I have also posted some SEDE queries here: chat.stackexchange.com/transcript/19138/2022/1/13 $\endgroup$ Commented Jan 13, 2022 at 14:22
  • 5
    $\begingroup$ Some history of broken links here to PlanetMath. I remember fixing some of these back then, after the site reorganized. I'd be glad to help. $\endgroup$
    – hardmath
    Commented Jan 13, 2022 at 15:11
  • 4
    $\begingroup$ I have asked this on Meta Stack Exchange: Can the mass-replacement tool also replace and remove? We'll see whether somebody with the knowledge of this tool will respond. $\endgroup$ Commented Jan 13, 2022 at 17:12
  • 3
    $\begingroup$ Looking at the post linked by @hardmath, I suppose that also the links containing planetmath.org/sites are broken. But in that case, I don't see any easy way to fix them. (Maybe some of them might be available in Wayback machine?) $\endgroup$ Commented Jan 14, 2022 at 7:28
  • 1
    $\begingroup$ @MartinSleziak: Since there are a couple of dozen of those and will require Wayback research, I've created a spreadsheet to keep track of fixes if possible. Per Glorfindel's approach, I'll bump no more than three a day (and be a bit churlish about posts that seem not worth the effort). $\endgroup$
    – hardmath
    Commented Jan 15, 2022 at 23:41

3 Answers 3

15
$\begingroup$

Because the ideal tool for a job like this, the mass-replacement tool, which can repair posts without bumping them, doesn't work in this case, I have fired up a script (a derivation of the Broken Image Repairer) to repair these links on half a dozen websites across the network, including Mathematics Stack Exchange. The script runs once every three days, bumping three questions at a time, in order not to flood the homepage with trivial edits.

I just did a dry run of the script and here is a list of URLs it won't be able to replace; these will need manual action. Here is another list with replacements it will make (scroll to the right to see the actual replacement links). You'll notice that it will update other links, e.g. HTTP to HTTPS. (I see some replacements are not necessary, e.g. for books.google.com. It's good to see the results upfront...)

I'm open for objections or suggestions. Following Martin Sleziak's comment, I can have the script try to repair planetmath.org/sites links as well, e.g. https://planetmath.org/sites/default/files/texpdf/40203.pdf in this question has a Wayback Machine copy.

$\endgroup$
11
  • $\begingroup$ Excellent, thank you! Do you expect to be able to fix every such broken link, and if not can you put out a list of exceptional cases? I'd agree with fixing the /sites links too. $\endgroup$ Commented Jan 14, 2022 at 9:08
  • $\begingroup$ You're welcome! I expect the /encyclopedia fix to be highly effective, since that is just reindexed content; for /sites I'm not sure, but I can do a dry run first and check which URLs can only be fixed manually (and post them here of course). $\endgroup$
    – Glorfindel
    Commented Jan 14, 2022 at 9:16
  • $\begingroup$ Sorry, I meant to ask about cases like in your comment here where converting http://planetmath.org/encyclopedia/Rotate.html to https://planetmath.org/encyclopedia/Rotate doesn't work and the correct answer is instead https://planetmath.org/euclideantransformation. It should be easy to check for false-positives by matching with the entries as listed out in PlanetMath's index page, but I expect the fix requires a human..? $\endgroup$ Commented Jan 14, 2022 at 11:39
  • $\begingroup$ Thanks. Three a day is quite conservative for a large volume site like Math.SE, but if it works across a spectrum of StackExchange sites, I'm happy with that. $\endgroup$
    – hardmath
    Commented Jan 15, 2022 at 16:43
  • 3
    $\begingroup$ @CalvinKhor it's not that many broken links, and since Stack Exchange posts should never rely on links for crucial information, I prefer a slow pace. I'll update my answer with links (oh, the irony) to the results of a dry run. $\endgroup$
    – Glorfindel
    Commented Jan 16, 2022 at 14:26
  • $\begingroup$ bug? :) I am slowly going through the failures, and I found that there were 3 hits for http://planetmath.org/encyclopedia/ProofThatAllNormsOnFiniteVectorSpaceAreEquivalent.html but the failures list says 2 hits. Perhaps this one was the culprit link since it just includes the url directly in the post instead of formatting it properly $\endgroup$ Commented Jan 22, 2022 at 14:48
  • $\begingroup$ @CalvinKhor my script should detect those 'bare' links as well, I have no idea what happened :) $\endgroup$
    – Glorfindel
    Commented Jan 22, 2022 at 16:01
  • $\begingroup$ Similarly had two hits instead of one for http://planetmath.org/encyclopedia/CompactMetricSpacesAreSecondCountable.html, hit1 and hit2; hit2 had a bare link. If its not too much trouble to check the script... :) Otherwise I'll stop updating you / other people until they're all done $\endgroup$ Commented Jan 26, 2022 at 5:29
  • $\begingroup$ Thanks @CalvinKhor - I'll have a look tonight, I found another example: https://math.stackexchange.com/search?q=url%3A%22planetmath.org%2Fencyclopedia%2FCondensationPoints.html%22 gives two hits while my results only show one failure. $\endgroup$
    – Glorfindel
    Commented Jan 26, 2022 at 9:56
  • $\begingroup$ Hmm ... that last search now gives only one hit. I don't I can figure out what's wrong with the script. We'll see how many replacements it can make. $\endgroup$
    – Glorfindel
    Commented Jan 27, 2022 at 18:04
  • $\begingroup$ Sorry, I likely edited the post before you checked it. The two condensationpoints links were math.stackexchange.com/questions/59549/… and math.stackexchange.com/a/96570/80734 $\endgroup$ Commented Feb 2, 2022 at 6:52
9
$\begingroup$

I researched the twenty-four posts that Martin's planetmath.org/sites query finds. One post has three such links (the rest have one apiece), and one linked URL is shared by two posts, but two dozen is a pretty accurate estimate of the broken links.

UPDATE: All the posts in that batch are remedied. As Calvin Khor points out, Martin's query still finds one match, but this is because the Wayback archive copy of the PDF was retained in parallel with the newer equivalent HTML topic page.

In the meantime Martin Sleziak brought to my attention a different family of broken PlanetMath links, at MathOverflow as well as here. Before MathJax was a breakout success, there was a similiar package jsMath. These have been broken for a long time, and there is some variability to their syntax. Martin's search query is:

url:"*planetmath.org/?op=getobj*"

Comments were left on these posts; all but three are fixed.

$\endgroup$
3
  • 2
    $\begingroup$ Thank you! A remark in case we edit again (perhaps automatically) in the future: there are 3 hits (i.e not 2) still in the /sites query, because the wayback replacement for a pdf has the whole original URL in it (in particular http://planetmath.org/sites). $\endgroup$ Commented Feb 6, 2022 at 8:45
  • $\begingroup$ @CalvinKhor: I'm done with my fixes. $\endgroup$
    – hardmath
    Commented Feb 20, 2022 at 19:58
  • 1
    $\begingroup$ great :) I suppose the next update we will have is in about 106 days when Glorfindel's script is done? It may / may not have missed some links (see my answer) so we will need to check (hopefully not many) $\endgroup$ Commented Feb 21, 2022 at 11:12
3
$\begingroup$

I'm done replacing the links in the planetmath-failures list provided by Glorfindel.

Methodology: It seems the errors were because the URL were changed to match the title. Given this, all I had to do was find the posts on the Wayback Machine and find the title in the index. There were also some other non-Planetmath links; all but two (#1, #2) were fixed; some comments were left on the exceptions.

Notably I found some links that were not in the list. These seem to be links that were 'bare', i.e. without any markdown or HTML formatting to include them as hyperlinks.

This table details exactly what I did.

$\endgroup$
0

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .