35

Professor's personal web pages hosted by their institution are crucial sources of information in two ways:

  1. to disseminate useful and practical but not publishable information (especially in systems programming)

  2. to disseminate supplemental data, source code, and software. Google killed Google Code, and people are apprehensive about the long-term fate of GitHub since it was acquired by Microsoft. As such, a lot of people put source code in their university web pages. Now we have Zenodo and FigShare, but those are relatively new.

    • E.g., I wanted to do an experiment involving an older system (published on ~10 years ago), and according to the author, it's source code has been deleted by their old institution when they left to teach at a different university. Reviewers will probably object that my evaluation is incomplete since it is lacking the older one, but there is no extant copy of that older system to compare to!

I have three questions:

  1. Is it normal to delete when the professor moves or retires? I can't find it stated anywhere in official university policy. What's the policy at your institution?

  2. What was the point of hosting that in the first place if it is going to be deleted eventually? If the purpose was to disseminate information, the fact that it will disappear so quickly defeats the purpose of having it at all. In the long run, it just leads to the creation of dead links.

  3. What are the main constraints the university is facing that cause them to stop hosting that data? If the issue is cost, how does hosting ~100KiB for even as large as 10,000 past and present faculty (total ~1GiB) become prohibitive for a university?

1
  • Answers in comments and other asides have been moved to chat; please do not continue the discussion here. Before posting a comment below this one, please review this FAQ. Comments that do not request clarification or suggest improvements usually belong as an answer, on Academia Meta, or in Academia Chat. Comments continuing discussion may be removed.
    – Wrzlprmft
    Commented Aug 10, 2023 at 9:21

7 Answers 7

54

In my experience there are two common situations.

  1. The university's provision for web-hosting is centrally managed, and tied to the user's university-wide access credentials. These are themselves tied to HR's employee database. Once someone leaves the university, this automatically triggers processes that lead to their computer accounts being deactivated. A consequence of this is that faculty websites disappear. Probably nobody ever consciously made the decision that this is what should happen - it just arises naturally from the way the web hosting is implemented.

  2. Faculty websites are hosted on an ad hoc basis, using a server run by a department or similar. Typically such websites persist for longer. However, at some point the system dies and is replaced by something newer. It is left to individual users to port their website from the old system to the new system, and so websites that no longer have an active owner disappear by default. Again, this is largely driven by practicalities - the alternative would be paying someone to deal with it, and no single person in the organisation cares enough to find the money for that.

There are also a number of practical, legal and security concerns associated with hosting unmaintained content. Information may be out of date and misleading (e.g., course syllabi that are no longer correct) or legally questionable (e.g. due to changes in privacy laws). There is also the risk that such websites (ab)use components or services that have not been updated for a significant period, and are known attack vectors. All in all, from the university's corporate perspective there is little sense in retaining a site that nobody is willing to take responsibility for.

14
  • 9
    I agree with security reasons. Especially when a lot of personal pages are setup uniquely rather than through a common setup easily maintainable by the Uni. Having to check 30+ differents pages, created over different decades with very different (but probably all legacy) code, and probably hosted on different parts of the infrastructure (because of the different school/research institute), sounds like a nightmare. Obviously you can just not check it, but then security issues will arise
    – JackRed
    Commented Aug 8, 2023 at 13:38
  • 18
    @JackRed which only means that the web is utterly broken. HTTP used to be able to serve simple content to visitors without "security issues". Imagine if libraries had a policy of removing all books over 10 years old because they weren't typeset using the latest process or because the paper manufacturer was "no longer supporting that version".
    – hobbs
    Commented Aug 8, 2023 at 18:02
  • 17
    In my experience academic pages like this are proper documents not web application garbage. The idea they're software-like and have to be maintained for security is nonsense. Commented Aug 8, 2023 at 19:38
  • 5
    @R..GitHubSTOPHELPINGICE Apart from the obvious irony of typing "web application garbage" into a web application, I've seen plenty of academic sites with interactive demos, using a variety of obsolete technologies - Java applets, Flash widgets, out-dated JavaScript... And that's before you get to server-side languages - ever see a page with "cgi-bin" in the URL? that's almost certainly running some very dodgy old server code, that might have remotely exploitable flaws, or just plain not work if copied onto a newer server.
    – IMSoP
    Commented Aug 8, 2023 at 19:43
  • 12
    @IMSoP: It's garbage when you're using it in place of what should be a static site, not when you're actually doing something with it. Commented Aug 8, 2023 at 21:12
42

A university website is not an archive, and the people working at the IT department are not professional archivists. If you want to seriously think about preserving knowledge from people no longer there (or no longer interested in maintaining it), you need a different place to store it and different people to maintain it and make it accessible.

For example, in my field it is important to preserve data, and now there is an international network of national level data archives that professionally preserve and maintain accessibility of that data. This only happened after each university tried to do this on its own, with very mixed results. I remember a story of data stored on punch-cards in the university basement. Punch-cards are cardboard and mice will eat bits of cardboard. You can imagine the rest of the story.

2
  • 4
    It may be useful to run those mice teeth edit card decks to see if they out perform the original key punched programs, who knows what transformative technology is lurking beside the old coal bins down there.
    – civitas
    Commented Aug 8, 2023 at 23:56
  • 8
    This - having worked in university IT. We'd preserve stuff where we can, but storage costs both time and money. A professor who'd left might get several emails to confirm we can delete, say, their home share. Then we'd archive it for a year, then, finally, delete it. We still got complaints. Data is also fundamentally useless if you don't know what is in it. Is that random python script on a professor's share important? Who knows...it might be a world changing algorithm. It might be a broken first draft. It might generate dummy data for a student exercise. We'd need to analyse it to know
    – lupe
    Commented Aug 9, 2023 at 11:43
18

Why would you not delete the webpages of people who are no longer in your employ?

  • Hosting their webpages makes it seem like they still work at your institution, but they are no longer in your employ.
  • By extension, you save the time of people looking for the professor, since they won't e.g. compose emails to the professor's @myinstitution.edu email address then find it bounced.
  • You can't easily modify those webpages, because they're personal webpages.
  • Even if the source code is worth hosting, it might not be licensed for you to distribute. The professor could (and arguably should) bring it with them to wherever they are now.
  • What was the point of hosting that in the first place if it is going to be deleted eventually? In the long run we are all dead, C++/Python/whatever language the code is written in will become a historical relic, and the Sun will go nova, so what was the point of hosting anything in the first place?
5
  • 14
    A sun will go nova but THE Sun was killed a long time ago by Oracle. I still miss it. I also hate that the one thing form Sun that I did not like, Java, survive to this day but my favourite thing from Sun, UltraSparcs, have basically disappeared.
    – slebetman
    Commented Aug 8, 2023 at 9:29
  • 12
    Good answer, except the last paragraph is weird. The problem is, the outlook "we're all gonna die so what's the point" could be applied equally to any question under the sun. A university professor might want to use the hosting provided to them for all sorts of reasons related to their job.
    – matt_rule
    Commented Aug 8, 2023 at 12:24
  • 6
    @matt_rule that's the point, though ("What was the point of hosting that in the first place if it is going to be deleted eventually" also applies to everything that can be hosted).
    – Allure
    Commented Aug 8, 2023 at 12:34
  • 2
    @Allure It was a little unclear that the italics indicated you were directly quoting the OP and responding to that specific concern. It's definitely at least partially on me for not reading the question closely enough to recognize it immediately, but maybe using quotation marks or calling it out specifically would make it more clear?
    – Idran
    Commented Aug 8, 2023 at 14:08
  • 1
    @JustinHilyard ah, thanks, I understand now. I thought those were Allure's words
    – matt_rule
    Commented Aug 8, 2023 at 19:09
7

Personal webpages for professors have their roots in smart individuals fiddling with the Internet well before it became standard for universities and businesses to have their own massive webpages. Most were home-spun and might even have been written in plain HTML.

This is not a good archival system.

In the last 15 years, as universities have built out giant web pages and systems, someone's not likely going to get www.podunk.edu/faculty/JSmith anymore, as websites become more sophisticated.

I have never run across a webpage that has been deleted, but it is not surprising to me that some IT departments have begun to trim these random hangers-on webpages from 20 years ago and probably two or three massive system overhauls (could even have been an accident in 2018 that no one noticed erased Smith, emeritus professor's webpage last updated in 2009).

While yes, universities are repositories of knowledge, it may just not have occurred to whoever was in charge that any thing like that may be lost if they clean up those old webpages.

What was the point of hosting that in the first place if it is going to be deleted eventually?

I don't think that just because something might be gone later, doesn't mean there's no reason to host it now? Nobody in 2004 was there to say "Professor Smith, we are going to delete this webpage when you retire in 2020, so don't post it now."

What are the main constraints the university is facing that cause them to stop hosting that data?

Obviously, these aren't huge storage demands.

2
  • 3
    "I have never run across a webpage that has been deleted" -- seriously? Ok. It doesn't happen to me every day, but I have many times come across links to pages that no longer exist, including some (once-) institutionally-hosted faculty web pages. Of course, if it's been deleted, then I cannot come across the page itself, unless at the Internet Archive or similar. Commented Aug 10, 2023 at 14:29
  • @john Yep, sorry no instances come to mind. But it’s not like I’ve been taking notes. Commented Aug 11, 2023 at 7:07
3

Given that it's 2023, and that both university administrations and many older faculty didn't catch on to "the internet" until the last 10-20 years or so, there is really no long-term precedent for how to deal with the question of faculty work posted on their university web pages. The main approach has been to try to not count this as being anything official, anyway. There still is the confusion about literal publishing (= on the internet) versus "publishing" in the once-sensible, but now-archaic, meaning of "passing peer-review for a 'journal'".

I myself see this as somewhat analogous to "the problem" of what to do with books/scrolls before the idea of "library" was established... :)

A disturbing (to me) difference is the idea of the ephemeralness of stuff on the internet. Indeed, lots of stuff is not really meant to be looked at more than once, if at all, and many things are of-the-moment. The concept of long-term or even permanently-useful things does not seem to be in harmony with "the internet", as we see it currently.

On another hand, given the lack of precedent, possibly there is a (near-?) future equilibrium that none of us has the imagination to see?

Since I've written lots of mathematical stuff, most of it aiming to be helpful to other people, I do have an impulse to try to keep it available "after I'm gone". At this point, I do not see a reliable means to do that. Some conventional books, ok. Do all of us have to allocate some of our estates to server fees for our life's work? "The state" has paid for libraries in the past (though "books" are apparently less popular in some quarters than they once were), so an author of a serious book did not have to allocate money to their book's continued existence.

Another plausible attitude is that it's just as well to lose "old stuff", since an accumulation of that old stuff imposes an ever-increasing burden on younger people. :)

11
  • 3
    “ ephemeralness of stuff on the internet” : yet most publishers are going online, sometime only online…. what will be the fate papers published today in 100+ years? Commented Aug 8, 2023 at 1:26
  • 3
    "At this point, I do not see a reliable means to do that." Maybe submit it to Archive.org? That's basically their entire raison d'etre.
    – Idran
    Commented Aug 8, 2023 at 14:12
  • 2
    @JustinHilyard, yes, though they do have a mild constraint that documents approximate publishable papers. Probably course notes, even somewhat advanced, don't fit that. The purported distinction between "research" and "exposition" is something I don't want to have to worry about. :) Commented Aug 8, 2023 at 16:16
  • 5
    FYI, @Justin Hilyard asked "Maybe submit it to Archive.org?" (among other possibilities, saving web pages for the Wayback Machine) and the two follow-up comments are about arXiv. Commented Aug 8, 2023 at 18:26
  • 2
    @paulgarrett: The AMS’s Open Math Notes is devoted to archiving things like expository papers and advanced course notes. You might want to think about using them to host your stuff. Commented Aug 9, 2023 at 1:48
3

To add to the excellent answers already posted:

One of the functions of professors' web pages is to give prospective students and faculty a glimpse into the areas of research and publishing that the particular university department is involved in. It would be deeply disappointing to join a university and realize that the majority of the interesting and pertinent research goals being discussed belong to faculty no longer associated with the university.

(Yes, one could get some idea by checking when each page was last updated, but that is a relatively tedious and manual process, and it may be difficult to tell the difference between a faculty member who rarely updates their pages and one who left last year after being denied tenure.)

In addition, orphaned pages can no longer be edited to, for example, remove links to retracted or superseded papers; in some cases stale information can be worse than no information, for example sharing source code with known and widely exploited security holes.

3

As yet unmentioned: copyright and distribution rights.

At most universities, the professor retains the copyright to material that they produce. The the university does not have the legal authority to continue distributing a professor's materials without their explicit consent. If the professor does not continue making it available, then the university cannot do so in their place.

For example, my employment contract grants my university a one-year, non-exclusive license to use all of the teaching materials I have used in a course taught at the university. This is the "hit by a bus" clause- if I were to be hit by a bus or otherwise suddenly removed from my teaching position, my department would have a one year window to continue teaching with my materials, but after that they need to figure something else out.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .