33

The arXiv license is the "default" license under which most preprints are being submitted to the arXiv at least in my subject. Out of dark curiosity, I am wondering how safe it is at doing what it is meant to do, namely ensure that these preprints are widely and freely available through the reasonable future. Let me quote the license in full:

The URI http://arxiv.org/licenses/nonexclusive-distrib/1.0/ is used to record the fact that the submitter granted the following license to arXiv.org on submission of an article:

I grant arXiv.org a perpetual, non-exclusive license to distribute this article.

I certify that I have the right to grant this license.

I understand that submissions cannot be completely removed once accepted.

I understand that arXiv.org reserves the right to reclassify or reject any submission.

I am wondering what this entails in any of the following hypothetical scenarios:

  • Cornell University decides to sell arXiv.org off for whatever reasons (which may be far lesser reasons than bankruptcy -- e.g., someone might catch wind of the fact that a great many PDF files are not ADA-compliant; or publishers might unleash a barrage of lawsuits on Cornell for hosting what they believe are not quite preprints; or it is simply decided that continued hosting of the arXiv is too much of a cost center), and the new proprietors don't see public access as a priority. Someone with a full dump uploads it on a server in the Ukraine. (Comparable cases: SSRN bought by Elsevier, although the full-dump analogy is broken here -- I don't know if anyone ended up re-hosting the papers taken down.)

  • The HTTP protocol and the WWW are superseded by something new and shiny, and the ".org" domain and the notion of a "server" lose their meaning; arXiv involves into a service which may have a hard time arguing that it is the same arXiv.org ("a highly-automated electronic archive and distribution server") that the license was granted to. (Comparable cases: The precursor of arXiv.org was a mailing list; it is far from obvious that mailing out a preprint on an ephemeral medium like a mailing list grants any rights for future perpetual hosting on the internet. Now imagine the next step after the mailing list and the internet, whatever that may be; ignore the current social media hype, which is not a relevant development for hosting documents.)

  • Various countries block the official arXiv domains (or force arXiv to geo-fence them out), causing the creation of multiple not-quite-official mirrors, some even on the dark web (.onion) or otherwise hidden-from-view. How can these mirrors argue that the arXiv license was granted to them?

  • The arXiv team splits along a political fault-line, resulting in two different groups/servers/teams with claims to the arXiv name. Are they both allowed to host the papers?

  • The TeX and PDF formats lose their universal support, and new formats come up (or new versions, breaking backwards compatibility); the arXiv team can no longer keep up writing compatibility scripts, and volunteers end up fixing the papers and posting them on github. (The compatibility nightmare is already happening to some extent -- the arXiv has its share of broken PDFs, and I recall even seeing a TeX that did not compile until I made a tweak. So far, most of the damage has been repaired, probably with a lot of manual drudgery, but the arXiv is getting more and more papers, and the next generation of formats to be deprecated will have a much higher amount of papers posted in it.)

What these scenarios all have in common is that, in a sense, the arXiv does not disappear -- it just evolves, changes its skin, reincarnates, as times change. My question is: Does the license follow it, or will the "new arXiv" be in troubles trying to prove that it still has a right to host preprints uploaded under the (standard) arXiv license back in the early 2000s?

14
  • 4
    PS: There is always the possibility that in 20 years, copyright will not apply to scientific work. But it is far from a certainty, whence this question. Commented Jun 19, 2017 at 13:04
  • 5
    arXiv.org is the entity, not the website itself. While Cornell hosts it, I do not know the legal standing of the entity itself. Cornell has a long history of hosting such sites (including the APS archives).
    – Jon Custer
    Commented Jun 19, 2017 at 13:42
  • 3
    These concerns are an excellent argument for choosing one of the Creative Commons licenses for your arXiv submission. They are just better in every way.
    – Boris Bukh
    Commented Jun 19, 2017 at 19:32
  • 2
    @JonCuster Often not, since many authors have since signed their rights away to some commercial publisher, after having submitted to arXiv when they had the right to do so. Also the authors might be dead/forgetful.
    – Boris Bukh
    Commented Jun 19, 2017 at 19:33
  • 3
    @JonCuster Many things were not credible before they happened. The point is not having to worry about this being one of them.
    – Boris Bukh
    Commented Jun 19, 2017 at 19:40

1 Answer 1

1

My answer is more to the question How future proof is the arXiv?.

How future proof is anything in this day and age?

The license itself is built on the idea that the system will continue roughly as it currently does, but substantial unforeseen changes both technological and of society (doesn't have to be a war) can easily derail it, as you have outlined in the various scenarios.

However, I don't think there is any reason to worry (your scenarios are too pessimistic), since the arXiv serves a very good purpose to all involved parties:

  • authors can quickly disseminate their results to all interested,
  • the scientific community (in fact the world community) get open access to (preprint versions) of many published articles (most in certain fields), and
  • the journals get a free preprint publication service which boosts their visibility and hence impact (since citations will be to the journal version, not the arXiv version - I know of no journal that disallows publication at arXiv). If a journal reverts this policy (in order to earn on copyrights), most authors of good articles will turn away from it.

Because of these points, funding at Cornell should be possible into foreseeable future.

4
  • 4
    This assumes that the publishers think in terms of preserving their impact factor, rather than in terms of copyright trolling. Something that might change if publishers start going bankrupt or getting bought up by hedge funds... And this is one of the more realistic scenarios: Pretty much everyone agrees the paid publishing model is not long for this world, at least in mathematics. Once the cash flow slows down, expect a fire sale... Commented Jul 7, 2017 at 18:03
  • @darijgrinberg Your scenario is not a danger, at least not in the long run. If a journal starts to copyright troll and disallow arXiv publication, authors (at least of the good papers worth reading) will stop publishing there. The journal cannot prevent arXiv access to articles published in the past.
    – Walter
    Commented Oct 29, 2020 at 13:56
  • 1
    Could the downvoters please explain how this answer can be improved and/or what is wrong with it?
    – Walter
    Commented Oct 29, 2020 at 13:58
  • 2
    You didn't actually answer the question that was asked. The question was "how resilient would this be in the face of catastrophic events", and your answer was "Oh you don't need to worry about catastrophic events. They're super-unlikely." That's... not an answer.
    – Ben Barden
    Commented Oct 29, 2020 at 15:28

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .