5

I am a postdoc in the humanities and I am drafting a funding application for a project envisaging the development of a research database. The project will last no more than 2-3 years and as a postdoc I am not supposed to stay at the host institution for longer. The question is how to preserve the results of this work for posterity. Luckily these days there are institutional and independent repositories, which allow me to store my complete dataset as a set of CSV files and in RDF/XML format with a well-documented ontology. This sounds sustainable. But I also would like to offer a web-interface to the researchers so that the database could be consulted online. (The structure of my data is quite unique, which does not allow me to simply submit all my data to the online database of any large project working in my field.) I see it as a standard MySQL/PHP application, which I am planning to develop. The question is what is the most sustainable strategy for hosting it. I have seen a number online research databases in my field, some costing a great deal of money to the funding authorities, which went offline after some years and are no-longer available. This includes both commercially hosted database (the worst option in my view, for someone has to pay eternally for hosting and domain name) and university-hosted databases. I understand that after I leave the host university (and I am just a postdoc), no one can guarantee that they will host the project website forever. Besides the technologies for web development are changing fast and the web interface designed with current tools will become obsolete after some years. What are the best practices for securing a lasting time-span for a small online research database, which does not have a strong lasting project behind it?

7
  • 3
    No offense intended. The first word of the title 'Infinite" is impossible. Nothing is guaranteed to last forever. Short answer to the question What are the best practices for securing a lasting time-span for a small online research database ... ? Your own web site or blog paid by yourself.
    – Nobody
    Commented Sep 9, 2016 at 9:00
  • For web development, I would recommend JAVA-JSP, Servlets, ... because of the high complexity and adaptability. The best option would be to have a paid domain and to mirrors on faculty servers.
    – Nikey Mike
    Commented Sep 9, 2016 at 12:03
  • My question implied that the problem with the privately paid domain/hosting is that it depends on one person (me) and it will just disappear after I (for whatever reason, we are all flesh) stop paying for it and supporting it. Is JAVA-JSP really superior to PHP in terms of life-span?
    – greenb
    Commented Sep 9, 2016 at 12:50
  • Is JAVA-JSP really superior to PHP in terms of life-span? is a technical question and is off-topic. To answer it, no one knows the answer. Maybe next year we're going to have KAVA-KSP (I just make it up) which will last longer than anything else. Who knows?
    – Nobody
    Commented Sep 9, 2016 at 14:20
  • 1
    How large is the database?
    – Davidmh
    Commented Sep 10, 2016 at 5:17

3 Answers 3

1

Let me offer you several strategies. Firstly, you can consider, instead of or in addition to developing a LAMP-based Web application, to publish your research database as open data set with a clearly documented structure (schema, ontology, etc.). The benefits of that include much wider option of long-term preservation as well as opening various opportunities for other researchers to reproduce, enhance and build new knowledge on top of your results: open data => reproducible research => scientific innovation. For this option, you can consider using some solid free open data repositories, such as figshare, Zenodo, CKAN-based Datahub and GitHub (see examples).

Secondly, you can consider a hybrid approach, which is to combine an open data set, published as mentioned above, with a relevant open source code of Web application that anyone could download, install and use to interface with your data set. Considering the open source hosting aspect, from above-mentioned options, the GitHub one is especially attractive, as you could seamlessly host both data and relevant Web application code. If you (or someone who can help you) are technical enough, you could make access to your data set, using this approach, even easier, by providing a containerized (such as Docker) version of your data and application (if the data set if not too large, you can even push relevant public Docker image to DockerHub or other services that host public images for free). Similarly, you can publish a free software appliance - virtual machine (VM) - perhaps, some of the above-mentioned repositories (and/or maybe others) offer hosting open VMs for free.

Thirdly, you can propose developing and hosting Web application that would provide open access to your data set to (in addition to some universities) relevant non-profit organizations, working in your particular domain. If successful, the costs of developing and maintaining the database would be covered (at least, for some time) by relevant scholarships, grants or similar financial vehicles. For example, for social sciences, including humanities, you can review funding opportunities at Social Science Research Council, The Rockefeller Foundation, Carnegie Corporation, Ford Foundation, Russel Sage Foundation and many other non-profits.

2
  • 1
    Thank you. I planned to publish open data (based on an ontology from a neighboring field with documented additions to it, necessary to accommodate my custom data). Yet for most users in the humanities working with a raw dataset is unpractical, when they just need to find some particular information in it (as they would do using a printed reference book); hence, a web interface is a must (some 20 years ago such db would be published as a desktop application, which is maybe worse in terms of longevity). So I like the idea of also sharing the sources and the db dump in a repository (hybrid).
    – greenb
    Commented Sep 10, 2016 at 7:52
  • @greenb You are welcome. I'm glad you found my answer helpful. Commented Sep 10, 2016 at 8:50
5

I have the impression that quite a few university libraries are very interested in preserving and making available other things than just books. So talking to your university librarian would be my first stop. Even if your university library does not have expertice in this field, they may know of such a project somewhere else that you can apply for.

As to someone else maintaining the web-interface for you, I would not count on that happening. That will cost someone time and money, and unless you are rich enough (and willing) to set up a fund to preserve your project, you cannot count on someone else doing it for you. Instead I would keep the interface as minimal as possible. It may not look pretty, but it is more likely to keep working.

0

If you set up a non-profit, you can apply for free webhosting for the nonprofit with some webhosting companies, for example dreamhost.

3
  • In some cases, academics are entitled to the same conditions as nonprofits (and many universities are nonprofits).
    – Davidmh
    Commented Sep 10, 2016 at 5:18
  • 1
    Correction: if you set up a non-profit based in the US. Which might be tricky to do if you don't live there. Commented Sep 10, 2016 at 6:49
  • Thank you. Similar conditions are offered by hosting providers around the world. But eventually a hosting company is just a commercial enterprise, not dedicated to preserving the data in the long term (unlike the repositories). I know one such nonprofit site hosted for free for twelve years already. It would have died a couple of times already without the interventions by the owner requested by the hosting provider on the occasions of migrating to another server or the like.
    – greenb
    Commented Sep 10, 2016 at 8:00

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .