45

I am writing a paper describing some of the research I have done. As part of my work I have developed an open-source library and made it available on Github.

How should I link to it?

Should I cite it in the bibliography or make a footnote with the link to the software?

2

5 Answers 5

38

Such resources, especially if they are a supplementary to the paper, i.e. in some sense a part of it, should be referenced in a footnote and not in bibliography.

Do include not only the URL but also a short description; and do try to keep that URL valid - once you publish that link, it's frozen forever.

It also helps to include the opposite citation. In the readme file of that library, include full citation information of your paper. This will allow others to gain extra information about the methods, and also give you citations if/when others build upon your work.

Some publishers also support binary attachments for supplementary information. If they do, you should use it - prepare a package of the current stable version and upload it there. It allows for reproducability, as a specific version is referenced which relates to the actual paper, and not some improvements done in 2020 that change everything; and it also attaches all relevant information together with the paper at the publisher's site, which will stay valid even if that github repository goes away for whatever reason.

3
  • 2
    Github may disappear in the future, you may close your account. I think it is the journal's job to keep a snapshot of the code for the record, plus a link to the repository for folks to get updates.
    – Davidmh
    Commented May 7, 2014 at 10:20
  • 1
    This depends. I develop software as a mathematician and in all the papers I have been a a co-author, we have cited the software (it's widely used in our community). I would probably cite the software on your university page, as this would be fairly stable and around, unless you are not tenured, and then on your site, link to the latest version of the library on github (see homepages.math.uic.edu/~jan/download.html) as an example. The other option is to write a 'software' paper and find an appropriate journal to publish the library, at which point you can cite that paper from then on.
    – nagniemerg
    Commented May 7, 2014 at 18:23
  • 1
    @Davidmh In an ideal world it might be the journal's job, but I've definitely had code I wanted to share for an article where the journal wasn't going to or couldn't handle archiving it.
    – Fomite
    Commented May 7, 2014 at 18:57
20

In previous papers, I've used something like (source code available at github.com/fomite/brilliantwork) when describing the software methods used.

However Figshare now allows you to directly import a repo from GitHub, which will give you a DOI you can reference as a citation. This also provides a benefit for having a "snapshot" of the repo at the time of publication, for repositories that will continue to be worked on. That, plus the ability for me and other people to cite the repo directly (and thus be able to get some traditional citation metrics to show the impact of the software), is what I'll be doing in the future.

4
  • 5
    +1 for snapshots; I would say in general, when citing, be very sure you give a version. git provides hashes corresponding to particular commits/versions: That's what I'd include in a citation (having not used Figshare).
    – Matthew G.
    Commented May 6, 2014 at 18:40
  • @MatthewG Agreed as to the version. The Figshare upload process will actually create an entirely standalone code base that will be frozen in time.
    – Fomite
    Commented May 6, 2014 at 18:45
  • I was going to post the same thing for the same reasons.
    – David Z
    Commented May 6, 2014 at 20:53
  • 1
    An alternative view on Figshare. It doesn't mean that Figshare is bad but one does need to be careful with the fine print when uploading there.
    – E.P.
    Commented May 8, 2014 at 10:12
9

A citation is a reference to a research object. Prior to the DOI, this reference contained the information somebody would need to physically locate the research object, whether it was a book, journal article, or dissertation. Although the jury is still out on how to provide general references to digital research objects, a Git repository contains a canonical reference, the SHA1 hash of the commit.

If you would like to refer to your Git repository in such a way that it is easy to locate in the future, you should provide not only the URL where it is now, but also the short name for the repository, the lead author[s], and the SHA1 hash of the commit that you produced your results with.

Here is an example URL from GitHub that contains the commit: https://github.com/hashdist/hashstack/commit/4c72950a0f6eb9cc1cf63cd640f3e6b82c9ce9c0

I don't recommend uploading your code to Figshare until they've fixed their ability to accept code licenses.

Update 14 May 2014

GitHub and ZENODO have partnered together to upload code for a DOI under a flexible license. Obtaining a DOI for your code should be as simple as following the instructions on the GitHub Guide to Making Your Code Citable. Here's an overview of the instructions:

  1. Choose your repository
  2. Login to ZENODO
  3. Pick the repository you want to archive
  4. Check repository settings
  5. Create a new release
  6. Check everything has worked
  7. Mint a DOI

It's really that easy. There's no current reason not to use this as a default approach for citing your software.

1

The specifics depend on your field, and I think most areas have very loose guidelines as to what to do in these situations. The only hard-and-fast rule is that you need to get it past your reviewers and your editors, who will tell you if the style is inappropriate. Other than that, the most important thing is that you yourself are happy that the citation is getting your GitHub repo the most visibility possible. Finally, you should consider your paper from the perspective of readers that may want to use and cite your code, and who will naturally look to your writing for how to do that.

In general, I would recommend citing the code in the bibliography, as one more reference. The advantage of this is that your article's references will be listed and indexed separately, and (with luck) this will register as links pointing towards your repo, which will become more of an advantage if more people cite it. Such a citation should have

  • the name of the programme,
  • the URL of the repository, and
  • a clear indication of the version cited and its date.

An example citation is

G. A. Worth, M. H. Beck, A. Jackle, and H.-D. Meyer. The MCTDH Package, Version 8.2, (2000), University of Heidelberg, Heidelberg, Germany. H.-D. Meyer, Version 8.3 (2002), Version 8.4 (2007). See http://mctdh.uni-hd.de

You should also describe the code within the text when you first cite the code, and provide a complete enough description within the paper that readers do not need to go read any additional information to continue reading your paper, because it needs to read like a single, coherent piece of work.

Alternatively, you can cite it in a footnote, indicating the name of the code and its location. This would be a good place to include the description if it is brief. Another choice is to do this in the acknowledgements, as giordano points out. However, I think these make your code less discoverable to both humans and search engines.

As I mentioned before, you should mould your citation of the code in the way that you'd like others to cite it. It is also desirable that you include, within the pages describing code online, a description of how you want people to cite the code. Some examples are GAMESS UK, MCTDH and MOLCAS, or the ones in this question. Having such a description will strengthen your position should referees or editors not like your preferred style. You set the terms on which your code gets cited!

Finally, as others have mentioned, you should make sure that the URL you point to is stable, as you will be unable to change the link in the published paper once it goes out. This is a separate question altogether, and there are a number of ways to do this - including supplementary information to the article itself, separate repositories for academic code, and of course GitHub itself - and you should strive for the most stable solution possible. Is it likely that the repository will someday get closed or moved? If so, you should consider alternatives.

0

In addition to referencing the repository within the paper, you should also make sure that the link between the paper and repository is part of the paper metadata. Specifically, if you are depositing the paper on arXiv, you can use the latest feature that integrates with Papers with Code to record the link between your paper and its implementation.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .