5

I am working on a research paper where I work with large amounts of data. The data was collected using a python code that I found on someone's Github repository. I was wondering how I would go about citing the repository in an apa style.

2

1 Answer 1

3

First of all: Contact the creator of the software and ask if he or she has some citable publication.

If not or in addition fork the repo. Use zenodo to get a DOI for a version of the forked repository (check the license of the original repository if this is allowed).

The APA-style reference would then look like:

Name (Date). Title [Type]. doi:10.5281/zenodo.XXXX

  • Name: owner of the original repo, if no real name known then github uid
  • Date: date of last commit to the original repo before your fork
  • Title: title of repo (heading of README.md)
  • Type: suggestion: "electronic resource: python source code"

If you take it serious: You needed to archive and reference exactly that version of the repo that you used for processing your data (later versions could lead to different results due to (fixed) bugs). If you have a snapshot of the software that you actually used it would be better to archive that through zenodo.

Would it not be better to archive and to get a doi for your data set instead of the toolset? AFAIK zenodo provides 50GB space per doi for datasets. I would try to contact the zenodo people if you exceed that.

To make it maximum reproducible you could create a new combined repo with your primary data, tool set, and secondary data. If there is just one "parent" repo (the python code as the basis of your tool set) you should fork that and add your data on top of it.

Not the answer you're looking for? Browse other questions tagged .