35

I'm a PhD student and use GitHub to track and share code for nearly all of my research projects, most of which are largely coding-based work in the field of remote sensing.

Our lab is discussing how to best keep track of all the software related projects we do. I generally start and manage projects where I'm first author under my own GitHub account. I recently started a organization account for our lab and I have a repository in there that is basically just a README listing everyone's projects and links to them. We also have this information on our lab website.

I'm considering the pros and cons of having projects owned by the lab account, but I am leaning somewhat away from this, because most of these projects are individually maintained by students or lab technicians and it seems fairer to have them get credit on their own accounts. But we all want the lab to get credit and for people to know certain tools were developed within and supported by our lab.

Anyway, I'm just curious what other labs do and to get a feel for best practices here.

3
  • 7
    If you want ownership to fall under the lab, then you should create an account representing your laboratory, put the repo under there and contribute to the project there
    – Daveguy
    Commented Aug 28, 2020 at 18:26
  • 4
    What stops the individuals from making forks and then submitting PRs to the lab repo?
    – jaskij
    Commented Aug 29, 2020 at 13:28
  • 4
    @Daveguy No need for that; github supports "organizations" which are a different thing from joint user accounts. Commented Aug 30, 2020 at 13:30

4 Answers 4

47

It's fairly common for labs to have GitHub organizations. There are multiple reasons for this.

  1. First of all, it doesn't prevent the authors from getting credit. The commits will be shown as commits of the individual, not as commits of the organization.
  2. Labs often have a scientific "brand" in terms of what they research. If the GitHub repository is completely personal, then your work does not fully contribute towards this brand. But contributing towards the joint scientific output of the groups is a major reason why you may get funding from the group's third party funding.
  3. Perhaps most importantly, PhD students eventually leave the lab. They can continue maintaining their projects if they want to - there is no necessity to be an administrator of a GitHub organization to continue maintaining existing repositories. But what if the leaving PhD students choose not to maintain the code? If the repository is personal, then other people in the lab then cannot commit bugfixes to the repository. They could fork it, but once the repository link is in a scientific paper, the original repository is the official one. And this is problematic if the group's brand depends on the repository because it's a central part of their overall story.
7
  • 15
    Beyond credits from commits, the students can also pin their most important repositories to their profile, regardless of who owns the repository.
    – GoodDeeds
    Commented Aug 28, 2020 at 20:21
  • 14
    In some way, trust too. I'm more likely to believe in code owned by a lab org than some individual. Commented Aug 28, 2020 at 21:04
  • 2
    Point 3 also makes the case for private repositories (raw data, paper drafts, theses, proposals) to be put at a lab-controlled place. Moreover, knowing the repo will be at least lab-public might push people into making browsing/interpretation/building as easy and reproducible as possible. At leat it does for me.
    – ComFreek
    Commented Aug 29, 2020 at 10:54
  • 5
    Coming from industry, it's point 3 all the way. Academia seems rather tied to whatever you do in the lab being your property; but in fact it is not, as folks would discover very quickly if they ever tried to take their work commercial. Succession management is the most important element of data and document control - your lab (or your company, in industry) will usually survive perfectly well if your work is accidentally made public; but it will almost certainly die immediately if key data and documents are lost.
    – Graham
    Commented Aug 29, 2020 at 18:43
  • 1
    @TheDoctor This is not about the experiments in papers. If the authors of a new technique introduce it in a paper and want it to be used/cited by other researchers, they will normally include a link to a website (e.g., to a Github page). Archival-for-Reproducibity does not really help here because that prevents code maintenance. Quite often, code doesn't even compile after a few years because of changes in C/C++ compiler defaults, hence the need for maintenance, and archived code version won't help here. Code version used for Experiments can still be archived in the way you write.
    – DCTLib
    Commented Aug 30, 2020 at 19:41
5

For me it boils down to the question: Does the individual student take the software with him after the PhD and build his career on it? Or does the lab base a large share of its work on this code and multiple students will contribute and maintain the code.

An example for single person is Tim Davis' UMFPack / Suitesparse. He wrote multiple articles about the algorithms and further improvements. When he moved from Florida to Texas, he took all software with him.

If your concern is about the future of the software, it depends on the people. If the single developer leaves academia, the project is at danger. If the group maintaining the software cannot motivate new PhD students to pick the task up and continue maintaining, the project will break.

2
  • 1
    This does not make sense because software can be moved to a new account. Commented Aug 30, 2020 at 21:54
  • 3
    This is an important point. In many cases, the lab will no longer have the expertise to continue developing the software after the original author leaves. Commented Aug 31, 2020 at 4:16
4

If your audience is non-programmers or infrequent programmers:

Most people will not look at Github accounts or care which account is associated with your code. They will look at the manual. They will look at journal articles about your code. It's possible they might look at a copyright notice.

Write a useful manual, and state at the beginning of it who should get credit for writing the software.

0
2

You don't have to answer these questions, but what country and what institution? In my country (USA) there are intellectual property laws for federally funded research (a lot if not most STEM research) and institutional rules resulting from those laws that make it expedient for the institution to have the ability to own the repo and whatever it contains. When the students/postdocs/other researchers leave, they'll need to have guaranteed that the GitHub ownership or access stayed with the lab or institution. At my institution, that incentivizes many projects to go open source (which is fine with the federal law (Bayh-Dole Act) and my institution (UT Austin)) which requires explicit paperwork at UT (the feds don't seem to care too much but they do like commercial products and OSS products simultaneously).

Seeing commits as credit/blame is one thing (which I guess you could show off on a CV), but that's not the real point. Bayh-Dole gives the institution the opportunity, practically the mandate, to commercialize the software, which they sometimes try to do. But, as you can find on the internet, it's weird when it comes to commercializing software. Nevertheless, your institution may have policies in effect that demand that you declare software products that were created under government-funded work so that the law can be followed. We are consulted about whether we think the software is commercializable, and the law requires the "inventor" to get a cut of any sales or licenses (which I've seen happen to the substantial benefit of the author), but it's not pushed very hard if you say "there's no market for selling this as a product, and we ought to open source it on GitHub." UT trusts the author/inventor to make that call, and many NSF grants are starting to come with a mandate that any software developed under the grant be opened (don't know how this is made to mesh up with Bayh-Dole's desire to commercialize everything!).

All that makes me think that you ought to have a GitHub owner account for the lab that is controlled by the most expert git/GitHub user in the lab with the password information being shared via a good shared mechanism (i.e. a password safe that supports sharing). There also ought to be a process in place for what happens to change both the GitHub owner account and any other shared passwords when they leave.

2
  • 2
    Storing passwords in a password manager is a good practice, but GitHub allows for organizations with multiple owners, GitHub recommends that you have at least two owners, and that way everyone can login with their own private account and not share passwords. Ensuring continuity of ownership is important one way or another though, however you do it. Commented Aug 30, 2020 at 1:56
  • 1
    There is no relationship between "Github ownership" and "intellectual property ownership." Commented Aug 30, 2020 at 21:57

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .