I used a fork of a GitHub repository whose paper I cited – should I also cite the fork?

Question

I want to cite a GitHub repository in scientific publications. The repository that I used is a fork version of the original GitHub repository which is accompanied by a paper. Both repositories are the same, but the fork has additional code for further calculations and plots.

I cited the paper to refer to the methodology. But I did not use the original GitHub provided by the paper. Instead, I used the fork.

Should I cite the fork?

Welcome back, nikki! Would you mind to explain "fork" to make the question more seldcontained? — user111388, Commented Feb 29, 2020 at 20:55
You can be generous with citing. It won't hurt you, but give credits where credits are due. — Per Alexandersson, Commented Feb 29, 2020 at 21:40
If you didn't use the additional code, you could have used the original repository. Assuming you did use the additional code, of course you should cite the fork. — chepner, Commented Mar 1, 2020 at 16:26

Wrzlprmft · Accepted Answer · 2020-02-29 10:40:02Z

22

You should cite the original paper and the GitHub fork. You are obligated to cite all sources you used, even if they overlap. Your methods should also state exactly what code you reused.

edited Feb 29, 2020 at 10:40

Wrzlprmft♦

62.7k18 gold badges193 silver badges299 bronze badges

answered Feb 29, 2020 at 10:30

Anonymous Physicist

99.7k24 gold badges204 silver badges355 bronze badges

The licence on the original repository might say all subsidiary works need to reference the original. The licence on the fork might say it needs to be referenced too.
– CJ Dennis
Commented Mar 1, 2020 at 1:49

Add a comment |

Wrzlprmft · Accepted Answer · 2020-02-29 10:39:45Z

For simplicity, original code refers to the software provided by the original repository or paper, and additional code refers to what is only provided by the fork.

A good litmus test is this: Suppose you had written the additional code yourself. Would you mention this, what algorithms you used, etc. in the paper? Or would this code be part of your publication, e.g., to ensure reproducibility. If yes, you should also cite the fork.

For example, if the original code allows you to perform some simulations and the additional code is only about plotting and does not touch the actual simulations (and you checked this to a reasonable extent), do not cite the fork for the same reason that you would not mention or provide any own code or plotting library that does nothing but plot some existing data. There is no reason to assume that the original does not suffice to reproduce your results. If it considerably helped you in preparing your plots, consider acknowledging it.

If, however, the additional code modifies the original code in a way that could affect the results, cite it. It does not matter whether the code added any functionality you used, but without this citation your work could not be reproducible anymore. Remember that citing does not only give credit but also shifts blame.

Better err on the side of giving a potential replicator of your results too much details. — vonbrand, Commented Feb 29, 2020 at 23:49
This does not account for the licenses on both repositories. The license on the forked repository might state that accreditation is required. — Robin De Schepper, Commented Mar 1, 2020 at 20:00
@RobinDeSchepper: In my experience, licences requiring accreditation are rather unusual in science (and in general), because they prevent the inclusion of the software in larger bundles, distributions, etc. and thus do more harm than good. But yes, if that applies here, you have to account for it. — Wrzlprmft, Commented Mar 1, 2020 at 20:58

user2768 · Accepted Answer · 2020-02-29 09:34:56Z

2

Cite the most relevant version.

Could you please let me know if I should cite the fork version? or not?

You used a repository that forks the original to add code for further calculations and plots.* Given that you cite the paper's methodology, which appears in the original repository and is unchanged in the fork, I'd suggest using the original repository. That said, it doesn't really matter which you use.

If the paper is published, then cite the published version, rather than GitHub.

^{* I don't immediately understand why a fork would be needed for additional code. That code could have appeared in the original repository.}

answered Feb 29, 2020 at 9:34

user2768

40.9k9 gold badges94 silver badges144 bronze badges

8

Perhaps the additional code was added by someone different than the owners of the original repository (neither of which is the OP), which would explain having it in a fork.
– GoodDeeds
Commented Feb 29, 2020 at 19:22
2

Could have appeared, but didn't.
– vonbrand
Commented Feb 29, 2020 at 23:49
@vonbrand Indeed. It raises questions.
– user2768
Commented Mar 2, 2020 at 8:06

Add a comment |

Stack Exchange Network

I used a fork of a GitHub repository whose paper I cited – should I also cite the fork?

3 Answers 3

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
publications
citations
software
.

Hot Network Questions

I used a fork of a GitHub repository whose paper I cited – should I also cite the fork?

3 Answers 3

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged publicationscitationssoftware.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
publications
citations
software
.