I'm an academic rather than a programmer, and I have many years' experience writing Python programs for my own use, to support my research. My latest project is likely to be useful to many others as well as me, and I'm thinking of releasing it as an open-source Python library.
However, there seem to be quite some hurdles to cross in going from a functioning personal project to a library that can be installed and used painlessly by others. This question is about the first steps I should take in order to start working toward a public release.
Currently, I have a single git repository that contains my code that uses the library as well as the library itself, and I use git as an emergency undo button in case anything breaks. All of this works fine for a single user but is obviously not appropriate if I want to release it. Where I want to end up is that my library is in a separate repository and can be installed by others using pip
, and has a stable API.
Learning to use setuptools etc. is probably not so hard once I'm at the point of wanting to publish it - my problem is in knowing how I should be working in order to get to that point.
So my question is, what are the first steps one should take in order to start preparing a Python library project for public consumption? How should I reorganise my directory structure, git repository etc. in order to start working towards public a release of the library?
More generally, it would be very helpful if there are resources that are known to be helpful when attempting this for the first time. Pointers toward best practices and mistakes to avoid, etc., would also be very helpful.
Some clarification: the current answers are addressing a question along the lines of "how can I make my Python library a good one for others to use?" This is useful, but it's different from the question I intended to ask.
I'm currently at the start of a long journey towards releasing my project. The core of my implementation works (and works really well), but I'm feeling overwhelmed by the amount of work ahead of me, and I'm looking for guidance on how to navigate the process. For example:
My library code is currently coupled to my own domain-specific code that uses it. It lives in a subfolder and shares the same git repository. Eventually, it will need to be made into a stand-alone library and put into its own repository, but I keep procrastinating this because I don't know how to do it. (Neither how to install a library in 'development mode' so that I can still edit it, nor how to keep the two git repos in sync.)
My docstrings are terse, because I know that eventually I will have to use Sphinx or some other tool. But these tools seem not to be simple to learn, so this becomes a major sub-project and I keep putting it off.
At some point I need to learn to use setuptools or some other tool to package it and track the dependencies, which are quite complex. I'm not sure whether I need to do this now or not, and the documentation is an absolute maze for a new user, so I keep deciding to do it later.
I've never had to do systematic testing, but I definitely will for this project, so I have to (i) learn enough about testing to know which methodology is right for my project; (ii) learn what tools are available for my chosen methodology; (iii) learn to use my chosen tool; (iv) implement test suites etc. for my project. This is a project in itself.
There may well be other things I have to do as well. For example, jonrsharpe posted a helpful link that mentions git-flow, tox, TravisCI, virtualenv and CookieCutter, none of which I'd heard of before. (The post is from 2013, so I also have to do some work to find out how much is still current.)
When you put this all together it's a huge amount of work, but I'm sure I can get it all done if I keep plugging away at it, and I'm not in a hurry. My problem is knowing how to break it down into manageable steps that can be done one at a time.
In other words, I'm asking which are the most important concrete steps I can take now, in order to reach a releasable product eventually. If I have a free weekend, which of these things should I focus on? Which (if any) can be done in isolation from the others, so that I can at least get one step done without needing to do the whole thing? What's the most efficient way to learn these things so that I will still have time to focus on the project itself? (Bearing in mind that all of this is essentially a hobby project, not my job.) Is there any of it that I don't actually need to do, thus saving myself a huge amount of time and effort?
All answers are greatly appreciated, but I would especially welcome answers that focus on these project management aspects, with specific reference to modern Python development.