8

I like my github repositories to include py files instead of ipynb files, but I much prefer working in Jupyter when doing projects.

I'd ideally like to work on my project by:

  • Performing a git pull from the repository I'm working on, which contains a .py file that I've converted into a .py file from my Jupyter notebook
  • Update the Jupyter notebook with changes before a commit
  • Download the notebook as a py file and then make the commit, with the notebook file ignored by git

Is there a way to do this 'updating of the py file' without manually downloading my notebook in to a py file, taking the py file I downloaded in my Downloads folder, dragging it in my work directory and manually replacing the extant py file before finally making a commit? It just seems very unclean.

Perhaps I should have my ipynb in the repository as well, so that I can make sure that when I pull the notebook is up to date and is the exact same as the py file? But I would want it hidden from the public.

I just want to be able to make things as seamless as working with just py files, while doing actual work in Jupyter and then for the public having the final code be py files, but I want to do it relatively fluidly without a lot of manual dragging and dropping of files and replacing old ones. Kind of ruins the point of using git as I'm no longer using version control "elegantly".

3
  • 1
    Just curious: is there anything that discourages you from committing the ipynb files instead? Two approaches to committing ipynb files are to strip the outputs (to minimize diffs) or to keep them as-is (which I prefer for reproducibility sake when doing research). What I think people do not realise is that there are excellent, highly customizable tools (jupyterlab-git, nbdime) allowing to create diffs of notebooks and that notebooks generally display well on GitHub (if kept to reasonable size, which is a good practice in the first place).
    – krassowski
    Commented May 8, 2021 at 15:18
  • 1
    @krassowski I just thought that notebooks would be less accessible for users than a simple py file.
    – sangstar
    Commented May 8, 2021 at 15:20
  • Ok. See my answer below.
    – krassowski
    Commented May 8, 2021 at 15:22

2 Answers 2

5

Yes. jupytext can keep .ipynb and .py files in sync for you via so-called "paired notebooks", as long as you edit those in one of supported clients, such as Jupyter Notebook or JupyterLab.

Alternatively, you could configure pre-commit git hook to run jupytext conversion from .ipynb to .py just before every commit, which is documented here. In short, you would need to install pre-commit - e.g. with:

pip install pre-commit

and then create/modify .pre-commit-config.yaml with:

repos:
-   repo: https://github.com/mwouts/jupytext
    rev: v1.11.2  # (replace with the latest stable version)
    hooks:
    - id: jupytext
      args: [--from, ipynb, --to, "py"]

It would be also possible to configure a pre-commit hook using the built-in jupyter conversion capabilities if you do no want to depend on jupytext, but this will provide fewer configuration options

3
  • 1
    This way the .ipynb will still be tracked in git... is it possible to ignore .ipynb files, for example by unstaging them in a separate hook? My search on the topic left me at this (outdated) piece of documentation in jupytext: github.com/mwouts/jupytext/blob/…
    – I. Amon
    Commented Nov 24, 2021 at 19:34
  • I don't understand your comment. .ipynb will be tracked if you choose so - but you can always configure .gitignore if you don't want them.
    – krassowski
    Commented Nov 24, 2021 at 20:02
  • 4
    The problem is that pre-commit requires the .ipynb files to be tracked and won't convert them if they are .gitignored. So you would have to unstage them in a separate pre-commit as for as I understand. Is there a solution around that?
    – I. Amon
    Commented Nov 25, 2021 at 10:18
0

I do store Jupyter notebooks in Git. I clear all outputs, save notebook, then commit. Don't use pre-commit hook as the process is simple to follow manually. We do review notebooks changes in pull requests when needed - easy to do if outputs are cleaned.

Check out this video for demo on how I put notebooks under Git.

Not the answer you're looking for? Browse other questions tagged or ask your own question.