Skip to main content
edited title
Link
Royi
  • 4.8k
  • 7
  • 48
  • 72

Using IPython notebooks under version control/ Jupyter Notebooks Under Version Control

Question Protected by Anshul Goyal
fixed broken link
Source Link
Vadim Kotov
  • 8.2k
  • 8
  • 49
  • 63

I have considered several options that I shall discuss below, but have yet to find a good comprehensive solution. A full solution might require some changes to IPython, or may rely on some simple external scripts. I currently use mercurialmercurial, but would like a solution that also works with git: an ideal solution would be version-control agnostic.

I have considered several options that I shall discuss below, but have yet to find a good comprehensive solution. A full solution might require some changes to IPython, or may rely on some simple external scripts. I currently use mercurial, but would like a solution that also works with git: an ideal solution would be version-control agnostic.

I have considered several options that I shall discuss below, but have yet to find a good comprehensive solution. A full solution might require some changes to IPython, or may rely on some simple external scripts. I currently use mercurial, but would like a solution that also works with git: an ideal solution would be version-control agnostic.

Fix formatting and add link to mentioned Gregory Crosswhite's solution
Source Link

The notebook format is quite amenable for version control: if one wants to version control the notebook and the outputs then this works quite well. TheThe annoyance comes when one wants only to version control the input, excluding the cell outputs (aka. "build products") which can be large binary blobs, especially for movies and plots. In particular, I am trying to find a good workflow that:

Update: I have been playing with my modified notebook version which optionally saves a .clean version with every save using Gregory Crosswhite's suggestionsGregory Crosswhite's suggestions. This satisfies most of my constraints but leaves the following unresolved:

  • When the notebook is running, one can use the Cell/All Output/Clear menu option for removing the output.
  • There are some scripts for removing output, such as the script nbstripout.py which remove the output, but does not produce the same output as using the notebook interface. This was eventually included in the ipython/nbconvertipython/nbconvert repo, but this has been closed stating that the changes are now included in ipython/ipythonipython/ipython,but the corresponding functionality seems not to have been included yet. (update) That being said, Gregory Crosswhite's solutionGregory Crosswhite's solution shows that this is pretty easy to do, even without invoking ipython/nbconvertipython/nbconvert, so this approach is probably workable if it can be properly hooked in. (Attaching it to each version control system, however, does not seem like a good idea this should somehow hook in to the notebook mechanism.)

The notebook format is quite amenable for version control: if one wants to version control the notebook and the outputs then this works quite well. The annoyance comes when one wants only to version control the input, excluding the cell outputs (aka. "build products") which can be large binary blobs, especially for movies and plots. In particular, I am trying to find a good workflow that:

Update: I have been playing with my modified notebook version which optionally saves a .clean version with every save using Gregory Crosswhite's suggestions. This satisfies most of my constraints but leaves the following unresolved:

  • When the notebook is running, one can use the Cell/All Output/Clear menu option for removing the output.
  • There are some scripts for removing output, such as the script nbstripout.py which remove the output, but does not produce the same output as using the notebook interface. This was eventually included in the ipython/nbconvert repo, but this has been closed stating that the changes are now included in ipython/ipython,but the corresponding functionality seems not to have been included yet. (update) That being said, Gregory Crosswhite's solution shows that this is pretty easy to do, even without invoking ipython/nbconvert, so this approach is probably workable if it can be properly hooked in. (Attaching it to each version control system, however, does not seem like a good idea this should somehow hook in to the notebook mechanism.)

The notebook format is quite amenable for version control: if one wants to version control the notebook and the outputs then this works quite well. The annoyance comes when one wants only to version control the input, excluding the cell outputs (aka. "build products") which can be large binary blobs, especially for movies and plots. In particular, I am trying to find a good workflow that:

Update: I have been playing with my modified notebook version which optionally saves a .clean version with every save using Gregory Crosswhite's suggestions. This satisfies most of my constraints but leaves the following unresolved:

  • When the notebook is running, one can use the Cell/All Output/Clear menu option for removing the output.
  • There are some scripts for removing output, such as the script nbstripout.py which remove the output, but does not produce the same output as using the notebook interface. This was eventually included in the ipython/nbconvert repo, but this has been closed stating that the changes are now included in ipython/ipython,but the corresponding functionality seems not to have been included yet. (update) That being said, Gregory Crosswhite's solution shows that this is pretty easy to do, even without invoking ipython/nbconvert, so this approach is probably workable if it can be properly hooked in. (Attaching it to each version control system, however, does not seem like a good idea this should somehow hook in to the notebook mechanism.)
Added an update about what issues still remain.
Source Link
mforbes
  • 7.2k
  • 3
  • 17
  • 22
Loading
Source Link
mforbes
  • 7.2k
  • 3
  • 17
  • 22
Loading