23

I have an IPython notebook which is several megabytes big although the code inside is just about 100 lines. I think it is that huge because I load several images inside.

I would like to add this notebook to a git repository. However, I don't want to upload something that big which can easily be generated again.

Is it possible to save just the code of an IPython notebook to reduce its size?

2
  • stackoverflow.com/questions/18734739/… may be related. See the section about stripping the output.
    – cel
    Commented Jun 14, 2016 at 9:01
  • Another experimental tool that might help: recombinecm. It saves the notebook as two files, and the idea is that you put the clean code-only file in version control, and not the file with all the outputs.
    – Thomas K
    Commented Jun 14, 2016 at 17:14

4 Answers 4

35

You can try following steps since it worked for me:

Select the "Cell" -> then select "All Outputs" -> There you will find "Clear" option select that.

enter image description here
And then save the file.

This will reduce the size of your file (From MBs to kbs). It will also reduce the time to load the notebook next time you open it in your browser.

As per my understanding this will clear all the output created after execution of the code. Since Notebook is holding code+images+comments in addition to this its also holding the out put in that file therefore it will increase the size of the notebook.

4
  • 1
    This reduced mine from 200mb to a few kb. Thanks!
    – azizbro
    Commented Oct 2, 2019 at 1:29
  • 2
    In addition to this, widgets can easily add several MB of data to a notebook. Widget data can be cleared with dropdown Widgets > Clear Notebook Widget State
    – Gman
    Commented Jul 18, 2020 at 21:07
  • 1
    Thank you so much @Yogesh, I was starting to hate Jupyter because of that issue.
    – BND
    Commented Aug 8, 2020 at 6:15
  • Nothing helped me until I used @Gman's method. Images and output clearing didn't make a dent, even after it looked like there were no widgets present anymore. Widget clearing changed several notebooks from over 100MB each to 20k each.
    – thorr18
    Commented Apr 29, 2021 at 3:56
2

Now you generate a simple script linked to the notebook with jupytext which others can rerun.

If you need to keep the images within (because, for example, you are sharing the notebook with someone who does not want to/can not rerun it) you might want to try to reduce the images.

I found this module ipynbcompress which seems to do exactly this, but so far I could not install it.

1
  • This is a good option, it works like a charm. Thank you! Commented Sep 2, 2023 at 19:30
1

I run into the exact same problem with one of my notebooks, which I solved by changing my df to df.head(5). I did this instead of clearing all outputs as I still wanted to show on GitHub how my code changed data inside the columns in my df.

You also can run !ls -lh in the last cell of your notebook to check size of your notebook before saving. This will give you an idea if you need to clear outputs/replace df with df.head()/remove images in order to reduce the size and be able to save on the GitHub.

-1

Yes, you will have big issues with Github, Checkpoints, and Git History.

Clear out the Kernal, save, and then push to github.

Be careful not to save checkpoints that are loaded with images. They also have to be under the 100Mb limit. Clear them out first or else add the checkpoint folder to gitignore.

Not the answer you're looking for? Browse other questions tagged or ask your own question.