3

I'm wondering how to make PyCharm's VCS (i.e. Git) work with Jupyter Notebook files. Changing even 1 loc results in 3 modifications detected during commit: enter image description here Sorry if it's a duplicate, but I haven't found anything similar.

3 Answers 3

3

Well, I wouldn't say that the current support of Jupyter Notebook files versioning doesn't work at all. You can see it on your own screenshot that your changes are detected. We don't parse all of the changes to detect only the source code changes. And even if we did, many people actually want to track the output because, for example, in data science, the results are not always reproducible and you may want to keep track of the output as well as src.

Although it can be enhanced with the implementation of the following functionality https://youtrack.jetbrains.com/issue/PY-20132 that would allow committing all of the changes, but see only changes of source code, so feel free to upvote and leave comments.

2

I use Pycharm Community edition. I love the way Pycharm integrates with git and its VCS shows the diffs visually. However, for jupyter notebook files, the diff is difficult to track visually. Running a cell introduces various changes.

Notebook files normally diff like text files. I use a simple method to enhance the visual quality. I created a new file type Settings>Editor>File Types for *.ipynb files. I enable matching for all types of brackets. I add few keywords,

Keyword 1:

"outputs"
"source"

Keyword 2:

"code"
"markdown"

This highlighted format shows up in the Pycharm VCS and enables us to easily locate changes in code and markdown cells and outputs. An example of this effect is shown in this screenshot. Now, we don't need to worry about changes in the execution count or meta data.

2
1

It's a bit hacky way, i just figured out, how to VCS Jupyter Notebooks, but you can use Refactor > Convert to Python File to convert .ipynb to .py and then commit the new file.

It can be reversed the same way by using Convert to Jupyter Notebook.

So i would put *.ipynb to .gitignore and then VCS only *.py files and "unpack/repack" the notebook files as needed.

PS: I have Pyharm 2024.1.1

1
  • Nice, and if we can automate it, we can put the commands as pre-commit, post-checkout scripts in git. Commented May 22 at 13:47

Not the answer you're looking for? Browse other questions tagged or ask your own question.