Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

diff-ability of notebooks #3065

Closed
ghost opened this issue Mar 23, 2013 · 5 comments
Closed

diff-ability of notebooks #3065

ghost opened this issue Mar 23, 2013 · 5 comments
Milestone

Comments

@ghost
Copy link

ghost commented Mar 23, 2013

I've had a few sessions of reading through notebooks containing lots of text
and sending PRs for fixes. There are a couple of snags i hit:

  • running cells in a notebook creates a lot of "diff noise" due
    to changes in the "prompt_number" fields.
    This necessitates individually filtering them out with "git add -p", which
    gets tedious quickly.
  • Large blocks of text are serialized as a single row in the notebook.
    I realize this is a constraint imposed by JSON, but it means that
    deleting a spurious comman results in a diff for what could be 50
    lines of text, as diff is line-based.

I can't suggest a solution that doesn't involve changing the nb format,
so I guess this is just fyi.

@minrk
Copy link
Member

minrk commented Mar 23, 2013

running cells in a notebook creates a lot of "diff noise" due to changes in the "prompt_number" fields. This necessitates individually filtering them out with "git add -p", which gets tedious quickly.

Input prompts are output information, and they definitely belong in the cell data. For those who want to strip these for git reasons, it should be easy to write a git pre-commit hook that either just strips the prompts, or does a full reset && run all && save on every changed notebook. But if you re-ran your notebook in a different order from before, the changed prompts are not noise - they are real information.

Large blocks of text are serialized as a single row in the notebook. I realize this is a constraint imposed by JSON, but it means that deleting a spurious command results in a diff for what could be 50 lines of text, as diff is line-based.

Where are you seeing this? It isn't true of input or markdown.

@ghost
Copy link
Author

ghost commented Mar 23, 2013

yes, the commit hook solution is what I thought, but that's pretty techy for drive-by contributors.

I see long lines in all markdown cells of a v3 notebooks, in the 'source' field. Is there something I should be doing
differently except hitting the "save" button?

@Carreau
Copy link
Member

Carreau commented Mar 23, 2013

Codemirror might be soft wrapping.
You can hardwrap wherever you want manually, those would appear as different line in JSON.
I don't know if there is an auto hard-wrap for Codemirror.

@minrk
Copy link
Member

minrk commented Mar 23, 2013

I see long lines in all markdown cells of a v3 notebooks, in the 'source' field. Is there something I should be doing
differently except hitting the "save" button?

Ah, I see - long lines that you type will be preserved (just like a regular text file).
But note that in markdown, newlines are not semantic unless followed by two spaces (or two newlines).
This means that you can wrap your markdown source just as you probably would in a regular markdown file,
and these lines will be reflected in the JSON. In that way, there is nothing special about the notebook - it lets you make whatever decisions you want regarding text wrapping, and it is reflected in the JSON. If you want to write long lines, we will save long lines. If you break up your lines, they are broken in JSON. It's all up to you.

@ghost
Copy link
Author

ghost commented Mar 24, 2013

good, in that case that's exactly as it should be. thanks for the tips.

@ghost ghost closed this as completed Mar 24, 2013
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants