Getting diff (or git diff) to show inserted hunks properly

Question

Let's say I have two files. The first one has the contents:

line 1
foo
line 2

line 1
bar
line 2

And the second one has a new section inserted in the middle, so it looks like this:

line 1
foo
line 2

line 1
new text
line 2

line 1
bar
line 2

Now, when I do a "diff -u", I get output like this:

--- file1   2013-06-25 16:27:43.170231844 -0500
+++ file2   2013-06-25 16:27:59.218757056 -0500
@@ -1,7 +1,11 @@
line 1
foo
line 2

line 1
+new text
+line 2
+
+line 1
bar
line 2

This doesn't properly reflect that the middle stanza was inserted -- instead, it makes it look like the second stanza was changed, and a new one added to the end (this is because the algorithm starts at the first differing line).

Is there any way to get diff (either by itself, or using git diff) to show this output instead?

--- file1   2013-06-25 16:27:43.170231844 -0500
+++ file2   2013-06-25 16:27:59.218757056 -0500
@@ -1,7 +1,11 @@
line 1
foo
line 2
+
+line 1
+new text
+line 2

line 1
bar
line 2

This is mostly an issue when generating a patch for someone to review, where a new function gets inserted into a group of similar functions. The default behavior doesn't reflect what really changed.

Try sdiff file1 file2 may be this is what you are looking for. — g4ur4v, Commented Jun 25, 2013 at 21:58
@g4ur4v, not quite -- that still makes it look like part of section 2 was modified and part of section 3 added -- when in reality, a new section was inserted between the other two. — Derek Pressnall, Commented Jun 25, 2013 at 22:14
"new function gets inserted into a group of similar functions" is a bit of a code smell itself, except too, too common in some languages. Have you tried --unified 5 or larger values? — msw, Commented Jun 26, 2013 at 0:58
@msw, I agree about the code smell in general -- I can't recall what this original case was. However my most recent case was when inserting records into an XML database export; in this case the new records will often be similar to the surrounding records (almost identical to the example I have above). As for adding a large number to the --unified flag, that just gives more context, but doesn't change where the "+" signs appear. — Derek Pressnall, Commented Jun 26, 2013 at 18:32
XML is grossly repetitive. I've not chased down any of the links but perhaps stackoverflow.com/questions/1871076/… might be useful. I was then thinking about the longest common sub-sequence algorithm and realized it, of necessity, would generate source-ignorant diffs. This turned up msdn.microsoft.com/en-us/library/aa302294.aspx which appears to operate at a semantic level. — msw, Commented Jun 26, 2013 at 20:24

rink.attendant.6 · Accepted Answer · 2016-11-24 15:13:13Z

Git 2.9 was released earlier this year which included the experimental flag --compaction-heuristic on the git diff command:

In 2.9, Git's diff engine learned a new heuristic: it tries to keep hunk boundaries at blank lines, shifting the hunk "up" whenever the bottom of the hunk matches the bottom of the preceding context, until we hit a blank line.

I don't think GitHub has it enabled for diffs on the web UI for Pull Requests and comparisons, but you can do it locally. I'd recommend using it in conjunction with --word-diff if you need that level of granularity.

More details available on the GitHub blog: https://github.com/blog/2188-git-2-9-has-been-released

Doesn't look like that flag exists anymore, at least on git 2.20 — user114651, Commented Sep 19, 2019 at 17:51

chirlu · Accepted Answer · 2013-06-25 22:44:22Z

1

The patience diff algorithm (git diff --patience) may give you more natural results, though not in all cases.

answered Jun 25, 2013 at 22:44

chirlu

1,2338 silver badges9 bronze badges

1

This still produced the same results in my example above. I know there is a a solution somewhere, as I remember reading about it a while ago, just can't remember.
– Derek Pressnall
Commented Jun 26, 2013 at 18:34

Add a comment |

Asenar · Accepted Answer · 2014-09-03 10:26:01Z

0

In certain cases, the command git diff --word-diff ( or --color-words) may give you better looking results

answered Sep 3, 2014 at 10:26

Asenar

3212 silver badges8 bronze badges

Add a comment |

Stack Exchange Network

Getting diff (or git diff) to show inserted hunks properly

3 Answers 3

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
git
diff
.

Hot Network Questions

Getting diff (or git diff) to show inserted hunks properly

3 Answers 3

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged gitdiff.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
git
diff
.