How does git handle squash merge vs normal merge?

Question

Let's say we have a Pull Request merging the commits from branch A into branch B, and we can perform the merge with normal merge and squash merge. And if we first perform the merge with squash merge (all the commits will be combined into only one commit) and then submit another similar PR from branch A to branch B, why does git still allow the merge in the normal way (all the commits will be kept)? I mean the changes have already been merged into branch B with the squash merge, and why does it not cause any conflict when having the 2nd merge in the normal way?

"and why does it not cause any conflict" - I'm not a git expert, but I think it's because the end-result of the squashed commit merge is identical to the end-result of the non-squashed merge, so because none of the files are different in-the-end, there's no conflict. Have you tried to perform the same actions locally though? — Dai, Commented Dec 9, 2023 at 7:00
You have a good answer to this question already. I've added another which takes a slightly different approach and also addresses when it makes sense to do what you proposed. — TTT, Commented Dec 11, 2023 at 18:58

eftshift0 · Accepted Answer · 2023-12-09 18:56:11Z

When you squash, git does not keep any information (other than, perhaps, the comment) about the commits that were merged so, unlike a real merge, git cannot know that the original branch was merged already. That's why it is discouraged to use squashes when you are dealing with long-runnning branches.

In a real merge, the common ancestor between two branches that have been merged changes but in a squash-merge the common ancestor does not move so later merges between the 2 branches will easily produce conflicts, either conflicts that were taken care of in previous squash-merges or new conflicts.

To explain it graphically, suppose you have this for starters:

  * GGG blah blah (other-branch)
  * FFF
  * EEE
* | DDD (main)
* | CCC
* | BBB
|/
* AAA

At this point, what is the latest common ancestor? AAA, right?

Now, suppose you do a real merge, we get something like this:

* HHH (main)
|\
| * GGG (other-branch)
| * FFF
| * EEE
* | DDD
* | CCC
* | BBB
|/
* AAA

Good. What is the latest common ancestor?

Tip of he answer: It's GGG. Make sure you digest that before moving on.

Now, suppose you keep on working on both branches and you end up with this:

* NNN (main)
* MMM
* LLL
| * KKK (other-branch)
| * JJJ
| * III
* | HHH
|\|
| * GGG
| * FFF
| * EEE
* | DDD
* | CCC
* | BBB
|/
* AAA

If you tried to merge again, git would need to consider the changes after the last common ancestor, which we already know is GGG, right? So, git would need to consider this for the merge:

* NNN (main)
* MMM
* LLL
| * KKK (other-branch)
| * JJJ
| * III
* | HHH
|/
* GGG

Now, let's go back to see how it would be if we had squash-merged instead. After the first squash-merge, we would get:

* HHH squash merge (main)
| * GGG (other-branch)
| * FFF
| * EEE
* | DDD
* | CCC
* | BBB
|/
* AAA

And, now.... what is the latest common ancestor? It's still AAA,... and now, on both branches you have a lot of common code... and not so common code that might have been adjusted from conflict resolution because of the squash-merge in HHH. How it would look if you had continued working on both branches?

* NNN (main)
* MMM
* LLL
| * KKK (other-branch)
| * JJJ
| * III
* | HHH squash merge
| * GGG
| * FFF
| * EEE
* | DDD
* | CCC
* | BBB
|/
* AAA

If you tried to merge, git would have to start over considering the changes from AAA, not GGG, as it happened before.... and given that you have a lot of common code coming from the squash and it's very likely that both branches might have touched those sections of code (which makes them different from git's POV), then you will get a bunch of conflicts.... it's actually very likely you will get the same conflicts you got when you did the first squash merge(content on each branch will be a little bit different from the original conflict, actually... but it will be the same section of code) plus a few more... just for the fun of it.

So, all in all... it's ok to squash, but it should be done for short-lived branches like feature branches that you work on and you kill them once they are merged.... if you are dealing with long-running branches, make sure to use real merges, unless you would like to take a peek at what hell looks like.

Now, about there not being any conflicts: git will not produce a conflict if exactly the same change is coming from the branches being merged.... If you squashed and then try to merge the real branch (without additional changes) then to git the same thing is coming from both branches so it's ok. There are scenarios (like when cherry-picking) when git complains about there not being any real change being introduced by the cherry-pick operation and then you need to decide what to do (skip it, create am empty commit)... This is an scenario I'd like to see if git does not complain about and allows the merge to go just like that.

TTT · Accepted Answer · 2023-12-11 18:55:34Z

...if we first perform the merge with squash merge (all the commits will be combined into only one commit) and then submit another similar PR ... why does git still allow the merge in the normal way (all the commits will be kept)?

To restate the problem- first you're merging a branch with squash, and then you're merging the same branch again with a normal merge.

After the first PR with the squash merge, you should observe that the second PR brings in a bunch of commits but with no file changes. This is why there aren't any conflicts, since you can't have conflicts if there is no change in state. The reason that it "allows" you to do it, is because when you merge you are bringing in the new commits, and sometimes it makes sense to do this even if there isn't a change in state. A common scenario where you want to do this is when you decide to cherry-pick some commits from a development branch, into a release branch so it can be deployed sooner. After that you may merge the release branch back down to the development branch to make sure it stays in sync, but since those changes are already in both branches, the merge only brings in the new commit IDs without any actual changes.

BTW, intending to squash merge followed by a non-squash merge of the same branch is pretty pointless. Instead, before you've done either option, decide if you want the granularity of the commits (regular merge) or you don't (squash merge). Then pick just one. Perhaps the only time it would make sense to perform a regular merge of an identical branch after a squash merge, would be if you already did the squash merge and regretted it, and then realized you wanted to keep the granularity of the commits. Note the reverse is not true; it would never make sense to purposely squash merge the same branch after a regular merge, since the squash merge will add literally zero value, with there being no new content and no existing commits to merge.

I did the normal merge on Github, and for "After the first PR with the squash merge, you should observe that the second PR brings in a bunch of commits but with no file changes", actually it did show me the changes again which is the same as what I saw when having the squash merge for the first time. — Jason Yu, Commented Dec 18, 2023 at 2:59
@JasonYu Ah. That makes sense because the PR shows the 3 dot diff, meaning it will show what's changed since the merge-base from the point of view of the source branch, so all the changes would still show up, even though there aren't any changes. After completing the PR you would see no changes actually occurred. If you test the merge locally you should see nothing actually changed. — TTT, Commented Dec 22, 2023 at 0:13

Guildenstern · Accepted Answer · 2024-01-14 11:46:17Z

The squash merge brings in the changes. The second true merge[1] brings in no changes but connects the two histories with a parent pointer to each.

Given this state:

cd /tmp
dir=$(mktemp -d)
cd $dir
git init
touch readme.md
git add readme.md
git commit -m readme
git checkout -b other
printf "change 1\n" >> a.txt
git add a.txt
git commit -m 'a first'
printf "change 2" >> a.txt
git add a.txt
git commit -m 'a second'

You now have:

     a second (other)
     a first
    /
main

You do a squash merge:

git checkout main
git merge --squash other
# For some reason you need to finalize it like this
git commit --no-edit

and get:

squash
|    a second
|    a first
|   /
main

The state of main and other are identical:

$ git diff main other
[empty]

But you can still do a true merge:

git merge --no-edit other

And you have:

merge -
squash \
|       a second
|       a first
|      /
main

Why does git allow you to do a true merge? Because you are telling it that you want to connect these two histories. And they haven’t been connected yet; the squash merge has no relation to other since it just takes the changes from other and makes a new commit, not related to other (as you can see in the diagram).

It doesn’t matter that main and other have the exact same tree;[2] the histories still need to be connected.

The squash merge might as well have been done by a different person who came along and did a commit with the same contents on top of the initial main commit:

unrelated
|    a second
|    a first
|   /
main

Maybe this person had the same idea as you and happened to implement the same thing. And if you did a true merge then you would get the same result:

merge ---
unrelated \
|          a second
|          a first
|        /
main

What git-merge(1) does when the tree contents are the same

Say you have main and other and they have the same tree (empty diff). By default it will:

If other is ahead of main and main has no commits that other does not have:[3] do a fast-forward
If other is ahead of main and main has commits which are not reachable from other:[4] do a true merge

The merging of the contents of these two will be a no-op since there is nothing to merge. The only work that needs to be done is to make parent pointers in the case of (2).

And (2) is always the case if you first did a squash merge of other into main. Because main will have at least one commit which is not reachable from other, namely the squash merge.

Notes

I’ll refer to a merge which creates a new commit which points to both parents as a true merge since I think the git(1) documentation does that.
The contents are identical: file readme.md and a.txt with the same file contents. That’s what we found when we did the diff.
For example: main - a - b - c (other)
For example: main - a - b; a - b2 (other)

Collectives™ on Stack Overflow

How does git handle squash merge vs normal merge?

3 Answers 3

What git-merge(1) does when the tree contents are the same

Notes

Not the answer you're looking for? Browse other questions tagged
git
git-merge
git-squash
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

What git-merge(1) does when the tree contents are the same

Notes

Not the answer you're looking for? Browse other questions tagged gitgit-mergegit-squash or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
git
git-merge
git-squash
or ask your own question.