45

I merged an upstream of a large project with my local git repo. Prior to the merge I had a small amount of history that was easy to read through, but after the merge a massive amount of history is now in my repo. I have no need for all the history commits from the upstream repo.

There have been other commits made after this upstream merge that I would like to keep. How do I squash all that history that was merged from the upstream into one commit while keeping the commits made after the upstream merge?

5 Answers 5

40

I was able to squash several commits after multiple merges from the master branch using the strategy found here: https://stackoverflow.com/a/17141512/1388104

git checkout my-branch            # The branch you want to squash
git branch -m my-branch-old       # Change the name to something old
git checkout master               # Checkout the master branch
git checkout -b my-branch         # Create a new branch
git merge --squash my-branch-old  # Get all the changes from your old branch
git commit                        # Create one new commit

You will have to force an update if you need to push your squashed branch to a remote repository that you have previously pushed to, e.g. git push origin my-branch -f

1
  • 1
    git checkout main -- need to make sure you are on the latest if main default not pointing to head !
    – Sion C
    Commented Feb 7, 2023 at 10:47
14

The solution I ended up using was to manually recreate the history. I did this mainly because I didn't want to spend too much time looking for an elegant solution and there wasn't that much history (around 30 commits I'd have to manually merge).

So, I created a branch before I merged the huge upstream:

git checkout -b remove-history-fix <commit ID before merge>

Then re-merged the upstream using the --squash option.

git merge --squash <upstream tag>

Then manually cherry-picked the commits after the merge from the old branch (the one with the huge amount of upstream history).

git cherry-pick <commit ID>

After all those commits were merged into my remove-history-fix branch, I removed the branch with the upstream history.

git branch -D <upstream-history-branch>
2
  • just wanted to add that after many attempts myself, this process (or the copy-paste variations of this process) are also the only way I could find, to squash stuff that is no longer at tips of branches after merge commits flood the branch timeline. Such merges are essential for branches that take long to PR (squashes, on the other hand, are a minor luxury). Just add your rebase -i tasks before any preliminary merge and it will save you the worrying later
    – leRobot
    Commented May 8, 2015 at 3:35
  • Worth noting that by doing a squash merge, you are not recording an actual merge, thus potentially leaving your upstream commits orphaned (for example, if you are squash-merging a hotfix branch and then delete the upstream hotfix branch with no tag). Commented Aug 26, 2019 at 14:25
4

A couple of options for you:

Limit Logging

Not exactly what you asked for, but possibly a good alternative, and a lot easier. This allows you to use git like normal, but hides all the stuff you don't want to see (assuming the issue is the history cluttering up your log and not the raw storage space. I think squashing the merge in your branch won't prevent git from including all the commits from upstream if you fetched the upstream for the merge action in the first place.).

In this case, you would do a normal merge, but when logging you would add --first-parent to the command.

For example, without the option I might have (assume "sample more" 1 to 3 was actually a lot more commits)

$ git log --oneline
0e151bf Merge remote-tracking branch 'origin/master' into nosquash
f578cbb sample more 3
7bc88cf sample more 2
682b412 sample more 1
fc6e1b3 Merge remote-tracking branch 'origin/master'
29ed293 More stuff
9577f30 my local change
018cb03 Another commit
a5166b1 Initial

But, if I add --first-parent it cleans up to this:

$ git log --oneline --first-parent
0e151bf Merge remote-tracking branch 'origin/master'
fc6e1b3 Merge remote-tracking branch 'origin/master'
9577f30 my local change
018cb03 Another commit
a5166b1 Initial

Notice all of the commits from the master after I branched ("my local change" being my divergent commit) are gone. Only commits I made show up, including when I merged. If I had used better commit messages during the merge, I might even know what the batch of changes were.

Replace History

This is for what you asked.

Taking inspiration from https://git-scm.com/book/en/v2/Git-Tools-Replace

What we'll do here is squash the remote's history, replace their history with our squashed version from our perspective, and merge the squashed version.

In my example repository, the revisions that upstream added which I hadn't merged yet were 682b412 "sample more 1" to origin/master (f578cbb "sample more 3") (although not that long for this example, pretend there are 50 commits or whatever in between).

The first thing I want is a local branch of the remote side:

git checkout -b squashing origin/master

Next, I want to quickly squash it

git reset --soft 682b412~
git commit -m "Squashed upstream"

Note the tilde ~ character. That causes our branch to be at the parent of the first commit in the range we want to squash, and because we specified --soft, our index is still at the last commit in the range we want to squash. The commit line results in a single commit that consists of what was our first through last, inclusive.

At this point, the origin/master and squashing branches have identical tree contents but different histories.

Now, we tell git that when it sees references to the original commit of origin/master, to use our squashed commit instead. Using git log I can see the new "Squashed upstream" commit is 1f0bc14, so we do:

git replace f578cbb 1f0bc14

From here on, your git will use the "squashed upstream" commit.

Back on our original branch (if it was "master")

git checkout master
git merge f578cbb

This appears to merge the origin master (f578cbb), actually gets 1f0bc14's contents, but logs it as having a parent SHA1 of f578cbb

We no longer need the squashing branch, so you can get rid of it.

Now, let's say upstream added more features. In this simple example, on upstream's repo, a log might show this:

84f5044 new feature
f578cbb sample more 3
7bc88cf sample more 2
682b412 sample more 1
29ed293 More stuff
018cb03 Another commit
a5166b1 Initia

After we fetch upstream though, if we look at its log from our repo, we see this instead:

84f5044 new feature
f578cbb squashed upstream
29ed293 More stuff
018cb03 Another commit
a5166b1 Initial

Note how it appears to have squashed history to us as well, and more importantly, the squashed upstream SHA1 is showing the one used in upstream's history (for them it is really the "sample more 3" commit).

So, merging continues to work like normal

git merge origin/master

But we don't have such a cluttered log:

4a9b5b7 Merge remote-tracking branch 'origin/master' for new feature
46843b5 Merge remote-tracking branch 'origin/master'
84f5044 new feature
f578cbb squashed upstream
fc6e1b3 Merge remote-tracking branch 'origin/master'
29ed293 More stuff
9577f30 my local change
018cb03 Another commit
a5166b1 Initial

If the "new feature" commit in upstream was similarly a large number of commits, we could repeat this process to squash that down as well.

1

I had a similar issue. It happens when you resolve a merge conflict without having the latest commit history. The issue can be recreated when you have an old PR and many changes has been merged to the main since the old PR was created. Here is how I resolved it:

  • get the latest commit history using the git command git fetch. Do that on both the main and the featured branch
  • Then resolve the merge conflict.
0

There is no way to do it, as you won't be able to push back or merge again with that remote repository or any other of that same project. When squashing, you are changing history, resulting in different sha1-hashes between your repository and the remote one.

You'll have to live with the large history.

4
  • There's no way to force it? I will never need to push back and any merging in the future I can do the same as I had done before.
    – E-rich
    Commented Dec 26, 2012 at 17:50
  • How did you merge it this time? What was the origin of that repository?
    – Femaref
    Commented Dec 26, 2012 at 18:35
  • The origin was from scratch. I was given a tar file with the sources probably from a tag to start with, then made my changes/additions, then wanted to merge updates/fixes from the official upstream repo. The merge required a decent amount of manual merges, which is fine with me since I won't be doing that often.
    – E-rich
    Commented Dec 26, 2012 at 18:57
  • Could you update the question with the commands you used to merge from upstream?
    – Femaref
    Commented Dec 26, 2012 at 19:34

Not the answer you're looking for? Browse other questions tagged or ask your own question.