The perks of merge and rebase
git merge and git rebase are excellent tools for integrating changes from one branch into another. But there is often a confusion about which one to use when.
NOTE: This can be understood better if all the commit histories in the diagrams are thoroughly followed. It will take a little more time but I am sure this will clear out all the confusions that we have about rebase and merge.
First lets understand how merge works. We will follow an integrated approach. This post is a little comprehensive about merge and rebase. Along with git rebase and git merge, we also covered few other git commands that helps us in our daily work.
First, lets create our own git repository to work with. Create a directory and initiate a git repository with git init command as shown in the below screenshot.
git init command does two main things.
- First, it creates a .git directory in which it maintains all the metadata of snapshots or commits.
- Second, it creates a default branch with the name master which now points to nothing because there are no commits yet. That’s why, it won’t show anything if you we use git branch command. We have to use git status command to check.
Lets add three new commits as depicted below.
Use the below command to check the commit logs.
We have now three commits in master. Now lets say we identified a bug in master and want to fix it. So we generally pull a new branch off master and make the changes, commit them and merge them. So create a new branch named bugfix from master as below.
You can now see both master and bugfix are pointing to the same commit and of course HEAD points to same since we are now in bugfix (HEAD points to the branch’ s latest commit that we checked out).
HEAD always points to the branch that we checked out or in other words the branch that we are in.
Now that we are in bugfix branch lets add two more commits in this branch.
From the commit logs we can now see that the branch bugfix went two commits ahead of master.
Note our master branch is still points to the commit we were on when we created the bugfix branch.
Lets assume at this point, one of our developer checked out a new branch named feature from master, made one commit and merged this branch feature onto master.
People often confuse which branch to use with the git merge command. I have interviewed many candidates so far and asked to write command for merging one branch into another. And I often come across candidates using git merge master being in feature branch if they have to merge a feature branch into master. Just to give you a quick pop thumb rule that, we always want to merge a branch that is a few commits ahead of other. So here in our example we need to merge the feature branch into master and feature branch is of course ahead of master. So first we need to checkout master and then use git merge feature (merging feature into master). You can see the commands that we used in the figure below.
And now here is how the commit history looks like.
From console message, there is one thing to be observed that git performed Fast-Forward merge. What does it mean ? Well, Fast-Forward is one of the merge strategies that git uses. When it does Fast-Farward merge it simply moves the branch pointer that we are in onto the branch that we are merging. In our case the master branch pointer is just moved to the one that is pointed by feature branch. The advantage of Fast-Forward merge is that the commit history looks linear.
The thing here is master branch is not the same as when the bugfix was created. It moved ahead because we merged our feature branch into master branch. You can see it in the diagram. Now we need to integrate the changes from bugfix branch into master. There are two ways to do that: git rebase and git merge.
Which one is better ? Well, Lets see.
Lets first see what happens when we use git merge. For that, first we have to be in master branch(git checkout master) and then use git merge bugfix. See what happened in the below diagram.
Notice that it will ask for a commit message this time. Earlier when we merged feature into master it did not ask for it. Just come out of the commit message editor by saving it.
This time, you can see from the above screenshot, git performed the merge by using the recursive strategy. What it exactly does is, it performs something known as three way merge.
What does git merge exactly do with three way merge ?
- First, it gets the snapshot(the object that git maintains for every commit) from the recent common ancestor of the two commits pointed by bugfix and master. In our case it is 3514df4.
- It merges all snapshots(the commits) from master and bugfix on top of the common ancestor snapshot.
- And then finally it makes a new snapshot (a new commit). That is why it asks for a commit message(because it is making a new commit). With Fast-Forward it didn’t really ask for a commit message, because it didn’t have to do any commit. It just needed to move the branch pointer here and there. Hope you understand the logic behind git asking for a commit message now.
You can now see this in the above screenshot clearly. The below diagram helps us understand even better.
git merge is the easiest way to integrate the branches. But just look at the commit history of master. Its messy right ?
Is there any other way to make it look a little cleaner ? Well, what could have been done is, we could have taken the patch of changes went in bugfix and apply them on master directly. Well this what exactly is done by git rebase.
So lets do git rebase now. For this we have to get our git repository to the same state as it was before the three way merge. But it is little tricky. I would rather go for creating a new repository with the exact scenario. And below is the status of our new repository.
The above is the same scenario that we were earlier in. So, lets do git rebase now. First we have to checkout bugfix. And then rebase it into master as shown in the below screenshot.
You can see from the above screenshot that the bugfix branch now looks like as it is branched off from latest master. To arrive at this git rebase does a series of steps.
- It finds out the common ancestor of the two branches: one we are in(bugfix) and one we rebasing onto(master). In our case the common ancestor is b42e60a.
- It gets the diff introduced by each commit of the branch we are on (bugfix) and saves these diffs in a temp file.
- Resets the current branch to the same commit we re rebasing onto. So in our case it resets the bugfix branch pointer to the one that master points to.
- Finally, it applies each commit that it stored in temp file on the branch that is recently reset. In our case it is bugfix.
So all in all, it made fresh commits on bugfix branch after it moved(reset) the bugfix’s branch pointer to master.
As you can see, it now looks as if we branched off from latest master. Now to complete the integration process we need to merge bugfix into master.
Okay, now what happens if I merge bugfix onto master using git checkout master and git merge bugfix commands? Lets see what happens.
Check the above screenshot carefully. And yes, it does a Fast-Forward merge. This is the same scenario of merging feature branch into master. It simply needed to move the master branch pointer onto bugifx branch as shown in the below diagram.
Now what did you observe b/w git rebase and git merge ?
The history is more of linear with git rebase while git merge gave us a graph like history which looked messy. So when do we prefer rebase over merge. For the situations like above git rebase is of course the better choice.
git rebase can become very complex if we don’t follow the guidelines. Because when we do rebase we are abandoning the existing commits and creating new commits with the same changes. If push these commits somewhere and others pull down and they may base work on them, and then rewrite those commits with git rebase push them up again. And the other collaborators will have to merge their work and things will get messy when we pull their work back into ours.
Now which one to use when ? Its not really simple to answer. Because every team and project is different in its own way. We can get best of both the worlds. The best practices with rebase are:
- Rebase local changes before pushing.
- Never rebase anything the you have pushed somewhere.
Thanks for reading. Hope you find it helpful.