git: merge only the completed work
Sat, Aug 8, 2020 ❝The benefits of merging only a completed work, forgetting in-progress development history.❞Contents
Something that pops up occasionally in a conversation on software development, is how one should reintegrate feature branches into the main line of development (typically “master” or “develop”). There are a number of options for reintegrating development work that lives on a separate branch:
- Merge the branch using a merge-commit.
- Rebase feature development work on top of the main line.
- Introduce the work, squashed into a single commit.
In this article, it is assumed that a new feature is developed on a separate branch, then reintegrated when finished. This may not apply to trivial changes, such as bug fixes. This article highlights the benefits of introducing new work as a squashed single commit. Consequently, advantages and disadvantages are not exhaustive. Instead, it intends to consider what the benefits are of forgetting feature development history.
The merge-commit
The most straight-forward and obvious solution for reintegrating a completed feature, is joining two branches with a merge-commit. A merge-commit has 2 parents, literally joining 2 paths together in the commit graph. By its nature, this preserves each branch’s unique history that emerged as part of the development work, both of the main line and in the feature branch.
This approach allows every part of the original history to be found and inspected. Consequently, it also pollutes the main line of development with commits that contain all kinds of “work-in-progress” changes, be it non-compilable code, bad solution attempts, trivial mistakes, and their (uninformative) commit messages.
Rebasing
Another approach to reintegrating feature work into the main branch, is to retroactively pretend that all work on the feature was performed on top of the current main line of development. This would allow committing the individual changes directly on top of the main branch, negating the need for “reintegration” of a deviated branch. Rebasing is the procedure of re-applying each of the commits of the feature branch on top of the (current) main line. Therefore, there is no joining of branches, only a direct progression of the commit graph.
The difficulty with rebasing is that for each commit there may (again) be conflicts that need to be resolved, given that there are recent changes on the main line. Secondly, the individual development commits still exist so there is pollution of the commit history with “in-progress” work. And, thirdly, rebasing becomes more difficult the longer the branch is separated.
Squashed-merges
The next approach, and the main topic of this article, is the mechanism of merging a completed work as a single commit. The previous approaches both have some problems. The most prevalent are the consequences of preserving in-progress development history after development has completed.
Issues with the previous approaches to reintegration:
- you cannot write informative, valuable, quality, definitive commit messages while work is still in progress.
- you cannot extract independent changes before the work is completed, as you cannot be sure what turns out to be a necessary but independent modification.
- there is no value in work-in-progress commits. Even more, they may be counterproductive:
- when bi-secting issues, if you end up in one of these in-progress commits, they will interfere with bisection as the quality is not guaranteed.
- when reading through the commit history, one will - with a ratio of
n:1
to merge commits - encounter work-in-progress commit messages. These messages do not contain functional value. If you are lucky and an actual message exists, you will get to know the developer’s state of progress and state of mind at the time of development.
The key difference between the previous approaches and squash-merging a feature branch, is that development history gets lost in the process. An arbitrary number of previous commits is squashed into a single end result, intermediate changes that were obsoleted are forgotten in the process. There are benefits, both to the feature developer and to the main line, when this information is forgotten.
- During development:
- the developer is not bothered with responsibility to write “good commit messages”.
- the developer is given the freedom to experiment, knowing that the completed work need not contain their prior trial-and-error attempts.
- During completion:
- reflect on the completeness of your solution/implementation.
- review the final set of modifications: consider if and how you might extract independent functional subsets.
- write a complete, informative commit message, knowing and understanding the completed working solution. At this time it makes sense to write a commit message that details the work.
- Eradicate irrelevant work-in-progress history:
- commit messages during development may contain incorrect/obsolete information, if any at all.
- work-in-progress commits are of no value in the main line commit history: the changes are applicable to an in-progress code-base. The code may not compile or may contain errors.
- eradicate (temporary or redundant) modifications that are irrelevant to the final solution, therefore simplifying the merging of the completed work.
- simplifies cherry-picking in case changes need to be replicated to other branches.
- Intermediate merges of main line into the feature branch will not interfere or show up in the final squashed-merge: there are no negative consequences to keeping up to date with main line development.
- The main line commit history has a coarser granularity and only contains relevant information: the commit history becomes suitable as input for a changelog or release notes.
- Commits are not relevant as a metric of quantity and/or quality of work, or amount of effort: commits are “free” so the number of commits has no qualitative value. Furthermore, the number of lines that are touched does not represent complexity or value: a one-line bug fix could have been hell to investigate and affect your full user base, but the one line changed could never reflect that. A one-line change could also correct a grammar mistake in a comment.
There are plenty of advantages to squashing development history when reintegrating a feature into the main line of development.