TSM - Common branching models and branching by abstraction

Anghel Conțiu - Design Lead

The use of version control has become the normal way of working while the latest version control tools enable us, the developers, to play with branches the way it suits us the most. Plenty was written about branching strategies and while some of them are very popular, almost all of them have a common problem that at some points must be handled. This is the merging problem. There is a reason why people are talking not only about their “branching strategy”, but about the “branching and merging strategy”.

Merging can easily turn into a nightmare when the branching strategy is wrong. Take for example one of the most popular branching strategies, the Vincent Driessen's branching model in Figure 1. This model is so popular that a maven plug-in was implemented on top of it by Atlassian.

Let us focus only on two branches that should be intensely used: the feature branch and the develop branch.

The feature branch branches off from the develop branch and it is used by a developer to implement a feature.

When the feature is finished, the developer will merge his feature branch into the develop branch, so the other developers will have the commits available.

Consider the following situation:

This seems like a healthy work flow, but it hides a couple of important risks that will lead to additional developing and testing effort. Here are some of them:

The feature retesting problem

Dan spends a day testing Mary's new and shiny feature and he raises his thumb up, all good. What he doesn't know is that Andrew is also working with two of the files that Mary has altered in order to implement her feature. Andrew merges his branch 3 days later, he is happy he had no conflicts while merging, so Dan starts testing it but he notices Mary's feature doesn't work anymore, so he starts retesting it and creates the corresponding tickets for Mary, Mary fixes the issues considering Andrew's changes and Dan retests again both Mary's and Dan's features.

So, Mary wasted her time doing fixes after Andrew's merge, while Dan wasted his time testing Mary's feature at least two times and Andrew's feature at least two times.

The merging problem

It is common for features to take a couple of days to get implemented and unit tested, so, even if developers do merge the develop branch into their feature branches everyday (trying to avoid major conflicts when the time comes to merge their branch into develop they try to stay up to date with the latest changes coming from their colleagues on develop branch), usually there are two features developed simultaneously on two different branches and the risks of conflicts at merge time is high, and the more time the developers spend working in isolation on their feature branches, the higher the risk gets.

The continuous integration problem

Developers should write unit tests, component tests and integration tests. When these tests are developed in isolation on one branch, while another feature and its tests are implemented on another branch, the tests failing risks at merge time are high and the tests have to be reviewed. The good thing about this situation is that the developers are aware that their tests fail, but they do have to invest additional time to fix the tests even if they worked before the merging moment.

The running of database scripts problem

Many software projects try to automate their deployment process and get their database up to date through scripts. There is a risk of messing up the order of running these scripts because of the same reason. Developers work on different branches, in isolation.

If we pay close attention to the root of all these issues, we find that it is related to developers doing their work in isolation. An alternative to this problem is the use of branching by abstraction technique.

Branching by abstraction

It is defined by Martin Fowler as “a technique for making a large-scale change to a software system in gradual way that allows you to release the system regularly while the change is still in-progress”.

The technique implies providing an abstraction layer above the feature being changed, so the client code will communicate only to the abstraction layer without being aware if it uses the new or the old version of a feature.

No matter the way branching by abstraction is used, there is a common practice

While the developers write the tests for the new feature, the new version, they are confident that their tests work with the latest code version as everyone is pushing to the same branch and there won't be any merges that will break the tests (simply because other branches don't exist).

It is important to detect the right place where the abstraction layer should be placed and the way objects will be instantiated. There is room for creativity as taking these decisions also depends on the context of the problem. Considering this aspect, there are multiple ways of doing it, here are two of them.

Branching by component abstraction

This applies to situations where a large component must be replaced or re-written.

An abstraction layer must be created so our code will not depend on the component anymore but on the abstraction. This might also include refactoring the component so additional unit tests can be provided at this time. The refactoring might be too costly in terms of time and resources, so this is a point where a decision has to be made in terms of which branching by abstraction strategy should be used.

The new implementation of the component will be done in a step by step manner, meaning that the features will be implemented according to client code needs. When a set of the features is ready for a client, that client can switch its wiring and migrate to use the new component. Remember that the application should build and run correctly at all times.

The implementation of the component continues with the additional features that are needed for the next client. In the end, there will be no dependency on the old component, so, it will be safe to delete it.

Figure 2: Branching by component abstraction. ClientCode is gradually migrated to use the NewComponent.

Branching by point of use abstraction

Consider the situation where branching by component abstraction is not the most suitable solution. Reasons for that might include the high effort to be invested in refactoring the old component in order to make it work with an abstraction layer. For this kind of situations we can leave the old code as it is and adopt a different tactics.

1) The old and new versions of client class of our implementation (the point of use of our implementation) and the abstraction (the interface)

2) The Factory

3) The scope of the client class instance

4) Having the implementation deployed on different environments

Figure 3: Branching by point of use abstraction. The old and the new

As a conclusion, here is what the developers and testers should focus on while using the branching by abstraction.

The developers

The testers

As soon as the client class is deployed to production and its functionality is proven to be right

Don't forget, the system must always build and run correctly.