TSM - Common branching models and branching by abstraction

Anghel Conțiu - Design Lead

The use of version control has become the normal way of working while the latest version control tools enable us, the developers, to play with branches the way it suits us the most. Plenty was written about branching strategies and while some of them are very popular, almost all of them have a common problem that at some points must be handled. This is the merging problem. There is a reason why people are talking not only about their “branching strategy”, but about the “branching and merging strategy”.

Merging can easily turn into a nightmare when the branching strategy is wrong. Take for example one of the most popular branching strategies, the Vincent Driessen's branching model in Figure 1. This model is so popular that a maven plug-in was implemented on top of it by Atlassian.

Let us focus only on two branches that should be intensely used: the feature branch and the develop branch.

The feature branch branches off from the develop branch and it is used by a developer to implement a feature.

When the feature is finished, the developer will merge his feature branch into the develop branch, so the other developers will have the commits available.

Consider the following situation:

George is working on george-feature-branch and he branches off from the develop branch.
Andrew is already working on his own andrew-feature-branch since yesterday
Mary finishes her feature after 6 days of working and she merges her mary-feature-branch into the develop branch.
Dan, our QA starts testing Mary's changes.

This seems like a healthy work flow, but it hides a couple of important risks that will lead to additional developing and testing effort. Here are some of them:

The feature retesting problem

Dan spends a day testing Mary's new and shiny feature and he raises his thumb up, all good. What he doesn't know is that Andrew is also working with two of the files that Mary has altered in order to implement her feature. Andrew merges his branch 3 days later, he is happy he had no conflicts while merging, so Dan starts testing it but he notices Mary's feature doesn't work anymore, so he starts retesting it and creates the corresponding tickets for Mary, Mary fixes the issues considering Andrew's changes and Dan retests again both Mary's and Dan's features.

So, Mary wasted her time doing fixes after Andrew's merge, while Dan wasted his time testing Mary's feature at least two times and Andrew's feature at least two times.

The merging problem

It is common for features to take a couple of days to get implemented and unit tested, so, even if developers do merge the develop branch into their feature branches everyday (trying to avoid major conflicts when the time comes to merge their branch into develop they try to stay up to date with the latest changes coming from their colleagues on develop branch), usually there are two features developed simultaneously on two different branches and the risks of conflicts at merge time is high, and the more time the developers spend working in isolation on their feature branches, the higher the risk gets.

The continuous integration problem

Developers should write unit tests, component tests and integration tests. When these tests are developed in isolation on one branch, while another feature and its tests are implemented on another branch, the tests failing risks at merge time are high and the tests have to be reviewed. The good thing about this situation is that the developers are aware that their tests fail, but they do have to invest additional time to fix the tests even if they worked before the merging moment.

The running of database scripts problem

Many software projects try to automate their deployment process and get their database up to date through scripts. There is a risk of messing up the order of running these scripts because of the same reason. Developers work on different branches, in isolation.

If we pay close attention to the root of all these issues, we find that it is related to developers doing their work in isolation. An alternative to this problem is the use of branching by abstraction technique.

Branching by abstraction

It is defined by Martin Fowler as “a technique for making a large-scale change to a software system in gradual way that allows you to release the system regularly while the change is still in-progress”.

The technique implies providing an abstraction layer above the feature being changed, so the client code will communicate only to the abstraction layer without being aware if it uses the new or the old version of a feature.

No matter the way branching by abstraction is used, there is a common practice

Use an abstraction layer to allow multiple implementations to co-exist.
Gradually migrate to the new implementation
Ensure the system builds and runs correctly at all times, so continuous delivery stays on.

While the developers write the tests for the new feature, the new version, they are confident that their tests work with the latest code version as everyone is pushing to the same branch and there won't be any merges that will break the tests (simply because other branches don't exist).

It is important to detect the right place where the abstraction layer should be placed and the way objects will be instantiated. There is room for creativity as taking these decisions also depends on the context of the problem. Considering this aspect, there are multiple ways of doing it, here are two of them.

Branching by component abstraction

This applies to situations where a large component must be replaced or re-written.

An abstraction layer must be created so our code will not depend on the component anymore but on the abstraction. This might also include refactoring the component so additional unit tests can be provided at this time. The refactoring might be too costly in terms of time and resources, so this is a point where a decision has to be made in terms of which branching by abstraction strategy should be used.

The new implementation of the component will be done in a step by step manner, meaning that the features will be implemented according to client code needs. When a set of the features is ready for a client, that client can switch its wiring and migrate to use the new component. Remember that the application should build and run correctly at all times.

The implementation of the component continues with the additional features that are needed for the next client. In the end, there will be no dependency on the old component, so, it will be safe to delete it.

Figure 2: Branching by component abstraction. ClientCode is gradually migrated to use the NewComponent.

Branching by point of use abstraction

Consider the situation where branching by component abstraction is not the most suitable solution. Reasons for that might include the high effort to be invested in refactoring the old component in order to make it work with an abstraction layer. For this kind of situations we can leave the old code as it is and adopt a different tactics.

1) The old and new versions of client class of our implementation (the point of use of our implementation) and the abstraction (the interface)

The point of abstraction is set to the client class of our implementation (the point of use)
Initially, we will have the original version of the client class and a copy of it, which will become the new version (I know, we don't want to make a copy of the class, but that is temporary, until the new version is stable; we can also switch between using the old one if something goes wrong with the new version; the old version will be removed in the end)
We will have a new interface that both client class versions (original and new) will implement (“Interface”); the interface will probably contain the public methods of the original class)
The new client class version will get modified to use the new feature implementation.
The original version will be removed in the end.

2) The Factory

It will be able to instantiate the original or the new version of the client class. The switch between instantiating one version or another is performed inside the Factory.
It can get as smart as needed, depending on the needs (eg: it might switch between the original and the new versions not only at start time but also at runtime, it might know more about the environment where the application is deployed and take corresponding actions, etc...)
We will use the interface of the old (original) implementation, so the popper instance (original or new) can be injected at the point of use.
The Factory will ensure that the toggle between the old and the new client class is performed in a consistent manner and takes into consideration the same conditions.

3) The scope of the client class instance

Some classes might be instantiated and have a request scope only, others might be Singletons. Having the Factory responsible for the client class instantiation, it has the opportunity to establish the instances scope.
Usually the new version should have the same scope as the original one and this will be specified inside the Factory, but other requirements can also be handled at this level.

4) Having the implementation deployed on different environments

The Factory can be made aware of the environment and act accordingly so it will instantiate the right version.
The switch can act as a safety measure in the production environment. The switch to the old version can be performed so the new implementation will be disabled with minimal effort.

Figure 3: Branching by point of use abstraction. The old and the new

As a conclusion, here is what the developers and testers should focus on while using the branching by abstraction.

The developers

Will use only one branch for all their development
Will implement the abstraction, the switching mechanism and add unit tests,
Will set the wiring so it will be possible to instantiate the original or the new version.
Will then focus on the new version to add the new functionality.

The testers

Will test the wiring and the switching functionality.
Will test the new functionality
Will also test the old functionality (if it got refactored in order to properly create the abstraction layer). This does not apply to branching by point of use.
Are confident that the code is not altered on some other branch (because there is no other branch), so they won't have to test again after a potential merge.

As soon as the client class is deployed to production and its functionality is proven to be right

The switching mechanism is removed
The original implementation is removed

Don't forget, the system must always build and run correctly.