Defect discovery and risk mitigation in Continuous Delivery
Barry Boehm, the inventor of spiral model of software process and Constructive Cost Model (Cocomo) in 1976, argued that defects are more expensive to fix when they are found in the later stage of software development. His concept is later developed into Cone of Uncertainty (CoE) in his next book “Software Engineering Economics” (1981). The cone of uncertainty concept is simply represented on the diagram below:
The basic premise of the concept is that uncertainty evolves and grows as project enter the next phase. Boehm’s spiral method and Cocomo is to anticipate the risk and uncertainty in software project. For many years, the software development has revolved around the Boehm premise of software development economics.
On the other hand, Laurent Bossavit, author of The Leprechauns of Software Engineering (2012), said differently. Bossavit argue that the “underlying evidence justifying Boehm’s curve…just isn’t up to any reasonable standard of ‘research.’” (see "What does it really cost to fix a software). Bossavit, who also the head of Agile Institute, noticed that Boehm misinterpreted the result of his own studies. Therefore we can question the validity of the concept. However we know that the the risk and uncertainty do exist in the software project. So the most important thing is to deal with the risk and uncertainty.
FOCUS ON IMPACT AND RISK MITIGATION
I suggest not to argue about the exact shape of the curve, neither on whether there is any scientific method involved, but rather to identify the impact of a change and to start identifying some work around and mitigations. Especially in case of software defects (also known as bugs). It is because we need to assess the impact and mitigate the risk. As in other fields, we also cannot avoid risk in software development. However, we must be prepare to encounter the risk. Therefore we have to be able to assess the risk and know the risk we face and mitigate the risk. The major risk in the software development is to be found on the transition stage from testing to deployment. In order to ensure successful deployment in the production server we have to minimize the risk in error. The code error will become defect in the deployment and lead to system failure in the production environment. We can minimize such failure by eliminating the defect from the code error.
DEFECT IMPACT MATRIX
Let us begin with assessing the risk correctly using the defect impact matrix below. At the diagram below I suggest a matrix representing the impact of a defect. Instead of representing it over time though (like Barry Boehm did), I use the delivery stages on the horizontal axis.
Indeed, defects can be detected at a different development and release stages. The impact differs depending on how early or late it’s been discovered. Defect is the outcome of the risks that are not properly identified and assessed, therefore when the risk occurs the organization does not prepare to monitor and mitigate risks.
The content of the matrix is to be adapted to your organization. In fact the one I propose below would describe an organization with rather average software delivery processes:
(Score A to F, A being the better score, F the worse. )
We could implement continuous delivery such as canary deployment or blue/green deployment method to reduce the lowest score, for the following score:
F(1) score could be mitigated by offering the new version of the product to only a subset of your user base (ie 5%). Also called Canary Deployment.
F(2) score could be mitigated using advanced production deployment strategy of the 4th or 5th level of Continuous Delivery Maturity such as: Canary deployment or Blue/Green Deployment
Let us talk a bit about two deployments we mention above: Canary deployment and Blue/Green deployment.
Canary deployment is the deployment where software is tested in production level by routing a subset of users to new functionality. This deployment is important to test how the changes affected the users in general, but the entire system is not affected. Since the canary system is only tested to some users of the system not the entire users.
Blue/Green deployment is the deployment method of creating two identical production environment, this method provide you capability to rapidly roll-back to another environment when anything goes wrong with one environment. Blue/Green deployment enable you to switch the production environment from one environment to the other environment by rerouting the application request. The following diagram explains the Blue/Green environment.
ADOPTING ADVANCED CONTINUOUS DELIVERY AND AUTOMATION PROCESS TO MITIGATE THE RISKS
In his book, Continuous Delivery, Jez Humble and his co-author David Farley suggested a Continuous Delivery maturity level. The maturity model ranges from 0 to 5 where 5 is achieved only by industry leaders such as NetFlix, Twitter, Github and a few more.
The matrix above could be representing an organization that has a well defined and reliable software development process. However organization can improve their maturity level by automating more steps and using advanced deployment and release processes, combined with Agile methodology. We see that in the production stage they can improve the quality of product by using automation process. Either using Canary Deployment or Blue/Green Deployment or both of them, we can correctly assess the risks and able to mitigate them hence improving the process.
We could imagine an organization going from CD level 3 to 5 in a matter of a year and reaching the following result:
As you can see, the organization is performing much better above. Discovering defects late, even in production, is not catastrophic and has almost an insignificant impact.
Therefore, it is better for companies to not hesitate to embrace those risks, keep up with fast pace innovation, while having a way to mitigate and remediate every single risk they can encounter. We cannot avoid risk, because risk is already there, consequently we have to be able to focus on the job and progress and be prepare to deal with the risks associated with software development.
REDUCE SOFTWARE RISK, INCREASE QUALITY AND REVENUE
Once again, the content of the matrix is subjective. I just want to point out the benefits of Continuous Delivery, Automation of processes and Agile methodology.
Combining the three of them together, will allow an organization to improve drastically, keep the customer happy, innovate, decrease risks and increase revenues through well managed technologies.