There is a continuous struggle happening in today's software development landscape. On one side are the developers, delivery managers, business owners who would like to make their work (product features) available to the users quickly and on the other hand are the QA engineers, the devops engineers, who would like to take some time to test and ensure that we avoid regressions and service outages at all cost while we deliver those new features.
Both sides have valid points, delivering features quickly seems to be a key component of capturing market share on the other hand, service outage causes the users real pain and translates into revenue loss, and in the worst case, loss of market share or user base.
Well, how about we hire an army of amazing engineers who write flawless code with 100% unit test coverage? - we can just push that code to production quickly, right?, Wrong. Testing (unit, manual, etc.) can only ensure the presence of bugs, not the absence
During a service incident, several engineers in one of my previous teams had to spend days debugging an issue caused by a single instance of case-sensitive string comparison.
Ring-based deployments
Over time, all software components increase its interdependencies with other components within the architectures. When we deploy a sizable change in a component or a new component, the probability of the introduction of bugs increases. These could be caused by logical errors, by faulty code, and or integration issues. Given this high probability, there needs to be a way to mitigate any negative impact on the customer bases while we deploy code to the production environment.
Ring-wise continuous deployment (CI) is one such method to mitigate the risk of customer impact while achieving high velocity of feature delivery.
Divide all environments where the software component is available into several rings of availability.
Starting from the developer's machine (Ring 0), to the final production environment (Ring 3). Continually deploy and test software features from one ring to the next until it is available to all users.
The exact implementation of the rings depends on the type of software, the team, and the target user base. A sample implementation:
Ring 0 Developer's machine
Ring 1 Staging environment
Ring 2 Production preview or Beta Customers
Ring 3 Production or Real Customers
The key point to remember is:
The infrastructure and the deployed code in a ring should closely (as closely as possible) represent the ring# after it
The underlying need of deploying software w/o causing disruptions in the production environment arises from the customer-first/user-first thinking. In that case, the tech, product, and operations teams all work towards achieving the optimal experience for their customers. The additional work for ring-wise deployment and testing is part of that cost.