How Armory and Spinnaker Support High Performing Teams (part 3)
The “Accelerate” book by Nicole Forsgren, Jez Humble and Gene Kim identifies 24 key practices and capabilities that characterize high performing teams and organizations. These are the direct result of the State of DevOps research that the authors have led over the past 6 years.
The capabilities are grouped into 5 dimensions:
- Continuous delivery
- Product and process
- Lean management and monitoring
In previous posts, we highlighted how Spinnaker helps support “version control (everything-as-code)”, “trunk-based development”, “loosely coupled architectures”, and “deployment automation” capabilities.
In this post, we will discuss how Spinnaker helps high performing teams practicing:
- Team experimentation
- Customer feedback
“If a development team isn’t allowed, without authorization from some outside body, to change requirements or specifications in response to what they discover, their ability to innovate is sharply inhibited.”
Spinnaker was designed as a Continuous Delivery platform for automating the deployment of software changes at high velocity. This means that changes going live are typically small, with a very clear scope – typically at sub-feature level. This allows a fine-grained level of control and experimentation over what gets released.
Teams using Spinnaker often run the CI part (build + unit tests) of their delivery pipeline in a CI tool like Jenkins or TravisCI, then kickoff a Spinnaker pipeline to run other activities like system or performance testing, environment promotion and so on, until finally deploying to production.
That said, it is possible to include role-based manual gates in a Spinnaker pipeline, thus reducing a team’s autonomy to experiment by depending on the approval from someone outside the team. However, these gates are being used more and more sparsely, as organizations understand that blocking a pipeline leads to an accumulation of changes (batching) waiting to be released, increasing the risk of problems and time to diagnose them.
Shifting from pipeline execution to pipeline design approval with certified pipelines
Leading organizations working in industries with strong compliance and risk control requirements are replacing role-based manual approvals with automated policies and criteria embedded in their delivery pipelines. Changes failing to comply still don’t progress to production, but there is no waiting time involved. Feedback is as fast as possible and teams start to modify their default behavior by baking in security and quality at early stages of development. They know that otherwise they are shooting themselves in the foot as changes are not really done and will require more rework.
Armory has effectively codified this recommendable practice in a feature called “certified pipelines”. We can see this as shifting the approval process from the pipeline execution to the pipeline design. Whoever is responsible for ensuring compliance and risk control (either a Change Approval Board – CAB – or a dedicated team) can review the policies and checks in the pipeline and “certify” it as fit for deploying changes to production. When any of the steps in the pipeline is modified, it needs to be “re-certified” again to be able to continue deploying.
This drastically reduces the required manual effort to keep a smooth flow of changes. Consider that a pipeline design might change weekly at most, so it would fit well with a CAB’s typical meeting schedule. Pipelines execute daily at a minimum, and up to hundreds or thousands of times per day for high performing teams. The overwhelming majority of them would never get past a manual approval gate, even if the change matches all the criteria for deployment (for example, passing all tests, no vulnerabilities detected, no alarms on code analysis, no performance degradation, etc).
Controlling visibility of changes and leveraging data insights for release decisions
Once we get beyond manual approval bottlenecks, how can a team effectively experiment and trial changes without risking propagating either technical failures (they will happen eventually, even if our robust test suites found no issues) or behavioral failures (i.e. the clients didn’t react to the software changes as we hypothesized) to all their customers?
Canary releases come to the rescue:
“Canary release is a technique to reduce the risk of introducing a new software version in production by slowly rolling out the change to a small subset of users before rolling it out to the entire infrastructure and making it available to everybody.”
Spinnaker supports this type of slow release of changes, exposing them to a (configurable) portion of live traffic. The goal is to analyze the impact of those changes on a small subset of users in order to decide if it should be made available to everyone or instead rolled back.
There are multiple possible factors to analyse in a canary, from infrastructure usage and run-time performance (especially for technical issues) to internal application metrics and real user monitoring (especially for behavioral impact on user). Different changes might require different combinations of data analysis.
Spinnaker can help take a step further here by integrating with Kayenta, a tool developed by Netflix to support fast and safe delivery. Its “Automatic Canary Analysis” (ACA) feature essentially takes infrastructure and application metrics from a (configurable) set of sources on one hand, and a pre-defined set of quality criteria in order to automatically make a roll out or rollback decision for a particular canary.
It’s also possible to define under which criteria a manual decision should be made based on how the canary scored. For example, we can configure that a score between 0 and 70 means the canary has failed and is automatically rolled back (all traffic directed to the previous software version again). Between 70 and 85 it’s doubtful and requires manual intervention. Between 85 and 100 the canary gets rolled out automatically (all traffic will eventually get redirected to the new version, according to the chosen deployment strategy).
For more technical details on using ACA, check out this whitepaper from Armory. For details on the rationale behind the tool and the decision computation algorithm see Netflix’s blog post introducing Kayenta.
Netflix’s blog says that “having detailed insight into why a canary release failed is crucial in building confidence in the system”. We would add to that “and enabling low risk team experimentation”.