Building A Better Release
Nov 19, 2021 by Marcia Knous
Security, stability and predictability – important hallmarks of what it takes to ship great software. In the fall of 2020, Armory began the process of incorporating these three key factors into a new release process. We believed there would be substantial benefits to changing our release cadence and offering our customers the chance to be on a longer release train, but at the same time offer periodic bug fixes and patches to keep them safe and secure.
At the time I joined Armory, we were still putting together the pieces for a much needed new release process. In this post, I summarize our journey to our new Long Term Stable (LTS) release. I also will share some of my experiences as a new employee working on this exciting project.
Planning the Journey
Armory began the journey toward an LTS release in November, 2020. The idea was that each release would have a one year life span and be released roughly every 6 months. Each release would be patched periodically for bug fixes and CVEs. We envisioned this release cadence would help customers achieve long-term consistency, infrastructure stability and better security.
As part of this process, we thought it would be beneficial to align Armory LTS releases with Spinnaker OSS releases, so we worked with the OSS community to create a six-month cadence. This community support enables us to continue to incorporate community backports in our releases.
We also committed to collaborating with our Managed customers in the new release process by giving them the opportunity to use the release in an Early Access program. We thought this approach would yield some valuable feedback in the months leading up to the release, and our experience with the first iteration through this process has so far validated that approach. Given this, we look forward to partnering with even more Managed customers in our next release.
It’s Getting Real
In March of 2021, I joined Armory as a Release Project Manager. Prior to joining Armory I had worked as the Nightly Release Manager at Mozilla, so I was accustomed to working with my team to continually improve and streamline the release process.
I joined the Extensibility team, which was composed of 4 developers who were already working on all the pieces needed to support this new release process. I joined the daily standup calls to learn more about the team and how they worked. It was pretty clear that there was a fair amount of work to do, including building an API for the release process. I had a lot to learn as well, and those standups helped me integrate into the team and learn much more about the work they planned to do. I also participated in various conversations with many stakeholders across the company, to learn as much as I could both about the company culture as well as the agile process.
While my team was busy writing awesome code, I was busy thinking about ways we could improve the release process. I eventually focused on four areas I believed would help keep the broader team on the same page: 1) communication, 2) tracking processes, 3) testing, and 4) documentation.
From the start, it was clear one area where I could make an impact was cross-company communication. To that end, I created a FAQ to answer all the release-related questions. I also helped to create a weekly release sync meeting to keep everyone on the same page. Finally, I created a release status document that I update regularly and post in our communication channel so everyone can follow along asynchronously.
The second area where I’ve spent time to help the team is how we track items in the release. Recently we decided the best way to track what was needed in each release was to link all the tickets in JIRA to the signoff ticket. That way, it would be easy to ensure that everything that was targeted for the release actually made it in.
Similarly, we started tracking release notes differently. Previously, release notes had often been put together at the last minute. Fairly recently we added a mandatory field in JIRA where developers can add more information about the feature so it can be captured earlier in the release process. This improves the overall quality of our handcrafted release notes created by our awesome Docs team, which are an important part of the final release.
The third area I tackled is critically important to every release: testing. To that end, I scheduled a testing review with all of the individual service teams. The goal was to look at our integration tests and see where there were possibly areas to improve. As a result, we identified a few microservices that could benefit from additional testing, and one of our Armory teams has started working diligently on those improvements. Spinnaker is a complex product to test, and I imagine there will be future sessions like this where we will periodically review our test landscape.
The final area where I think we have made some significant improvements is documentation. This includes things like having a Release Process Runbook to consult when there are questions about doing something as part of the release process. There is quite a bit of tribal knowledge, but sometimes documenting it in a centralized place can be very beneficial.
Heading in the Right Direction
Besides all of the process improvements mentioned above, our team had to make decisions along the way regarding the technical side of the release. This included things such as feature freeze dates, versioning, trying to automate release notes and CVEs, etc. We even had to take into account things like trying to make sure we didn’t have any conflicts between the legacy release process and the new release process.
By far the most impactful tool that our team developed during the new release process was Astrolabe. Developing this project allowed us to take in data from multiple domains across the SDLC. It allowed sources such as code repositories, build systems, artifact stores and cloud deployments to be ingested and linked together, allowing the ability to interrogate relationships across domains and trigger off events. The Github link above will go into more detail about some of the other parts of the project and how it can be used, but I feel that using Astrolabe was a game changer in terms of what it can do for us in the future as we continue to automate and refine various parts of the release process. I am proud of the fact that my team had the vision to integrate a tool into the process that will be so useful down the road.
An Engineering Perspective
The other day I was chatting with Dan Peach, a Senior Engineer at Armory and one of my former team members about what he believed to be the significant wins in this new release process. One of the top big wins of the new release process he highlighted was the fact that the commits are integration-tested more quickly than they were in the old release process, where it could take up to 2 months for a new commit from OSS to be integration tested. Now it takes approximately one hour for a commit to be built with Armory code and integration tested. Another point he mentioned was the fact that we built a very user friendly, hands-off process that would allow almost anyone in the company to build a release. In fact, I currently have been creating the release candidates in Github for some of the releases.
Finally, Dan also mentioned the fact that we built an API around the release process. That is now allowing us to do things like build a release dashboard where we can see different views of our development process. I can see many great things coming down the road as a result of being able to see the big picture of the release as it unfolds.
There is always more work to be done in the Release Management world – I am excited to continue the journey working with my team to make the LTS release process even better!
Recently Published Posts
Welcoming 2022: Reflecting and looking forward
Nearly all cultures globally have some form of celebration marking the Winter Solstice. Common threads found in most observances of the annual event are celebration of family and friends (living and past), reflection of the past year, and some form of giving thanks for continued health and sustenance. Exiting 2021, said celebrations would seem especially […]
Read more →
Resiliency and Load distribution
Introduction When scaling a network service, there are always two concerns: resiliency and load distribution, to understand these concepts let us first understand the broader term “Redundancy”. Redundancy is the duplication of a component to increase reliability of the system, usually in the form of a backup, fail-safe, or to improve actual system performance. Resiliency […]
Read more →
CVE-2021-44228 – log4j (Log4Shell) – an analysis
Today marked a 0-day disclosure of a rather nasty vulnerability in one of the most commonly used frameworks for logging – log4j. This one is nasty on multiple levels. Note that Armory Enterprise is NOT affected by this vulnerability. The impact on this vulnerability is likely huge and is already being exploited. Additionally it can […]
Read more →