I’ll be upfront with you, I’m a sucker for a good origin story. It’s one of the reasons I spent hours engrossed in the Marvel Cinematic Universe not too long ago. Rooting for incredibly flawed individuals with an outsized sense of duty and superpowers to back it up….What’s not to love? My partner has a very different opinion on this, but I’ll spare you the details of that conversation. Instead, I’ll share an origin story that is much less polarizing—one that I heard from Andrew Backes, Armory’s Head of Engineering.
Not a good user experience
Andrew recently shared with me how the team discovered the need for the Armory Agent for Kubernetes:
“As soon as customers had 100s of Kubernetes accounts, they would start experiencing significant operational slow down. Every time they added a new account, it would require restarting Clouddriver, and that could take at least an hour. Also, caching for large clusters would take a very long time and cause dramatically slower deployment times. These things caused both problems when adding or removing accounts as well as performance problems during deployments. It really was not a good user experience at all.”
Put another way, whenever customers performed a key operation—deploying or adding/removing an account—it required additional, manual resources. In short, customers had a painful operational scaling issue.
Reaching massive scale
So Andrew and team introduced a solution that enables Kubernetes deployments at massive scale. The solution involves offloading some of the functionality centrally performed by Clouddriver and distributing that functionality to Kubernetes clusters at the edge. Specifically, the Armory Agent for Kubernetes is a lightweight, distributed service that monitors Kubernetes clusters and streams changes back to Spinnaker’s Clouddriver service in real time.
This architecture created the ability to deploy to Kubernetes at scale by creating a complementary relationship between Clouddriver and the Armory Agent for Kubernetes:
|Spinnaker Clouddriver responsibilities||Armory Agent for Kubernetes responsibilities|
The end result is that customers can use Spinnaker to deploy to thousands of clusters and namespaces, both on-prem and in the public cloud. One reason this is now possible is because Agent workloads are now cached at the edge to help reduce latency issues. Another reason is because performance is dramatically enhanced by using the Kubernetes Watch function to update Clouddriver only with event stream changes rather than a periodic full cache rebuild.
Scale + Security + Accelerated Onboarding
As mentioned in our KubeCon 2021 press release, Distributed Kubernetes Agent Tames Day 2 Operational Complexity, 94% of organizations running Kubernetes in production state that it is a source of complexity for them. But what’s causing all the strife? D2iQ shares that 47% of respondents cite security concerns, 37% cite scaling issues and 34% cite a lack of resources. As mentioned, the Armory Agent for Kubernetes was created to address a scaling issue. However, after running the agent in customer environments for the last year, we’ve validated it does more than just solve for scaling.
What we’ve learned is that the distributed nature of the agent provides a number of benefits. For instance, the Armory Agent for Kubernetes reduces security vulnerabilities by distributing service accounts across environments and clouds rather than placing them in a central location.
The Armory Agent for Kubernetes also reduces the constraint caused by a lack of resources by giving product teams control of the service accounts and permissions in their clusters. This decentralized approach—combined with automation and code templates—not only reduces bottlenecks, it also speeds up the onboarding of new clusters by as much as 3x.
If we can help with your origin story of deploying to Kubernetes at scale, please reach out and set up some time to chat with us. We’d love to hear from you!