Armory Agent for Kubernetes Simplifies K8s Complexity

Oct 11, 2021 by Armory

I’ll be upfront with you, I’m a sucker for a good origin story. It’s one of the reasons I spent hours engrossed in the Marvel Cinematic Universe not too long ago.  Rooting for incredibly flawed individuals with an outsized sense of duty and superpowers to back it up….What’s not to love?  My partner has a very different opinion on this, but I’ll spare you the details of that conversation. Instead, I’ll share an origin story that is much less polarizing—one that I heard from Andrew Backes, Armory’s Head of Engineering. 

Not a good user experience

Andrew recently shared with me how the team discovered the need for the Armory Agent for Kubernetes:

“As soon as customers had 100s of Kubernetes accounts, they would start experiencing significant operational slow down. Every time they added a new account, it would require restarting Clouddriver, and that could take at least an hour.  Also, caching for large clusters would take a very long time and cause dramatically slower deployment times.  These things caused both problems when adding or removing accounts as well as performance problems during deployments. It really was not a good user experience at all.”

Put another way, whenever customers performed a key operation—deploying or adding/removing an account—it required additional, manual resources. In short, customers had a painful operational scaling issue.

Reaching massive scale

So Andrew and team introduced a solution that enables Kubernetes deployments at massive scale.  The solution involves offloading some of the functionality centrally performed by Clouddriver and distributing that functionality to Kubernetes clusters at the edge.  Specifically, the Armory Agent for Kubernetes is a lightweight, distributed service that monitors Kubernetes clusters and streams changes back to Spinnaker’s Clouddriver service in real time.   

This architecture created the ability to deploy to Kubernetes at scale by creating a complementary relationship between Clouddriver and the Armory Agent for Kubernetes:

Spinnaker Clouddriver responsibilities Armory Agent for Kubernetes responsibilities
  • Manage the storing of infrastructure data in its cache
  • Apply logical transformations and serving them to other services
  • Initiate state changes
  • Account management
  • Get data from the Kubernetes API server to Clouddriver
  • Perform operations directly against the Kubernetes API server

The end result is that customers can use Spinnaker to deploy to thousands of clusters and namespaces, both on-prem and in the public cloud. One reason this is now possible is because Agent workloads are now cached at the edge to help reduce latency issues.  Another reason is because performance is dramatically enhanced by using the Kubernetes Watch function to update Clouddriver only with event stream changes rather than a periodic full cache rebuild. 

Scale + Security + Accelerated Onboarding

As mentioned in our KubeCon 2021 press release, Distributed Kubernetes Agent Tames Day 2 Operational Complexity, 94% of organizations running Kubernetes in production state that it is a source of complexity for them. But what’s causing all the strife? D2iQ shares that 47% of respondents cite security concerns, 37% cite scaling issues and 34% cite a lack of resources.  As mentioned, the Armory Agent for Kubernetes was created to address a scaling issue. However, after running the agent in customer environments for the last year, we’ve validated it does more than just solve for scaling.

What we’ve learned is that the distributed nature of the agent provides a number of benefits. For instance, the Armory Agent for Kubernetes reduces security vulnerabilities by distributing service accounts across environments and clouds rather than placing them in a central location.  

The Armory Agent for Kubernetes also reduces the constraint caused by a lack of resources by giving product teams control of the service accounts and permissions in their clusters.  This decentralized approach—combined with automation and code templates—not only reduces bottlenecks, it also speeds up the onboarding of new clusters by as much as 3x.

If we can help with your origin story of deploying to Kubernetes at scale, please reach out and set up some time to chat with us. We’d love to hear from you!

Share this post:

Recently Published Posts

How to Become a Site Reliability Engineer (SRE)

Jun 6, 2023

A site reliability engineer (SRE) bridges the gap between IT operations and software development. They understand coding and the overall task of keeping the system operating.  The SRE role originated to give software developers input into how teams deploy and maintain software and to improve it to increase reliability and performance. Before SREs, the software […]

Read more

Continuous Deployment KPIs

May 31, 2023

Key SDLC Performance Metrics for Engineering Leaders Engineering leaders must have an effective system in place to measure their team’s performance and ensure that they are meeting their goals. One way to do this is by monitoring Continuous Deployment Key Performance Indicators (KPIs).  CD and Automated Tests If you’re not aware, Continuous Deployment, or CD, […]

Read more

What Are the Pros and Cons of Rolling Deployments?

May 26, 2023

Rolling deployments use a software release strategy that delivers new versions of an application in phases to minimize downtime. Anyone who has lived through a failed update knows how painful it can be. If a comprehensive update fails, there are hours of downtime while it is rolled back. Even if the deployment happens after hours, […]

Read more