Identifying risk when executing your Kubernetes migration
Apr 17, 2020 by Fernando Freire
“Any improvements made anywhere besides the bottleneck are an illusion.” – Gene Kim, Phoenix Project
Software continues to “eat the world”, and organizations are struggling to keep up. With the increasing pace of business and consumer expectations, enterprises in the Global 2000 are undertaking costly “digital transformations” to stay competitive.
When speaking to bank executives, they’ll tell us, “We’re not a bank; we’re a software company that specializes in finance.” When speaking to retail executives, they’ll tell us, “We’re not a shoe company; we’re a software company that specializes in shoes.” Digital transformation is billed as the miracle that will lift these traditional companies into the Digital Age. Many of these transformation initiatives are taking the form of containerization and “Kubernetification”, which appear to be silver bullets to the growing pains of building an engineering organization.
Through the 90s, a common refrain was “nobody gets fired for buying IBM”, and the same can be said of Kubernetes. It’s the New Shiny™, and no engineering VP can be faulted for choosing it to improve their productivity woes. So if the promises of Kubernetes are so great, why are enterprises taking so long to execute on their Kubernetes migration?
Storytelling from the Trenches of the Fortune 100
Let’s look at “Hooli” to understand how this could take a wrong turn. Jennifer is a Director of Engineering that oversees several teams in Hooli’s Signature Box division. Jennifer hears that containers are the wave of the future, and assigns a few engineers to identify how they will migrate their platform. We’ll call them Allyson and Jorge.
These two engineers set out with gusto to identify the absolute best way to use containers in their organization. They test several solutions: Docker Swarm, ECS, and a couple flavors of Kubernetes. They settle on Kubernetes because it ticks most of the boxes they are looking for:
- Deployments as a first-class member of the platform.
- Resiliency in the form of nodes and containers restarting when unhealthy.
- Deduplicating out-of-process agents such as log aggregators, metrics monitors, and security agents.
The path forward is clear; all Allyson and Jorge have to do is get one little application migrated. One application sounds easy until you start considering all the dependencies that are required to run on Kubernetes. How do they translate their virtual machine build process to containers? How do they make sure compliance agents running in their virtual machine function the same way in containers? How do they adjust their deployment pipeline to take advantage of what Kubernetes has to offer? A molehill becomes a mountain of little details, each requiring their own bespoke solutions to work inside of Hooli’s infrastructure.
Undeterred, Allyson and Jorge set out to make one application run on Kubernetes. They slave for months to migrate. When they finally reach the finish line, they look around the organization for the next team to migrate. To their surprise, very few are willing to migrate and countless more refuse. They refuse because Allyson and Jorge are asking them to upend their current world in favor of a muddy future with unclear value. Most teams, while willing to try new technologies, are conservative when adopting unproven tools. More often than not, their performance is measured in features delivered and system stability. Asking these teams for wholesale migration is a recipe for disaster. No matter how much well-intentioned support Allyson and Jorge offer, the approach is riskier than those teams are willing to accept.
It ends up taking Allyson and Jorge eight months to show value. Granted, a lot of questions get answered in that span, but teams look at the timeline and shudder at the impact to their pace of delivery. Product owners certainly won’t grant more than a sprint’s worth of time to migrate an application, let alone eight months.
Misunderstanding the Risks in Migrating
Timeline aside, a migration to Kubernetes introduces new tools and processes besides Kubernetes. While the new platform fits together seamlessly, there’s often a gap in tooling to help teams migrate onto Kubernetes with the same fluidity. When Allyson and Jorge finally sign up a team to migrate, they find implicit assumptions about how teams should be using Kubernetes at Hooli. The horizon for moving onto Kubernetes moves further into the future when these assumptions start to break down. This is how your well-intentioned migration to Kubernetes goes from a few months to a few years.
Hooli’s migration is, at it’s core, a misunderstanding of the risks involved when moving to Kubernetes. When most organizations start their migration they often begin with “how am I going to deploy this to my new environment?” The focus on delivery often results in security, auditing, observability and infrastructure becoming secondary goals. The end result is a shiny new platform that is riskier to operate because it’s less feature rich. The desire for new technology to solve old problems leads to complexity because Kubernetes isn’t a silver bullet.
Another common pitfall is doing things the Kubernetes Way™. It starts with your engineering champion jumping at the opportunity to take full advantage of Kubernetes. Instead of moving to containers and Kubernetes, you’re now moving to containerd, Istio, Argo/Flux, and the list goes on. If your head is spinning, take solace in the fact that you don’t have to do it all at once if at all. Kelsey Hightower, on Kubernetes, says, “There are no best practices, everyone is just practicing.”
Piecing It All Together
When your engineers tell you one story and the industry tells another, how do you successfully execute your Kubernetes migration? The answer is likely more boring than you hoped: define incremental milestones that mitigate risk along your path. The transition from Waterfall to Agile is instructive; rather than designing your entire transformation up front, find smaller milestones that still deliver value. When the milestones are small enough, you reduce impact to consumers, keep your engineers happy, and find a more effective organization on the other side. This is the Zen of migrating to Kubernetes.
In our next post we’ll identify concrete ways in which you can reduce the risk of your Kubernetes migration, and how Spinnaker helps you get there.