Dogfooding At Armory

Apr 18, 2022 by Dan Peach

We recently onboarded the main server component of our new software delivery platform — codenamed Borealis — onto Borealis itself. I’m going to talk a bit about Borealis, the experience of dogfooding at Armory, and what we hope to learn from it.

Borealis and Deploy Engine

The application we onboarded is called Deploy Engine. The name is uninspired and is out of place among the sea of ostensibly Grecian software systems — Kubernetes, Argo, Tekton, etc. — but is descriptive of its role within Borealis: it orchestrates software deployment strategies.

Using the Borealis CLI or API, a user can provide a high-level description of their deployment: First, I’d like 10% of production traffic to go to my new software version. Next, I’d like the deployment to wait thirty minutes and then I’d like to run a canary analysis of application metrics. Next, I’d like to increase traffic to 50% — and so on. Deploy Engine, in concert with a few other subsystems, orchestrates this kind of deployment workflow.

Dogfooding at Armory

Dogfooding at Armory

In my experience, the mental state of even the most empathetic, thoughtful, and experienced software engineer is divorced from that of a user.

I have two analogies:

Dogfooding is hard, painful work, but it’s one of the best ways to experience your software as your users will. It’s not a replacement for user feedback — we have biases and beliefs that inform how we experience our software that may be different than those of our customers — but it’s an essential part of the development lifecycle.

What we’ve learned

I’ll share two anecdotes about our experience of dogfooding Deploy Engine.

When we first tried to onboard Deploy Engine onto the Borealis platform, we hit a strange bug that we’d never seen in testing or development. I soon discovered a bug that’s a little embarrassing, a little funny, and fortunately won’t take long to describe.

Our build uses Kustomize to generate manifests for five Kubernetes resources. When we tried using Borealis to deploy for the first time, only two of the five resources were deployed. We had unit tests that ensured that this case would work, so I was mystified. After some digging, I found the answer.

Our API accepts both JSON-encoded and YAML-encoded Kubernetes manifests. We check the first 512 bytes of a manifest: if those bytes look like JSON, we interpret as JSON; if not, we interpret the manifest as YAML. Unfortunately, in the case where we decided the manifest is encoded as YAML, we only used the first 512 bytes – we threw the rest away! Oops.

Our unit tests checked that we could deploy multiple manifests, but we used unrealistically small manifests — smaller than 512 bytes. This is a bug I’m glad a customer never had to experience, and is a bug we likely wouldn’t have caught without dogfooding.

My second anecdote is also brief. Borealis has a UI that lists all recent deployments; Deploy Engine determines how those deployments are sorted. Working with the product team, we defined the sort based on our experience and what we thought would make sense to customers.

Once we started using that UI as users, we realized the sort didn’t make sense at all — it bothered us, and we decided to change it. That very same day, one of our first users asked us to update the sort on the deployments page; their suggested change was the exact change we had just implemented. We were able to go back to the customer and tell them that their fix would be ready by the end of the day.

Today all the services used by Borealis are deployed using Borealis. Additional teams throughout the company have also started using it as they’ve seen it improve. Teams across Armory are giving us feedback. Some of it is tough to hear, but we’ll catch bugs and improve the product in ways that our customers will love. I hope to give you an update on Borealis, our progress, and our dogfooding journey in a few months.

If you would like to try out Borealis, we are still accepting design partners, and I’d love to hear your feedback on it as well.

Thanks!

Dan

Share this post:

Recently Published Posts

Lambda Deployment is now supported by Armory CD-as-a-Service

Nov 28, 2023

Armory simplifies serverless deployment: Armory Continuous Deployment-as-a-Service extends its robust deployment capabilities to AWS Lambda.

Read more

New Feature: Trigger Nodes and Source Context

Sep 29, 2023

The Power of Graphs for Ingesting and Acting on Complex Orchestration Logic We’ve been having deep conversations with customers and peer thought leaders about the challenges presented by executing multi-environment continuous deployment, and have developed an appreciation for the power of using visual tools such as directed acyclic graphs (DAG) to understand and share the […]

Read more

Continuous Deployments meet Continuous Communication

Sep 7, 2023

Automation and the SDLC Automating the software development life cycle has been one of the highest priorities for teams since development became a profession. We know that automation can cut down on burnout and increase efficiency, giving back time to ourselves and our teams to dig in and bust out innovative ideas. If it’s not […]

Read more