
Enabling a Unified Data Model for your Software Development Lifecycle (SDLC)
Jun 25, 2020 by Armory
I caught up with Isaac, Armory’s CTO, and Clay, Principal Software Engineer at Armory, to discuss how Armory and the Spinnaker community are evolving and deepening Spinnaker as a Platform. In this video we discuss the work being done to create a Unified Data Model for our customers’ entire Software Development Lifecycles (SDLC) to help solve the following types of challenges we typically see in Global 2,000 enterprises:
- Compliance and security bottlenecks and due to fractured data models across multiple systems of record in the SDLC, ranging from Jira & ServiceNow on the left, to APMs, Centralized Logging & Alerting on the right of the SDLC
- How a Unified Data Model that ingests the siloed data sitting across SDLC tooling today will solve Compliance & Security auditing challenges companies are facing — especially as they move workloads out of data centers and into multiple cloud targets types and providers
- How this Unified Data Model will also enable new views that reduce Mean Time to Resolution (MTTR) and improve developer efficiency by helping enterprises answer questions like “are my teams shipping code at an optimal pace while also being secure and meeting internal standards?”
Here’s a transcript of the video:
DROdio: Okay, we’re recording. This is DROdio, CEO of Armory, doing a “Between a Shield and a Fern” CEO interview here with Isaac and Clay. So Clay, let’s have you introduce yourself first.
Clay McCoy: I’m Clay McCoy, and I’m a principal engineer at Armory. And I’ve been here since October of 2019, mostly working on the plugin framework, but I’ve worked on Spinnaker for quite a while. I was at Netflix and worked on Asgard which is a precursor to Spinnaker and helped start Spinnaker and open source it with a great team there, and then moved on to do some other things. I’ve also been at Pivotal and worked on Spinnaker at Pivotal, and now I’m here at Armory.
DROdio: I can’t wait to dig into all of the history of Spinnaker and then talk about where we’re taking it from a data perspective which is what this talk is going to be about here. Super excited for that. All right, Isaac if you could also mention we’ve got some awesome news to share as well, so introduce yourself and share this news here.
Isaac Mosquera: Yeah, great. So I’m Isaac Mosquera, I’m the CTO, and one of the co-founders here with you, Daniel at Armory, and I’ve been here, obviously since the beginning. And the good news that we have to share with everybody today is that I just joined the Steering Sommittee for Spinnaker, which is the top-most committee for the community, and I’m super excited and happy and fortunate to be on that committee with a bunch of other folks from Netflix at Google, to help really take Spinnaker and the community to that next level, and I think what we’re going to talk about today shows some of the visionary thinking around the project and where we want it to head.
DROdio: Awesome, awesome. Okay. All right, so let’s actually start by just looking at some history so let’s just rewind to the origins of Spinnaker — Clay you talked about that a bit, but also maybe we can talk about how Spinnaker exists inside of Netflix and how that’s different from what’s in open-source; what some of the problems and gaps are and then we can get into this data conversation which is so exciting.
Clay McCoy: Okay, so, like we’re talking about before this call, Netflix didn’t open-source everything — they have several tools that have remained internal and add ons to Spinnaker that integrate those tools that that store data about the world. Other teams do them and specialize in these tools, and Spinnaker gets to benefit from that information. Things like Astrid, that is an internal tool they have that handles dependency management what dependencies your project have and understanding all those relationships and then being able to correlate that with what’s in deployment, that sort of thing. And without that, and even with it, there’s still some gaps, like Spinnaker’s got all of this information about your software delivery flowing through it. But it’s not necessarily the best steward of that all the time — sometimes it can be hard to extract that information and related things to questions like, “when you’ve got something deployed, what pipeline built build this server group? What artifact is in it? What built that artifact, and what commit di that ultimately come from?” — some of the some of that. Those are questions that people have when things start to go wrong. And it’s nice to have that at your fingertips.
DROdio: And so when I speak to Global 2000 CIOs, very consistently, they have very little understanding of what their SDLC looks like inside their own company — so even beyond just the deployment piece, but across the entire SDLC there’s all these vendors with all this data that is just siloed in each system of record, and we were just talking recently about how Armory is now at a point where we can start to execute on some of the broader vision of unlocking some of this SDLC data, and really using Spinnaker to be a core component of that. So maybe Isaac, you can talk a bit about what is coming, what is Armory involved in, and what are some of the timeframes around what some of this can look like, and then we can go into some more detail about it.
Isaac Mosquera: Yeah, the really big push for Spinnaker in the next 12 to 18 months — and this is something that Clay is actively working on — is really taking Spinnaker and turning it into a true platform, or I should say, it is already a platform but I think a more extensible platform, that allows you to build on top of it, and that people will start perceiving Spinnaker not as a CD tool but as this platform takes you 80% of the way there, and then you build on top of it that 20% whereas historically, a lot of the tools that we’ve used have been things like Jenkins which has been made for CI, which gives you 20% of the way there and you’re supposed to build 80% on your own; you have to have your own data model you have to have your own event bus you have to have to think of architecturally, like, way in advance where you think your organization is going to go.
And so Spinnaker, from the very beginning — and I think Clay because of what happened with Asgard; everybody forks Asgard to meet their customers’ needs; there was a fork at IBM; there’s a fork at Nike. And one of the main tenets of Spinnaker was that we would prevent forking and instead try to bring people into the community and extend Spinnaker for your use case, right? And if you think about it like that, every enterprise is actually very unique. In their processes: Each component may be similar, like there may be a similar component in terms of security or compliance, like many people use JIRA or many people use ServiceNow, but how and when that gets applied is different for everybody. So being able to plug and play all of these things is really important, and to be able to build your own software delivery tool yourself out of these components that we’re building.
And so what you’ll see coming out of Spinnaker, which I’m super excited about, is actually stripping away a lot of the specific end use cases and turning those into Plugins that are supported from people like Atlassian, from Amazon who’s heavily contributing to the community, from people like Gremlin from Pulumi — and that ecosystem will grow; it’s already growing, and then you’ll have a very stable core that is just about the core abstractions, and then everybody else contributing onto that platform, and then you as an enterprise will be able to take these plugins and mix and match, based on your company’s needs, and probably more realistically, based on each business unit’s needs, because maybe you have a certain business unit that uses AWS and a security tool, and maybe you have some other business unit that uses GCP and a different security tool; all of that can be housed under this Spinnaker platform, and then you can have this Unified Data Model that Clay has been talking about for you to build new views that we as a community have never thought of that is very specific to your company. I’ve seen that now — Airbnb has presented on this — and there’s other companies who have presented, how they took the core data model, which needs to be approved which Clay mentioned, and built their own views and we can totally see what people are going to be doing with this in the future. It’s pretty exciting.
DROdio: Okay, so we’re moving, removing from this “fat core” to this “lean core and fat ecosystem” with vendors creating plugins, which is already happening and it’s amazing — I’m actually going to do another interview specifically about all the all the plugins. All these plugins are throwing off data. Let’s dig into that unified data model, that graph database for SDLC data, and let’s just talk a little bit more about how will that work and what will that enable? Let’s say that I’m on a central infrastructure team or automation team or I’m a VP of Engineering or maybe I’m a C level; CTO or CIO — what’s what’s gonna become possible with all of this that I can’t do now?
Clay McCoy: So, right now where we’re thinking of this in terms of a graph database (not to dive too deep into the implementation), and, an API on top of that it’s probably GraphQL. And what that lets us do is, basically as a user, you won’t know that there’s Neo4j or whatever underneath, you’ll know what API is and you’ll be able to basically pick up this graph of SDLC data at whatever point you want, and take whatever’s hanging off of that via relationships that you’re interested in, filter it down based on attributes and see exactly the data that you want. So you could you’ll be able to ask some really interesting questions like some of the basic things we were talking earlier about like “what happened to produce this?” Sure you can go look at a pipeline, and there’s ways to tease out a lot of that information, but knowing what commit is running, or even historical data about what was running last week when we had an outage. But the cool thing about this, the graph database and the GraphQL is the ability to ask unanticipated questions, sowe will empower people to ask the questions that we don’t even know necessarily what they are, like, “take a dependency that is a problem and ask what is deployed right now that that uses this dependency” — those are the kinds of things that can come out of this. You don’t have to think of the questions you want to ask about the data beforehand and write code and have this endpoint that specializes in answering that question.
DROdio: I want to dive a little bit deeper into these use cases because if I’m an executive I may not fully understand; I may be making assumptions about what exists in the world today, and I may not fully understand — even within my own company — what some of the challenges are, and especially if I’m moving to the cloud I’m moving out of data centers, I don’t have as much control over my infrastructure as I did when it was in data centers and so now there’s infrastructure running in AWS or wherever it is. can you actually go a little bit deeper on that use case Clay? So for example, understanding what’s running on which servers in production, they go a bit deeper into how that’s going to be valuable for these companies.
Clay McCoy: All right. Well in that case specifically imagine that you know some kind of vulnerability in a certain library, and right now you want to know, “hey, are we using that? Is that running anywhere in production? Do we need to think about not only fixing that in the long term, but get an idea of of what’s going on right now.” Just the ability to ask questions about your data that are pressing right now but you hadn’t thought through that beforehand.
Isaac Mosquera: And here are a few enterprise Enterprise Compliance use cases: What I observe at many of these enterprises that we work with — you’re talking about the Fortune 100 with big, very complex hairy problems with thousands, or tens of thousands of software engineers. What is happening in the world today is that you’re not going to have a homogeneous stack of infrastructure. You’re going to have some people using VMs, you’re going to have some people using Cloud Functions, you’re going to have some people using Kubernetes, and some people using PCF — all under the same enterprise. There is just no way to control it. And in fact, you probably shouldn’t control it, because you want your business units doing what’s best for them. And what’s best for them may not be best for everybody else. Some may want to use Kubernetes, some may want to use Lambda and I think that’s okay; You want to enable them, because what you as a CTO or a CIO really care about is meeting a set of compliance and set of performance standards right? You as a CTO don’t care about Kubernetes, but you care about is, “are my engineers productive? And are they shipping code at a fast enough pace, while at the same time being secure, and meeting our either legal compliance rules or internal standards, that we have which may be around having code reviews, or maybe around having security reviews a day before things go into production.”
Those rules; some of those are commonly shared things, like things that are public; PCI compliance. Those rules are common across every single enterprise that adopts PCI compliance and they have to apply that those rules. Those rules have nothing to do with Lambda and Kubernetes, and whatever, so what ends up happening at these organizations is that because you have this fractured data model, you have the Lambda team doing their own data model, you have Kubernetes doing their own data model; again, everybody having their own data model is that compliance and security are thought of as an afterthought. And so in many cases don’t even get automated at all. And it becomes such a pain for you and your developers. I would say, when I am talking to CIOs and CTOs and ask them “what is the biggest bottleneck?” it is always a compliance and security.
Now, why is that? It’s because you have this fractured data model, and for you to enforce compliance becomes almost impossible at that point, and it becomes a very big to do; a lot of arguments are being had; things are being thrown over email. It’s not automated. But uou put everything under one data model, as Clay was suggesting, everything is flowing through the same thing. You can apply that one compliance rule or standard at one place; you don’t need to have it everywhere, and to think about it and re-implement it a bunch of times, or never actually get to it.
And so, so that to me is the power of what Spinnaker provides — yes, do we support multi-cloud; can you be under one tool? Yes, you can be, that’s great, that helps you & that reduces the amount of tools. But what it really enables is this place for you to have centralized controls that otherwise may put your company at risk or compromise your security or compromise compliance. Now back to what Clay was saying: Questions that you’ve never thought of. How compliance works is that a compliance officer or compliance agent will come into your organization, and ask you questions that you’ve never thought about; that you need to be able to answer to make sure that you’re meeting compliance. They’ll ask you, “can you tell me who deployed this, what, where and when,” and they’ll get into the nuances of those questions — and you better have a data model that can answer those questions. And it’s doing what Clay is suggesting, by having a unified database with a GraphQL or some sort of system on top that allows you to query the data in ways that you never thought about; not only enables that compliance question, but it also enables a whole other area of productivity around things like Mean Time to Resolution; we can start creating views that are specific to your organization that we as a community will never think of but you think it’s important for you. You will create these and you’ll be able to debug things and fix things faster than you ever could before. You’ll be able to answer these questions around, “who’s deploying what, where and when?” and “What was the git commit; who was involved?” and “What groups are these people involved with?” because you’re you shouldn’t have necessarily the person who wrote the code, deploy the code; you need somebody else involved in that; the person needs to be from a different group.
We can prove all of these things, not only with this common database, but we can actually even start preventing them with policy being enforced before they even happen. So we can prove that they shouldn’t be able to happen and that they didn’t happen. All together under one house, and that’s what you’re getting the benefit from. If you don’t do that you’re just in this fragmented hell of processes and code and you never really get to that vision of automated SDLC.
DROdio: And I think the amazing thing about this, when we go into these enterprises we see they’re using all of the vendors all of the tools, they’re not using just one tool for one thing — different teams like you’re saying Isaac are using different things, and all that data is siloed in the systems and being able to create this unified data model to enable this, it’s really about the entire SDLC. It’s not just about software delivery but it’s about when we say “Collaborate from Code to Cloud” and it’s about unifying all of this, that’s what gets me really excited about, and it sounds that’s also what I hear the two of you saying as well.
Isaac Mosquera: Yeah, I mean I’m really excited about that. And what I’m really excited about is to see, as we create this platform, what enterprises will build with it, right? Because we are moving away from giving you the answers, and for you to construct your own answer that is right for your business. If you look at a lot of the tools in the SDLC, they’re giving you this opinionated workflow and it only works in that way. Well guess what, you know, JPMorgan Chase is going to be very different than every other bank, and they’re going to need to have their own opinions serialized into that process, and they’re going to need to be able to take data and morph it in a way that they make sense for their engineers and their and their people, and I think that’s where we as a platform are going to shine versus every other tool that you see in the infrastructure space, because we’re not a tool; we’re more of a Platform.
DROdio: Alright so let’s close this out by talking about timing. So this is, this is really exciting.
Clay McCoy: A little bit about how we’re going to use the same Plugin system: You can choose what data you want to see, and what questions you want to ask about that data or make your own data adjusters and relationships that you can ask questions about. So it’s open-ended for that customizability, you can answer a lot of questions with just the data that comes from Spinnaker but as you combine that with something that’s pulling data out of Artifactory, and Jenkins, and JIRA, and or whatever services you use, the types of questions you can ask just get more and more powerful.
DROdio: Yeah, that’s incredible. All right, so, timing: Let’s talk a little bit about when people can expect to actually be able to leverage some of this, how would we can set some high-level expectations.
Isaac Mosquera: Yeah Clay why don’t you answer this, you’re already working on the Plugin system now; we have an ecosystem already starting it here. Now it’s iterating and getting that escape velocity with the ecosystem itself.
Clay McCoy: We, the team alone has been doing a lot of work with the Plugin system in collaboration with Netflix and we’re writing plugins, we have customers and internal plugins that we’re writing. And now, my team is doing a lot of work to make sure that the Plugin system is powerful enough to do what everybody wants it to do. So, this data initiative is kind of a new thing that we’re just getting started on. So we’ve got a lot of the tools to build on top of it, and it’s just a matter of starting pulling in SDLC data and relating it and seeing where that goes. It’s pretty early; I’m hoping to be demoing some of this internally in the next sprint or two. And we’ll see how it goes from there.
DROdio: Well, if you’re watching this and you’re excited about this, either if you are a leader at a large company and you’re excited about what’s coming, we’d be happy to talk to you — or if you’re interested in joining the Spinnaker community and really starting to get involved in this, you can go to join.spinnaker.io — there’s a Slack workspace with just about 10,000 people in there, including Isaac and Clay and a bunch of the Armory team, so we welcome involvement from people that are excited about, what you hear on this call, and want to help build the future together.
Any, any last comments, thoughts? Anything else that you would add to that?
Isaac Mosquera: If you are a CIO or CTO or senior leader at one of these large organizations, we run collaborative sessions to help come up with a vision for your SDLC. We actually don’t really talk about Spinnaker very much. We really start at a very high strategic level and think about processes and think about what is the best process for your organization. And then we layer on Spinnaker at the end to see if it makes sense, but I’d be very happy to work with any any engineering leaders to help you craft that vision, and to really think about something two to three years ahead; not just a tool that will solve your most immediate pain, but just a bigger vision with more strategy.
DROdio: Yeah, these are incredible half-day sessions, and we’ve consistently had execs at Global 2000s tell us it was some of the best time that they’ve spent in an afternoon; really to learn and understand what’s happening inside their own companies, and regardless of whether or not Spinnaker and Armory is the answer, just that understanding and intelligence from us bringing all the best-practices and knowledge that we have from working with so many other companies has proven to be really valuable. So, let us know if you’d like to do that; we really enjoy those sessions. So thanks for offering that up Isaac. Anything else, play anything else you want to mention?
Clay McCoy: I guess a similar message out there from like an engineering slant: If there’s anybody that is interested in adding behavior to Spinnaker; is interested in writing a plugin, and the documentation isn’t getting you where you want, reach out to us on Slack or anywhere and we’d be happy to help you get started with that.
DROdio: Awesome. Okay. All right. Well, thank you for the time gentlemen. Really, really excited to see how this matures and appreciate the expertise.