Skip to main content Etsi

DevOpsConference talks

The application of Agile and DevSecOps to space systems| Robin Yeman

In an era where we expect to see more growth in space in the next 5 years, than we have in the last 50 years, it's critical that we can adapt to changing needs and deliver faster while ensuring we are safe and secure. Space exploration, with its inherent complexities and high-stakes missions, has long been synonymous with meticulous planning and painstaking precision. However, as the demand for more ambitious missions, commercial space activities, and enhanced safety standards continues to grow, the need for accelerated development and deployment processes becomes evident. The application of industrial DevOps to space exploration represents a monumental leap. This presentation unveils the transformative potential of applying Agile and DevOps methodologies to the development of large-scale, safety-critical cyber-physical systems in space exploration. Industrial DevOps pulls best practices from multiple bodies of knowledge across the digital engineering value stream, including systems thinking, model-based engineering, Agile, DevOps, cyber, machine learning, and more. We need to use all of the tools in our toolbox to reach the moon first and then leap to Mars. About the speaker: Expertise spanning over twenty-five years in software engineering with a focus on digital engineering, DevSecOps, and Agile building large complex solutions across multiple domains from submarines to satellites. She advocates for continuous learning with multiple certifications, including SAFe Fellow, SPCT (Candidate), CEC, PMP, PMI-ACP, and CSEP. She is a Systems Engineering PhD candidate at Colorado State, researching best practices to deliver critical solutions to complex safety using Agile and DevSecOps.

The application of Agile and DevSecOps to space systems| Robin Yeman
Transcript

Hello wow this is me I I spent most of my career 26 years at Lockheed Martin building a whole variety of things everything from submarines to spacecraft I tell people and then um here at Carnegie melon I work for the software engineering Institute and at the software engineering Institute my job is to support um different space customers so I work a lot with the space course uh NASA uh NGA Etc and my goal is to enable them to build faster deliver faster uh get capabilities quicker all right so I have seen huge benefits with agile and devops when you talk about applying them to software right have you guys seen that we've seen some huge benefits um so I was like What happens if I go ahead and I look at that at the system level right applying it to the entire system because what I see is we have people that are implementing let's say agile for software but then we're still doing a lot of traditional development for hardware and bringing them together doesn't always come together like you should and it makes sense we've got some challenges right long lead times right so if I'm if I'm having to work with supply chain right I have a long lead time very expensive test equipment right one of those things that people like to push off till later because you know if you delay it it gets better right uh many dependencies Next Level very complex risk management once we connected everything the attack surface was just uh you know out of this world and then safety reliability and regulatory requirements and I I break those up because uh regulatory requirements is a lot more around the policies and and less around what's specific for safety so Hazard analysis and stuff like that so let's talk about that long lead time um when I'm building satellites one of the things that I need to do is procure a whole range of Hardware right and you know we even saw it during covid you know the car manufacturers who could not get the chips fast enough right caused them huge delays in delivery you know some of these companies have overcome some of these long lead Times by doing things like 3D print their entire rocket right um relativity space is known for that um but most of us have to deal with those long lead times very expensive test equipment right here it costs millions of dollars to build one of these thermal vacuum Chambers right these aren't just floating around um however you know I keep making the case to bring it in earlier because we're still going to spend it but if I can validate things much earlier in the life cycle right then I actually can buy down risk dependencies um so when we're looking at dependencies uh we have system of systems and a vehicle itself right a satellite is a system of systems but let's think about planet Labs that's a constellation of satellites right it it's the next level of system of systems um if we think of you know uh large uh communication systems that I have to communicate between the ground space aircraft radar right these dependencies get Way Beyond anything that I can I can mentally hold in my head I was just talking to Steve in the back room going hey tell me about these nested values dreams because they're horrible in my world we have a lot of issues in space uh things like you know the the actual kinetics right things like solar interference or frequency crowning or signal degradation we also live an economy we have contested space right this isn't something that we thought of decades ago so jamming is a real issue uh spoofing and really um denial of service I can honestly tell you knew that again 20 years ago when we were building things like cbers uh we weren't thinking about somebody being able to actually hack the satellite meanwhile back at you know in Las Vegas a black hat held their conference um and held it calling hack the SAT how many of us are going to be able to handle it if if GPS goes down that's how I got here today like I had to walk for my hotel extensive attack surface once we once we connected everything which seemed like a very good idea and still a very good idea there are multiple places to compromise these systems all right and people that just do nothing but look at ways to compromise these systems right things that I haven't even considered yet I know it sounds bad we're going to get to the good news soon but we did have to walk through some of the challenges high stakes safety and reliability this is just the NASA standards NASA standards for 8,000 for the safety standards associated with anything that's going to be launched in space it was interesting because I was talking to some folks from NASA and I only say that because it makes me sound cool right I'm not from NASA so I'm really not cool but I will tell you that I was chatting with them and they were completely frustrated because SpaceX has made it so that they have to go faster than they ever could before and they're very uncomfortable about it right they used to only have to deal with 10 15 launches a year and now they're on average you know 120 130 their entire process because they still have to certify everything has has had to completely change now personally I think this is okay they're not happy about it but I was like we actually do need to figure out how to move faster and then we have these serious Regulatory and compliance hurdles now most of the Regulatory and compliance is to ensure safety but actually the the bureaucracy of it is completely separate um I am working on my dissertation and someday I plan on graduating I don't know it's it's it's taken a while and I it's pretty interesting that actually the biggest complaint on applying things like devops to cyber physical systems uh by an order of two by the way according to my research is about the impacts on Regulatory Compliance not safety right so it's just getting through the paperwork that's actually the bigger problem which I thought was a little odd um these are just some some examples now as I said I've had the opportunity to build a lot of different systems and see uh how things can go wrong a lot of different ways and so last year uh myself and my co-author published a book and we called it industrial devops I don't know if it's a good title it kind of Stu stuck because we started writing papers um it really is the application of devops and agile and systems thinking and modelbased systems engineering right and shift left it's basically bring all of the tools to your toolbox to optimize delivery of these large scale safety critical systems because one of the priorities is to get to Mars first right that's that's it's it's critical whoever gets to Mars first is really going to um have the opportunity to do a number of things colonize first um actually there's these things called the lrange points and whoever gets to the lrange points first are the ones that actually will control the flow of the supply chain which is huge all right um kind of like uh it I'm from the US the Wild Wild West whoever you know planted their flag first right made the rules um that's currently how that that system works uh for space too so I have some ideas all right all is not lost I think that we can do this all right and this is what we would call industrial devops I'm going to walk you through some of these principles and I'm going to tell you why I think they will help for the system level one we already know that we need to organize around the flow of value uh great talk by atlassian earlier first time I've ever heard devops related to coal mining amazing and it's interesting to me that even then right in the 40s cross functional teams were the way to optimize delivery right so here large organizations such as lacked Martin where I worked they're organized like this right we've got program management systems engineering we have a lot of silos um it's pretty interesting because not only do we have these silos of information we've got silos of communication if I'm in program management I'll probably talk about things like especially if on the Leading Edge right I'm going to talk about lean thinking program management practices if I am in systems engineering I'm talking about systems thinking also very good if I'm in systems design I'm talking a lot about design thinking has a language unto itself right Double Diamond um diverge and converge Hardware engineering hear a lot about rapid prototyping or revving we come into software and we do talk about things like agile and devops then we get to test and you probably have heard this before but we've got shift left thinking and then we get to operations and you hear people talk about itel it infrastructure Library not only only do we have silos we have a completely different language set to communicate with each other each one of those areas is trying to optimize the flow of delivery but we can't even communicate right I I don't know what you're you're telling me you know because I'm using Sprints and you're talking about revving your next board uh so that's a problem right that's one of the things we want to talk about for these cross functional teams next multiple Horizons of planning how many of you guys uh have used agile or devops in your development and have any of you had let's say one or two week Sprints so how does that work if I'm building a system that's going to take years to get out well I have to connect them right I don't want the integrated master schedule that tells you what Robin's going to do in 2027 in October of the last week we which I've seen before just so you know um but we know it's not true but I got a plan Beyond two weeks here you can see Artemis and I think again it's a language thing because we already do this uh so Artemis 2 was set to launch this year they had delays they had problems because of actual tactical things that happened on the ground so they updated it right to 2026 but here's my goal right is is to get to emis 5 right get to Mars and I have a plan but it's going to adapt and adjust based on empirical data that is occurring on the ground right so we want to make sure that we understand that I think that uh I refer to this as empirical planning over uh predictive and what I mean by that is I could have a 10-year master schedule and believe me locky has many um but the difference between this empirical planning and a master schedule is this what I'm looking for is I'm going to use the data from each of these time phases right so let's say my day right I plan to complete 10 things today and I complete eight if I have 10 of those days right I'm going to be 80% complete now I can either use that data to further inform my next Horizon of planning or I can ignore it and beat people till they get back on plan I can tell you exactly how that works not well so inherently what I want to do is use the data that I'm getting from each Horizon of planning right so my Sprint will inform my quarter my quarter will inform my year my year will inform my Five-Year Plan right so we have to look at it with empirical data as opposed to this is my predictive schedule datadriven decisions all right so we have some great tools at our our our disposal uh I've seen a huge amount in this area now it's not that simulators and emulators are brand new but the technolog is getting better every single day so we can simulate what's going on this the Cyber physical system we can leverage things like digital Shadows which is a lower Fidelity digital twin right I don't have much data in there yet it's not integrated yet um I can leverage digital twins which is two-way Telemetry so that I can actually use the things that are happening on the ground and use the things that are happening in my spacecraft and continue to evolve the system all right and then 3D printers or additive manufacturing allows me to get rapid data so I can start looking at what does that breadboard look like what does the brass board look like can I apply this software to it how does that make a change uh so we have the ability to use a lot of these tools in our toolbox to make informed decisions not rocket science even though I guess technically it's for rockets architecting for change in speed we already know this modularity is critical now does this mean that I can automatically change everything to a microservice I can tell you that the defense Innovation board at one point in time told me yes I said well there are things like latency um when I'm working on Real Time embedded systems I'm just throwing it out there um so I actually have to make a choice and so here you can see Smart Set this is from from Lockheed and it's a softwar defined satellite as modular as it gets I'm seeing a lot of this across the board so software to find Vehicles software to find satellit software to find communication systems um the most modular there's still a tradeoff but if I can modularize the capability I can make changes to it with the least amount of impact right right now I can have a satellite up there and I can have it looking at weather and I can change that out for another mission just by uploading new software right so you can really look at it as basically the hardware is the platform and we're configuring as much as we can with software it gives us the maximum ability to to be that modularity iterate and manage cues now really what we're looking at here is not just iterating for my software I want to iterate for my design and so I pulled some of this in this bottom picture from from scaled agile you but you could have gotten the same picture or something similar from um you know principles product development flow but one of the key things here is it's not just iterating for the software and as I got a chance to interview different Engineers around Lockheed Martin I found that they were doing it anyway we just didn't know we were all doing it right so talking to the hardware folks that're like yeah we iterate those boards until we actually get one that we're we're ready to actually apply we iterate the hardware we iterate the models we just don't do it together so if we do it together we can optimize the flow of value I have a model it's in a tool called innoslate it's lifecycle modeling language a little different than CML although it does connect to CML and I built a midsize Leo satellite in my um in a slate tool basically the build not not act the satellite itself although I do have a requirements uh model and I built one using the NASA approach phase a through phase D they have a phase e but that's a disposal phase which looks very much like your traditional waterfall and then I modeled the second one using a series of MVPs uh you know if I was King for a Day how I would build it and right now it runs and my Monte Carlo analysis on it shows that the midsize Leo satellite takes 6.2 years to build based upon past experience that does not seem out of the ballpark that's pretty close uh and the exact same satellite takes two years through the series of MVPs right so um still work to do with the the model and things but it does run and again it really comes down to I don't think I proved anything amazing smaller batch sizes right regular feedback rapid iteration um all of these things have been proven before which makes it really hard to graduate by the way because they keep telling me I have to invent something new and the more research I do the more I find that it's there we just don't follow it all right Cadence and synchronization so as far as um when I'm building or when I'm manufacturing right I want to remove all variant or most of variant you guys agree with that right that's the goal in product development you want to remove bad variants and you want to exploit good variants in the areas of innovation well that's really hard right removing all variants is way easier um so what I consider is you know while I do want the system to integrate regularly in those things I think the other thing that Cadence and synchronization does is remove noise for the system right so then I have the opportunity to pick out those elements of variance that are good right and remove those that are not right so here you can see I've got my avionics team my environment controls team thermal protection there would be more teams by the way but for easy pictures easy math and they're not going to all integrate every time but we do want a heartbeat right it was just like what uh the alassian folks showed but for Hardware I have to be more cognizant of it because we talked about it I've got lead times I've got Supply chains I can't make this argument enough um and and I hear it at every conference so again I know I haven't invented it by any stretch of the imagination but I can tell you that every time I would try to get a a program director at locked Martin to build that environment that test equipment early they'd be like no no we'll do that when the system's done right which because it magically comes together you've seen that right I know I love magic it's awesome all right so this is a a a Carnegie melon uh example and it's pretty small I keep trying to get them to let me build a cube set I think that would be really fun especially this continuous integration but we're starting small here we have a um continuous integration pipeline that both updates the software right and validates the software as well as the firmware now you can say hey the fpga is still very much software oriented but right take that to the next step adding that to the the cube set we could start showing that regular delivery the reason I say this is because at locki I had the opportunity to see systems come together a lot and so I could follow it but most of my teams would only see just their their slice they never actually saw the whole system come together and I think it's really hard to explain the impact of waiting too late to integrate and the impact of these silos if you can't see it and so to me if you had something really small like a cube set you could show updates to your requirements let's say in jira or doors however you want on day one you could show updates to your uh model day two you know what kind of do I have to update the cad model do I update my Cameo models I could update my digital twin on day three to kind of see what that should look like day four make sure that I've updated any kind of automated tests day five actually implement the updates to the system whether it be firmware software or hardware and I think on something so small people would really be able to wrap their brain around it these really big systems is hard shift left I got a chance to interview some of the Formula 1 folks which you know you think that space is cool but Formula 1 super cool um they're they're um and they they talked about you know uh their their test environment they have a full digital twin they actually told me that if the slowest or the fastest car on day one did not evolve every day of the race in the Formula 1 race they would lose they evolved not just the software the hard etc for these cars throughout the entire race um so and again very expensive test equipment very expensive environments but environments you're going to have to buy anyway so the earlier you do in the life cycle the bigger impact it's going to have on your system all right you guys agree with that awesome then this growth mindset thing how do we continue to learn and I not like this this comes from SpaceX we could have gotten something out I can honestly tell you led would never ever tweet something like this with a launch people would be fired right if if something exploded people would be fired um which basically says your psychological safety is not quite what it should be and you got to get it right every time now here's what I would argue if you have to be right every time you're never going to make a change right we're coming back to just manufacturing the same ideas so while I'm not saying that everythi