AI Native development principles and practices | Patrick Debois
At The Future of Software conference in London, Patrick Debois shared how mastering the latest AI tools and understanding the principles and patterns can help you navigate the space of AI Native development. These patterns include "Producer to manager," "Implementation to intent," "Delivery to discovery," and "Content to knowledge." About the speaker: Patrick Debois is a renowned expert in DevOps, DevSecOps, and the interaction with AI. He is a principal product engineer at Humans and code, and the co-author of the DevOps Handbook. Patrick's work focuses on helping engineering teams become more productive with AI tooling and delivering AI-powered products with engineering rigor and good practices. He takes a pragmatic approach to his work, grounded in reality and experience.
Transcript
Thank you! I'm not sure if we go back to the roots of DevOps, to be honest. I want to go out to the edges and the new things. That's probably more fitting for me. The intro said 'principles and patterns'. I haven't gotten to principles yet. I'm just going to talk about patterns. Why the horses? The mist is there; we don't know yet what AI is going to give us. That's one way of looking at it. The other way could be the four horses of the apocalypse. So, that could be as well. Also, because unicorns don't exist, - and we're trying to make everything work right now. All right, let's kick it off. Development, Copilot... There was an earlier presentation on this. That's what most people think about. Like, "Tab completion, very easily". But we'll move quite quick. The chat came. We're going to do copy and paste. That was the next thing. "Oh, great. Suggestions". Why are we copying and pasting? Why can't we just say 'Apply' and done? Okay, we got more confident on doing this. Why only have the one completion? Why not do multiple lines, completions at the same time? And I'm trying to build up the whole thing. Now it's predicting where my next cursor should be. Like, I'm editing here. "Just go there, start your work here". It started understanding our code bases. It was a little bit of a funny thing. Everybody was talking about fine-tuning. We kind of went off that track and said, - "Let's look at our local code bases". It was funny, because the big companies asking for fine tuning - were the first ones to say, "Not on our code". So, this was kind of the solution: doing it locally, keeping it locally. But we went to one file or a tab completion - to multiple files now being suggested. So, it's just taking more things, like the terminal feedback. It's not just your code completion. It looks at our browser. Also cool, right? It basically understands our loop: code, browser, open, - taking everything into account. Generating tests for test coverage, yes. Although it's a little bit tricky, - writing the test for itself, 'Who watches the watchers?' But that aside, it is definitely helping us with that as well. And then came reasoning models. So, instead of just doing the completion and the code, - it started more thinking. Obviously, it's not thinking, but kind of breaking down our problems - and reasoning what we should do, and then doing the suggestions. And one of the tools, I don't know if you've come across it, - Devin, made a big splash saying, "We can just do that continuous loop". It actually looks at the terminal, opens the browser, - kind of does that stuff. So, we've come a long way from that simple tab completion - that most people know Copilot, where we are now at with the tooling. And then, why have one Devin, right? You can have multiple Devins work at the same time on your code base, - maybe on this directory, maybe on this part. Good as well. My main point is, it's not anymore about the LLM. It is expanded from LLM, - adding the RAG to the local indexing, to the functions that it's calling. And it just went on and on. That technology is very interesting, - but we're not going to talk about it today. But it just gives you a sense that... Like, when people discuss, "Model A is better than B". That's great, but there's a lot of other factors - that go into making this work. And there is a belief or a hope, - much like we had with autonomous vehicles, - that one day it will drive for us, completely autonomous. Now, I don't know... In some cases, it succeeds quite well. In other cases, we're still not there yet. But look at the money that we've invested in there. But I do think that generative AI is actually different - from the traditional AI this is built on. Generative AI is inherently prone to errors. So, we should treat it differently. I'll come back to that later. And now we see new technology bring new technology roles. We're not all the experts, because nobody was the expert before - in that new technology, which is really great. And obviously, that's why I'm talking to you, - because now I can say I'm also an expert in this. It's been mentioned that, of course, DevSecOps, near and dear to my heart, - but I looked for a way to learn more. For me, it was like engineering with AI. That's partly why I started thinking - about that space and the technologies maturing. So, I thought it was a good time for this. So, we've seen this morning, it's more of the same, right? Oh, cloud-native. But how do we do VLAN? Now it's called VPC, right? And it's the same thing with AI-native. We don't know exactly how it's going to do, - but it's going to be more than slapping AI on the surface. And as we mentioned a few times, we need to put it more at the centre - and rethink what we do. And we talked about the LLM-phant. The real thing is, are we going to be out of a job? And a lot of people worry, including me, - for my children studying in IT, working in IT. What's their future? What are they doing? Challenging. Now, I started looking for some clues on what this could mean for us. And I found this visualisation quite interesting. Imagine all the things you do as a software engineer - or whatever role you have in the company. There's certain things when a new technology hits - that will be enhanced by it. So, you can do a better job, maybe faster, and so on. There's going to be new tasks you can do that you've never done before. And some pieces of work are just going to be redundant. So, the unbundling of tasks is what we're going for. So, a lot of people talk about the code generation. Yes, that's one piece of a job a good software engineer does. So, that might be replaced, but maybe we'll move into other parts. And the talk is a little bit on where are we moving next. If these things are happening and they are meeting the expectations - that it's getting more autonomous and getting better, - what is happening to us? I distilled it in this, the four AI-Native Dev patterns, - and we are going to dive into each of them. One is, it's been mentioned before, - we're going to express more intent than worry about the implementation. We're going to be less producing code, - but more reviewing and becoming a manager of the code. We're going to look at what is being created, - and hopefully we'll get some knowledge out of that. And if we have a lot of time saved, maybe we can do the discovery - of what is actually the thing that we need to build. So, those four patterns we'll go through. First one is, from producer to manager. Writing code with AI, yes, you can push more, - but the pull requests get longer, right? So, the reviewing time of the PR just goes up. And the cognitive load actually rises - because now we haven't done the process of thinking about and creating it, - but now we have to jump in that moment of reviewing. And we've all done PR reviews of our peers, - but that takes quite some time. And the more code is being produced like that, - the cognitive load is going to rise there as well. Now, I want to show you a few examples - where tools are starting to help us. Yes, we have the diff view. Like, what is it? Red and green? I can't even see. I'm colour-blind. There is a chat view who gives it very verbose. I don't have time to read all that stuff. And then I found a very simple thing. That cursor there actually describes what is being changed, - replaces that piece of code by just an explanation of text. A lot simpler to review this code than all of the previous things. It's just an example of tools maturing - and looking for a way for us to deal with these problems - that we see, of more code. Now, sometimes it's multiple files. I already mentioned this. So, they take us to one step at a time and have us review the pieces. Another way of reducing cognitive load, - of breaking it down into smaller things - that we have to review and give feedback on. Sometimes making a diagram can help reduce cognitive load, - because looking at this in code is totally different - than us looking at a diagram. Immediately you can spot the difference or what happened, - or what's not connected. And this brings up one of the things that is called moldable development, - where your IDE might start adapting to the task at hand. So, the review, whatever the problem is, - your IDE might not just be about your code - but about what the problem is you're trying to review. So, this is one way of reducing cognitive load to get there. Now, sometimes tools already take the step of, - "Let's auto-commit. Skip the review. Let's do it". Okay, if you're certain enough. But still, you are able to do the review. But we want to have a safe space. So, what you see, the further they get into the generation and the agent, - is that they start creating checkpoints. "Do you want to roll back?" "Are you happy? Do you want to do this?" So, that is another way of dealing with the situation - where you might have been uncertain, - but you do it in a safe environment and you get some part of that review. This is an interesting one. This tool actually says, "AI cannot touch this file". This is more about who gets to do what and then do the review. Like, this is still my manual work that I want to do. Brings up a whole new world for IAM on AWS. It was already complex. Imagine now us specifying what the AIs can do to our systems. And it was mentioned already before on the panel. If we do blind acceptance of this code being generated, - that's risky, right? There's been a study about people accepting more code in the weekends. Why would that be? "We don't have time for this. Accept." But that blind automation is another problem being generated. Post mortem it's going to be interesting. We don't know which pushed the button. Who's responsible, still? So, that understanding from the code is also going to move - towards understanding things in production. And maybe we have to clean up as users, or as humans. Now, the way that I framed it into my head, - if I put my DevOps history hat on, - we early on had infrastructure as code and getting CI/CD. That was all about getting it to work. Look at the industry that came after that. That was dealing with the failures. Can we understand the failure? Can we detect the failure? Can we design when the failure happens? And lo and behold, the systems got so good. We weren't good anymore at dealing with the failure. We had to train ourselves. So, all the time saved went to spending the time - when the problem occurred. So, it's not that you can't escape this. Now, you might be willing to take that risk as a company. Good for you, right? But that's a choice you make. But it is shifting complexity - in a certain way that we're doing right now. So, the next pattern - is we're going to specify more what we want - instead of worrying about the implementation. So, what are a few signs that I saw? Yes, the obligatory kind of vibe coding. We just accept what it does. We tell it in a chat prompt, and we just do this continuously. Great for prototyping. When the AI gets really good, - maybe we don't have to worry about it, but still. You see this appearing. A lot of people started not typing just prompts, - but they started building Markdown files with their specifications. And then, similar to typing them in a prompt, - you add the specification of what we want: - the directory structure, the type of language, - kind of... conventions that we use. And it's just started generating this. So, that kind of trend of reusable prompts - and then reusing that more into a file. You started the tools adapting to this - by having '.ai', or whatever the flavour is, - where I can specify this, so it gets reused. And you can reuse this in your own project. You can reuse that across your team. You can reuse that across your organisation. So, those are all specifications that would now live somewhere, - in a wiki or whatever, - that you can now bring into that process. And I talked about the reasoning process or the reasoning models. We can now take those specifications and break it down to smaller tasks. And then it does the implementation for us. Probably then we'll get into the review cycle, - which was kind of the first pattern that we have to deal with. So, we got from implementation towards intent-based coding. And all of the tools are heading that way. "Specify what you want, we'll take care of the business, - but when it fails, you still are on your own". That's how we are. And so, the interface is starting to change - that the products are not in the code space anymore. They just ask it to manage product requirements documents. Ironically enough, product and platform engineering, - maybe there's something there that would do more products, - and we're actually working on that. For those who lived through the times of 4GL, a long time ago, - everything was promised by defining just the specifications, - and the AI or whatever automation would do it. You know, specifications are messy. Bring everybody in the room, start collaborating, - have great discussions, and you might disagree. "I think we should build more performance." "There is no budget for this." So, you see conflicting views. But I'd rather spend my time here aligning on the specifications - than maybe having to boil down on the linter and that stuff. So, that's moving towards requirements. And it becomes specification-centric. I feed it, it will do it. We're not there yet. I'm not saying we're there yet, - but the breakdown, the technology is trying to get us there as well. Now, maybe there is an in-between state - that the prompt has partly the specifications. For example, when I build a project, I iterate on it, - and I add things to the Markdown file step by step. But I've never, until... For my experience, when I give it the whole specification, - it does it completely. So, there is something about the iterative process - that actually also makes it work right now. We don't feed all the specifications, but you do it in chunks, - similar to what we did with the task reviewing: - you do smaller bits and pieces and build that in. You do need to have that test harness that we talked about - because otherwise it will change things, and you will not notice the things. And this is probably the farthest you can stretch this. If AI is generating all that stuff and we're just giving specifications, - why are people still buying SaaS softwares? Because they can take your specification and build your product in-house. So, it's a tricky business. But it all comes from, first, the belief, - "Are we going to get that far?" "Are we going to get that good?" And there's a lot of theories. The amount of money we're throwing at it, - it must be good. Again, it will flaw, and we'll have to deal with this. But that kind of shows you, if you think ahead, - what people are almost willing to believe that we're getting to. They might be right. I can't tell you. But that's where we're heading. The next one is... So, we dealt with the reviewing and the generation. We went a step earlier and said, - "Now we can actually tell what it needs to do". But now we have to figure out, actually, what do we need to tell it to do? So, instead of focusing on the delivery, - "Just do this thing of the specification", - we can do more experiments - because creating things is getting cheaper. So, we have multiple options to choose from. The simplest thing you might have seen is an image generation. It's not generating one image. It's generating five. Pick one. Generate five pieces of code. Pick the best one. So, that is the pattern here. A product owner already knows - that they are collecting product ideas from a lot of places. We could do this on a product, - but we could do this also on engineering, - all the ideas that the engineers have. We can build some of that research automation capability - and use AI to help us come up with different strategies or ideas. We can use that same visual technique to find gaps. "We've never thought about structuring things that way." There is a discovery process - and visualisation thing there happening as well. There is a story about somebody selling fashion clothing, - and they're doing contracting with legal, - and they just feed all the information that the party provides, - and they have it ask a bunch of questions. And so, that verification, "When the AI can't answer a question, - there must be something wrong in the contract"... That kind of verification and discovery is quite useful. We can do complete designs. Because discovering what is a good design, - what people actually want, have a few of those generated. Lovable, one of the typical things is, - it's becoming so easy that you don't have to be a coder, - but you can have working prototypes sent to your customers way faster - for end user feedback, "What do you want?" "Is this a good idea or not?" And this one kind of got me thinking. On the edge, when we pushed everything out to the production - and the end user doesn't like it, - could we generate the code and see if they like it? Not just a feature flag toggle, - but kind of change the behaviour and see if there's something the customer wants. So, the discovery can be done before the coding, - but it can definitely also be done in the production system as well. So, the more our time frees up, - maybe we can spend our time and unbundle our tasks to do that stuff. The last one. Going from content to knowledge. And content is quite liberal, and naming the patterns has been hard. So, bear with me in this. You've already seen that the more content we have, - the more we can feed it into the context of those systems, - and the better the answers become. If you have bad content, - that's been challenging, because it just repeats your errors. So, knowing what good looks like is important in there. It could be your documentation, but it could also be the guidelines - that we're dealing with. Or how people build software in the ecosystem of Python - is different from how people build it in OGS. So, all that industry content, we can turn it into knowledge as well. We can take production information. Or this has been outdated... This is documentation that is being outdated - when we change things in the code. This was indeed taking open telemetry information. To feed it back, what we should do? Imagine you are doing something in your IDE, changing the code, - and it tells you, - "That little piece of code that you're changing - is used by a million users. Are you really sure?" That is kind of connecting two dots. Now, it's interesting when I talk to the code generation tools, - they're like, "We're in the silo of code generation". When I talk to the Ops people, they are in the Ops thing. So, we'll still have to connect somehow the dots to get that feedback. But that's going to be the interesting part. Writing incident responses, learning from what has happened, - definitely documentation, and then turning it into knowledge, - so we don't repeat that same stuff - when we're coding or when this happens again. But it also can be something as trivial or 'not useful'. We have an existing code base, and somebody new is being onboarded. Let's take that code base and turn that into lessons. Because we have information about the prompts that we build. We keep track of that stuff. We have the product requirements. And so, if somebody asks, "Why are we doing this again?" Well, here's all the knowledge that you need to do. Onboarding new devs, what are your kind of standards? That turns it into knowledge as well. And even keeping track of, "The feature is not like this, - but in the past it's been like this, and we changed it because of that". That is knowledge. It's often lost. Trying to understand what the product reasoning was. And then, maybe in two years, somebody tries the same thing again. That is another piece of knowledge that we can capture. I find this particularly interesting in Devin,