Beyond DevOps: The rise of full-stack platform engineering | Peter Birkholm-Buch
In this talk, Peter explores the evolution of DevOps and introduces the concept of full-stack platform engineering. While DevOps promised to bridge the gap between development and operations, it often resulted in Ops teams scripting infrastructure without applying key software engineering principles. Platform engineering takes this further by treating infrastructure as a product, raising the abstraction level, and enabling developers with scalable, reusable solutions. Peter will explain how they made this shift at Carlsberg and why the future of infrastructure lies in full-stack platform engineers who can think like developers and build for the future. About the speaker: Peter Birkholm Buch is a senior technology leader with over 35 years of experience in software engineering, architecture, and cloud computing. As the Head of Software Engineering at Carlsberg, Peter leads a global team of 75 engineers across Portugal, Malaysia, and the Philippines. His expertise lies in driving innovation through cloud-based solutions, platform engineering, and the alignment of DevOps practices. Peter is passionate about improving Developer Experience and ensuring compliance and governance across large-scale engineering teams. He has a unique ability to blend strategy with hands-on technical leadership, making him a sought-after speaker and thought leader in the industry.
Transcript
Thank you. I hope the crowd is here for the content - because I didn't bring any beer. [audience laughs] And definitely, we're in the wrong room. We should be in next door. Thank you for showing up. I'm Peter. I run software engineering in a department called Growth Products. We changed names quite a few times over the past few years. I have Elina with me here in the front row, she used to be on my team as well. We do primarily e-commerce stuff. So, we sell beer. But we do a bunch of other stuff as well. And over the time, we've managed to intake a lot of different products, - all on different kinds of technologies, done by different people over time. And we struggled trying to integrate all of that in a clever way. So, this is part of that. Before we do anything, I want to show you guys - a short one-minute snippet of the video that we did with GitHub, - which explains a little bit about Carlsberg and why we are here - and how we do software. [calm music] This may look like tasty beer. And it is. But it's so much more than that. It's hundreds of years of legacy, - science, - and mastery. Started in 1847, the Carlsberg Group is one of the leading brewers in the world. Carlsberg invented the pH scale. They discovered how to purify brewing yeast and shared it with the world. And they're committed to a zero carbon footprint in the years ahead. But today, this beer also represents thousands of lines of code, - a digital transformation, - security embedded in the workflow, - and the world's most widely adopted AI developer tool. Because this is Carlsberg beer. And they use GitHub. [music ends] [audience laughs] Not trying to make a plug for GitHub, - but it was what kind of started everything. Before we go ahead, is there any developers in the room? Thank God. Dark mode. [audience laughs] Yeah, it's all downhill from here. [audience laughs] This is the story of how we got started - about five years ago to where we are today. So, some of the things I'll tell you, you're probably going to go, - "Oh God, we're already doing that. And why didn't you do that sooner?" But remember, this is where we were five years ago. So, bear with me. A few things happen in parallel, and then in the end, - we'll get to the full stack platform developer, - which I think is a nice play on the full stack developer. But anyway. So, in the beginning, we had this really horrible house - that was built on every conceivable technology known to man - that you will find everywhere in every company - if they've done something for a little bit of time. We had a bunch of different things for source code. Think of one, we had it. We had a bunch of different things for CI/CD, - and we had a bunch of different things for storing artifacts. And it forced our developers to get out of their flow all the time. So, imagine that you're doing something, - I don't know, writing, and you need a pair of scissors or something. So, you have to pick up your paper and go to another desk, the scissor desk, - do your paper cutting and then come back to your writing desk. That's what it was like to be a developer. They had to jump around all these - different platforms and hoops and remember all their passwords and everything. And it was a major hassle. So, right about at that time, - around five years ago, GitHub Actions came out. I think they stole it from Azure DevOps. Does anyone agree? And we thought, hmm, this looks pretty cool. Why don't we try and see if we can consolidate everything on GitHub? So, that's what we did. So, we turned GitHub into our software development platform, - and we got rid of a lot of other stuff. We got rid of GitLab, Bitbucket, - ADO, Jenkins, Nexus, SonarQube, a bunch of stuff. We use packages for container images. And it just made the lives of our developers a lot easier, - because now they had to worry only about a single platform. And it totally streamlined our workflow for developers, - so they no longer had to jump around hoops everywhere. So, if you were in some of the talks earlier this morning - about value stream and how many tools you could actually fit on a slide, - we were there. Today, we have GitHub. That's our software development platform. And I think that makes it a lot easier for us to be both a developer - and, as it turned out, also to be a DevOps person. Can anyone recognise this scenario where you have a ton of everything, - and you're trying to streamline it to make it easier? Yeah, hands up. Yeah, yeah, we know who you are. Don't worry, we've all been there. So, now we have GitHub. And then, Copilot came out. And I was watching the GitHub Universe. I remember that. I can't decide if the most important moment - of my life was having my kids born - or watching the GitHub Copilot developer demo at Universe - because that really blew my mind. And I thought, holy... We have to have that. And I got access to the preview, and I started fooling around with it, - and I thought this is going to just blow my mind and blow everyone's mind. So, we started rolling out as soon as we could. It was fascinating to watch how GitHub Copilot - impacted developer experience in my team. Because developer experience is about productivity, impact and satisfaction. Probably a bunch of other things as well. But Copilot impacts developers' productivity - by making it a lot easier to understand other people's code. If there's one thing we hate as developers is to get - someone else's code under our fingers, and we're like, what is this? Why did they do that? And Copilot can explain it for you, - and it's a lot faster to get started. Documentation. [exhales] The favourite part of every developer's life. You just ask Copilot to create the documentation - and insert it into the code, boom, you're done. That's a lot of saved keystrokes. Make your life longer. Also, scaffolding. Every time you're trying to build something new, - you have to build a lot of scaffolding stuff to get going. Copilot will do that for you so you can get to the fun part a lot faster. It helps you learn. If you're trying to learn a new technology, it holds your hand. So, rather than having to go back to documentation, - it will provide you with suggestions, - and it makes it a lot easier and a lot faster to learn new stuff. And that's really impactful. They say that it saves you 50% of the time. I've looked at the GitHub Copilot metrics, - and it turns out that about 20% of the lines of code - suggested by Copilot, at least in our tenant, - is what ends up being accepted. It kind of resonates with my understanding anyway. So, if you ask for something, and you get 100 lines of code, - you end up accepting 20. But remember, it's a lot easier and a lot faster - to review and reduce than it is to write. So, you're kind of rolling downhill rather than having to go uphill. So, does 20% accepted lines of code - in the end translate into 50% saved time? I don't know. Possibly. We don't, you know... Where are you? I saw you over there, the Atlassian guy. There you are. Yeah, so we don't measure those metrics. We could, but we don't because we trust our people, right? So, we don't go, you know, [audience laughs] how many stories? How many lines of code? How long did it take? Time registration. We don't do that. So, we just go, "Are you happy?" And they go, "Yes, cool. I trust you." So, looking at satisfaction from our developers, - if you go look at the movie, there's a guy, João, - and he says, "I really love this. It makes my life a lot easier." So, I'll just have to trust my developers - when they say, "Yes, it really saves 50% of our time." So, getting Copilot into our software development platform was very impactful. The next step was security. I have always been kind of a conscious guy. I mean, I've never had, at least to my knowledge, - I never had viruses or malware or anything on my laptop. And if anyone comes to me and says, "You can't be a local admin", - I'm going to shoot you on the spot. But you always, as a developer, you get this, you're unsecure, - you're writing stuff, you don't need to know the admin password. The advent of GitHub Advanced Security, where we could ingest security - into the developer workflow was extremely impactful for us as well, - because rather than having security be an afterthought, - you know, you write something, you deploy something, - and then the security team, they put all kinds of tools on your code, - and they find all kinds of issues, and they stick it in a backlog. And what happens to security issues in a backlog? They sink to the bottom, and they're never found again. So, what we do is we scan the code while you're doing it, - and then we block you from being able to merge it. So, if you're developing, you have security issues in your code, - you're faced with it right there. You cannot go home from work unless you fix your security vulnerabilities - because we're blocking your pull requests. And that turned out to be extremely impactful. And we eradicated thousands and thousands and thousands - of security vulnerabilities in no time because it turns out that - people really want to merge those pull requests and get on with their jobs. So, that was really cool. So, I highly recommend you guys looking at something like this, - integrating security into your developer workflow - rather than having it on as an afterthought. One thing we are struggling with a bit, and possibly everyone, - is there is this tool fragmentation going on within DevSecOps - where there is a bunch of extra tools being added on, - and they create all these signals. And how do you then avoid having, you know, signal death - where you were as a developer and now being tasked - with consolidating and aggregating all these signals from everywhere - and then deciding which one is actually important? Which one do I have to fix? And is this even a security vulnerability - in some environment somewhere no one cares about? So, a bit of a warning, I guess, for me is that - if there's any security people in the room, - make sure that security is baked into the developer workflow, - and it's not bolted on as an afterthought. That makes lives a lot easier for developers. But let's get to the interesting part. Originally, DevOps, our DevOps team was, - you know, they were really the Ops guys, - we just gave them a new T-shirt, and now they were the DevOps guys. They were still doing this stuff, you know. They were still really, you know, give me a Word document that describes - how to deploy this that's on a USB stick to my very safe and secure environment - that sits now in the cloud and not in the basement. And, oh, by the way, you still need a jump machine to access it - because everything is anonymous, - and we use network endpoints to make it safe. And then, our teams started scaling. And we had these poor guys in our DevOps team - creating infrastructure by hand, and they simply ran out of time, - and they became the most significant bottleneck in our entire organisation. And that, obviously, is not something that scales very well. And so, I remember again, there's a bit of history here. So, Terraform came about, - and we started looking at it as like, hmm, this is really cool. Maybe our developers can do Terraform. [audience chuckles] No, because Terraform is really like the assembly language for infrastructure. If you ask people who are used to type safe, - you know, compile languages, - suddenly do scripting in Terraform, they're going to run away fast. So, that didn't really work out either. So, we were looking at how can we make our DevOps people really go away? How can we create this golden path for developers - where we, on one hand, want them to be able - to create infrastructure on their own, - but we want them to use our Terraform script, - and we want to make sure that they do it in a way - so that when they build infrastructure, it follows the guidelines, - you know, compliance and policy and security - and everything that we have imposed on us, - and that we, obviously, want to make sure that we are secure and safe. So, we started thinking, how can we productize this? So, we created Gaia. Gaia is the god of the earth, by the way, in ancient Greek mythology. And we created modules in Terraform. And then, developers could create infrastructure - by simply just filling out what turned in the end, - we arrived that you only have to fill out basically the name - of a Lambda if you're trying to create a server that's functioning in AWS - or the same in Azure, we abstracted everything away. So, the only thing you had to know was the name of the thing you want - and how big you want it, and then the modules do the rest. So, developers no longer had to think or care about, - how do I get to my thing? How do I set up an API gateway in front of this? What's the network connection? All of that stuff was abstracted away - and done in modules in Terraform. And that turned our developers into really infrastructure creators. They could just do pull requests, - and then the DevOps guys would approve those pull requests. And then, we had workflows that would deploy the code, the infrastructure. So, they went from being totally overworked and overloaded - to just sitting in the corner going, "Hmm, what do we do now? Because the developers are now doing our job." But that's what it was all about. That was how we created this golden path - of developers suddenly being able to create the stuff that they needed - without having to talk to DevOps people or Ops people or anyone else. And they didn't have to understand any of the difficult intricacies - of creating infrastructure in a secure, safe way. Because at the end of the day, we all want that. But if you're a developer, then everything that's above - the operating system, at least that's how I think, - If someone comes to me and says, "What's the CIDR block for this?", - I'm going to die. I don't want to need to know that information. I just want to create something, - and then have it talk to another thing over HTTP and using DNS. I don't care about the rest. Under the surface, the operating system and down, I don't care about that. And we accomplished this using this Terraform approach. If you have any questions, actually, just jump in. I'm going to repeat the question. Is GitHub Advanced Security AI driven? It is now. So, it does a couple of things. So, we use something called CodeQL for static code analysis. So, it uses pattern recognition to find out - if you have unsafe practices in your code. I don't think it's AI driven. I think it's more like static code analysis, really. We use something called Dependabot that scans your dependencies - for security vulnerabilities in your dependencies. That now has an AI feature called Autofix. So, if you have a security vulnerability in a dependency, - it will create a pull request that fixes that security vulnerability for you. So, fixing security vulnerabilities using Dependabot - ends up being approve PR, approve PR, approve PR, - rather than having to go in and fix it on your own. We also extended CodeQL to use third-party tools. So, when you do a pull request, - we will build your code as if we're deploying it. And if there are any binary artifacts that we create, - we will scan those for security vulnerabilities. And if there are any security vulnerabilities, - we're going to block your PR. So, that allows us to always be sure that there are - no security vulnerabilities at the time of writing, so to speak. So, that's the level of AI that we have in it, - the Autofix from GitHub. So, the size of my team, - we scaled to 150, and right now, we're 75. But I see no reason why you couldn't scale this to any size - because we're scaling on the back of GitHub, so... I mean, if it doesn't work for them, then it works for no one, I guess. So, the question is, did we receive any pushback - from the businesses if we're blocking pull requests from being merged? Yes and no. The trick is that you don't start coding and then merge after two weeks. Hopefully, you merge at the end of the day. So, you know well ahead of the final day when you're supposed to be done - if there's something lurking, if there is an issue. Have we had issues where, like, the night before - a critical security vulnerability came out - that would actually block us from deploying? Yes. Do we then circumvent it and deploy it anyway? Depends. [audience chuckles] If it's like there's a huge server side something going on - in the credit card payment or something, - then we're probably not going to deploy it. If it's something that affects something in a non-internet facing service, - yes, we're going to go ahead and deploy it. We've not had any issues with the business - interfering with this whatsoever because we're actually fixing it - way before they get visibility into it. So, we kind of take in security out of the loop, so to speak. The rise of full-stack platform engineering, - I'm trying to make fun of full-stack developers here - because I've always kind of said they don't exist. Because there is no way for anyone to know everything. But I think what we managed to do here with our DevOps team is that we, - because they're now thinking of this, - of the way that they build infrastructure as a product, - so they're now thinking of the developers as their customers. So, they're now, okay, how can we enhance this? How can we turn this into a product where developers are our customers, - and they don't have to worry about anything? So, then they started gold plating, right? So, now when you're rolling out infrastructure, - we create dashboards automatically in Datadog. So, if you were rolling out code, - we have dashboards created automatically for you. We have documentation created automatically for you. We have API specifications rolled out to our API platform automatically. It's pretty cool. And it's basically the product, our DevOps guys, - they don't have anything left to do other than to maintain this thing, - because the developers do all the work on their own. So, I think that's how we went - from assembly language to really creating products for our developers. It's how we allowed our DevOps guys to stop being DevOps guys, - to start being more of full-stack platform engineering type people. They're creating products for our developers to use. They're not Ops people anymore. So, I think that is how we use this Gaia product. I'll put out a blog post. I've had many requests - to detail what it is. I'll write something up. We used it to elevate our DevOps people from being, - go create this piece of infrastructure for me and do it yesterday, - to being people who are actually now building - an infrastructure deployment product, - and they're now product people rather than DevOps people. And that is how we found out - that full-stack platform engineers - are as rare and valuable as dinosaur unicorns. And we all know that dinosaur unicorns are pretty special. [applause] Thank you. [outro music] [music ends]