10 things we hate about platform engineering | Emma Dahl Jeppesen and Dan Grøndahl
Platform engineering is the latest blockbuster in the tech world. Still, like any major production, it comes with its share of potential pitfalls and dramatic twists when stepping into this new frontier. Join Dan Grøndahl and Emma Dahl Jeppesen from VELUX to hear their biggest pain points in platform engineering, explained through analogies from popular movies. This talk isn’t focused on the usual tools or technologies; you won’t find a single line of YAML here. Instead, we focus on the fundamental misunderstandings and missteps that can derail even the most well-funded platform initiatives. From the executive suites’ elusive buy-ins to the gritty realities of inadequate developer experiences, we dive into the top ten common errors that many encounter but few talk about openly. So grab your popcorn and expect to be entertained, maybe even learn a thing or two! About the speakers: Emma is a Platform Advocate for VELUX, formerly a Product Manager within the DevEx and platform engineering space. Initially working in the media industry, she fell in love with the product, organizational, and cultural aspects of software engineering when she randomly ended up as a Product Owner for a DevOps team. Emma is passionate about helping developers achieve their goals, gathering insights, and ensuring user-centricity in platform teams. Dan is the Platform Owner for the CCoE team in VELUX and is responsible for building great internal developer products for VELUX’s software engineers. An experienced DevOps engineer, he’s spent the last 5 years as a consultant, guiding many organizations in their DevOps, platform engineering, and DevEx efforts. Dan’s focus is always bettering the organizations and people around him, often emphasizing cultural and human aspects.
Transcript
Thank you for having us. Yeah, catchy title, right? 10 Things, Almost 10 Things We Hate About Platform Engineering. We thought that it was the best way for us to express our pet peeves, - our continuous ranting, - the hard-learned experience that we have, - learned in terms of the platform engineering world. And now, we often have to deal with short deadlines, right? And deadlines are important. So, sometimes we need to de-scope a bit to retain value. And that is what we did here - with the silent blessing from you, which is why it's called - Almost 10 Things That We Hate About Platform Engineering. I recently started at Velux - in a role as the platform owner for a lot of Kubernetes platforms. As Mark mentioned, I've previously been with Eficode as well. So, yeah, in that role, we try to build platforms for our colleagues, - our developers, and that's kind of it. [Emma Jeppesen:] Yeah, so my name is Emma, - and I'm the platform advocate for Velux. And what I do is that I drive adoption for the platform, - and I do that by championing insights and discovery efforts on our platforms. So, kind of like an advocate and a product person in one, you might say. And yeah, this is our purpose at Velux. If you don't know, we make rooftop windows, - probably the best in the world. That's at least what we say. But yeah, we transform spaces using daylight and fresh air. And then, when we started at Velux, we had a similar question. [Dan:] Yes. So, Windows needs Kubernetes. [audience chuckles] Of course, we know all that Kubernetes definitely does not need Windows, - but that's for a different talk, right? But when we joined, it was like, do we run clusters inside our Windows? What is that? But the reason is, of course, - because we are providing a platform, a Kubernetes platform, - several of them, to the factories - where we actually produce Windows and stuff like that. And we actually have a platform team on top of us - that runs an industrial IoT platform. And then, we also provide the platform for, as I mentioned, - our developers in order for them to build web shops, - to help the installers of the rooftop windows to make that happen. And just to get us on the same page, what is a platform? I love this quote from Evan Bottcher from Thoughtworks. He mentioned that there are some keywords here - that it's actually self-service. That is something where the developers can do whatever they need - without us as a platform team interacting with it. We have product as a main term here. We actually need to have product thinking in terms of what we do as a platform. And it's optional for developers to use. We don't mandate that our developers use the platform that we provide. So, golden path instead of golden cages. And we actually also strive for making reduced coordination - that we don't necessarily need to be involved - in whatever that we do in terms of this. [Emma:] Yeah, and so just to get everyone on the same page, - because I think a lot of us come from the DevOps world. That's at least what I did before. And so, this new thing called platform engineering, what is it really, right? I like to look at this white paper, is what they call it, - from the Platforms Working Group at the Cloud Native Computing Foundation. They've been really nice doing this work - of defining it for us so we all know what it is. And the thing is, we all know these capabilities, right? We've worked with these for a long time doing DevOps. But the new thing here is the platform interfaces that you see on top. This is what the product and application teams kind of engage with. And it's what kind of hides the complexity - that sometimes is within these capabilities that you see underneath. And so, we try to make things more user-friendly. And maybe you've been working in a DevOps team before, - and maybe you're now a platform engineer. Maybe you've been rebranded because some visionary leader came and said, - "Hmm, platform engineering is the new shit, right?" So, Dan, I have a question for you that you might elaborate a little bit on. What about DevOps? [Dan:] Yes, that brings us to the first movie analogy. Dead Alive, sometimes back in the past it was called Braindead, I guess. DevOps is dead. Long live platform engineering, right? I get it, I get it. I see it as a marketing stunt. Oftentimes it's easier to stand on the shoulders of giants - if you kill them first, right? And when we ventured into this, I recall that, as a consultant, - I was like, are we back again with silos then? How do we collaborate between teams? Are we back to tickets again? I oftentimes see platform teams being pure operational. Is that what we are striving for? And I get it that the term DevOps team is maybe a little overloaded. Sometimes it's just tooling. But what I believe is that, of course, platforms provide this, - that application teams don't necessarily need to build, - run and manage the entire stack from the chip design - to the button in the top of the UI, right? But I think that there's still some really valuable principles in DevOps. And of course, I'm talking about CALMS. We still need to have psychological safety in the teams that we're in. That's a necessity and an enabler for everything else that we do. If we as platform teams need to be capable of scaling - the business, you could say, with the members that we have on our team, - of course, we need automation for that. Otherwise, we cannot support 10 times the number that we are on the team. We still need to have some way to think about flow. When we are in the process of actually delivering, - we need to make that fast, such that we can get the feedback - back again to the team for the changes that we make. Measurement is a huge part of it. We need to, as a platform for product thinking, - we need to actually measure that what we provide is something - that is valuable to our users, - both in terms of having them express what they feel, - what they need, what they desire, but of course, also to measure - the behaviours in terms of how they interact with our platform. And of course, also knowledge sharing is a huge part of it. Collaboration is a huge part of it. Otherwise, we risk doing stuff that is not important. So, DevOps is not dead in my head. We still need some of these principles. Next movie analogy, Titanic. Oftentimes, I think a team's journey risk - resembling the maiden voyage of Titanic. And what do I mean by that? Well, as a platform teams, we often forget - that we actually are in rough and sometimes icy waters. Many of us come from traditional organisations - that still rely on old, traditional methods like Waterfall. And most of our colleagues might necessarily not understand - what platform engineering is all about. And that can create some kind of friction - when practices clash with established ways of working. It's actually like how Agile disrupts traditional methods. And platform engineering, in my head, builds on top - of good, sound Agile principles. Now, we have just added product thinking into the mix. And I think, if we don't recognise these tensions - or spot the icebergs, we're bound to crash into them. So, we need to figure out when we need to navigate around them - and when we should try to see if we can actually melt them. So, like Titanic was believed to be unsinkable, - we can sometimes be too confident that we have management on board. But oftentimes, organisation waters change. The leader who supported the mission, who might be a visionary, might leave, - a new global strategy might shift, - and then, we may still hear the musicians playing, - now in a minor key, - as the deck tilts and water begins to flood the stern, right? So, will we make it? If we cannot demonstrate business value, - we risk ending up like Jack and Rose here. Some of us will probably not make it. Some of us will, at best, stay afloat on a raft, - organisational driftwood, - with far less power and potential that the platform could offer. So, we need to be capable of translating tech into business. Of course, it should be the business that drives the efforts that we do. So, when we talk that we have improved our CI-CD pipelines times 10, - how does the business relate to that? Maybe it's because we are then better at reaching Time to Market. Maybe we are better off to protect revenue in terms of pivoting - when new market opportunities rise. Maybe when we do the security efforts, it might help us avoid costs. And I get it, it's not necessarily that everyone inside a platform team - needs to be capable of translating or thinking about that. But the more the better. I believe that we are better off understanding the why we're here. It is, in the end, to support the business, right? [Emma:] So, I chose Kevin here for my next movie reference. Maybe you know already by now what this is about. But picture this: you've built a platform, - it's filled with awesome capabilities, - maybe you did some sort of an abstraction on top of it. It's got all the shiniest new tech. And perhaps, yeah, now that you've built it, they will come, right? That's at least what you think. But the issue now is you have no users. So, you're like Kevin here, all home alone. It was probably fun at first, being alone. You could do whatever you wanted. You could eat ice cream and stay up late and watch TV. But suddenly, the lack of users become a little empty. And maybe you're trying to set traps, - so management doesn't realise that you created absolutely no value yet. So, what went wrong here? Well, you probably waited a little too long - onboarding and talking with your users. Typically, this happens when you haven't really thought of - what is the thinnest viable platform. And instead, it became the thickest one. And so, it's starting to look a little bit like this. Yeah, your tech landscape now looks like the CNCF ecosystem, right? It became, like I said, the thickest viable platform. You ended up with a monster, but no users. So, this is what I suggest you do instead. You start with the need. Start with actually talking to your users, figuring out - what are they having problems with, what are opportunities - that might be out there, desires or problems, right? And then, pretotype. If you don't know what pretotyping is, - it's way smaller than a prototype. With a prototype, you're trying to get something to work, - like functionally, right? With pretotyping, you're probably just trying to validate an idea - by doing the smallest little thing you could do and then showing that to users. It could be a drawing, it could be a GIF, it could be whatever - that visualizes what you're trying to solve. And then, you go and validate it, - ask them if they like it, if they would use it. And once you have that, you can build it with confidence. And actually, maybe try with the abstraction first, - and then building the capabilities underneath that after. But yeah, next reference, right? Alice. So, Alice here, she finds herself in a strange, chaotic world - with no clear direction of where she's headed. And in a way, she's kind of like a lot of platform teams - that I've seen out there, wandering aimlessly around, - through unpredictable challenges without a clear plan or strategy to guide them. And it's something I've seen a lot, platform teams without roadmaps. But the point is, though, where we're going, we do need roadmaps. We do need direction, we do need vision, we do need some overarching strategy - so we know we're not going into seven different directions in a team, right? And without a roadmap, it's not just a lack of direction. It's a lack of a future. So, if you're out there, and you're feeling like Alice here, - kind of lost in a maze, - don't worry, I've got a few actionable tips for you. What I would suggest is that you start by creating a vision, - and then you create a roadmap after, right? You want to start with figuring out what is it that we're doing. Who are we solving things for? What is our product even? And then, you can make a roadmap. It doesn't have to be down to small features or really descriptive, - just something so we all know we're on the same path. And then, we align, and we commit as a team. Once we're all committed, that's when I really started - to see that teams can make real magic. [Dan:] So, this could have been about catching - and placing big legacy creatures in modern environments. But that's a whole different story. This is about our priorities and how we work with them, actually. And that is why we have this quote here, - that your scientists were so preoccupied with whether they could, - they didn't stop to think if they should. So, sometimes I have experienced this, - that someone asked for a feature, a solution, - maybe an email or a Slack message. Unfortunately, we have Teams at Velux. Or by the watercooler. Lots of meeting, discussion, but only on the solution, right? We haven't actually touched upon what we're trying to achieve with this. And maybe also just with a narrow scope of users. For example, just the requester or the team that needs this. So, we end up with some vague idea of where we are heading with this one. But that's cool, right? We're engineers. Let's get started with coding. That's what we do. Solve problems. So, now we have this huge task. It may be a lot bigger than we initially thought of. So, eventually, we ask our colleagues and the team, our peers, - to grab a cup of coffee because we're ready for a huge PR now. But [chuckles] at least I spared you for the unit tests, right? And then, ta-da, maybe we did something, - maybe it was right, maybe, well, we don't know. That is stupid. We don't have any processes, - no documentation, no actually connection to why we did this. And then, we end up with this. Yeah, but we're agile, right? We work agile. We don't necessarily need to go into some larger analysis - of whatever that problem is. So, I believe that we accidentally come to value, - endless DMs and meetings, pseudocode like script and YAML. So much coordination all the time and constantly changing direction. And that is, we hate the items on the right so much - that it's okay for us to suck at the items on the left. So, this is something that product teams actually do. We start with an opportunity. An opportunity might be a need, a problem, a desire. It could just be a desire. It doesn't necessarily every time need to be a problem, right? And then, we diverge into thinking about the problem in various ways. How is that? Is that something that is related - to other teams that also have this problem? Figuring out where is that problem. Up until that we kind of figure out, okay, now we are a bit clear, - and then we can start to converge into an opportunity statement. So, this is where we, again, have the opportunity to figure out - what possible solutions do we have for this. And as Emma mentioned, we need to make something that is - minimal, I would say, viable, of course, - that just maybe does this, and then we can iterate upon it. And then, of course, to deliver it. And yes, it might be considered phases, - but it doesn't necessarily need to mean that we have a long process, - a waterfall process behind that. And this is what I think about it. I think, as engineers, we are really, really great at solutions. We are great at doing the solutions, - but sometimes some of us - need to be more focused and interested in the problem, right? Asking questions. Why, why, why, why, why should we do this? [Emma:] So, in one of our other points, - I mentioned that you need to create a vision, to know your scope, right? You need to know where's my responsibility, - but it's also important for another fact. You want to know what's not my responsibility. Who am I not serving, right? And if you don't, your organisation may end up in this common pitfall. One platform to rule them all. And see, the thing is, while we often don't acknowledge this, - not everything can be solved with our precious Kubernetes. Not everything should be solved with our precious Kubernetes. And the idea that one single platform team - is going to save a whole organisation, - a whole bunch of engineers with all kinds of different needs, - it's just not viable, in my opinion, at least. You need to know what's your value stream. Who are you serving? And also, similarly, who are you not serving? I've had these before. I know a lot of you had these. The snowflake developers, right? The ones that have - that really, really, really specific need that no one else has. And you're struggling, struggling to get them onto the Kubernetes path. And that's why I kind of see them as the hobbits. You know, the hobbits there in the Shire, - content in their own world, resisting the journey into cloud native territory. But just as the hobbits didn't seek out - the wars of Middle-earth by themselves, right, - you don't need to force these snowflake developers - onto your platform as if you're Gandalf. After all, not every team is meant to be - part of the epic quest for cloud native perfection. Let them remain in their comfort zone. Let them remain in the Shire, right? And either at some point they'll venture out on their own, - or the business will send another fellowship - to address their very unique needs. So, that's it for the movie references. And so, I think we need to finish up with something here then. We love platform engineering. Of course we do. We think it's the best way forward. We think it's way better than what we've been doing before. If you do it right and listen, it's like really new. We're all doing the same mistakes. We're all doing the pitfalls. And that's why we've been ranting about them here on stage. So, hopefully, you can see yourself in some of them - and maybe not do the same as we've done. So, really come speak to us in the hall afterwards - if you want to talk more about these common pitfalls - or maybe if you just recognise some of the things - that we've been talking about here on stage. And I think what's left to say now, Dan, is - thank you all for listening. - [Dan:] Thank you so much for coming. [applause] [outro music] [music ends]