“Most enterprises will fail to scale DevOps initiatives, if shared self-service platform approaches are not adopted.”
Gartner, 2022 Research Roundup for DevOps
To adopt the best DevOps practices for your business, you must utilize a platform that encourages excellent Developer Experience (DX). Without this, your DevOps initiatives may fail, resulting in a lack of productivity. Focus on attracting and retaining the best developers by avoiding time-consuming software development work.
With the rising popularity of microservice architecture and Kubernetes, software practices have become increasingly complex. Platform engineering emerges as a valuable practice to simplify development and create a conducive environment for innovation and enjoyment. It allows organizations to focus on software development while providing developers with the necessary cognitive space.
PART 1: What is platform engineering?
Platform engineering establishes an internal development platform (IDP), which helps support teams within cloud–native enterprises and encompasses the platform's design, implementation, and operational aspects throughout the service lifecycle. It has recently gained significant popularity due to the growing demand for improved developer and operations practices.
Platform engineering can provide “golden paths” or “paved roads” that developers can quickly adopt if approached correctly. These terms are used mutually, but they essentially address the same need–that platforms should offer an easier way for developers to build, ship, and run their software. The platform provides automation and different levels of abstraction to support the software development process according to a company’s governance.
What people say about platform engineering
Evan Bottcher states: “A digital platform is a foundation of self-service APIs, tools, services, knowledge and support which are arranged as a compelling internal product. Autonomous digital platform delivery teams can make use of the platform to deliver product features at a higher pace, with reduced coordination.”
Understanding internal development platforms (IDPs)
An IDP is an added layer that simplifies operations and allows developers to help themselves with existing technology and tools. IDPs, in general, can be defined using three keywords.
Internal: This term emphasizes that the platform is unique to each business and can’t be purchased as a pre-built product. It’s tailored to the specific needs and processes of the company, including its rules and preferred tools. The platform guides developers, providing them with best practices and resources to facilitate their work and help them get started easily.
Developer: This term highlights that developers are the primary users of the platform. Self-service is crucial here, as developers are empowered to create value by quickly implementing new features without relying on IT teams for resource provisioning. Templates are essential in enabling developers to create sample or basic versions of products rapidly.
Platform: In this context, "platform" refers to the infrastructure and collection of software tools provided. The platform encompasses different functionalities related to the software delivery pipeline, such as resource provisioning, various types of software testing, and deployment of artifacts to target environments. The platform includes a user-facing developer portal built upon shared abstractions in cloud-native software, like containers, declarative configuration, and control loops.
Read the blog "Internal developer platforms: What they are and why you need one" to learn more.Read blog
Platform engineering vs. DevOps
Some say “DevOps is dead, long live platform engineering!”, but the two approaches are actually complementary. Platform engineering leverages DevOps practices while reducing cognitive load, aiming to enable self-service capabilities for developers.
Application development becomes increasingly complex as DevOps continues to expand, with developers needing to learn new digital platform tools, manage infrastructure, and prioritize operational tasks while coding to develop new features. These demands reduce productivity, increase burnout, and lead to job weariness.
Platform engineers are crucial in simplifying standard DevOps processes
Imagine a business that creates web applications using a consistent structure with a database, a backend that provides RESTful APIs, and a web-based front end. While modern software tools and templates have been adopted, the development process still heavily relies on manual efforts. DevOps engineers are responsible for creating Docker files, writing Terraform scripts, setting up project-specific build pipelines, and managing environment updates.
They collaborate with developers to address their requirements while handling monitoring, alerts, and meeting Service Level Agreements (SLAs). But, depending on the DevOps team alone for these critical tasks creates a bottleneck, leading to increased developer lead times and significant stress on DevOps engineers.
Platform engineers simplify this by using an IDP to automate tasks, starting with self-service. Developers don't need to set up Git repositories manually, as users can quickly request an IDP to create user groups and automatically integrate the right CI/CD template.
Platform engineering doesn’t replace DevOps practices–it builds on them
It offers teams an easy way of getting projects started through standardized patterns. These patterns are built into an IDP with self-service capabilities. Teams start adding value immediately instead of spending weeks on project setup and problem-solving.
Self-service allows developers to be autonomous yet compliant without being overloaded. Because of this, platform engineers can focus on more significant architectural challenges, improve current features, and adjust the system to changing needs.
Platform engineering vs. Site Reliability Engineering
Google pioneered Site Reliability Engineering (SRE), which focuses on operating and improving software applications at a large scale. Despite sounding alike, platform engineering and site reliability engineering are different. SRE is mostly about operations: running a service and ensuring it’s continuously available and up-to-date. However, SRE also provides a model for service management, which can also be applied to IDPs. The approach discussed in the “Site Reliability Engineering - How Google Runs Production Systems” book is particularly valid in this context.
Reliability in platform engineering
A software application can’t have a higher Service Level Agreement (SLA) than what the lower layers in the stack offer. For an app to have a 99.9% availability guarantee, all its infrastructure components must offer the same deal. SLAs matter between a platform team and the development teams who use the platform. It constitutes a promise of overall SLAs and thereby provides development teams with an expected level of reliability.
A Service Level Objective (SLO) targets a service level measured by a Service Level Indicator (SLI). Choosing the right SLOs is challenging but vital for the platform team's performance measurement and business success. It helps platform teams balance innovation and reliability using an error budget and the margin between SLI and SLO. Therefore, SLOs and error budgets should also be published to set expectations for its stakeholders.
Platform teams and incident management
Platform teams are essential for the reliability of software applications running on their platform and infrastructure. They must also take responsibility for problems with parts that belong to the platform team during outages or issues.
Team interaction modes
SRE teams work closely with development teams, and their interaction changes as the application progresses. An SRE team may be a mix of enabling and operations teams. It provides coaching on scaling and building reliable services up to a certain point. The SRE team should take full responsibility for the reliability of one or more digital platform services, which is different from a platform team, which is expected to deliver self-service interfaces for developer teams to consume. The platform team needs a product mindset and a close feedback loop with development teams to build the right things.
PART 2: Why you need platform engineering in your business
Software development continues to gain traction in today's business landscape, regardless of the sector. Companies must prioritize investing in a well-designed IDP to achieve higher productivity, efficiency, and motivation.
Here’s a list of how having an IDP can help your business:
Speed up time to value
It's not just about speedily deploying software; it's about the broader effect new software solution has on user experience and business growth. A robust platform acts as a catalyst and can shield developers from the intricate details of infrastructure management–enabling them to focus on crafting features and functionalities that matter most to users and the business.
Increase cost efficiency
A unified platform acts as a catalyst for strategic financial management practice. Centralizing infrastructure and tools through a platform increases cost transparency and empowers service owners. With this visibility, teams can confidently gauge and balance their expenditures, aligning them seamlessly with the revenue and business value they bring. This convergence of cost awareness and business-driven decision-making fosters a culture where IT investments focus on value generation just as they do with cost containment.
Improve Developer Experience and attract new talents
Attracting and retaining talented developers is key to any software-driven organization. By investing in platforms to unify and streamline software delivery, an organization lays down clear, safe tracks for its developers to race on. A platform provides a welcoming environment to start projects and the safety to experiment. Thanks to a robust tech stack, the platform team can enable developers to produce meaningful work without friction.
Ensure compliance and security
With the digital landscape being as vulnerable as it is, tighter security and privacy are no longer just options; they’re necessities. Guardrails like Automated Audit Trails, Policy as Code, Secured Self-Service, and Consistent Environment Configurations can be integrated into your platform. A successful platform integrates compliance and security into the workflow seamlessly. That way, developers can focus on innovation without the burden of security and regulations.
Check out the podcast episode "Platform engineering done right" about the capabilities of an IDP in a Portuguese bank, Millenium BCP.Listen to podcast episode
The challenge of cognitive load in platform engineering
Engineering teams are often taxed by handling complex tasks, which burden their cognitive ability. They can effectively address this issue by leveraging an IDP daily. An IDP allows engineering teams to streamline processes and concentrate on essential tasks.
A key benefit of IDPs is alignment. IDPs ensure consistent practices across diverse teams and systems. Consequently, they play a vital role in easing cognitive load and promoting consistency in developers' experiences. For instance, a company's security practices can be standardized and automated by the platform team, thereby helping unravel essential cognitive resources for other purposes. By following these measures, your business can create a thriving culture around IDPs built upon shared processes that maximize overall team productivity.
PART 3: Creating the right IDP for your teams
IDPs offer self-service capabilities for developers, enabling them to focus on their most important work. Many businesses know the benefits of adopting IDPs, but finding the right talent and strategy to build a compelling platform is challenging.
How to get started with platform engineering
There are many factors to take into account. Technical teams must collaborate with organizational stakeholders to obtain the necessary digital platform support and resources for the project. Getting executive help is crucial, as building an IDP alone requires significant resources and isn’t just about technical challenges.
This undertaking can be costly, particularly for larger organizations. So here are some steps a CTO could take to get started:
1) Get management buy-in and let them own the greater vision
2) Establish a team and let them define their charter and mission
3) Discover opportunities that provide good value to application development teams
4) Establish measurements for success
5) Prioritize, focus, and deliver (software and documents)
6) Capture feedback and outcome
7) Communicate achievements and challenges to sponsors and stakeholders
8) Revisit team charter (mission)
9) Iterate from step 3
Hear how TV2, a Danish national TV station and streaming platform, got started with platform engineering.Watch this talk on Internal Developer Platforms
Tip: Improve adoption with a name
Naming your IDP and making it a product will help your teams feel more committed and comfortable with building the platform. It’s a good idea to keep the platform name distinct from the platform team's name, which should represent the team's identity and ethos.
Don’t build on assumptions—experiment and improve
Falling into the rabbit hole of building on foundational layers is common. Before we welcome our first team into the platform, we want everything in place. After all, we want developers to have a great experience, so we need to “have it finished” for them to run their services.
But things are never done when it comes to software. From the start, we need to work on incremental improvements with actual developers as early as possible. Otherwise, we build on assumptions, and they tend to fail us. After all, we haven’t proven the right to play as a platform team.
Perhaps this anti-pattern comes from the word “platform”. We believe it should support the entire software delivery, security, quality, monitoring, and operations upfront. It’s okay to think about the bigger picture, but start out small and build the Thinnest Viable Platform iteratively (a concept from the book Team Topologies). So, figure out the smallest possible use case that provides value to developers and build just that. Measure the outcome and learn what to build next.
If you’re scared of going into production (as we all are) and just need to do a little more work, resist this urge and reach out to your most open colleagues who want you to succeed and offer their help to test the platform.
To learn more about the challenges and pitfalls of Platform Engineering, check out the talk on “Platform Engineering is Hard, and We are Doing it Wrong” at DevOpsDays Denmark 2023.
Tip: Always document your developments
Identify specific capabilities and prioritize addressing the minimal requirements at the developers' level. Start by documenting these developments, as it will prove essential in the future when dealing with complex APIs.
Documenting your IDP
Begin by creating a documentation plan and anticipate what matters based on the needs of developers, depending on the challenges they’re currently facing. If you don’t, you will likely face the issues listed below:
- Developers will need help getting started on using your platform and its services.
- You will need to explain how certain concepts and ideas work repeatedly.
- Developers will remark they tried following your documentation but failed one of the steps as they didn’t have the correct version of specific software installed.
- Developers will approach you for troubleshooting when they face roadblocks rather than referring to the documentation.
- You will constantly direct the same people to the same documentation, with little results every time.
The goal is to provide a clear overview of your users and how it can benefit them. Save the detailed technical documentation for later. Developers are looking for platforms that are easy to understand and use. By keeping their needs in mind and providing good documentation, you can attract more developers to adopt your platform.
Check out our blog post on better documentation for better platform developer experience for more insights.Read blog on better documentation
Tip: Introducing your platform to new users
Consider creating a "Getting started guide”, which is the perfect opportunity to set expectations and give users a clear overview of your platform's core features and benefits.
Your guide should be concise and cover the basics of what your platform offers. Focus on getting users excited about your platform rather than bogging them down with too many details. The guide is helpful for developers new to your platform, as it provides an accessible introduction to your product.
Choosing your platform engineering tools
Crossplane? Argo CD? GitLab? Let’s rewind a little here. Each organization will need different tools that work for them.
Technology and tools are essential for a platform to deliver value to its users. Some choices must be made early, while others can be made along the way. The infrastructure the platform must run on should, of course, also be decided at an early stage. Should it be built on a cloud platform, on-prem, or a hybrid?
Like with all other critical decisions, it should be made with the right stakeholders in mind. That’s why you should involve shared services (finance, procurement, security, etc.) when deciding.
Here are a couple of things to remember:
- Avoid tight coupling: Consider utilizing open-source software and tools that offer a good range of 3rd party integrations over a one-size-fits-all approach. This will save you from becoming vendor (or technology) locked in the future.
- Accept that things will change: Whatever your decision, you’ll likely be looking to migrate in 2-3 years.
Other tool choices are easier and should first be made when you’re ready to adopt them. Avoid making upfront commitments to tools you don’t plan on using in the near future. Because tools and technologies change rapidly, follow the Lean practice to delay commitment until the last responsible moment (as described in the book Lean Software Development).
It’s also worth considering how many innovation tokens you can afford to spend on new tools. Platform teams are not excluded from being exposed to cognitive overload. As boring as it sounds, sometimes it’s better to start with tools you’re already experienced with and have available and then gradually investigate if they are still fit for purpose.
PART 4: Measuring success with a platform approach
Most options to assess software development focus on local productivity, quality, and consistency in the delivery process. When it comes to DevOps, there are well-established metrics thanks to the DevOps Research Assessment (DORA) program:
- Deployment frequency
- Lead time for changes
- Mean time to recover
- Change failure rate
These metrics allow us to capture the impact of DevOps culture and practices in quantitative terms related to IT performance and quality. Lately, attention has shifted towards evaluating how developers feel about and value their work by taking a DevX-centric approach to measuring success.
Measuring Developer Experience
Measuring DevEx is a daunting task, though, as the concept itself is very broad. It’s helpful to start with a framework that identifies the three core dimensions of DevEx: feedback loops, cognitive load, and flow state. These dimensions emphasize that DevEx is not only a technical matter—tooling or workflows—but also primarily about developers’ perceptions.
Developers are human, after all, and no single quantitative metric can accurately capture DevEx. For this reason, collecting qualitative feedback from developers through surveys is imperative. This kind of feedback describes the perceived complexity and satisfaction with deploying code and using a platform, among other aspects. For example:
- Easy-to-use tools help developers streamline development and deployment quickly, without hassle.
- Self-service capabilities support developer independence and control.
- High-quality documentation lets developers navigate complex infrastructures.
- The platform enables collecting diverse metrics related to DevEx.
- The fast setup of a development environment fosters productivity and creativity in realizing minimum viable products.
Ultimately, platform engineering is an effort that goes beyond improving productivity and leading to faster innovation. By improving DevEx, a platform ultimately makes developers happier, improving retention, talent acquisition, and success in realizing business goals.
Watch Nicole Forsgren’s talk "Why even DevOp—making our days better" for more information.Watch Nicole Forsgren’s talk