Skip to main content Search

The $8 Trillion question: Open source, risk, and resilience

Open source powers nearly everything we build today—but what is it actually worth? Pinja and Stefan discuss Harvard’s $8 trillion estimate, how widely open source is used, and why central registries are under pressure. They touch on security risks, supply chain attacks, and what the EU Cyber Resilience Act means for companies using OSS. Finally, they look ahead at how AI and coding assistants might reshape open source development.

[Pinja] (0:03 - 0:23)

If you're making money out of it or if you're a company and you're using open source software, you need to be aware of your obligations. Welcome to the DevOps Sauna, the podcast where we deep dive into the world of DevOps, platform engineering, security, and more as we explore the future of development.

 

[Stefan] (0:23 - 0:32)

Join us as we dive into the heart of DevOps one story at a time. Whether you're a seasoned practitioner or only starting your DevOps journey, we're happy to welcome you into the DevOps Sauna.

 

[Pinja] (0:38 - 0:45)

Hey, and welcome back to the DevOps Sauna. I am joined again by my co-host, Stefan. Hey, Stefan.

 

[Stefan] (0:46 - 0:47)

Hey, Pinja. Good to be back.

 

[Pinja] (0:48 - 0:49)

It's good to be back.

 

[Stefan] (0:49 - 0:50)

So how's things going in Finland?

 

[Pinja] (0:50 - 1:01)

We are getting ready for December. The darkest month of the year called November is almost done. Let's see.

 

Let's see. Christmas is almost here. How about Denmark?

 

[Stefan] (1:01 - 1:11)

I was just thinking, are you starting to wear long sleeve shirts and skip the shorts back into the storage room? Because the story is always you're wearing t-shirts and shorts year long.

 

[Pinja] (1:12 - 1:21)

Yeah. I just saw a guy last week and he was below freezing and he was in his shorts and a t-shirt and a bus stop. I think he was making some kind of statement with it.

 

[Stefan] (1:22 - 1:30)

Well, he can join Denmark with all of the rain we have. It's commonly Danish weather, soggy, moist, cold, windy. Nice place to be.

 

[Pinja] (1:30 - 1:42)

It's always nice to start with the weather, but it's something we actually thought we would actually bring up today, which was open source and the importance of it and how wide it has been spread so far.

 

[Stefan] (1:43 - 1:55)

Yeah. We could probably draw a weather forecast for open source if we wanted, but let's not do that. It's going to be way more fun talking about why and how and monetary looks on open source as well.

 

[Pinja] (1:55 - 2:27)

Yeah. Let's do that. Because the reason why we actually were prompted to do this, we saw an article by Harvard Business School, and this was from last year.

 

They detailed how much it would cost to rebuild all the software. I know when I say all the software, I actually mean everything in the past X many years, and that has been built on open source. To rebuild it without open source, and the number was a staggering $8 trillion.

 

When I say that, I feel like I'm Dr. Evil from the Austin Powers movies and I'm talking about monopoly money.

 

[Stefan] (2:28 - 2:52)

We've touched upon previous sales of different companies for $1 billion, and all of a sudden we're in the trillions. It's just crazy. But when you look at the whole landscape, it's everything from this tiny, small library running in JavaScript to Kubernetes running in a lot of servers these days, and even desktop software like LibreOffice or Audacity or whatever we have.

 

There is so much open software out there.

 

[Pinja] (2:52 - 3:22)

In this episode today, our aim is to talk a little bit about why do we even use open software? How wide is the spread at the moment? Let's brush upon the security perspective.

 

We're not going into that much detail with the security today, but also we want to talk a little bit about the future of open source software and a little bit about AI in that context. But Stefan, how would you describe it? Why would I even use open source?

 

What's the gist here?

 

[Stefan] (3:22 - 4:19)

How would you not? The whole internet is running on open source. Back in the good old days, every story begins.

 

There were some guys building this network and all of a sudden we're spreading it out and all of a sudden you have servers running everywhere, running on maybe Linux, FreeBSD or whatever, and they sort of support the internet. And then software comes into the picture and we start building a lot of software and we see something like, there's something missing here. I'm going to build that small bit, and all of a sudden you have a library that is being downloaded billions of times a week.

 

It comes out of something missing in the market or me wanting to prove my professional worth of building a really well-written software library. I have not done that. I think I have maybe 100 downloads a year of one of my open source projects, but it's still open source.

 

It's out there and it made the world a bit better than what I actually saw. It's about filling these small gaps all of the time.

 

[Pinja] (4:19 - 4:55)

And if we think of the history of software development, it has always been very, very community-driven. There's always been a social element to it. So sharing what you have created, maybe doing it for a greater good.

 

That sounds very altruistic, by the way. But that's what it is. It's about like, hey, look at this.

 

I made a cool thing. Would you like to see? Can I try it?

 

Yes, you may. It's all about this. And now with open source being the primary source of basically all software in the world that we're using at the moment, it's kind of staggering to actually put a number on it if we were to take away all of open source software.

 

[Stefan] (4:56 - 5:33)

Yeah. I think it was back in 2014 I had a talk about what motivates people to go into open source, but I couldn't find any actual monetary value on it. I could find a lot of research why you would do it, and to some degree it was kind of funny because all of the research was done by economists.

 

And actually quite a lot in Denmark. So I don't know why that happened because I don't see Denmark as the frontrunner of open source in the world. But apparently when we try to make money, we want to figure out how we do open source and how we can actually bring stuff in and give it a monetary value for ourselves.

 

So I guess we forgot the altruism in Denmark sometimes, which is sort of funny.

 

[Pinja] (5:33 - 5:54)

Yeah. But maybe what I've heard from many people who work as developers, there is the kind of inherent need to make things better, to kind of, as I say, fill the gaps. There's something missing, really building that mastery in it and then sharing that with the world.

 

Like having, for example, GitHub and everybody's repositories available, the ones who actually made them available.

 

[Stefan] (5:54 - 7:07)

Yeah. That is super nice. And I think GitHub came up at the right time for everything because it became the social coding platform when it sort of surfaced.

 

I was part of a project, it was called Code 52. And the 52 was actually because we went into a new open source project every week, either helping maintain something or creating something that was missing or just like goofing around with something. It was a quite strenuous year for the, I think there were four or five sharing the lead positions in the project.

 

Quite hilarious. And some of the projects really helped out and some of them died the week after. But it was a cool time.

 

And it sort of made this like a big open space where you collaborate and all of a sudden you figure like, oh, I'm sitting here in Denmark. The other guy is sitting in Australia, the last guy in the US. And we actually just like to make stuff work without having calls and worrying about time zones.

 

We learn how to communicate through the code or through issues or something like that. In that part, it was actually really helpful to do open source for my personal career because I figured out like, all right, I don't have to call people. I don't have to sit in a meeting with them.

 

Asynchronous work does actually work in practice sometimes, if you're in a culture where the setting is mature enough for it.

 

[Pinja] (7:08 - 7:33)

Yeah. And if we really think about how widespread it is, For a long time, it was actually quite hard to estimate how wide open source is spread because the usage was not exactly tracked so much and it's been difficult to measure because it's free.

 

But we have some indicating numbers based on the downloads of like, let's say packages. Stefan, you collected the data for the packages. So how widely spread can we say something is downloaded per week at the moment?

 

[Stefan] (7:34 - 8:14)

Yeah, I just grabbed some of the numbers to sort of give an indication because when I try to find the actual numbers, it's actually quite hard. If we take something like images being pulled from Docker Hub, I think it says like 1 billion plus downloads of Nginx, which is used more or less everywhere. But it does actually have 11 million pulls a week.

 

And like Postgres is 16 million pulls per week. That's a lot of downloads. And if you look even further into NuGet packages, overall, they download 5 billion packages a week for all of the internet projects.

 

When you look at Python, it's 22 billion packages a week. Just imagine the bandwidth and everything to transfer those 22 billion packages.

 

[Pinja] (8:15 - 8:15)

That's true.

 

[Stefan] (8:16 - 8:20)

Like I surfed a lot of traffic out of a CDN and well, this is a lot.

 

[Pinja] (8:20 - 8:47)

It is a lot and it is hard to, of course, we don't have the statistics of what let's say companies or organizations or independent developers are using these packages. So we don't know, for example, in these like let's say 22 billion packages a week or 11 million pulls per week, how many users for that end product do we have that have been used for that one specific pull or that one specific package being used?

 

[Stefan] (8:47 - 9:58)

It's actually a bit tricky because the more modern we get as an organization and like we pull the packages and then we build, the bigger these numbers will be because a lot of companies don't actually set up like a local proxy. So you only like to grab it once from the public repository to your local cache, which means we just keep hammering all of these public repositories. And it is actually an issue that is sort of brewing in the background somewhere.

 

I think there were 15 organizations actually coming out and saying like, we cannot support all of this traffic anymore because you keep hammering everything. You don't set up proxies. You don't make sure that you only fetch it once to your local hub.

 

We need to do something because we cannot keep up with the monetary cost of running all of this as like an internet backbone or whatever we would call it. So there are some dark skies in the future we need to solve. And I actually think PyPI which runs all of the Python packages, was one of those that was on top of the list saying like, this is not sustainable for us anymore.

 

NuGet, not a problem. Microsoft is running that. NPM, run by Microsoft these days as well.

 

They have a ton of money. They make their money elsewhere, so they'll probably figure it out. But as soon as you move into the sort of other communities, then it starts being an issue.

 

[Pinja] (9:58 - 11:04)

There are some, not maybe bottlenecks, but some of these packages are actually relying on individual contributors and creators. And there is a huge risk in, for example, the support. Are we getting any updates and that kind of stuff?

 

But let's touch upon the value, the valuation function for a moment, because the $8 trillion is extremely high. And in the study that Harvard Business School made, they divided this into two different valuations. So there was the demand value, so what the companies would avoid in costs.

 

So let's say a company that has that one, one has been using NuGet packages and they've been maybe using, as I say, they get it once or maybe they update it every once in a while, but they get it once and then they reuse that information, that package. So how much would it cost for them to actually, how much are they avoiding in costs because they're using this open source package or if they were to build it in-house? So it's kind of on an open source scale of buy versus build by themselves, right?

 

[Stefan] (11:04 - 11:36)

Yeah. As far as I recall, the factor they had, if you wanted to build this on your own, you would actually be spending three and a half times as much money on it because you have to hire the correct employees, infrastructure, hardware, operating systems. If you're not running a free operating system as well, and housing and electricity and everything, it just builds up.

 

But I think the balance of how much demand there was and how much supply value there was, it was sort of like a 50-50 out of the 8 trillion.

 

[Pinja] (11:36 - 12:30)

So that's the first side. How much would we avoid the cost? And then the second one is the supply value.

 

What would it cost to create everything from scratch? So that's how they, and of course, estimations of the packages and the pools. So that's how they came up with $8 trillion of valuation for this.

 

And I'm intrigued by this, as we mentioned a little bit about some of the risks of open source at the moment. Its security has always been, of course, a concern for many people when it comes to open source. And in this podcast, a couple of months ago, we covered the NPM package supply chain attacks. So 20 very widely used NPM packages that were collectively downloaded for over 2 billion times per week had an issue.

 

And there was malware spread with that that was attacking Bitcoin wallets at the time. So that was it. That was one of the things.

 

And these things have been just increasing in the past couple of years.

 

[Stefan] (12:30 - 14:11)

I think we're seeing sort of like attackers leaning into the open source market because it's easier to get into an open source project and infect it or whatever we would use terms for it. Because we saw like the SolarWinds attack where people spent a lot of time to get in and sort of like get into a company, get into the code base, put some malware in there. And it just took ages.

 

Now you can sort of like lean into an open source project. You can push something in and it's distributed the day after. So I think that they just shifted their playing field.

 

And like no matter what, if it's commercial or open source, there'll still be attacks tomorrow. We're not going to be done any day. Every week there's a new group, there's a new technique and so on.

 

And when I talk to some people, I know there are pen testers, they go out as a red team trying to squeeze into these companies. And as they say, we're not using advanced tactics at all. We know some really advanced stuff, but often we find something that is super simple and we get in.

 

Which means in general, even though in many cases for him it's commercial software, you could probably see the same thing in open source. We need to raise the, what do you call it, the bottom level of everything. We need to make sure that the baseline is high enough that it's tricky to get in.

 

It's the same when you read about security. What you're doing is sort of like to make sure that you push the attacker into where you want him to be. So always push him to where he may be able to deface your doc site, but he cannot change the code or something like that.

 

Something you can recover easily. Which comes back to the whole discussion of risk and how you manage risk. I'm pretty sure some of our colleagues could go on for hours on global risk and compliance.

 

That is a very big field and a lot of people misunderstand it as well.

 

[Pinja] (14:11 - 14:35)

It's true. And even though there is a report from last year that said that since November 2023, so basically two years ago now, there have been over a half a million new malicious packages. And it's been across Java, JavaScript, Python, and .NET. And there was the ShaiHulud malware that was infecting, was it, 500 npm packages and leaking secrets on GitHub?

 

[Stefan] (14:35 - 14:48)

Yeah. And it started pushing. It scraped your local machine for secrets and pushed it to GitHub.

 

Really nice service. Secrets as a service. It's what everybody wants, especially when it's your own secrets that are getting out.

 

[Pinja] (14:48 - 15:00)

No, nobody wants that. But as open source, of course, is an attack surface. But at the same time, so would be your home-built software anyway.

 

[Stefan] (15:00 - 16:05)

Exactly. The old saying for open source is it's more secure because everybody can see the source code. But I think we're reaching a level where the complexity of open source is at such a high level that it's not easy for everyone to see through it and understand it.

 

It makes sense to use some open, big algorithms where everybody can see and verify, all right, everything is right here. There are no backdoors in this encryption algorithm or something like that, which is a totally big discussion in politics as well with backdoors in encryption. But having an open encryption algorithm that scholars could sit in universities, they can verify everything is good, everything is okay, the math holds up.

 

All right, so we can use this because the input and the output are actually the important bits of it. But the algorithm is known and it runs well, and we're all happy. And we've seen that happen many times in security.

 

Use open algorithms. Don't invent your own because you're not going to be as good as 500, 700 PhDs working in cryptography or something like that. Really embrace it, but understand it as well.

 

[Pinja] (16:05 - 16:06)

That's true.

 

[Stefan] (16:06 - 16:07)

And maybe commit back.

 

[Pinja] (16:08 - 16:27)

That's more important. And that's what keeps the open source community running. Because we know that it's not going anywhere.

 

But let's do a little bit of future prediction here. You said very briefly here that it is now getting more complex than ever, but what else is happening? Do you have any predictions, Stefan?

 

What is the future for open source looking like?

 

[Stefan] (16:27 - 18:20)

We already touched a bit upon it. The central registries, they're under pressure these days. We need to come up with a model where we can actually support them.

 

I think when we look at successful projects running, we can take SSL certificates. Let's encrypt. You can use it for free.

 

And they have quite a lot of backing with money to make sure it runs. They probably have a profit now these days, but I don't know the details of their economy. But we need to figure out a model like that for all of these central repositories that are being used.

 

And we need to make sure that we actually commit back when we spot something. All companies should, to some degree, have an open source office, or at least a policy for how you adopt open source, how you can contribute back. I know we have a few open source projects here as well in the company.

 

They're listed, and everybody can actually jump in and do stuff about it. But it does require effort to sit down and write a good open source policy. I used to work for a company where we had our own small program office, and there were three people running it.

 

One that sat with a strategic view on technology in the company, and then we were one representative from each side of the different business units. And we actually sat together and talked about, all right, how do we need to change this policy? Have we gotten any feedback where it goes towards what's written in the employee contract, toward what we actually state in our open source policy?

 

Can we actually get them to change the contracts for all employees? And we actually got an update for the employee contract stating you can actually do open source, you can even do it in the work time if it's agreed with your manager and X, Y, and Z. So we need to have better contracts that allow us for open source, because usually an employee contract we state in the IP you create belongs to the company, period.

 

And that means in theory you can't really do open source, because if you create something, it cannot be open source all of a sudden. And then you get this fight of different licenses and so on, and you should be aware of those as well. So it's super hard to do well.

 

[Pinja] (18:21 - 18:41)

That's true. And we've been always talking about the legislation, especially in the context of AI, but we also have the EU Cyber Resilience Act, which entered into force last year in December, but the main obligations are introduced in the Act. They will come into force or they will apply from December next year, if I remember correctly.

 

[Stefan] (18:41 - 18:45)

Yes, it's 26 or 27, so there's still some time to fix it.

 

[Pinja] (18:45 - 19:03)

Yeah, 26, 27. I think it might be two years from now. So there are some things coming out of that, but it depends on the details, to be honest.

 

I think some actors are exempted, if I remember correctly, those non-commercial OSS developers.

 

[Stefan] (19:03 - 19:04)

Like my small projects.

 

[Pinja] (19:05 - 19:05)

Exactly.

 

[Stefan] (19:05 - 20:22)

If you do something regular open source and you don't make any money on it, it can even be a fairly big project. Of course, you will feel more reliant to do something about it, make sure it runs well and there's reporting guidelines and everything. But when you look into the details of it, because there was a lot of fear when it came out that, oh, shoot, now everybody has to be responsible for it, you can actually be sued and everything.

 

And luckily, I can't remember the organization, they do a big open source conference every year. They actually had a guy coming in with a lawyer next to him like, all right, what does it actually mean for all of these open source projects? Are you a maintainer?

 

Are you a contributor? What happens to X, Y and Z? If you're a maintainer of a project, well, if you make money on it, you will have to apply it by the Cyber Resilience Act.

 

If you don't make money off it, keep an eye on it, but you can't be sued. If you're a regular maintainer, don't worry, or a regular contributor. So it changes with the maintainer and the contributor, because if you're a maintainer, you're running the project.

 

If you start making money off it, either by getting, like there's a lot of ways to support open source with money these days. So if you start getting money, you might apply for it. But it really focuses on physical and software products as a whole.

 

It's not trying to kill off open source.

 

[Pinja] (20:22 - 21:20)

Oh, definitely not. It's just that contributors, if you're making money out of it, or if you're a company and you're making money out of your software product, and you're using open source software, you need to be aware of your obligations. One thing to push upon is, of course, AI.

 

It wouldn't be a DevOps sauna podcast episode if we didn't talk about AI, but we promised this will be very brief. This is hypothesizing, of course. We know that many open source software packages and projects are created with the help of coding assistants, so there will be a change.

 

There already has been a change in this sense, but maybe this is a topic for another day, I think, in more detail. This is me thinking out loud as a non-developer, but maybe, I don't know, there might be a shift in using open source versus your own coding assistants in the future, depending on the development of the coding assistants.

 

[Stefan] (21:21 - 22:03)

Interesting. I haven't actually thought about using your local coding assistant. You can probably do it, but as soon as you start bringing home the models to your own machine, then it sets quite a lot of requirements for your machine or your small corporate server farm.

 

It's a lot of discussions we have. We have some of our customers because they want to bring AI in-house, but when you start talking to them, what you can actually get is a model that might be, let's say, 25%, maybe even only 10% as good as the online one, because if you go to the big vendors, they run massive data centers with a billion GPUs, and you cannot afford that on a corporate scale. So you'll never get the same good results.

 

[Pinja] (22:03 - 22:35)

That is definitely not something that is going to happen next year. This is one of those things, well, artificial intelligence takes our jobs. Definitely not.

 

Definitely not, since we're only talking about LLMs nowadays, but maybe this is just me hypothesizing. But if we think about how the whole industry positions itself around open source as a last thing today, because open source is considered a core enabler, so it's not just a convenience or a cost reducer, but as we know, it is a must, basically, for companies building software.

 

[Stefan] (22:35 - 24:45)

It's a foundational building block these days. A lot of companies run maybe a Linux server to run their software. Even .NET folk that came out of the whole Microsoft stack are running their software on Linux servers these days. It's just what you use. If we go to a lot of companies, we see Kubernetes running Kubernetes as open source. It was created as the Borg for running Google, then they open sourced it into Kubernetes.

 

Today, people are running all of their AI workloads on Kubernetes. Open source will still be here, and it's going to be here for a lot of years. Even though we see some tables turning every now and then when commercial offerings are popping up as a branch out of open source, but usually people get quite aggressive when they see somebody trying to branch out of open source and make a commercial offering.

 

Let's see what happens. I think most of the articles I read were like, how can you actually make open source sustainable? They are talking about having a commercial track and an open source track, so you get some maybe richer features, some more usability when you use the commercial option, but you can do everything on your own in the open source version.

 

I know several products that are built like that. You can get your core set up by open source, or you could buy the enriched offering in a commercial manner. That's really going to be interesting.

 

One thing I don't actually know how people fix these days, like when we look at the legislations with critical infrastructure, are you doing something extra to make sure open source is working? I guess not, because you need to know all of your current vulnerabilities in the software. You need to know vulnerabilities in your operating system.

 

It doesn't matter if it's open source or not, you need to be able to patch it. I guess there are some two parallel tracks running in the companies where they run critical infrastructure, because some of it might not even have a commercial offering on the site, so you have to use open source. You might be running on something that is maintained by Bill and Bob sitting in a basement in the US somewhere, and what happens if they go outside and get a tree that is falling on top of them?

 

Will the project survive? Will you be able to run your business on it? Well, that's a business decision to take.

 

Do we actually dare doing that?

 

[Pinja] (24:45 - 25:13)

It is a resilience risk. Always related to these things. I think it's fair to say that the $8 trillion evaluation is not, it might not be very much off.

 

Of course, it's an estimate on the valuation. We don't have exact numbers, but if we think of a ballpark, if we think of something that is a core enabler and a strategic part of strategic infrastructure for software at the moment, I think it might be a fair valuation there.

 

[Stefan] (25:14 - 25:42)

I actually think it's a fair number. It's a number people have been looking for for years. When I read the report, you can probably find a lot more details to support the numbers, but they gave it a pretty good effort in figuring out a global estimate for all of this.

 

And I love that they took the demand value and the supply value. If we have to create everything on our own, what are we actually getting from all of the open source stuff we're bringing in? So, a pretty decent report.

 

[Pinja] (25:42 - 25:56)

Yeah. And that's all we have time for today. Maybe we gave you some food for thought on how we position ourselves with open source and how you might be looking at it a little differently next time you pull that package.

 

Stefan, thank you so much for this conversation.

 

[Stefan] (25:56 - 25:57)

Thank you.

 

[Pinja] (25:57 - 26:07)

All right. Thank you, everybody, for joining, and we'll see you in the sauna next time. We'll now tell you a little bit about who we are.

 

[Stefan] (26:07 - 26:12)

I'm Stefan Poulsen. I work as a solution architect with focus on DevOps, platform engineering, and AI.

 

[Pinja] (26:13 - 26:17)

I'm Pinja Kujala. I specialize in agile and portfolio management topics at Eficode.

 

[Stefan] (26:18 - 26:20)

Thanks for tuning in. We'll catch you next time.

 

[Pinja] (26:20 - 26:28)

And remember, if you like what you hear, please like, rate, and subscribe on your favorite podcast platform. It means the world to us.

 

Published:

Software developmentDevOpsSauna SessionsSecurity