The ethics of AI: data, privacy, and responsibility

In this episode of DevOps Sauna, Pinja and Darren dive into one of the most pressing topics of our time: the ethics of AI. From copyright infringement and data usage to environmental impact and corporate responsibility, they unravel the hidden costs behind today’s AI boom.
They explore how big tech handles data, the consequences of AI monopolies, and why regulation is a double-edged sword. Most importantly, they discuss what individuals, companies, and the industry at large can do to ensure AI is used ethically and responsibly.
If you’ve ever wondered whether AI can be ethical—or how you can use it more responsibly—this conversation is for you.
[Pinja] (0:06 - 0:13)
Do we have an ethics board? And if not, why don't you have it? If you do, actually, does it work?
[Darren] (0:17 - 0:25)
Welcome to the DevOps Sauna, the podcast where we deep dive into the world of DevOps, platform engineering, security, and more as we explore the future of development.
[Pinja] (0:25 - 0:54)
Join us as we dive into the heart of DevOps, one story at a time. Whether you're a seasoned practitioner or only starting your DevOps journey, we're happy to welcome you into the DevOps Sauna. Hello, and welcome to the DevOps Sauna.
I'm joined by my co-host, Darren. Hey, Darren.
[Darren] (0:55 - 0:56)
Hey, Pinja. How are things going?
[Pinja] (0:56 - 1:02)
Things are quite okay. Summer is almost over. It's the end of August.
It's looking good. How about you?
[Darren] (1:02 - 1:12)
Yeah, pretty good. I think we had this discussion last time of whether August was summer or autumn, and I don't think we've come to an agreement. To me, it's still summer, so.
[Pinja] (1:12 - 1:48)
Yeah, I think to myself, I live with the school cycle. So whenever school starts, it's autumn for me. It doesn't matter.
It's nice and breezy outside. I think it's autumn weather. But regardless of that, let's talk about our favorite subject of this year, AI.
And we have a topic in mind today, which is the ethics of AI. There has been a lot in the discussions, especially when we go to our customers, we talk about the ethical use of AI, what kind of rules you should have in place. But today we're trying to cover a little bit more about what the large companies are doing, the environmental impact of AI and privacy aspects.
[Darren] (1:48 - 2:07)
It comes down to the Arthur Conan Doyle quote from Sherlock Holmes, which is, data, data, data, can't make bricks without clay. And I feel like people, the first thing you should discuss when talking about the ethics of AI is to try and get people to understand the quantities of data that are actually being processed here.
[Pinja] (2:08 - 2:32)
I think it became more clear to me when I heard about how many hours it took to build the models for Open AIs. Like really, what is the amount of data that we have floating around, not perhaps floating around, but what is being used to train the models. And of course, when we have that much data being processed, we have the data centers that are behind all of our GenAI tools at the moment.
[Darren] (2:32 - 3:21)
Yeah, I'm just thinking the most concrete example we have of that is when Meta was recently in the news for their download of 82 terabytes of books. Now, 82 terabytes is again, one of those sizes that people don't really understand. I actually just want to do the math on it, just quickly typing the math.
So yeah, we're talking about potentially 80 million books. Well, yeah, I think that's right. But let's say math is not important.
We're looking at 82 terabytes of downloaded books put into their Llama system to train it. This is the size of every library in the world, presumably.
[Pinja] (3:21 - 3:49)
We thought the library of Alexandria back in the day was big. This is something completely different. And when we're talking about companies like Meta, we're talking about large companies and the rules of how to drain your, not dragon, but your LLM are not extremely clear at the moment.
And I'm not sure if the rules are actually made for somebody else at the moment. And shall I be a little pointy here and say that rules are for the lesser entities?
[Darren] (3:49 - 4:31)
I think the rules are actually relatively clear, but the large companies have just elected to ignore them. They've just decided that the rules don't apply. Like the 82 terabytes of books that Meta downloaded were downloaded illegally.
And I was actually researching this for a presentation I did before summer and I came across the case of the United States court versus Aaron Schwartz, who downloaded 80 gigabytes of data from MIT servers and how he was threatened with a million euro fine. And I think it was like 35 years of jail time for downloading a one, one thousandth of what Meta have just downloaded. So it's like, it's kind of frightening.
[Pinja] (4:32 - 5:08)
The difference here is that Aaron Schwartz is a private individual, right? So we're not talking about an entity having these infringements. Fine OpenAI is being constantly sued by other entities for copyright infringements.
And we already covered the case of DeepSeek and they just literally draped off OpenAI when they were training their model. And just everything is, do we base everything on plagiarism right now? Because OpenAI got really hurt about this and they were really upset.
And then we don't even know what OpenAI uses to train their own models with.
[Darren] (5:08 - 5:42)
Yeah. OpenAI is like, we have a general idea of the model, but it's based on, again, stealing people's work. It's like scraping things like DeviantArt and that's how it's putting out images.
And then yeah, DeepSeek, the whole, oh no, the American plagiarism machine has been plagiarized by the Chinese plagiarism machine. This is what LLMs are. This is what AI is currently.
They are plagiarism machines. I think the ethical dilemmas are clear and we should acknowledge those. I just don't think it's fair to brush those under the rug like a lot of people do.
[Pinja] (5:42 - 6:21)
It is clear that the big companies are ruling this game and even nations such as the US and China and Russia are getting into the game. So I think we briefly talked about this before, like how this is now where perhaps the Cold War is happening at the moment. So it's no longer in space as it was before.
And now we actually are talking about who rules the data. And so many people are using LLMs and GenAI in their daily life. So for example, spreading misinformation is a critical thing here.
But if we think of, for example, we already talked about OpenAI, we talked about DeepSeek, and then there's Grok and God knows what is happening with that at the moment.
[Darren] (6:22 - 6:31)
Yeah, the last thing I heard about Grok was it having some new avatar that was deeply disturbing in a number of ways. Let's not go down that road.
[Pinja] (6:31 - 6:45)
Let's not. But if we think of how most of the AI tools are being sold at the moment, then if we come up with an analogy for this, they're just wrappers and frameworks. And this is a way to interact with one of the big vendors, basically, isn't it?
[Darren] (6:45 - 7:20)
Basically, every product, it seems to be very rare that people are applying their own learning. They're just wrapping a prompt around your prompt and passing it to chat GPT most of the time. That's the most common way for an AI.
And I'm not going to say I think there's anything wrong with that. Prompt engineering is a big step forward, and that's going to be ruling the AI space over the next couple of years. So having experts generate these prompts or format your prompts in such a way that you get the output you want, it's a valid use case and it's a valid business case.
[Pinja] (7:20 - 7:58)
It is. And just to be clear, and let's have a disclaimer here, Darren and I per se are not against the AI and technology advancements. But let's talk about the ethics of all of this.
This is the whole point of this discussion here. And we might point out here that the initial development of AI tools is not exactly quite open. And it is not what seems extremely ethical at the moment.
We know that all the big players are currently at the center of various legal actions, as mentioned before. And as I said, regarding copyright infringements and Meta now downloading 82 terabytes of books without actually having the rights to do that.
[Darren] (7:58 - 8:24)
When it comes to the question of is AI ethical, my answer leads to no. But that doesn't mean you can't use AI ethically. That's, I think that's the key nuance here.
The problem is no one's having this conversation. No one's talking about ethics. The ethics have got left behind, behind all of the need for rapid adoption and the need to stay ahead of a curve that's constantly accelerating.
[Pinja] (8:24 - 8:52)
It is. And as I say, it is possible as a single consumer of Gen AI tools and a user to use it ethically. And yes, the methods of how, how it has been trained might not be that ethical in the background.
And the problem at the moment, one of the things that we need to talk about is AI concentration. We have Nvidia, which is now the first company ever to reach $4 trillion in valuation, and they pretty much have a monopoly in this area.
[Darren] (8:53 - 9:25)
I was actually reading about this. We discussed it last week on our news episode. And I was reading how it's because Nvidia has the only plant in the world who can make these, uh, nine nanometer circuits.
So they're so infinitesimally small that no one else can do it. And it's like, it's great for innovation, but I honestly don't know how they haven't been hit by anti-monopoly laws. Like I don't know actually what laws that would be like, maybe the EU could hit them with something, but then it would just mean the EU not getting the cards to keep up with everyone else.
It's an interesting angle.
[Pinja] (9:26 - 9:51)
Yeah. So we're basically in, maybe it's too harsh to use the Cold War reference again, but it's a little bit of a standoff, isn't it? That's why we want to regulate it.
We want to make sure that it's fair, but for example, as you say, if the EU is making any regulatory actions towards companies like Nvidia, what are we going to get in the end? So we need to make sure that we keep in the game, but at the same time, we need to make sure that things are fair.
[Darren] (9:51 - 10:04)
Yeah. I also think there's an aspect of AI that people aren't really considering before we move on to like guides on what we think should be done. Can we talk about the environmental aspects?
Because I don't see that being discussed enough.
[Pinja] (10:04 - 10:37)
No, that is not being discussed enough. And there's a funny anecdote from a couple of months ago where OpenAI said that it costs them millions of dollars when people are saying “Please” with ChatGPT. And finally, we can talk about the humanization of AI, all we want, but at the same time, this cost is going towards the data centers and the energy consumption of the data centers.
So it takes a lot of energy to run the data centers. It takes a lot of energy to train the models themselves. So we're talking about carbon emissions, we're talking about water usage, and we also talking about e-waste here.
[Darren] (10:37 - 10:54)
Yeah. And I think it was last week that you said the idea that AI feels weightless because, you know, you just interact with it. It's like a chatbot interface.
It's designed to have a conversation. The power going into it is just not discussed. It's not visible or transparent anywhere.
[Pinja] (10:54 - 11:02)
It's the same as with Wi-Fi nowadays, or the network 5G that my phone is using, because it's over the air, right? It's just, it's weightless in that sense.
[Darren] (11:02 - 11:03)
Yeah, it's just there.
[Pinja] (11:03 - 11:32)
And yeah, it exists. And why is there, what is e-waste, for example? It's one of the questions that somebody might be asking.
But there is a lot of data usage being actually created and the energy that goes into, number one, even building the data centers. I think many European countries are nowadays getting data centers. And that, of course, you need to build it.
You need to perhaps clear some forests and start with that. Like everything goes into that.
[Darren] (11:32 - 12:23)
Let's say, we want to say, please. That ups the power requirements. We already know that Microsoft didn't sign the deal to reopen a nuclear reactor for their AI work in the US?
So it's like, it also doesn't help. There's a kind of a bit of shame on my side because of the British government. They recently stated that to save water, people should consider deleting their old emails and photos.
And it was just like, storage doesn't make a difference. It's computers that make a difference. And the fact that governments don't know that at this point just means that when they do say, oh, we have to be careful about water usage in AI, people are just going to go, you have no idea what you're talking about and continue to ignore them.
So we have some difficulties in many directions when it comes to environmental aspects. And that's something we discussed with Anne Currie when we had her, which country? Was it London?
[Pinja] (12:24 - 12:34)
It was London, the Future of Software Conference at Eficode powered in March. We have had a panel about sustainability of AI. So we had a conversation back then and Anne Currie was one of the guests at the panel we had.
[Darren] (12:35 - 12:52)
And she's one of the few people that are talking about green technology with any kind of actual understanding at the moment, other than like the governments who just say things for the sake of saying them. But back on our subject, shall we talk about, like we've said that AI can be ethical. Shall we discuss how?
[Pinja] (12:52 - 13:13)
Yeah. So what can be done in terms of the ethicality when we know that the data models behind it might not be so ethical? So we have a couple of aspects here that we want to discuss here.
So the personal responsibility, what can a company do if your company is using, for example, chain AI development tools and what should the industry do?
[Darren] (13:13 - 13:28)
Yeah. So we're tackling some big issues. Let's talk about a person first, because it's mostly people listening to this podcast.
So let's hope that they want to do something about responsibly using AI. I know a lot of the people who listen will. So what can we do as a person?
[Pinja] (13:29 - 14:08)
Let's start from the privacy aspects, I guess. This is maybe not being discussed enough yet, but I guess we can start with respecting the privacy of various people. And it can even go as far as, there was an example a couple of months ago, somebody was posting online a prompt that they use on ChatGPT, where they uploaded a holiday photo of their kids with the faces visible and recognizable to ChatGPT to create a coloring book page.
It sounds very innocent, doesn't it? And the results were kind of cool. So you can then give the photo that is now a coloring book page to your kid, but you just provided the LLM ChatGPT your kid's face as a photo.
[Darren] (14:08 - 14:58)
And the same can be said for just about any discussion. If you're like asking emails to be summarized by an AI, you have to understand that you're not getting the person's original intent. You're not respecting their privacy.
And again, there's a lot of this like summarizing email. I'm not going to say it's bad. You just have to understand what you're getting from AI and what you're taking away from human interaction when you use it.
If someone sends you like six A4 sheets in response to a simple question, feel free to summarize it. But you know, you have to understand that in doing so you are first off passing this person's information to an AI model. And I don't think I know how clear GDPR is on that because that could be another person's private data.
So maybe a legal expert can weigh in on that.
[Pinja] (14:59 - 15:15)
Yeah, well, if we think of the GDPR and the scope of it, if there is an identifier to the person, right, so we're talking about a name, you're talking about your birthday, hopefully not a social security number. Please don't put your social security numbers on GDPR. Stuff like that.
Phone number for a private person.
[Darren] (15:15 - 15:21)
Email. I think even IP addresses have been discussed as personally identifiable. Correct, exactly.
[Pinja] (15:22 - 15:34)
So anything like that. So if you do use GPT and other tools to summarize things, please make sure that you do not leave any personal identifiers into the data that you're processing there.
[Darren] (15:34 - 16:03)
I think we could also talk about the current project that's being spearheaded from Eficode’s side, at least by Henri Terho, which is the MAISA project, which is like a human AI hybrid project. And honestly, it's a cool project, go look it up. But if we can summarize it in one sentence, I think it needs to be that AI should be used to enhance people and not replace people.
And I think if there's one takeaway from this whole podcast discussion, it should be that sentence, enhance and not replace.
[Pinja] (16:03 - 16:16)
So talking about replacing people with AI, let's move on to the company side. We had a debate in our show notes here, we always make this outline with Darren, and we were thinking like, what can a company do? And I think it says, and a question mark.
[Darren] (16:17 - 17:08)
Yeah. And then there's my notes, which was, I have no idea, but maybe stop being evil, which I'm pretty sure Pinja was hoping I wouldn't say on the recording. But yeah, this concept that I think has kind of been lost when it comes to 2025 in a lot of cases, which is unfortunate, and it's corporate social responsibility.
And I think companies and corporations need to understand that they are part of an ecosystem. They exist as part of an ecosystem within society. And their goal obviously is for business, it's to make money, and there's nothing wrong with that.
But the task is to at least be sustainable within that environment and not try to suck out everything you possibly can from it without putting everything back. And I think that is something that needs to be the focus of a lot of companies pushing AI going forward.
[Pinja] (17:08 - 17:51)
Yeah. And one thing that you as a company, any people who are responsible for these things at any company, is to think, do we have an ethics board? And if not, why don't you have it?
If you do, actually, does it work? Do you think that it's functional? And if not, why?
So we're talking about social responsibility here. And many companies are required to have social responsibility policies, right? So is your use of AI part of that policy nowadays?
Because the use of gen AI tools is so wide nowadays. And you might not even know that your employees are using touchBT in the background, even though you're not paying the licenses as a company. But it might be that they're still using it and paying for the license themselves.
[Darren] (17:51 - 18:02)
There's all kinds of attack surface discussions to be had about AI. And I think this corporate ethics is a cornerstone. But okay, finally, what do we think that the AI industry can do about this?
[Pinja] (18:03 - 18:29)
Yeah, you can hear us laugh about this. We're researching this. And our thinking is that not actually a whole lot at the moment.
If we think of the premise that we just built here, how the LLMs have been trained. But one of the prerequisites for AI is currently, unfortunately, plagiarism, which we just mentioned before. And I'm not sure if there's a solution, at least an easy one for this, without actually dismantling the whole thing that we have gotten in the past couple of years.
[Darren] (18:30 - 19:29)
Yeah. And I'm just thinking, there are so many artists who have been materially damaged by AI that they may be seeking reparations based on that damage. But it's like, the idea of paying all of these people for their individual contributions is impossible.
And I'm not saying that because I don't think it should be approached. I think someone smarter than me on this subject needs to approach it. But it's so difficult.
If we think of one of the training grounds for early LLMs was Reddit. As has been pointed out before, Reddit was the ultimate training ground because it had everything split into posts and responses that had upvotes and downvotes. So you know which ones were good and which ones were bad.
It cultivated training data. And how do you compensate people for that? You can't really.
But then when it comes to ripping off DeviantArt and other image storage places, then that's where it becomes a lot more morally questionable. And I don't have an answer there.
[Pinja] (19:29 - 20:00)
There was a trend a couple of months ago, where people would go in and ask, hey, could you please recreate my profile picture in the style of Studio Ghibli? So we're talking about the Japanese studio that made, for example, Totoro the movie, and they have a very distinguishable art style that they make their movies in. And I think that was one of the discussions that sparked out of this.
How do you actually, are you allowed to use the model and train it to use this kind of art style that somebody created?
[Darren] (20:00 - 20:17)
But the problem is the discussion's happening too late because people are using the model for that. The discussion should have happened at the inception of AI, but no brakes were put on. Everyone decided full speed ahead.
And now copyright doesn't really mean much anymore. That's not legal advice, by the way.
[Pinja] (20:18 - 20:46)
We have a lot of disclaimers in this episode, by the way. But one thing that we can kind of discuss about, or maybe ask somebody who knows more about this, is that maybe the model sizes could be a little smaller. So maybe that they would not actually use as much energy.
And maybe looking into, for example, if you're using ChatGPT in your daily life or any other LLMs, do you actually need to use the largest model to tell you which lunch place you want to go to?
[Darren] (20:49 - 21:23)
Or even like restrictions on not model size, but the use capacity. Because I've actually been playing around with these auto subtitling models, and the smaller models are done reasonably well. But then you get up into these massive models, and I have a pretty powerful PC, and it will stop them working for days.
And it's like, do I really need that? I can just go through and edit the subtitles manually in the meantime. It's relatively simple.
So it's like, yeah, I think occasionally we're using bazookas when we could just kick down a door.
[Pinja] (21:23 - 21:53)
One aspect is also the energy source that is being used for the data centers, let's say, and how competitive are the prices for the renewable energy sources in the countries where we have the data centers. So that's not maybe the AI industry, but that is for the governments to decide and think about if we make it more lucrative to use the renewable energy sources to power the data centers, maybe that might actually help with some of the environmental aspects and impact that is coming from the data centers.
[Darren] (21:53 - 21:56)
Or maybe not make it more lucrative, but mandated.
[Pinja] (21:56 - 21:57)
Even better.
[Darren] (21:57 - 22:04)
So that if you are building AI, just require the energy to be renewable. I think that would be a fair step too.
[Pinja] (22:04 - 22:33)
And that actually gets us to the next point. I think that one of the last ones we have here is what the industry can do, and maybe what the governments can do, is the regulation that we already discussed briefly here. It's a kind of double-edged sword here, and we have to thread this very lightly and carefully, because as mentioned, we want to have some kind of controlled manner in how the AI industry is working, but at the same time, do we want to put the whole continent, for example, of Europe behind in the development of AI tools?
[Darren] (22:33 - 22:46)
I think at this point, we have this discussion of... We've been kind of negative for quite a bit of this discussion for obvious reasons, and I'm not sure we should end it there, because I don't think either of us is saying don't use AI.
[Pinja] (22:46 - 23:05)
Absolutely not. And maybe the last point that we want to say is that we as users have to be morally responsible when we use AI tools, and to keep the morals and ethics in mind when we use them. All right.
On that note, I think that's all the time we have for today. Thank you for joining us in the DevOps Sauna. Thank you, Darren, for joining me.
[Darren] (23:05 - 23:06)
It's a pleasure, as always.
[Pinja] (23:06 - 23:09)
All right. Thank you, everybody. It was fun.
See you next time.
[Darren] (23:12 - 23:15)
We'll now tell you a little bit about who we are.
[Pinja] (23:15 - 23:20)
I'm Pinja Kujala. I specialize in agile and portfolio management topics at Eficode.
[Darren] (23:20 - 23:23)
I'm Darren Richardson, security consultant at Eficode.
[Pinja] (23:23 - 23:25)
Thanks for tuning in. We'll catch you next time.
[Darren] (23:25 - 23:31)
And remember, if you like what you hear, please like, rate, and subscribe on your favorite podcast platform. It means the world to us.
Published: