Responsible innovation in AI: How you can shape the future | Hugging Face
In this talk, Emily Witko, Head of Culture at Hugging Face, discusses the importance of responsible innovation in AI. The presentation explores how ethical, inclusive, and collaborative approaches—spanning engineers, ethicists, policymakers, and designers—shape the future of AI. Through real-world examples, Emily highlights the risks of overlooking non-technical perspectives and provides practical strategies to promote accessibility, challenge assumptions, and connect technology with human values.
Transcript
I'm Emily. Thank you to our friends at Eficode - for putting together this beautiful day today. I am excited to talk to you about responsible innovation in AI. I also want to point out that while I do focus on AI, - a lot of the information can be applied to many different types of innovation. I'm curious, before we start, - how many folks in this room identify as non-engineers? Do we have any non-software developers in the room? Okay, that was kind of interesting - because people were waiting to look around first [Laughs], - and only then identify. Good. I'm excited you're here. I am also not an engineer. Excited to raise my hand there. I'll tell you a little bit more about my background in a few slides. Hopefully everyone can get something out of this talk. I like to think about responsible AI and responsible innovation - in terms of something that's much larger than a technical challenge. We all can play a hand in responsible innovation. It's a shared project. We have our engineers involved. We have the policy team involved, ethics teams, on and on and on. We're all here to do a good job and shape the future of AI. This is essentially what we'll talk about today. I want to define responsibility and then talk you through my five pillars. I've been trying for weeks to come up with a better word than pillar, - because I felt that sounds a little hoity-toity. If anyone has any advice after the talk, I'd love to hear it. Then we'll share some examples of folks that maybe didn't do a great job - and some folks that did. Just generally, you know what I look like, - but I like to share a photo of myself - because then you get to see my chicken. This is Francis. I am a non-engineer, as I mentioned. I have worked at several startups across several different industries. The thread throughout all of that is that I am people-obsessed. I got my start at Hugging Face working very specifically - on diversity, inclusion, equity and belonging. For anyone that works in the AI or the machine learning space - knows that this is quite a challenge. My role has since expanded to be much more of - an employee experience, culture, recruiting type role. I get to give talks like this to awesome people like you, which is exciting. In case anybody is not familiar with Hugging Face, this is our logo. You will now never forget that's the name of this emoji. We are an open source platform - where people build and share AI tools together. Are folks familiar with Hugging Face? Have people used it in the past? I see some shaking and some hands raising. Great. I swear I will not focus all of my examples on Hugging Face. But we will pop up every now and again, because we are doing - a good job at some things, so I would like to share that. [Laughs] I also want to point out, before we begin, - that Hugging Face is fully decentralized. Frankly, when I started at Hugging Face, - I didn't have a great understanding for what that actually meant. But in short, it means that every employee at the company is a leader. Our CEO does not set goals for the company - that then cascade down through each of the teams. It actually is sort of flipped on its head. Each of our employees is responsible for setting their own goals, - often for coming up with their own projects - that they believe will be impactful. Our CEO's job is to sell the company. That's important to point out as we get into some of the examples. We will see them moving forward, because decisions, contributions - and innovation really happen at a faster pace across this global community, - because there's no single leadership chain. So. responsibility. I was hoping to be able to give you - a quick, easy definition, but I can't. [Laughs] we're going to talk about the five things - that make up responsible innovation. The first one is ethics. The question of ethics is not what we can do with this technology. It's what we should do with this technology. As we've seen over the last ten years in our own lived experiences, - technology moves faster than the law. Ethics helps to fill that gap while regulation catches up. I'm going to spend more time on this bias bullet point - in a couple of slides, but... Often people talk about how bias is a data problem. I want to expand that. While that is absolutely true, it's also a moral one - because people make decisions about who gets represented in the data - and who gets excluded or harmed. Then again, non-engineers bring perspectives like - psychology, policy and design to responsible innovation - that helps to keep AI aligned with our human needs. We need AI literacy in order to make sure that happens. I want to ground each of my pillars, again, think of a better word, - in an academic or grounding principle. The first one I'll start with is... this is one of the first papers that came out - really talking about what responsible AI could look like. Floridi and Cowls, in 2019, created The Five Principles of AI Ethics. It's important to go through them fairly quickly. I'm throwing a bunch of words at you, - but I think some of these should be pretty straightforward. I swear I didn't just copy my five pillars - off of the fact that they used five principles. [Laughs] the first one is beneficence, which has been difficult for me to pronounce. Did I say that properly, beneficence? Thank you. I see people shaking their heads. AI should essentially do good, right? We're trying to do good with our innovation. It should also not do bad. These are fairly straightforward. Non-maleficence is, I think that's the correct way to say that. Also, we should be supporting human agency, not working to replace it. Then we also have justice. We promote fairness and avoid discrimination. And then explicability. We want to be transparent about what we've created, - how and why we've created it and all of those things. I want to take a step back to the non-maleficence piece quickly - and point out that there is an environmental aspect here - I'm sure everybody in this room is aware of. Sustainability falls under ethical responsibility. And I want to include a nod to my climate researchers - and folks who are focused on the impacts of AI in our environment. I am not an expert in this space - and I won't touch on it much more for the rest of this conversation, - but I want to make sure and acknowledge this is a huge issue. Doing good means designing AI that's not just good for people, - it's also good for the planet. Thank you to folks who are focused in that space. Okay, let's talk about ethics going wrong. I don't know if folks remember DHH, David Heinemeier Hansson - and Steve Wozniak, who helped create Apple. They had this conversation about an Apple card - that they both applied for back in 2019. I have an Apple card. I'm sure people in this room have Apple cards. Back when these two men signed up for an Apple Card - and also signed their wives up, - they got much, much higher credit limits than their wives did. They also have giant social media followings. And so, people noticed, and people noticed pretty quickly. Unfortunately, the way that Apple responded was to say, - it's not our fault. Goldman Sachs issues the cards. It's not us. We don't know what's going on. Their response was pretty limited and also confusing - because they didn't take any responsibility. The card was branded as theirs, but they completely deflected - and drew a lot of criticism about it. The New York financial services - actually investigated Goldman Sachs and the Apple Card. Eventually they decided that gender wasn't an explicit input in their data. That was incredibly opaque. They weren't answering any questions about it, all of those things. But... They could tell that the "algorithm", was encoding gender biases - through proxies like income patterns, employment histories, - spending behaviours, which all led to disparate outcomes. Imagine your wife goes on parental leave for a year, - spending habits change, income habits change. All of a sudden your credit limit is much lower, - even though you didn't encode gender into your data set. If no one can explain how a decision that harms people became to be, - that's not good, that's maleficence. In the end, there were no charges pressed against Apple or Goldman Sachs. They added manual review processes for contested cases. As that tweet saw, maybe it's not included there, - but essentially they tried to get customer service and didn't get any. They increased the transparency for their credit decisioning. They said that they conducted internal reviews. Not great. I said I'd come back, circle back around to data a little bit. It's a human problem. It's not just an engineering problem. People think if we fix the data set, we'll fix the bias. But bias exists because we humans decide what to collect. We decide what labels to apply, what counts as normal, - and who is considered the user. Bias isn't just about messy or missing data. It's about who shaped the data in the first place. Let's talk about some data biases quickly. They go in order from the creation of data to when it's completed. It starts with intent. What problem are we solving with our data? Essentially you have to ask, who is this data accurate for? Or, whose values is this data in line with? Then when we talk about representation, - we talk about who's visible and who's invisible in our data. Lastly, we talk about the impact. We've answered those questions. That way we should know who is being impacted. What are the real world impacts, like the Apple Card example? Folks were getting much lower credit limits. People are denied loans. People get misidentified. All of those big, scary topics that we talk about. So, Hugging Face, what do we do well? We talk a lot about making sure that everything has a bias. As we know, we're all human, we all have some biases. We use model cards, I'll show you an example in a second, - to bring those biases to the surface, - so that we're not hiding behind opaque data. We also have inclusive data sets that align with - some of the five examples from Floridi and Cowls - to reduce some bias of the Western English language, the kind I speak. We also focus on smaller models - and things that hopefully will bring down the environmental cost of AI. If anyone is familiar with Bloom, it was a project we started back in 2021 - with thousands of volunteer researchers around the world. It is a large language model that can output text in 46 languages - and 13 programming languages. It was a big science collaboration, a fully, fully global project. It was essentially the first multi-language, open-source, - large language model built in the public for people. The examples here are directly from its model card on the Hugging Face website. You can see we have pulled out information - that you can see even before you start using it. You know who created it and who funded it, which is also very important. Then you can see behind the scenes, check out the infrastructure - and learn a little more about the training data. You can also see how we recommend that you use the model, - as well as ways we recommend you maybe don't use the model, - where it might not be applicable. Then we also highlight some risks and limitations. That list is longer, but I just included a couple there. I pulled out a few of these examples - because I think it's important to show that... hopefully no offense to anyone sitting in this room, - but these model cards weren't written by engineers. It isn't the engineer's job. Of course they were helped along by the engineers, - but we need other folks to translate the content into understandable, - digestible and applicable guidelines on the model cards. That brings us to how SX doesn't just live in the engineering department, - shaped by lots of people. So the UX Researcher makes people visible. Policy advisor, like I just said, turns policy into practice. People who do the hiring are creating teams that have certain values. Communications decides how we talk about our innovations. Moving on to the next pillar now. If ethics is about intent, governance is about accountability. Again, who decides and how? Essentially, governance, government, is the system of rules, roles - and accountability that ensures that AI is used responsibly. Accountability means someone must own the decisions, - like Apple didn't, not just the algorithm. Rules clarify how the models are trained, evaluated and used. Then the decentralized governance, like at Hugging Face. It means sharing decision power within the community. We end up with an equity of voice. Open source, in general, is a good framework for good governance - because there's transparency, accountability and participation. That's straightforward, if folks have been involved - in open source projects in the past. People can see how things are made, - and they hold each other accountable for what they change - and then actively take part in what's coming next. Like we do. [Laughs] Are there any Americans in the room? I'm curious. A couple of hands. The Americans may be familiar with this example. This is pretty recent. In 2024, a couple of changes - were made to legal aid in New York City. The government decided to launch a chat bot - without doing a lot of what we talked about already. Then there's a mayor of New York City who's essentially a joke. If anyone is familiar with Eric Adams, I would recommend - looking up one or two things about him. When reporters asked about it, he essentially doubled down. He said, mistakes are just what happens when you innovate. They kept this up. Just a couple of these examples - make it really, really clear who we are discriminating against. Who is invisible in these examples? We're taking away tips, which is what makes in the United States - servers, waitresses and baristas have a livable wage. I was a bartender for many years. I made four dollars an hour on my paycheck. The top two are even more egregious. Section 8 is someone who gets rental assistance from the government. These examples are saying you can actually turn people away - who are getting assistance in order to have a home. But again, it's just what happens when you innovate. Here are a couple of examples of how good governance happens every day, - including how it didn't happen in that chatbot. We need legal folks to review the answers that a chatbot might give, - which clearly did not happen in the New York example. We need folks who understand the policy to review what the output might be. We need to build systems for how we might fix it if something goes wrong. Community manager, customer experience folks will be able to - give feedback quickly when something goes wrong, - turn it off for a minute, fix it and bring it back online. It is still active to this day in that form. In case anyone wants to give it a Google. Even if we have good governance... We get to the third pillar now. That's where risk shows up. We need to anticipate harm before it happens. I wrote a note about this. I'm not sure where I put it. I was at dinner last night with some folks from Eficode - and some of the speakers, which was lovely. We were laughing because one of the conversations turned to risk management. We might have been bouncing off the AWS outage a little bit. We were talking about how hope is not a strong pillar of risk management. It made me laugh, because we've probably all worked for companies - where we were just hopeful that everything would be okay. [Laughs] let's talk a little more intensively about what risk is. What could go wrong and who bears the cost? I just saw that sign of the cross happen in the audience. Risk management in AI mitigates potential harms, - uncertainties and unintended consequences. It's about anticipating outcomes, not just responding after the fact. Anticipation is cheaper - if you need to tell your boss something that might help. Especially cheaper than a bias scandal, for example. Different risks require different responses. As you don't want to lose the trust of that public, risk is super important. The European Union has a risk pyramid - where they've assigned five different levels of risk. At the bottom, there are usually things like customer service chatbots - or places where you have easy human intervention - that are essentially permitted with no restrictions. As you go up the pyramid, you can see that they get more and more restrictive. Just very quickly, the top unacceptable risk - generally boils down to AI systems that do social scoring. I'm not sure how familiar everybody in the room is with that term, - but it's essentially assigning people a number. It is based on what the system defines your trustworthiness as, - essentially based on your behavior, your purchases, your social activity. It's usually used to grant or restrict access. It is a pretty dangerous topic of conversation. The EU has decided that it is unacceptable. The high-risk categories are some of the things - that we've talked about already, for example credit limits. It's where things can significantly affect your safety or your rights. Algorithms for hiring or college admissions, - anything that can really impact your life falls in the fourth category. I'll give a quick example from the United States. I'm embarrassed, and everybody knows our incarceration rate is very high. This is a picture of two inmates who I believe were in South Carolina. Bernard Parker is on the left and Dylan Fugit is on the right. Some states in the United States have used and still use - a program called COMPAS. Just looking at this photo, you can maybe guess - what the topic is that I'm about to talk about. COMPAS is - Correctional Offender Management Profiling for Alternative Sanctions. I know we love our acronyms in engineering. That's a bad one. That did not need to be shoehorned in there. Essentially it was used to predict - how we think somebody might reoffend or not reoffend. I think we lost a little of this slide in the Google to PowerPoint transition. What that essentially says is that Bernard Parker on the left - was assigned a number that made it look like - he was very likely to reoffend, and he did not get parole. Then when he did get parole several years later, - he to this day has not reoffended. The opposite was true for our friend Dylan on the right, - who was paroled and immediately reoffended. What we found was that was the pattern. White folks were labeled as low risk. Non-white folks were labeled as high risk. Probably not hugely surprising, but obviously hugely problematic. It was opaque. You couldn't argue with the data because you couldn't see it. Who's doing it a little better? Hugging Face sometimes. [Laughs] We do risk management through openness. I'm harping on this point a little bit. We treat risk as a feedback loop. Everything is open and accessible. Even our internal communications are fully open on Slack. You're actually not supposed to, it's not closed, - but you're not supposed to direct message anyone. All of our conversations through our company policy - are meant to happen in public channels. That way, anyone can chime in about something they think is important. We have this very open culture in order to mitigate risk. If you in the back corner are going to recognize something that I'm not, - we want to hear about it, early and quickly. We'll talk more about OpenAI later, its good and bad sides. Lilian Weng, if folks aren't familiar with her, you should look her up. She's awesome. She's the head of their preparedness team. She came up with some of the earliest policies and procedures - at OpenAI to mitigate risk. They actually have a centralized risk governance strategy, - which is the opposite of Hugging Faces, but they still do really good work. Both of them make risk everyone's responsibility, - whether that's the community flagging issues - or a dedicated team that tests edge cases. Both strategies are concerned with risk, which I think is great. Again, who do we think is on this slide? Besides engineers, obviously. We've got legal, always very important. We've also got design. Whenever I think about design and how they test - how real people use and misuse systems, does this pop into anybody's head? It's the video of the woman who was QAing a fake program - with that kid's toy, putting the blocks in the different shapes. She said, okay, that's a square, put it in the square, - and it fit in the circle and went in it, on and on and on. We need real people testing the systems because they're going to misuse them. Diversity folks, or folks that hadn't even thought about diversity, - can surface patterns of inequity and disproportionate impact. Super important. Customer support folks can spot early warning signs - and mitigate some of this harm in the wild. I mentioned earlier that I'm a people person. I'm going to try to not spend too long on this section, - but my next pillar is essentially human impact. So, it's my favorite. We'll try to get through it quickly - because responsible innovation exists through all of us every day. I'm going to start with a real silly example. It is, how many chicken nuggets do you think you could eat? Stop! [Women on the video laugh] [Emily chuckles] My friend in the sound booth asked if I wanted to - replay that video over and over. I said I thought once is enough. What I really like about it is... a couple of years ago, it may have happened here too, - McDonald's tried to take away their drive-through people - and replace them with an audio AI system. This group of people and it's just their laughs, - we've all laughed like that, it's just uncontrollable and you could feel it. They could not get the machine to stop adding - chicken nuggets to their order. [Laughs] By the time a human showed up, - they had 18,000 chicken nuggets in their order. [Laughs] I love the humanity that it shows. Clearly there's a disconnect between the system and the human experience. It's joyful, which I really like. It's a good example of how these systems exist in real life. When we talk about human impact, who is affected and how do we repair harm? I'll talk about repairing harm a lot. We talked about most of these bullet points already. The question, sort of similar to ethics, is not just can we build it, - but does this contribute to the world that we actually want to live in? Do I want to eat 1800 chicken nuggets today? Sort of unclear. I could probably do 1400. We'll talk about accountability. The AI Now Institute wrote this paper about accountability and practice. It's maybe not worth going into too much detail here, - but if folks are interested, I would recommend looking it up. It's essentially challenging the idea that - we can fix systems with technical audits alone. We need more of a structural change. We can't just decide that something has bias. We have to do something about that bias. It needs to be built into laws and have enforceable consequences. The core idea is that responsibility is not about patching bias after the fact. It's about designing systems where if there's bias, - if there's harm created, there's real consequences. Are folks familiar with the concept of restorative justice at all? It's really fascinating. While I was coming up with the next couple of slides, - I realized that I should maybe focus my next talk on - restorative AI ethics and restorative justice. It's really, truly wonderful. I recommend digging in a little deeper - if this is something that you're excited about. As a quick personal restorative justice example, - the company that I worked for back in 2020 in the U.S. - was the height of our Black Lives Matters protests. It was a highly politically charged environment. The company I worked for decided to have some open conversations - about what was happening in the world, which I don't think is a bad idea. I'm not sure they did it very well. There was no moderator, - and the conversation was just fully open. It meant that one of the conversations got fairly heated, as you might expect. A colleague said something inappropriate - and was immediately terminated. I swear, within the hour, that colleague was terminated. Many of us were really disappointed, - not because we agreed with what he said, - but because there was no attempt at a conversation or no attempt at repair. The most dangerous thing about that is that all of a sudden - the thing that he said could become even truer to him. Obviously this is like, I'm going to get fired for saying this? Like, of course I'm right. That moment really stayed with me because it showed how quickly - we move to punishment instead of understanding or conversation. Sometimes that happens in AI too. Instead of focusing on compliance or crisis management, - we should be talking about how to repair relationships and rebuild trust. I think it's a really interesting topic of conversation. Here are a couple of examples of people doing some restorative AI work. This paper is one that I fully endorse. If you're interested, I would recommend reading it. It's pretty new. I want to quickly shout out Denmark. You have a law proposed, which I think is actually pretty cool. I don't know how exactly it will work, so don't ask me. Individuals will be given essentially copyright-like ownership - over their likeness, voice and image. If someone uses it inappropriately, you have actual legal recourse, - whether you get it removed, whether you get paid a little. It's kind of neat, because Denmark is not just saying, let's stop deepfakes. Denmark is saying, it's your voice, your likeness, your humanity, - and it's being used without your consent. You have the right to reclaim it. It's actually a very human-centric law. It's a shift from prevention to repair. We know that this will happen. How do we restore it when it does? I'll say it again, because I think it's important. It acknowledges that your identity is not just data. Your identity is you. It's your personhood. I will be curious to see how that plays out here in Denmark. This is a good segue into where it can go wrong. We're all familiar with the fact that - sometimes models can suck up artistic work, written work, - things that have not provided their consent. Then all of a sudden we see these models recreate some distinctive styles. It's not a great situation for that artist where this is their livelihood. So consent and credit are core to being creative. Obviously, this is where the trust might break down for creative folks. Then the other example of removing humanity that I have - is about the gig economy. Uber Eats, or I guess any sort of rideshare app, - all have the closed systems where they assign work to people - based on data that we have no idea. We're literally just making guesses. Sometimes it has to do with people taking breaks. Sometimes it has to do with how fast they do deliveries. What we know is that there's no human manager. It's essentially an algorithm that's deciding - what work you get and why you get that work. It makes the human part of it invisible. We're relying on this algorithm that no one can see - to allow me to make money is not caring for people. A couple of very quick examples of things going right, except - I do have to call out that spawning is currently offline, I apologize. Their tool became so popular so quickly that they had to pull it down, - because it was overrunning their servers. Essentially, they had a tool that said, have I been trained? Artists could put in their artwork and see - if it had actually been sucked up into any large language models. Hopefully it will be back very soon. Mozilla does really good work in this space. They've asked for volunteers around the world - to represent voices that we don't normally see in AI models. Voices, accents, those sorts of things. AI that learns from the people. We also have a partnership on AI that's working to protect folks - who are unseen in building these systems. Data labelers, content moderators, those folks who get excluded - or don't get access to the psychological treatment, - even though they're looking at some traumatic information. All of these objectives center human rights. Just like always, I've spent too much time on the human impact action. It is clear to you by now that we are looking at more than engineers. The last bit is openness. I will bring us to the OECD core principles - that fully promote transparency and accessibility in AI systems. If you are doing open source work, you are essentially - already being transparent, which is great. UNESCO said something similar when it defined AI literacy as a public good. It is really fascinating because if you don't understand AI, - you're not going to have a voice in it. I'm sure we've all had conversations with people in our lives - where you tell them you work in AI, and the reaction is a surprised 'oh'. AI is going to ruin the world, - because that's the only conversation they've ever had about it. The more we can increase AI literacy, - the more we can achieve responsible innovation. Open AI, I said we'd get back to it. Not open anymore, right? They used to be open source. They realized they could make a little more money if they closed things down. You know, they're doing great work. It's just not transparent anymore. Then LAION ran into a problem with having open data. I've been advocating for open data the whole time. But, their data set contained copyrighted images. It even contained people's medical images, which was crazy. We need consent in order for this to work. We are an open source company. We provide transparency, - participatory development, community literacy and education. Anyone doing open source work, you are also working toward this goal. I really thank you for it. I do think it's really important. The thing that we're not doing well, I said I would get to it, - is that this is our website. What is this? I told my dad I was going to go work - for a place called Hugging Face and told him to go to the website. This is what my dad sees. He's a smart guy. He's like, all right, I hope they pay you. [Laughs] This is the bottom of the website. If you are not an engineer, or even if you are, this is gobbledygook. We are not making ourselves available - to people who would be interested in doing good work - just by the fact that this is an opaque and meaningless website. You can tell my boss I said that. Again, a lot of this openness section is actually focused on AI literacy - because I'm mad about our website. [Laughs] we need people who can translate very technical terms and details - into human language. That's sort of the bottom line. Responsible AI is cultural. We all have a piece, a part to play. That's how we keep the future human. If you remember nothing else from today, - remember that's not about perfection. I hope you can take bits and pieces from what we talked about - and apply it in your work, because it's just about being a participant. It's just about being involved. Thank you.