Demystifying AI or: How I learned to stop worrying and love the LLM | Henri Terho
At The Future of Software conference in London, Henri Terho highlighted the current happenings in the AI field and taught us how to peek behind the curtain on what’s actually going on, where AI fits, and where it does not. In his talk, Henri makes note of how AI has changed since its inception and how organizations today are making use of it in their day-to-day operations. About the speaker: Henri Terho is a Senior AI Consultant at Eficode with broad experience spanning regulated industries—automotive, banking, aerospace, and beyond—alongside a deep commitment to open-source collaboration. He has played a key role in fostering community-driven innovation, having served as chairman of the Tampere Entrepreneurship Society and co-founding Tampere Tribe to support local startup culture.
Transcript
Hello, everybody. I'm going to talk a little bit about demystifying AI - and what it actually is, and with the nice Dr Strangelove quote, - or how I learned to stop worrying and love the LLM. So, everybody of us is talking about AI and LLMs and much of that. So, I'm going to open up that box a little bit. Warning, there will be a little bit of math, - and there will be a test at the end, of course, as is for any university lecture, you have to have that. So, a little bit of what we're going to do. Who am I? A little bit of AI, a little bit about the types of AI slash machine learning - and how do we actually work, and what does this mean to all of you? Going into who am I, I'm actually a biologist. I'm not an IT guy. I jumped from the field of biology to IT - because I was sitting in a closet in the university doing data cleaning, - doing data processing and all of this stuff and deciding that - nobody's ever going to pay me enough to do this whole data janitor stuff. And here I am, still doing the same stuff 15 years later, - basically on IT sector, but not just doing the biology side. I worked a lot with the AI tools and doing stuff around AI and testing. And I think that's one key point that happens with AI now - is because we can generate so much code, - testing is actually going to become the number one thing, - that has been for a long time the second thing, - and the first one that we typically take savings from, - hey, let's not test as much, but now because it's so easy - to generate code, that's going to happen a lot. And now we can get to the AI part of it as well. AI everywhere. Everybody's talking about it. Every single moment. How did we get there? How is everything going to change? I didn't think everything's going to change. This change has already been going on for the last 60 years, maybe. And I think this is one of the points that I'm trying to make to prove that. This is pre-packaged food from Finland. It's leverlåda, so it's liver casserole, pretty much. This was my first AI project that I did 15 years ago - on building AI-optimised recipes - for creating a liver casserole at scale in these huge factories that exist. Fifteen years ago. And exactly the same kinds of problems with neural nets, - optimising recipes and all of that. So, this is not new. This has been going on for a long time, but now it feels like it's all exploding just right now. And as I said, neural networks are 70 years old. We invented all of this technology 70 years ago. Why is it happening now? And we're using this technology that is really old, - and everything's changing in a really fast pace right now. And I think it's just because in the base of it, it's just math. How these things work, it's not really anything super complex - or let's say, complicated, - but it gets complex because of the size of these things. So, because of all of this evolution that's been happening - with all of the math and vectors, that's all that is. It's just doing matrix multiplication. AI does nothing else than matrix multiplication. That's it. And for some reason, that gives us really good and powerful results - on predicting text, predicting recipes, - predicting a lot of other things, just doing this. And this is, I think, one of the key points that I want to make, - it's just math, it just works. But you have to understand that it's just math. It doesn't do any kind of reasoning, it doesn't think, - It doesn't do anything like that. And with these caveats, the reason why we think it's now exploding - is because we, as humans, are really bad at forecasting exponential change. And most of the stuff that's happening in the society is exponential. But when we think about, oh, when is AI going to change our way of life? Okay, I think in linear way, maybe in 10 years. When is fusion power going to come? In the next 15 years. When is something happening? It's always in the linear way that we are thinking about it. But suddenly, it explodes, and we get all the benefits of it. That's just because that change is typically exponential. You build on top of change and build on top of that and keep going at it. And this is also reflected in a lot of the technology behind AI. These are from GitHub on the different libraries. This is what Kelsey was talking a little bit in the morning, - how many stars his libraries have gotten. Yeah, I got 4,000 stars. This is the history of many of the libraries - used in machine learning and AI. And you can see the same kind of exponential growth in here, - looking at some of the earliest ones, MongoDB databases used for AI, - slow growth, also a little bit exponential. Then, you jump into Spark, you jump into all of these libraries, - you see that they're growing faster and faster and faster. And then, we get into Transformers. This is what basically all of our AI is now based on, - on this kind of architecture and these libraries. And it's just skyrocketing because it's built on top - of all of these other changes as well. And this is the architecture. This is what all of our AI works on. This stack of boxes. And it's been working like this for 60 years. And I think the biggest reasons that this has been happening now - is the soup of more compute. We built more and more computers. Nvidia is selling like hotcakes now, they pretty much sold out - for two or three years on everything. Everybody wants more compute. One is that the architecture, we invented all of this stuff, - like Kubernetes, all of these kinds of technologies, - enabling us to build on scale, - enabling us to just take something off the shelf - and suddenly deploy it to a thousand machines. And also, this open source community building on top of other ideas - and fast idea share. And boring, continuous work from the 60s. And we've ended up here. Into this very Google-esque, one text box user interface. We keep getting back into the basics - of just having one text box as the user interface. And then, it just works. Asking questions on what's going on, what's going to happen, - what are we actually going to get. Help me study, tell me a fun fact. Very simple user experience. It's not like some big company's box product, - millions of different views. Very simple. And it's all because of the LLM in the background. And you can get a lot of these LLMs. There's a lot of discussion now. Can I use DeepSeek? Can I use Azure? Can I use Open ChatGPT? Who can I use? What's the limitations there? Which is best? I don't think the model matters that much. Because already all the models are commoditised. There is the same kind of communities around there - where you can get multiple different models interchanged. But what does matter, - it's the infrastructure around those models. And this is something that I think we need to take into consideration - when thinking about where do you actually run your AI. Are you taking the whole application from OpenAI? Are you taking the whole application from Azure? Are you taking the whole application from DeepSeek China? Because all of the data is going to go somewhere, - and your data lives somewhere. And that data is at its core how we trained all these AI models. A little bit of history. Most of the data that's being used to train all the English-speaking LLMs - is pretty much stolen from the internet. Reddit being, I think, number one source - of most of the natural language data that they got. If you think about it, Reddit data is pretty much on-prime for training AI. You have contextual data. All of the discussions relate to some topic, - here being some gaming topics or something else. I don't even remember, hopefully, it's not too offensive or anything. But then, you get plus or minus context, upvotes, downvotes. Is this relevant to the topic? And a huge amount of textual data. And this is the number one reason why Reddit also now realised - that they'd be giving this away for free. Their APIs were open till, I think, two years ago - when they realised that, hey, we can actually get a lot of money - by just selling access to all of this conversational data. And they closed it down. But all of this is based - on a lot of these open source, well, not open source, - but open data that's been collected from multiple places. And this basically tells you what the LLM has learned. So, we basically trained all of your businesses now, - you who are using Open AI ChatGPT, all of this on Reddit data. So, that might explain some of the curiosities that you get from it. And I'm going to talk a little bit about the differences - to a little bit go deeper in the tech side. What's supervised learning and unsupervised learning? This is something that has happened in the math field as well, - and a lot of other fields. The example here is, are we teaching the AI model - what we want to predict or how we understand the world? Typical classical models typically do the first one, - that we do supervised learning. We tell that this point in data is an accident. Here you can see that the traffic accident is happening. But on the case of these new AI models, you just give it a huge amount of data, - and you don't predict the accidents. You predict the traffic, and from there you can figure out what happens - in the patterns and where there are risks, - where do you typically have more accidents on average. But you cannot predict a single accident from all the data. And this is also to, I'm just jumping a little bit, - this might sound like I'm jumping, but many of these points - I want to make, so that when you use an AI, - you can understand that it doesn't think. It has just been taught different things, - and then it infers from those, - and you'll get the best average results always. Which just happens to be quite good, which maybe tells something - about our businesses as well and the way that we do anything - if the average result is quite good. And this data is then fed into something - called generative pretrained transformer, - which is the technological board that's happening in an LLM. It's basically just splitting words into tokens. Any one of you heard of tokens, I'm just guarding how technical we are here. No tokens here. A couple of them, no. Okay. So, the idea with all of the LLMs, they're just glorified text prediction - on your mobile. That's all the whole LLM is. They break down the text that you input them, like your question, - can you tell me what's the temperature like typically in London? It breaks it down into tokens, the small bits of word that we have, - and then feeds this one token at a time to the AI engine that we have. And the only thing that it does, it tries to predict the next token, - over and over and over and over and over again. And it's nothing else than a glorified text prediction engine. And what's behind that is basically how we've trained it with all the data. We teach it relations between words. This is what many sometimes hear that, hey, we can embed your data into the AI. This is what we're talking about. We are teaching it relevance between different words. For example, here, this is totally arbitrary, - just for the sake of example, we take the cat. It has something between minus one and one of relevance, for example. Well, it's a 0.6 living being, according to the AI, for some reason. And house is minus 0.8, probably because it has a little bit of relation - that people also live in a house. So, these values are assigned to different words, - different tokens that are happening in here. And based on these, we can make predictions, or you can map them. For example, if you look on the bottom of the image, - you have the man and woman, and king and queen. They're totally different on the wording, - or the position is different for all of them, - but the relationship between man and a woman - and a king and a queen on that picture is pretty much the same. And you can do this with a lot of the data that's been thought to LLMs. You can do this for, let's say, interesting wording pairs. You could take, what could be a good example, - like the presidents of Sweden, - or the king of Sweden and the president of Finland, - you'd probably get a little bit of a similar kind of relationship. And you could map from these to another, - and then, well, something that's also happening, - if you take, for example, Mussolini, and you take Churchill, for example, - you typically get probably a similar kind of a relationship between them - you can predict that these somehow relate to each other, - they somehow relate to World War II and at the same time period, - so for the AI, there's a similar kind of difference than on the lower image. And this is what it all is basically based on. You have huge matrices, over a billion lines, - over a billion times a billion tables, pretty much, - of these different embeddings that have different weights and biases. And that's it. And then, we just run, - then we just basically run matrix multiplication over all of these values. What is relevant to what, and predict the next token. And that's all these very great AI systems do. And understanding this, I think, gives the baseline of - that you cannot expect it to give you out-of-the-box answers - because it's always based on just probability - of what humans have actually input into the system. And it's actually quite simple for you to do as well. If any of you are coders, then it's simple to just teach your own. It just requires a lot of resources. You can find tutorials on building your own AI at some point. And again, we come back to the text box. And also, this is why we have a lot of the racism, - why we have a lot of these relevances in a lot of these AI models - is because we've taught that, it's learned from the data that we have. It's kind of like a child who doesn't have any filter on its output. It's just the relevances from the data that we've given it. It's kind of the same problem like this. How...? [chuckles] If I tell you to not think of a pink elephant, you always have to - first think about it, and then start processing, - do I actually want to say it out loud? You have to understand the context around it. But now with these unfiltered LLMs, this kind of a thing doesn't happen. Do not think of a pink elephant. Okay, I will not think of a pink elephant. I've already thought about it. And this is from psychology. So, jumping straight from math to psychology, - this is what happens with the LLM talks. So, this is how humans pretty much think and operate, - we have some data on the bottom line, we've learned something. We select some of this data based on our context, - on our assumptions, on our values. We paraphrase it somehow because we don't remember all of it, - we've forgotten parts of it, and we have some subset of knowledge that we use. Then, we probably give it a name of what's happening - and think and explain it, and then decide what to do. And in each part of this happens some kind of filtering. And we do this daily on pretty much every single interaction that we have. For example, I could be interacting with pretty much anybody. For example, Kalle here. I have a background with Kalle. I can say to him that, I don't know what it could be, - let's go to have lunch now, and that could be interpreted nicely. But if I that said to you, the experience is totally different - because we are processing on two sides of the discussion. And this is because we do a lot of this filtering that AI doesn't do. And this is something that will help you learn a lot - around how do we do this prompting, because there is no filtering. If you want to get the best bang for your buck - from any of the AI that you're using, - be it ChatGPT or be it anything else or be it on Azure, - you really have to think about how to prompt. But we also daily prompt other humans as well. It's about how we output what we want to get done, - how I form my questions, how I explain what needs to be done. It's not just about writing a question in there, - but really thinking about what needs to be done. And this, I think, is going to be the big thing - in all of software engineering and all of like, - well, the future of software engineering is pretty much going to be - how can we eloquently tell what I want - in a way that the AI system understands - with all the different weights and biases and context - and all of that that I have in my head - that cannot fit in one sentence for our ChatGPT. And summarising a lot of this is exactly that, - because of the way that we taught that it's just math. There's a lot of hidden biases. We don't know all the data, there's so much of the data that's been used - to train these big models. We don't know what's in there. So, AI is really bad on asking questions about ethical considerations and bias. Because that's built in already. It already has a lot of bias. But asking these kinds of questions just doesn't work. Handling complex logic and custom algorithms. This has more to do with the fact that you cannot ask it - to calculate 1 plus 1 or 2 plus 3. Or maybe those, you get right answer, - because if you think about it, it's just probability. You'll find a lot of data on the internet saying that 1 plus 1 is 2. So, it's probably going to get that right. But if you ask it 377 times 2 divided by 6, - and then let's just add pi for fun, - it doesn't give you a right answer because that example - probably doesn't exist anywhere on the data that you think of it. So, math, stuff like that doesn't work. Rapid changes. Because we've used a lot of the data from internet, - there's a lot of history behind that. You cannot rapidly change what it outputs without training a new model. And that's why we see this cycle of getting a new model every half a year. You can train something on top of it, but you cannot change the whole model immediately. Or you maybe do some crude filtering on top, - that every time Trump is mentioned, just drop the whole thing, for example. You can do that, but you cannot change the underlying model easily. Security policies, asking the limit - that, hey, you cannot tell this to these and these people doesn't again work, - because then we'd have to build something on top of AI - to limit who gets what kind of answers. And the last one, I think, is controversial, - creative and innovative thinking. We can generate a lot of images. We can generate art. We can generate music. We can generate all of this. But it's always based, again, on what data has it learned - and what have we given it. Of course, then it becomes, hey, I've listened to music - for the whole of my life, I want to make a new song. Is my song also based on all that I've heard during my life? And is it just the same thing or not? Very difficult, philosophical question. And exactly, this comes down to how do you address the AI on this. A lot of the stuff that we see people experiment first, - hey, can you please fix my code? Okay, what piece of code do you mean? What do you want fixed? What's a positive end result? What do I actually want to fix? So, talking to these systems, - expanding a lot on what you actually want to get done. Here's a snippet of my code that doesn't work. Very expansive ways on how do you actually talk to this. This is the same thing as I can talk to humans. Kalle, fix my code. He doesn't know any context on what's actually happening, - but explaining a lot more on, - hey, it's in this repository, I have this problem, - here are the error messages. Can you help me with this? Because LLMs are trained on human data, - they respond to pretty much the same cues as humans as well. So, the same stuff also works when working with your colleagues, - explaining, expanding, and asking nicely, actually. Asking nicely works with LLMs, too. You get better answers statistically - by asking nicely, which is fun. And exactly, start simple, and then expand on that. Avoiding impreciseness, define how do you want this output. Y