Everyone gets the logic of the cloud: scale up the resources as the demand increases, and only pay for what you use. Yet one in three to one in five euros spent on cloud is wasted. Why is that, and what can we do about it? As cloud becomes the de facto service deliver method, security becomes evermore important. One viable approach is Zero Trust: secure everything based on trusted identities. Or, said another way: Trust no one, provide least privilege, and assume breach. Marc Cluet from HashiCorp joins with Andy Allred and Eugen Kaparulin from Eficode.

Marc (0:10): 

I always say that every single objection hides a requirement, right? So, if somebody objects to it, it’s normally because there’s a hidden requirement that has not been discovered yet, that it’s important digging into it and understanding exactly the reason behind it. 

Lauri (0:29): 

Hello and welcome to the DevOps Sonar podcast. The DEVOPS Conference is coming again on March 8th and 9th, and of course you are invited to join our event. To build the excitement to the DevOps event of the year, we have invited exciting people to join our podcast and share a bit of backstory to the themes we will be covering in the event. And how could we talk about DevOps without talking about Cloud? This is why we invited Marc Cluet from HashiCorp. Marc is joined by Andy Allred, Lead DevOps Consultant at Eficode and Eugen Kaparulin, Senior Consultant at Eficode. Our two topics are Cloud Waste and Zero Trust. Let’s tune in. 

Lauri (01:14): 

Thank you very much, Marc, for joining. Thanks Andy and Eugen for joining this podcast.

Andy (01:19): 

Thank you

Marc (01:20): 

Pleasure. Thank you. Happy to be here. 

Eugen (01:22): 

My pleasure.

Lauri (01:23): 

So, this is an interesting conversation. We have two topics, but I’m expecting us to end up intertwining them in one way or another towards the end, and it remains to be seen what that is. But we are talking about Cloud Waste and Zero Trust. Now, before I ask for the first question, let me just recap the definition of Cloud Waste, and I am verbatim reading this from HashiCorp: Research shows that anywhere between 20-35% of cloud costs are completely wasted. That’s a minimum of 5 million wasted every day and idled and/or provisioned resources as organizations take advantage of the benefits of the cloud, that waste will only be increasing. And it’s really interesting that when we talk about the benefits of cloud, there is something that often goes unnoticed. So, let me ask you, Marc, what is the reason companies are wasting around 40% of their cloud budget in unused resources?

Marc (02:25): 

So, for me there are several reasons. The first one of them would be mainly that we as companies, or as companies get into cloud, they come from a model where they had one, two, three different flavors of service because that was how a lot of companies operated and they tried to transpose literally that model into the cloud. The problem with that is that when we oversize things, etcetera, we oversize the computing that we need and that’s that the usual because, first of all, you don’t fully trust the cloud yet as it’s a virtual instance. It doesn’t work exactly as anything other you used and second of all, you try to oversize always, just to make sure. The problem is that people oversize but then they don’t go back to size it right. It’s just working right, so why touch it? The second part of that for me would be also the reliability of the instances themselves, so there’s also some connection between the amount of bandwidth and the amount of memory that you have that is decisive for instance. Sometimes that’s a limiter in the sense that you need that exact memory size for your application, but you are oversizing the CPU, or you are oversizing the network. (03:40) and in that sense you are wasting that resource in a way because there’s better ways to manage that. And that could be either breaking down that application into more micro applications, or that can be finding the right instance because these are generic instances but there’s also CPU extra instances, or memory extra instances. So those for me I would say are the main factors. 

Lauri (03:55): 

What about you Eugen, when you think about this and think about the ways you have seen this through your own lens, any thoughts about this?

Eugen (04:10): 

Yeah. I guess the term Cloud Waste is a little bit wrong in some sense because for the cloud provider, it is not a waste at all because it gives them a little, a huge amount of money for resources being idle and unused. It is a waste of course, a waste of money, because you have resources around which are not utilized enough. As Marc said, the correct scaling for the applications is an invalid factor but often, especially in the long run the applications are not only scaled correctly but also not really used correctly. Say we are going for a peak load, or a peak preparation like Black Friday or something, usually the resources are over-scaled or overprovisioned just to make sure they will handle the unexpected load. But this is bad calculation since you should scale your applications correctly or specify the autoscaling capabilities rather than over provision. Another thing, the developers or the architects even if they are experienced and may not see a whole big whole picture all the time. So, we are in the cloud, we are working in the cloud, we should, everyone QA developers, end testers, architects should be aware that we are in the cloud and every resource costs, even an idle one. So, there is often a missing red line across the teams that the computer resources are higher, so they are not running in our basement, and cost nothing. That would be my five cents about this, about the definition and the way how it comes to that. 

Andy (06:13):

I’m also curious or would be curious to know how much idle time is in kind of on-prem data centres. There’re so many servers in our on-prem data centres that are just idling all the time. Is it really more or less than what’s happening in the cloud? Where in the cloud you see the bill every single month, every single 5, 7, 10 years, so is it different? If it is or not doesn’t change the fact that we should do something about it, but I was just thinking as you were speaking earlier that is this a new problem or just a problem that’s highlighted more? 

Eugen (06:55):

I guess that it is an old problem since I saw organizations turning off test environments for the weekend not only for on-prem hard starter, but also for Cloud. And then people were set extra time working extra time on Fridays to shut down the whole test environments in the cloud and then yet again earlier in the morning on Monday to revive all those test environments. So that sometimes helps, but it sometimes resulted into overkill, because more people are busy on reviving the resources rather than saving the money over the weekend on the long run. But I think it is an eternal problem. It’s just more painful than to pay public cloud because the resources reflect real money. 

Marc (07:40): 

Yeah, I would also add to that, that one of the things is that for me it is a slightly different way of the same problem in the sense that when you are getting service in a data center you normally condition or buy the service that will support the maximum amount of traffic that you will have at any point. And that can be the turning point of the year. Where in Cloud you are encouraged against doing that in the sense that you need to size for the right moment, for the right volume, right, and that’s what is discerning in the way is all about. Why you don’t have the privilege of throwing more hardware every single minute or less hardware as traffic comes in. 

Lauri (08:19): 

So, it sounds to me like the perfect recipe for Cloud Waste is to be very very quick going to cloud, and very very slow doing anything after that. If you are on the cloud, then you should start reducing it, but maybe reducing is dependent on what kind of waste you have. So maybe we first relate about different phases of Cloud Waste and then how to go about reducing them. What’s your take on that, Marc?

Marc (08:48): 

So for me, as we know there’s many different studies from Gartner and Accenture and other companies that they try to define exactly this journey of Cloud Waste. And the one that is always very apparent is whenever you are migrating from a data center to cloud, most of the companies, I would say 90% of them do what is called the lift and shift, right. They don’t understand yet how cloud technologies work. They don’t understand yet what is the best way to use them, so they just literally copy what they have over to the cloud and that results in a lot of over-provisioning as well because they copy the exact environment with exact sizing and we already established that the way data centers are built are built for the exact amount of traffic during the year. Because you cannot change your hardware from one week to the next, it is very costly. So, in that sense, you are creating the cloud as well for the maximum amount of traffic that you would have throughout the year instead of throughout the day. So that generates a lot of waste, wasted resources. And then of course you get all the instances that are running idle. Because if everybody is speeding up cloud instances and spending up cloud resources, sometimes they forget to turn it off, because it is something new. And most of the time as Eugen said, if you don’t go back and turn the servers on again because there will be all kinds of trouble.  And I have suffered recently from servers that you turn them off for a day and you turn them back on again and you have disk issues because the disk was so used to being hot, that as soon as they went cold, they broke. So, you have all kinds of issues in that sense. And so that also is something that people are normally not used to. And all of these get aggravated by not having someone in charge of cloud control, not having someone that controls all of the expenditure. This new figure, it’s called the FinOps manager, Financial Operations, is to make sure that we are keeping things on track. And some, I would say, some companies learn that the hard way. I was involved in a company around 4 years ago in which they didn’t have a FinOps Manager. They started migrating to the cloud. They had a certain budget for the whole year, and they used up that budget in the first 3 months. So as soon as they saw that, they went into panic and they immediately appointed someone that was already warning them, this was the FinOps Manager interim, Until the situation was in their control again. 

Eugen (11:16): 

Yeah, besides the lift and shift problems, the budget will even grow after that. I especially, after the lift and shift setup, was not scaled correctly. And the new cloud environment needs to scale out on-demand. Then the scaling rules are usually set up either too high or too low and need adaption but that means additional dollars in the budget which are used up. And often you need to migrate this setup into a newer one. Like a Micro-services orchestration where you need, again, to measure and scale and everything. There will be always phases that you have some budget which will be exploding or tend to explode. That’s why every cloud provider basically recommends to divide projects as small as possible. Like the idea of microservices, also the cloud projects, like development project, production project, prepro, test, and so on. You need to separate them, so you have an easier overview over the billing, over the used out resources. That’s something I would add here.

Andy (12:35): 

I was involved in a project some years back and it was the typical you know lift and shift to the cloud, and we are gonna sort this out later. And, you know, short story or long story short, later never came. So, we just kind of lifted and shifted and then just ran and ran and ran. And the CFO was yelling that,” every single month our RAWS bill is ridiculous, what are we gonna do?” But he was tracking only the total amount, nobody was actually going through and checking what was contributing to that total amount.Everybody was just focused on “yeah we just need to do this, we need to do that, we’ll take care of costs later,” but later never comes.

Eugen (13:19): 

It can go also backwards, especially if somebody starts reducing the costs but doesn’t have much sense of the architecture, or of cloud provider itself, they start running around and saying “oooh the big database instance, we need to replace SSD disks again, physical disks because they are cheaper. Or we need to change CPU types,” but on some platforms it affects also the network throughput depending on the CPU core count and so on.  And so, it could be a season, and another race. Basically, you need to know how to reduce costs. Not looking on the big numbers and just turning off equipment. 

Andy (14:06): 

Exactly. The easiest way to save costs is turn everything off. We have all these idle servers in this other data center, let’s just turn them off. Oh, you mean disaster recovery. 

Eugen (14:15): 

yeah, never try it out.  

Marc (14:18)

I think for me, and you mention disaster recovery, right? It’s one of those things that everybody likes to have, but nobody wants to check. 

Lauri (14:25): 

Yeah. On the positive note, I remember two examples. One was a company who was basically doing a transformation to a SaaS company, and they had like a very simple emailer. As simple as, here’s the service, and that service is emailed around to the customer. And simply by taking that and turning it into a serverless function, it like collapses your service costs for that particular service. And so collapsing is probably the right change or magnitude of change here. The other example was even more drastic, which was that if you adopt the right way of using serverless functions, then if the lifetime of an individual function is shorter than a certain amount of seconds, then in certain conditions your cloud provider doesn’t charge you at all. So especially for startups and people who are starting out and building up, then it can be really, really advantageous for them to look at their architecture and say, “if we do it this way, then potentially our cost is zero.” And of course, it doesn’t compute in the long run, but on a short run, it can be a very, very effective strategy, especially for startups. 

Eugen (15:43)

Yeah. If you tailor it right, it would be possible, sure. But it will always be like a combination of everything. We will have a central set of your microservers, run by some Kubernetes clusters, probably. We will have soma databases running on the ends or inside or what. But surely there will be a lot of utilities or hooks or whatever. They can be realized by serverless functions quite easily, and cost-efficiently. But there will never be a crystal clear, one single service in a cloud provider realizing all your infrastructure. That’s why they basically invent all of these new functions, so you can combinate them greatly together, being cost-efficient, being computing-efficient as well. You just need to make it right, and the question is who are the ones for this route? The cloud provider, your architects from next door, or your security advisor in the company? So, it really depends on what is right. What is right and what is cost-efficient. So, it would also define what is waste and what is not waste, in that sense, because sometimes an additional cost makes sense.

Marc (17:10): 

Yeah definitely. And I would say that would be the difference for me between Cloud Waste and overhead, alright? Because, for example, you get a platform as a service, a serverless, there’s an overhead always. There is an overhead in cost, or there is an overhead in the maintenance or something. There is an overhead in the architecture itself. But that, you are buying something with that overhead, you are buying back a piece of land, or you are reducing your operational footprint in that sense. So, you are recovering part of that cost back, whereas with Cloud Waste it’s 100% wasted in that sense, right, and that’s why it’s called Cloud Waste. There’s no recovering back. There is no counterargument, there’s no counterbalance in that sense. 

Lauri (17:53): 

So, I think then Eugen you had a fair question there. Who defines, how to start reducing it, and which way to start it? But are there some common methods that you could apply to reducing Cloud Waste?

Marc (18:06): 

I guess so. I don’t know if they are commonly defined in some open libraries. But common sense explains that, as I said, when an architecture is placed on the cloud, then everybody touching it, even as an end-user, needs to understand that it runs in a public cloud, generating costs, generating possible Cloud Waste as such. Everyone needs to be educated about the architecture, and communication between teams needs to be in place saying that “look, we need to avoid Cloud Waste, but we need to be efficient.” So, it’s a commonsense approach I would say. 

Marc (18:50): 

Yeah, I would say for me one of the most efficient ones that I have seen work is having a common framework of approach. Things like having a common deployment framework, having a common deployment pipeline where you have, where you can implement those controls, and a phenomes control, security controls. So, in this sense you are reducing, in a way you are reducing the amount of outcomes that you can create out of your platform. But the advantage is that you have full control of those outcomes. And whatever happens with them. And you can implement things, policies, codes, or even just, as I did in some companies, you don’t give them the choice. You size things up for them. You get the resources for them. So, you are the layer of abstraction in that sense of, for the platform. And then the developers only care about “use my code, push my code, deploy it, and I don’t know what happens with it, but it just works,” right. So, it is a lot of course of grades of gray between all those different outcomes. The importance for me is to have clear control and governance and have in that sense a point where you keep that control, be it through a pipeline or be it through the cloud itself. 

Lauri (20:09): 

It’s Lauri again. Many people wonder how to combine DevOps and Cloud. While DevOps practices to delivers software faster, higher quality, and people gained the ability to spend more of their time working on new features, successful Cloud adoption brings benefits in regard to skill ability, availability, and cost-effectiveness. It also makes the adoption of new technologies, and thus innovation and experimentation, much easier. To help our customers in their journey we wrote a DevOps and cloud guide. In this guide we will walk you through how DevOps and Cloud support each other, the cultural shift needed, the technical practices and the benefits to be gained from maximizing cloud usage with DevOps. You can find the link to the guide in the show notes. Now, let’s get back to our show. 

Lauri (21:00): 

Before we started the podcast, we had a little chat with Andy because we came online a little earlier and we were discussing about this idea that in some cases it’s the technology that steers the culture, and sometimes it’s the culture that steers the technology. What are your experiences in that, a culture created on the basis of technology advancement and also technology selected and adapted to serve the right culture?

Marc (21:29): 

There is this concept that is called confirmation bias, right. And I think this applies very very well to this example in the sense that normally I think it is a lot more important to define requirements first and then find the tool that matches. Because if you select the tool first based on shiny, based on this is what the cool people are doing now, you will shape your requirements around the tool, and we know how that goes. Normally, it doesn’t end well. 

Eugen (21:54): 

I remember a project I was involved in where a customer decided basically that the technology, well the requirement was there. Splitting a monolith into microservices, and the microservices should run in Docker on java front and not GS front, a quite simple, quite common approach. But then when it came to the cloud provider, we were thinking about AWS or GCP back in the days, and the customer was again saying, “Ok, I am against AWS. Take GCP.” Okay, let's go into the GCP. Let’s deploy it in the GCP's Kubernetes engine. I don’t like it. And by the end, we ended up in via AWS governed, each one starting up in one docker container and having service discovery via HashiCorp’s console and secret managed firewall. So basically, we reinvented the wheel. We created our own Docker orchestration with service discovery, despite having these illusions already there and being tested and polished already, which brought us a lot of childhood problems of course, but we stood fast. We managed to go to production in two months. We worked 12, 14 hours a day, and we had a success, even the customer managed to go through the peak of Black Friday. It was a retail company. But maybe if we had the discussions in the beginning more effective, between the acustomer and the consulting company, and maybe rounding the edges and waiting this, liking against technology, we could have been more efficient especially in the long run. And we could have saved a lot of money into changing the existing infrastructure later on. And of course, the level of stress. I mean so many people burnt out after that project, so one year after everybody left, one after the other, not at once, but anyway. So, it’s a tough thing, and really if I could've done it differently what could I do. If you start something, really, you need to really think hard and maybe obstruct from your liking or dislikings in that sense. It depends on the used case.

Andy (24:32): 

But I think in that kind of case it’s easy to say but difficult to do. When some customer or CTO or whoever says, “I don’t like that technology,” what is it that you don’t like about it? What is it that concerns you there? What problems do you foresee happening? And often, they probably can’t articulate anything, they just know that “I don’t know it just feels wrong.” But how to kind of get out what really is a resistance, and what are they worried about with this particular technology or solution or idea. And is that really a problem that this amount of cost is justified by. But figuring that out, getting the details out, being able to say, “well this is how much it’s gonna cost to do it differently, are you sure,” if it bothers you that much, is a very difficult thing to do. But that, academically at least, is the right answer. But in real life, it is very difficult to do.

Marc (25:30): 

I always say that every single objection hides a requirement, right? So, if somebody objects to it, it’s normally because there’s a hidden requirement that has not been discovered yet, that it’s important digging into it and understanding exactly the reason behind it. 

Andy (25:46): 

I love that phrase the way you said that, and I am gonna absolutely steal it. Every objection hides a requirement. It's perfect. 

Lauri (25:56):

Yeah, I don’t know why but it happens that a lot of people in IT like Star Trek, and considering your example, Eugen, maybe some people haven’t really comprehended the definition of “boldly go where no-one has gone before.” It was not originally said in the technology selection phase of an IT project. Rather stick to what has been proven to work before. Maybe we could look a little further ahead and more to the second subject. Is there anything else you wanted to say about Cloud Waste before we go to the Zero Trust? I think you said it, Eugen, that it's always going to be there in one degree, and Marc has said there is a difference between overhead and Cloud Waste, but just a quick conclusion: Is Cloud Waste inevitable, or is there a way around it? 

Marc (26:47): 

I would personally say Cloud Waste is not inevitable. There’re ways to go around it. For me, Cloud Waste is a necessity of the business, but as the same as let’s say technical debt, right? As you have found you accrue technical debt and also you create Cloud Waste. But at some point, you need to pay that back. You need to make sure that you keep it as low as possible. For me, Cloud Waste can transform from something that generates extra financial burden on you, to Something that is an insurance. I always say that 5% extra computer insurance just in case you get that peak traffic, and you cannot react in 5 minutes. And that’s alright., that’s healthy. It becomes unhealthy when you are talking about 30, 40, 60% wasted. That’s where you’re really throwing money away. 

Eugen (27:42): 

Cloud Waste is like dust. You can’t get rid of that. Dust is everywhere. You need to wipe it often enough and you have clean surfaces, and everybody is happy. But if you don’t clean them, you will have all the dust becoming like earth crust and so on and so forth. So you need to have an eye on it. But it’s a good indicator, like you take a white glove and check Am I cleaning, and in terms of cost and resource management. So, it will be there. Like dust. But you have to tune it to your advantage and use it as an indicator for instance. 

Lauri (28:23): 

Super. Moving on to Zero Trust. So, switching gears and I remember that Andy, this was a topic a little closer to your heart, and Eugen you were more focused on Cloud Waste, but all the same we are going to give hard time for Marc. Zero Trust, and again when I look at the verbatim from HashiCorp, it’s predicated on securing everything based on trusted identities. Machine authentication and authorization, machine-to-machine access, human authentication and authorization and human-to-machine access being the forefront of all the categories in the trust and security. Is there some particular reason why we are talking about this now? I’m thinking back 20 years, and ticket-granting, ticket factor authentication and all kinds of things. Like I remember them from back when I was a practitioner, so why now? 

Andy (29:21): 

I think it’s probably a much bigger issue now for a couple of reasons. The first one that comes to mind is of course we have a global pandemic going on, and everybody’s working from home, and you know, overnight, every company went remote, and you can’t trust that the employees are in the same network all the time. So, you have to change the way you're kind of securing things and that didn’t really change anything but made it more apparent for everybody that hey we have this issue. And then the other big thing that I see contributing to this is the number of SaaS services is just exploding. I remember you know, 20-25 years ago something, it was really a big, big discussion, if we can take an external interface. And now I was just talking with a customer, and they said, “yeah we have these microservices in-house, and then we have these services which are coming from another team, and we have this block over here, these are all external,” and it was like the dozens of external interfaces. Just the sheer number of different connections which need to be made from modern solutions and the way they’re kind of distributed in a very good way and it's really great for microservices and Cloud Waste and what-not, but it means that you need to look at the things much differently from a security and authentication and trust point of view. 

Marc (30:50): 

Yeah, and I completely agree with Andy. I would like to say that even if that from the HashiCorp website, I like another definition of Zero Trust better, which is what I normally use which is: “trust nothing, authenticate everything, and assume breach” which is. I would say, from the DevSecOps movement, that’s the normal one we use for that. I would say it is not anything new, but it's just as it stands, we’ve always been lazy and for me one thing that I use as a comparison is the difference between living in a very small town where you grow up and everybody knows who you are, and you try to go buy a beer when you’re 14 years old, right. The guy at the shop goes, “Oh no, don’t do that, John, we know you, and I’m going to tell your parents about this by the way.“ If you go into a big city, right, the shopkeeper doesn’t know you at all, doesn’t know who you are, so he needs something from a trusted authority , in that case the government to show that you have the right age to be able to buy that beer. And to be honest, the shopkeeper in the -small town should have some of the same, like “show me an ID.” But he got lazy -because he knew already who that person was, and we did the same in a way. We have the full control of the data center and the machines. “Oh yeah, I already know that machine and there’s nothing to check because it's fine,” when it really wasn’t fine, but we just assumed that it was. 

Andy (32:15): 

Yeah, I was in a meeting with a customer just a few days ago, and this gentleman I was speaking with was not a native English speaker and he said, “yes, we have no trust in our organization. We do everything with no trust.” And I said, “So you don’t trust anything?’ kind of jokingly, and he said, “Oh yeah, yeah, I mean Zero Trust. But actually no, we do have no trust because we want to know with a “K-No” what we’re trusting, and I thought that was kind of interesting perspective that came out of his messing up the no-trust and Zero Trust. 

Lauri (32:58):

Yeah. So now we already have like two definitions of Zero Trust on the table, and I am pretty sure that all heated topics are dime a dozen definitions. But are there some main guidelines that despite the definition that you would use, you would still say you are practicing Zero Trust?

Marc (33:19): 

If you look at Zero Trust from the consequences, which is what we’re trying to avoid right, the consequences perspective, it's all about blast radius. It’s all about if something happens here, how much of this would get me in trouble? How many systems would be affected? How far away could the attacker get? In the sense, all these definitions of Zero Trust, for me it's not about minimizing that plus radius. Minimizing the amount of damage that anybody can do. And that’s why I always say assume breach, right. Assume that you’ve already been breached. Assume that there’s already somebody there that’s trying to do something that is not good for you. How long can you stop them? How far can they get? 

Lauri (33:58):

Hm, Andy?

Andy (33:59): 

Yeah, I was just thinking that quite often in the spy movies the handsome guy in the tuxedo comes into the front door, and gets through the lobby and through security. And after that, nobody challenges him anymore because they assume he must belong there. And that’s the same way we kind of treat our computer networks quite often, in that yeah, you’re in the trusted land, you must be fine. But instead of that model, if we verify based on every connection and every kind of transaction instead of just “here’s the IP yeah it sounds OK,” that’s the kind of thinking we need to do, that every bit of, every request that comes in, should I be trusting this or not? Of course, you don’t need to be verifying every single TCP packet, but every transaction or…

Lauri (34:52):

Before there was a term “defense in depth”, and I think it sort of associates the same thing, don’t only trust the perimeter. I think we have already discussed how to, or how can Zero Trust make computing more secure, but if there’s anything else that you want to say about that, as you get the last words, I’m interested in what are then the different ways for let’s say, architects and companies who build software, how would they go about practicing this in real life?

Marc (35:21): 

That is a very good question and for me I think it goes back to the DevSecOps movement and part of that community. It’s all about security and I know that has been repeated a lot with Zero Trust, but security is something that needs to be intentional. It needs to be intentional, and it needs to have a purpose. And of course, that means for me that any single application, any thing in architecture you need to have the security environment sitting there with you from the beginning. If you start adding security once the application has been decided, or once the architecture has been decided, it would be a patch on top of something. But you need to create that defensive architecture, that defensive application architecture as well, to make sure that it is as secure as it can be. Like I said, otherwise it would be like installing a glass door and saying, ”oh yes, actually anybody can break this glass door. So, let's add this small barrier afterward, I am sure nobody will jump there.” 

Andy (36:20): 

I think  the, at least in my view, the top two things that people should do to move towards Zero Trust is number one, don’t trust connections based on the network, but really authenticate the connections, and number two, no tokens or no accesses should ever be like defined anywhere. But instead, the application should fetch a token or fetch an authorization from somewhere else and update that. They should be short-lived and always fetched. So, when you go to the source code, even if things are proper and you don’t commit, you know, API tokens and to GitHub in your local source code, you shouldn’t see any tokens there either. But your applications start up talk to some secrets manager, get a short-lived token from them, and then continue on with establishing connection. You know, if we could make those two changes, I think a lot of the security problems start going away. It doesn’t solve everything, but those are, in my view, the two biggest changes that give us the biggest benefit in the short term.

Eugen (37:28): 

But if you come back to the beginning of the discussion about Zero Trust, as Lapa was said, it's nothing new. Like 20 years ago, it was the same. Same security concerns, same security measures to be taken. And yeah, the world grew rapidly. Social Networks, people saving a lot of data in the cloud, it can leak easily and external links as Andy said, like you have 12 or dozens of external production-related services running. How do you rotate the authentication token towards them? What happens if somebody leaks it because he saves his notepad somewhere? Or push it to a gist. So, the considerations are the same, the blast radius became bigger right now because of the speed of the communication of the social network of people. Not only social networks, but you know what I mean. But yeah, the impact became much more serious, and we have to be more serious in implementing the security measures to be taken. But they are all the same, like 20 years ago, like 40 years ago.

Andy (38:50): 

Yeah, and 25 years ago, I had to update my password every year. Then I had to update it every few months, then I need to update it more often. But now, applications are doing things so quickly, you should update that token like how many minutes. So, as things are speeding up, we need to speed up our kind of security renewal process, and our security auditing process and all these things as well. They need to follow that same speed, that same break-neck speed of growth that everything else is following.

Marc (39:17): 

And it is the same reason as DevOps, and as automation and as Agile. It's all about how you cope at that speed, because you cannot make things slower again. So, you need to make sure that you get it right. 

Lauri (39:32): 

Okay maybe the last question, what has been the good, sort of, organizational practices for Zero Trust? You already talked about Shift Left, which is a good one, but anything else that comes to your mind from an organization or culture perspective? 

Marc (39:48): 

Yeah, I would say that one thing that people always forget is to make sure -that they use two ways of authentication for those very critical points, like the beginning of your route of trust, the administration of your whole infrastructure, right, those kinds of stuff. You think you know, plus something you have, always. And if you do either of those only, then you are opening yourself up. In that sense, I think that's a good practice to follow and enforce all across. 

Andy (40:39): 

Thanks

Eugen (40:39): 

Thank you

Marc (40:39): 

Thanks

Lauri (40:41): 

Thank you for listening. As usual, we have enclosed the links of the social media profiles of our guests in the show notes. Please take a look. You can also find links to the literature referred to in the podcast on the show notes alongside other interesting educational content. If you haven’t already, please subscribe to our podcast and give us a rating on our platform. It means the world to us. Also check out our other episodes for interesting and exciting talks. Finally, before we sign off, I would like to invite you personally to The DEVOPS Conference happening online on March 8th and 9th. The participation is free of charge for attendees. You can find the link to the registration page from, where else than in the show notes. Now, let's give our guests an opportunity to introduce themselves. I say take care of yourselves and see you in The DEVOPS Conference

Eugen (41:31): 

My name is Eugen Kaparulin, I am in the IT business for over 20 years. About 15 years I was a C++ developer mostly working with development with remote access solutions and lately, I think since 2016 or so, when I moved to the DevOps and Cloud Ops area where my background helps quite much, especially argumenting to the developer teams why the infrastructure has to be designed that way. I was working mostly with GCP and also AWS and currently working as a Senior Consultant for diverse customers. I don’t know if I need to say whether I used HashiCorp products or not, but I have. 

Marc (42:40):

I am Marc Cluet, I am the manager for the Garnering Solutions Endearing Team at HashiCorp for EMEA. I have 25 years of experience in the industry. Very passionate about DevOps and about DevSecOps as well. I do organize DevOps which is one of the biggest meetups in the world. And also, I organize DevSecOps days in the UK, so those are very close to my heart. I should also say I really like HashiCorp products, being a practitioner of HashiCorp for the last 6, 7, 8 years and every new product that they've released, every new product that I've consumed or tried to break at least.

Andy (42:53): 

Hi. I’m Andy Allred. I started my career as an electronic warfare specialist in the US Navy working on a nuclear-powered, fast-tech submarine which is something always kind of unique. It gets people's attention. After that I moved into telecoms and worked in telecoms mostly, well I started on the radio side and then shifted more to the IT side of telecom companies and worked there for quite a number of years. And then recently I have moved into the consultancy space and working as a consultant for DevOps and cloud projects.