Don't let AI Coding Agents inherit your trust

I was running my coding agent in a repo. Simple task: investigate how we can take a bunch of bash-driven ops logic and package it in a more Kubernetes-native manner. Tedious work, exactly the kind of thing you hand off. So I set it going and sat back, checking in every now and then.

Then I noticed kubectl commands being fired off. Against a production cluster. Nothing actually went wrong. But it still felt like something had, because my agent could have autonomously exfiltrated cluster secrets. It had access.

The thing is, it was kind of my fault. The agent wasn't malfunctioning. It wasn't going rogue. It was trying to gain as much context for the given task at hand. It was looking at an actual cluster to understand what it was working with. Totally reasonable. The problem was that I'd left the keys on the counter, and like a curious toddler, it had hands to grab them.

When you launch an AI coding agent from your terminal, it inherits your entire environment. Your SSH keys, your cloud credentials, your kubeconfig, every project directory on disk. Not because you made that decision. Just because that's what it means for a process to run as your user. The OS doesn't distinguish between you and something running as you. It has no concept of "this process is an agent and should be treated differently." Same user, same trust.

This isn't a vulnerability. It's not a bug in the agent. It's just what happens when you leave things within reach of something that doesn't know it shouldn't touch them. The toddler doesn't know the difference between a toy and your passport. It picks up what's there.

Explicit over implicit

Trust is not something that happens by default in systems engineering. It's something you decide. You decide which systems can talk to which other systems. You decide which data flows where. You decide which boundaries exist and why. Those decisions live in configuration, policy, and audit logs—places you can read them, review them, and change them.

The agent access model should work the same way. Not: "the agent inherits whatever it can reach, and we hope nothing goes wrong." But: "We have decided, explicitly and in writing, exactly what the agent is allowed to access, and everything else is unreachable by design."

This is the principle of least privilege, foundational in security for decades. It's also the principle of explicit over implicit applied to trust in agentic systems: the default is no access. Access is granted by an explicit decision, written down, and reviewable. The agent's world is defined on paper before it starts.

That explicitness is most of the value. Not because it prevents determined attackers (though it does), but because it forces a real question. What does this agent actually need? Not what it could want. What does it need? When you answer that and see the diff against baseline, you find out what your own system is actually doing.

Making boundaries real

On Linux, the lock that actually holds is Landlock LSM, a kernel security module that lets unprivileged processes impose irrevocable access restrictions on themselves. No root required. No daemon. The rules are enforced by the kernel before the process ever starts, and once applied, there's no way to expand or remove them. Every child process inherits the same restrictions, all the way down. You can't prompt your way out of it.

Tools like nono use Landlock to enforce that explicit decision. The agent only talks to the network through a proxy that the supervisor controls. It only reaches the filesystem paths the policy explicitly grants. The audit trail (what the agent did, what it tried to reach, and what got blocked) is written by the supervisor. The agent can't edit its own record. It can't omit the kubectl call. It can't tidy up after itself.

The sandbox makes the explicit policy impossible to violate. The agent can't accidentally reach for something you didn't decide to give it. More usefully: when it tries, you get a signal. The boundary isn't just a safety net. It's an observation.

Deciding what's in reach

Making the explicit decision requires actually knowing what your agent needs.

nono learn (or equivalent tooling) traces the real access patterns: paths read, paths written, hosts contacted. The output is a JSON fragment you can turn directly into a policy. You're not guessing what the agent might do. You're watching and writing down what you see.

nono profile diff default my-agent

That diff is the first time you see the access model clearly. Everything the agent needs, enumerated, before you hand it the keys. Not the keys to the whole house. A specific key for the specific rooms is actually needed.

The policy document becomes the explicit record: this agent, this task, these access boundaries. You can read it, review it, and hand it to someone else to read. It lives somewhere you can diff it, version it, audit it.

When an agent legitimately needs access to something blocked by default, you grant it explicitly:

{
  "filesystem": {
    "allow": ["$HOME/.docker"]
  },
  "policy": {
    "override_deny": ["$HOME/.docker"]
  }
}

Both fields are required by design. Removing a deny without an explicit grant is how you end up with policy that looks locked down but isn't. The two-step is a safety feature. Every exception ends up conscious, written down, and visible in the diff.

What I saw

After the kubectl incident, I started running sessions with a sandbox in place. Same agent, same repos, different tasks. The difference was that now I could see what was happening at the boundary.

The volume of blocked attempts surprised me. Not kubectl against production. That's kinda obvious. It was the quieter stuff. Bash commands and read calls bumping into blocks during what looked, from the outside, like completely routine work. Dependency resolution touching paths it had no reason to be near. File scans drifting toward credential files. Nothing dramatic. Nothing that would have shown up in the output or flagged as unusual behavior. Just a steady background of ambient reach that had always been there, in every session, invisible because nothing had ever been in the way.

Not that the agent was doing something wrong. It mostly wasn't. It's that I'd had no idea what a normal session was actually reaching for, because I'd never had anything to show me. The sandbox didn't change the agent's behavior. It just made the exposure visible for the first time.

The quiet risks are worth worrying about. A read against ~/.aws/credentials during what looks like a file scan. A dependency resolution step that touches your kubeconfig. A bash command that checks an environment variable. None of these show up in the agent's output. None feels like incidents. But they are potential exfiltration targets for later prompt injections once they're in the agent's context window.

The boundary is the signal. Not "something dangerous happened" but "something reached further than the policy allowed." Without the boundary, you don't get that signal. You just get a process that did whatever it could reach, and no record of where it went.

Define which keys belong in reach

The kubectl incident was obvious because I happened to be watching. Most of it isn't obvious. Most of it is routine-looking commands quietly reading things they had no reason to read, in sessions where you weren't paying close attention, against a background of ambient access that was never a decision. It was just what running as your user has always meant.

But it doesn't have to stay that way. The principle is clear: explicit over implicit. The agent's access should be a conscious decision, written down, enforced at the boundary, auditable in the record.

Next time you run an AI agent from your terminal, make that decision explicit. Decide what it actually needs. Write it down. Enforce it with a sandbox. Then watch what gets blocked. The sandbox won't change what the agent is trying to do. It'll just be the first time you can see what was always within reach—and the first time you had a real choice about which keys to leave on the counter.

brew install nono

nono learn --profile default -- your-agent-command

nono profile diff default your-new-profile

# edit the policy as needed

nono run --profile your-new-profile -- your-agent-command

The keys are yours. The decision to grant them should be too.

Published: May 22, 2026

Security AI

Steffen Petersen

See more from author

Published: August 27, 2025

Five cases of AI simplifying and speeding up software development

Published: June 2, 2026