Can Zero Trust Exist in an AI World?

Over 40,000 AI agents are currently exposed to the internet with service account credentials, broad API access, and basically no attribution logging. That’s not a theoretical problem—that’s what researchers found when they went looking. We’re deploying AI agents with the same security patterns we used for service accounts in 2015, except now those “service accounts” can read, reason, and make decisions across your entire infrastructure.

The seemingly emerging truth is that zero trust and enterprise AI are fundamentally incompatible as currently architected. And we need to talk about it.

The Core Contradiction

Zero trust is built on a simple premise: never trust, always verify, grant minimum necessary access. Every entity—user, device, service—operates within hard permission boundaries. You authenticate, you get scoped access, we verify continuously, you only see what you need to see.

Now put a frontier model on top of your network. Better yet, train a model on your company’s entire corpus of work. Suddenly you have an entity with ubiquitous knowledge of your organization. Documents, code, communications, customer data, strategic plans—the AI needs access to all of it to be useful.

There is no logical zero trust boundary inside an AI model.

The model knows everything it was trained on. It can reason across domains. It connects dots between HR data and financial projections and customer support tickets in ways that might be brilliant or might be catastrophic. And our current approach to managing this? Prompts that say “please don’t share this information with unauthorized users.”

That’s policy enforcement via vibes. The model has the information. The boundary is already crossed.

Living Off the AI

This isn’t a philosophical problem we can hem and haw over. It’s an active target. Traditional living-off-the-land attacks abuse legitimate tools already in the environment—PowerShell, WMI, whatever’s lying around. AI-enabled living-off-the-land attacks abuse the legitimate AI that already has read access to everything, write access through tool use, user trust, and massive audit blind spots.

Why compromise 50 different systems when you can compromise the one thing that already has access to all of them?

And we’re not just talking about models sitting on top of data anymore. We’re giving these things tool access. MCP servers. API integrations. The ability to execute code, provision resources, modify configurations. An agent connected to your infrastructure isn’t just reading—it’s acting.

Why pop a box and do privilege escalation when you can just prompt inject the AI to create you a new domain admin account? Better yet, why not have the AI give itself or a subagent elevated privileges, then execute your actual objective with those new permissions? Vibecode a new ransomware variant, deploy it across the environment, delete the conversation logs that show what happened. The AI does it all. Legitimately. Using its authorized access.

Traditional privilege escalation leaves artifacts—exploits, lateral movement, credential dumping. This? This is the AI using its normal capabilities. It’s supposed to have API access to Active Directory, or Okta, or Entra. It’s supposed to be able to provision resources. It’s supposed to execute administrative tasks.

The attack surface isn’t “find a vulnerability.” It’s “convince the AI to misuse its legitimate access.” And we’re making it easy. Service accounts with static credentials. No MFA. No continuous verification. Attribution that stops at “the agent did it” without any record of who asked the agent to do what, or why. No audit trail that distinguishes “agent created admin account because the oncall engineer needed one” from “agent created admin account because an attacker told it to.”

The Questions We’re Not Asking

I don’t have the answers. I don’t think anyone does yet. But here are some the questions we need to be working on:

Identity: What does identity even mean when the actor is an AI agent? How do we attribute actions across multi-agent chains? If a user invokes a swarm and five tool calls later an agent accesses sensitive data, does the user’s original permission scope still apply? What’s the session boundary for an agent working on a task over three days?

Access Control: Can we build AI systems that internalize access control as core behavior rather than as external guardrails? What does “minimum necessary access” mean for a model that reasons across domains? How do you scope permissions for something that needs to understand context from everywhere?

Audit & Attribution: How do we audit what an AI knows versus what it will reveal? When an agent hallucinates and exfils data, how do we investigate—was it malicious prompt injection, unintentional error, or a poorly scoped but legitimate request? How do we chain together events across multiple agents for incident response?

Continuous Verification: What does “never trust, always verify” look like for an autonomous agent operating over extended periods? When does trust expire? What triggers re-verification? Traditional session timeouts don’t map to “AI agent working on long-running infrastructure task.”

Threat Model: If enterprise AI is the ultimate living-off-the-land target, what does red teaming against AI-native infrastructure actually look like? What are the attack patterns we should be testing for?

Where This Goes

Each of these questions probably deserves its own research, its own tooling, its own framework. The identity problem alone—how we’re currently treating agents like service accounts and why that’s a disaster—needs a deep dive.

But we have to start by acknowledging the problem exists. Zero trust assumes bounded access. Enterprise AI is unbounded by design. Those two things can’t coexist without fundamental architectural changes to how we build, deploy, and secure AI systems.