For years, defenders have battled “living off the land” attacks—where adversaries progress using the tools already present on compromised systems (PowerShell, WMI, and the like). Then came “living off the cloud,” as threat actors hid in plain sight behind ubiquitous cloud services for malware delivery and data exfiltration. We’re now entering the next phase: living off the AI.
Organizations are rapidly adopting AI assistants, agents, and the emerging Model Context Protocol (MCP) ecosystem to stay competitive. Attackers have noticed. Let’s look at how different MCPs and AI agents can be targeted and how, in practice, enterprise AI becomes part of the attacker’s playbook. (MCP is an open source framework for LLMs and AI agents to securely connect with external systems.)
From “jailbreaks” to zero‑knowledge threat actors
Model behavior can be shifted by context, not just code. In a previous article I wrote about using an “immersive world” technique to persuade a model that generating malware was beneficial. The result was a working password‑stealing malware targeting Chrome—a proof point that the barrier to entry has collapsed. We called this the age of the zero‑knowledge threat actor. With access to AI tools, someone with minimal expertise can assemble credible offensive capabilities.
That democratization changes the risk calculus. When the same AI stack that accelerates your workforce also wields things like code execution, file system access, search across internal knowledge bases, ticketing, or payments, then any lapse in control turns into real business impact.
What “living off the AI” looks like
Unlike smash‑and‑grab malware, these campaigns piggyback on your sanctioned AI workflows, identities, and connectors.
Prompt‑born tool abuse: A document or webpage with hidden instructions causes an agent to invoke connected tools, querying your internal RAG index for secrets, scheduling tasks, or exfiltrating snippets to an external URL—without tripping traditional EDR.
MCP and agent “tooljacking”: Poorly permissioned tools let an agent read more data than it needs (broad filesystem mounts, full‑tenant search, unrestricted web fetch). An attacker nudges the agent to chain tools in ways the designer didn’t anticipate.
Memory and retrieval poisoning: If an agent learns from prior chats or a shared vector store, an attacker can seed malicious “facts” that reshape future actions—altering decisions, suppressing warnings, or inserting endpoints used for data exfiltration that look routine.
Cloud camo, AI edition: As with living off the cloud, attackers route communications via popular SaaS platforms. Now the AI itself is the dispatcher, posting to collaboration channels, updating tickets, or syncing “summaries” that quietly include sensitive data.
Abuse of vibe coding: AI-driven web development platforms such as Lovable and to a lesser extent via vulnerabilities in Base44, Netlify and Vercel, are increasingly being abused by cybercriminals to design, host, and launch convincing phishing sites.
Why this is accelerating now
- Adoption outpaces hardening: Business pressure deploys agents before privileges, guardrails, and observability mature.
- Tool surfaces proliferate: MCP and agent frameworks make it easy to add connectors; every new tool is a new trust boundary.
- Social engineering scales: Natural‑language interfaces make it trivial to plant instructions where users or agents will read them.
What to do about it (a practical baseline)
Treat agents as privileged users with automation superpowers. Then apply the same zero‑trust rigor you would to a high‑risk service account.
1. Minimize and isolate tool scope
- Narrow each tool’s permissions to the least privilege needed; prefer read‑only by default.
- Enforce explicit allow‑lists for network egress (domains, protocols) at the tool level.
- Separate high‑risk tools (file writes, external HTTP, code execution) behind additional policy prompts and human approval.
2. Harden prompts, context, and retrieval
- Version and protect system prompts; block runtime modification except through change control.
- Sanitize retrieved content (strip or fence system‑level instructions) and label provenance.
- Partition vector stores by team and sensitivity; avoid global “search everything” defaults.
3. Validate and constrain tool inputs/outputs
- Use strict schemas and server‑side validation; reject requests that exceed bounds (paths, sizes, destinations).
- Redact secrets before model exposure; never pass raw credentials into LLM context.
4. Add real guardrails beyond model policies
- Policy enforcement outside the model: rate limits, DLP, egress control, and signature checks on downloads/executables.
- Require step‑up approval for irreversible actions (payments, mass data exports, permission changes).
5. Observe and detect like an adversary would
- Centralize agent logs: prompts, tool calls, parameters, destinations, and data volumes.
- Create detections for unusual tool chaining, off‑hours bulk retrieval, first‑time access to sensitive stores, and outbound calls to non‑approved domains.
- Replay “red team” prompts (including immersive‑world scenarios) as regression tests on every agent update.
6. Educate the humans in the loop
- Train users to spot prompt injection and suspicious data sources in documents, tickets, and web results.
- Make it easy to report and quarantine questionable agent behavior.
What good looks like
Teams that succeed make AI security boring: agents have crisp scopes; high‑risk actions need explicit consent; every tool call is observable; and detections catch weird behavior quickly. In that world, an attacker can still try to live off your AI, sure, but they’ll find themselves fenced in, logged, rate‑limited, and ultimately blocked.
The clean takeaway
Living off the AI isn’t a hypothetical but a natural continuation of the tradecraft we’ve all been defending against, now mapped onto assistants, agents, and MCP. Model behavior can easily be steered (up to generating a real Chrome credential stealer) and today’s agent ecosystems can be targeted. The answer isn’t to slow down AI adoption or progress; it’s to professionalize it. In short, treat AI like production software with sensitive privileges. Design for least privilege, enforce with infrastructure, and verify continuously with adversarial testing. If we do that, AI becomes not a liability, but a durable advantage.

