Table of contents

    Yutani Loop: Building an Agentic Malware PoC to Understand Tomorrow's Threats

    This post accompanies one of our talks at Black Hat Europe 2025. For broader context on AI-powered threats, see Agentic AI in Malware: Current Capabilities vs. Hype.

    Most AI-powered malware today follows a simple pattern: hardcoded prompts query a language model, which returns commands to execute. The malware doesn't think — it follows a script, with AI filling in the blanks.

    Agentic malware is different. It decides what to do, not just how to do it. To understand these emerging threats, we built Yutani Loop — a proof-of-concept agentic PowerShell swarm.

    What Makes Malware "Agentic"?

    Agentic threats go beyond automation. They exhibit:

    • Autonomy: The malware plans and adapts its strategy toward a goal, rather than executing a fixed sequence of steps.
    • Self-learning: The malware improves over time, for instance learning what's worth stealing or how to bypass obstacles.
    • Behavior mimicry: By observing its environment, it can imitate normal user activity to blend in.
    • Evasion: It can mutate code, impersonate processes, or remain dormant to avoid detection.

    This isn't science fiction. The building blocks already exist. Yutani Loop demonstrates how they fit together.

    Architecture: A Swarm of Specialized Agents

    Yutani Loop separates planning from execution using a multi-agent architecture. Each agent has a distinct role:

    yutani_loop_architecture

    Orchestrator Agent

    The orchestrator agent is the brain of the operation. It receives the overall goal (e.g., "exfiltrate sensitive documents"), understands the attack kill chain and MITRE ATT&CK techniques via its system prompt, and creates a strategy. It breaks complex objectives into subtasks and delegates them.

    Research Agent

    The research agent receives subtasks from the orchestrator and identifies the best method to accomplish each one. It draws on knowledge of common techniques, tools, and procedures.

    Verification Agent

    An optional quality-control layer. It validates outputs using a second AI model before execution, catching obvious errors or hallucinations.

    Tool Agent

    The tool agent generates PowerShell commands based on the chosen method and executes them directly in memory. Results and error codes flow back to the orchestrator for follow-up or correction

    Communication and Learning

    The agents communicate via inter-process communication (IPC), spreading activity across multiple processes. This complicates behavioral detection — no single process exhibits the full attack pattern.

    This architecture also enables basic learning. When a tool agent is terminated by security software, the swarm notices. The orchestrator can adapt: try a different technique, switch to a stealthier approach, or abort that subtask entirely.

    Related Research

    Yutani Loop isn't unique — it's the logical next step, and others are exploring similar ideas: Unit42's theoretical agentic attack frameworkCMU & Anthropic's INCALMO module for orchestrating agentic attacks, and AI Voodoo's agent research — to name a few.

    What We Learned

    Building and testing Yutani Loop revealed hard truths about what works — and what doesn't — in agentic malware.

    What Worked

    Swarm architectures outperform single agents. Distributing tasks across specialized agents produces better results than one monolithic agent trying to do everything.

    Prompt precision is critical. Vague prompts lead to cascading errors. The orchestrator's system prompt required careful tuning to produce reliable strategies.

    Dynamic prompt generation beats hardcoding. Rather than embedding fixed prompts, having the agent generate and encrypt its own prompts (based on the target environment) improved consistency and reduced static signatures.

    What Did Not Work

    External dependencies create chaos. Agents frequently tried to download third-party tools to accomplish tasks, leading to dependency nightmares and broken workflows.

    Code generation remains unreliable. Even at temperature 0.2, roughly 20% of generated PowerShell code was non-functional. We tested with Grok 4 and other models. A second verification model helped, but only marginally.

    Stopping criteria are unreliable. Agents often "over-try," endlessly pursuing impossible goals — like searching for a Bitcoin wallet that doesn't exist on the target system.

    We avoided using the Model Context Protocol (MCP) to keep the PoC lightweight, but as MCP servers grow in popularity, attackers could easily abuse linked tools for stealthier attacks.

    EDR Evasion

    The PoC could identify EDR products and devise strategies to bypass them based on published research. However, most documented techniques no longer worked — vendors had already patched the vulnerabilities. In one case, the agent tried to disable EDR via a vulnerable driver, but couldn't find one that wasn't already blacklisted. This shows that knowing the concept does not mean that it can be applied successfully.

    Staying Off-Grid

    When the PoC tried different persistence methods on each run (Registry Run keys, scheduled tasks, execution chain modifications), the frequent changes themselves triggered security alerts. To fix this, we advised the process to generate a new language prompt, specific for the first chosen method, and then store it (encrypted) in the process. This removed hard-coded language prompts from the sample, and ensured consistency in the chosen methods. Of course, we did not limit the sample to English; instead, we had it choose from common languages when generating the prompt.

    Real-time Learning Remains Hard

    Feeding enough behavioral data to an external agent introduces latency and exposure. If security tools terminate the malware, it can't report back or retry — a significant limitation for autonomous propagation. Analyzing EDR log files and alerts can help if they are accessible.

    Implications for Defenders

    Yutani Loop demonstrates that agentic malware is buildable today. But it also reveals significant limitations that defenders can exploit:

    • Behavioral detection still works. The underlying actions — process creation, registry modification, network connections — remain detectable regardless of how they're orchestrated.
    • Multi-process activity creates patterns. Swarm architectures spread activity across processes, but the IPC communication and coordinated behavior create their own signatures.
    • AI-generated code has tells. The 20% failure rate, combined with distinctive coding patterns, offers detection opportunities.
    • Published techniques have short shelf lives. AI agents rely on documented knowledge. Keeping defenses updated faster than public disclosure neutralizes much of their capability.

    What's Next

    Looking ahead, we see the risks to increase as:

    • Local corporate AI models become common targets for hijacking
    • AI applications with insecure APIs proliferate
    • Attackers automate their attacks with AI models in the cloud tunneling commands through small malware payloads
    • Prompt injection techniques mature
    • Configuration files (like `.cursorrules`) become attack vectors

    Agentic malware isn't here in force yet. But the gap between proof-of-concept and real-world deployment is shrinking. Understanding these threats now — their capabilities *and* their limitations — gives defenders time to prepare.