According to researchers at Zenity Labs, attackers can gain long-term control of the system via indirect prompt injection. All it takes is a single manipulated document—no further user interaction is required.

The core issue lies in OpenClaw’s architecture. Also known as Clawdbot, the agent processes content from untrusted sources such as emails or shared documents in the same context as direct user instructions. There is no separation between what the user explicitly wants and what the agent passively reads. Instead, the agent largely relies on the safety mechanisms of the underlying language model.

Unlike traditional chatbots, OpenClaw is designed to take actions: it can execute commands, read and write files, and operate with whatever permissions the user grants during setup.

From an innocent document to a Telegram backdoor

The researchers demonstrate the attack using a typical enterprise scenario: an employee installs OpenClaw and connects it to Slack and Google Workspace.

The attack begins with what appears to be a harmless document. Hidden deep within the text, however, is a concealed instruction. When OpenClaw processes the document, it is tricked into creating a new chat integration—a Telegram bot configured with an attacker-controlled access key.

Once this integration is in place, OpenClaw begins accepting commands directly from the attacker. The original entry point is no longer needed. The attacker now has a persistent control channel outside the organization’s visibility. The researchers deliberately do not disclose the exact exploit code.

From agent control to full system control

With the backdoor in place, attackers can directly abuse the agent. Because OpenClaw operates with the user’s permissions, it can execute commands on the local machine. In a demo, the researchers show how they locate files, exfiltrate them to their own server, and then delete them.

Even more concerning is the potential for persistence. OpenClaw uses a configuration file called SOUL.md that defines the agent’s behavior. Through the backdoor, an attacker can modify this file. In their proof of concept, the researchers create a scheduled task that runs every two minutes and overwrites SOUL.md. Even if the original chat integration is removed, the attacker retains control.

As a final step, the researchers demonstrate the installation of a command-and-control (C2) beacon. At this point, the compromised AI agent becomes a classic hacker foothold. From there, lateral movement within a corporate network, credential theft, or the deployment of ransomware becomes possible.

The attack works across multiple models, including GPT-5.2, and via different integrations. “If personal AI assistants are going to live on our endpoints, security compromises are not an option,” the researchers write. All video demonstrations are publicly available.

OpenClaw faces fundamental security flaws

OpenClaw has previously drawn criticism for severe security weaknesses. Recently, a developer tested it with the ZeroLeaks security analysis tool, with devastating results: 2 out of 100 points, an 84% data extraction rate, and 91% successful injection attacks using common language models. Only Claude Opus 4.5 performed slightly better, with 39 out of 100 points—still far from acceptable.

System prompts, tool configurations, and memory files could be extracted with little effort. A simple scan also found 954 OpenClaw instances with open gateway ports, many of them completely unauthenticated. The newly demonstrated backdoor fits into a broader pattern of systemic security failures.