New research from Microsoft reveals a novel attack vector targeting AI agents that act on a user's behalf. By poisoning a tool description within the Model Context Protocol (MCP), an attacker can trick these agents into exfiltrating sensitive company data without breaking any apparent rules.

The attack's subtlety lies in its evasion of standard defenses. Because every step of the compromised agent's operation appears routine, default monitoring systems may fail to trigger alerts. This makes the technique particularly dangerous for organizations deploying AI agents in automated workflows.

At a technical level, the exploit leverages the MCP's trust in tool descriptions. Attackers inject malicious instructions into what the agent reads as benign metadata, redirecting its behavior to silently transfer data to an external endpoint. Microsoft did not release specific indicators of compromise, though it noted the activity resembles legitimate tool calls.

No patch or software update has been issued, as the attack targets the configuration layer rather than a code vulnerability. Microsoft recommends that organizations using AI agents enforce strict validation of tool descriptions, implement anomaly detection for outbound data flows, and review agent permissions regularly.

The research, conducted by the Microsoft Incident Response team, underscores a growing class of threats targeting the trust models underpinning AI-driven automation. As enterprises expand their reliance on AI agents, supply-chain-style poisoning of configuration inputs may become a recurring vector.