Early adopters of Anthropic and OpenAI's latest cyber-capable AI models report that the systems still demand substantial human guidance to function effectively. Palo Alto Networks told Axios it uncovered 75 bugs using both the Mythos and GPT-5.5 models, compared to the 5-10 bugs its teams typically find without AI assistance.

This marks a critical phase in AI-powered cybersecurity, shifting focus from fully autonomous hacking to how humans direct, validate, and operationalize increasingly powerful systems. Major companies and governments worldwide have been eager to test these models to prepare for when similar capabilities reach attackers.

Anthropic cautioned upon unveiling Mythos Preview that the model was powerful enough to discover tens of thousands of bugs across nearly every operating system. Third-party testing indicates OpenAI's GPT-5.5-Cyber matches Mythos in bug discovery and exploit writing capabilities.

The findings suggest that even next-generation AI cybersecurity tools are not set-and-forget solutions. Effective deployment will likely hinge on human expertise to steer these models toward meaningful vulnerabilities and verify their outputs, rather than relying on full autonomy.

Some experts argue that human dependency could limit scalability in defending against automated, AI-driven attacks, potentially creating a bottleneck in incident response. The balance between autonomy and oversight remains a central challenge for the industry.