OWASP Agentic AI Taxonomy in Action: From Theory to Tools

As the work of OWASP’s Agentic Security Initiative progresses, we’re really excited to see how it is already shaping the security practices of developers, red teamers, and AI product teams. Our Agentic AI – Threats and Mitigations taxonomy is more than a framework—it’s becoming a foundation for tooling, testing, and community learning.

Alongside the OWASP Top 10 for LLM Applications, the Agentic AI taxonomy is being adopted by leading tools that embed security into the real workflows of AI builders. In this post, we want to spotlight three tools that are integrating these taxonomies into practice—and contributing back by helping us run developer hackathons, improve coverage, and gather practical insights. These include – in random order:

Pensar logo

PENSAR: Empowering Secure AI Agent Development

PENSAR provides developers with a focused platform for building and testing autonomous agents. Pensar generates SAST-style findings and remediations that are aligned with the Agentic AI threat & Mitigations. These agentic findings are especially valuable, helping surface real-world risks such as unverified tool use, memory mismanagement, or unintended cascading behaviors.

What stands out is how PENSAR integrates the OWASP taxonomies within the development lifecycle, not just as a post-hoc check. This empowers developers to adopt secure-by-design practices as they build, and contributes to the shared OWASP knowledge base with practical feedback from real-world usage.

SplxAI logo

SPLX.AI Agentic Radar: Code-to-Execution Threat Coverage

SPLX.AI’s Agentic Radar offers one of the most comprehensive AI security testing pipelines available, combining static code analysis, runtime red teaming, and agentic workflow visualization. Their radar system is a unique product offering that generates a graph showing what tools an agent is connected to. This helps teams monitor potential issues with Excessive Agency (OWASP Top Ten for LLMs #6) as they build. SPLX.AI’s support for both the OWASP Top 10 for LLMs and the Agentic AI taxonomy in their products brings our work to life across the full AI system lifecycle.

As announced here, SPLX.AI is experiencing rapid growth in adoption among CISOs and AI security leads. It’s unique in enabling both governance alignment and technical assurance, bringing OWASP-based risk detection to code, config, and behavior in live systems.

ai&me logo AI&ME: Low-Code Testing + Real-Time Defense

AI&ME lowers the barrier to entry for secure AI development by offering a low-code interface to test systems. Whether you’re designing prompts, deploying agents, or securing inference APIs, AI&ME enables targeted testing across the lifecycle. It uses uses ASI Threats & Mitigations and LLM Top 10 to  test an AI system as a whole using an application endpoint, rather than just test the model in use. Combined with their API and GitHub workflow support, the low-code approach helps shift-left AI assurance by making it easy enough to become an embedded ‘habbit’ for teams.

Their automatic AI Firewall  enforcse real-time protections. This makes it not just a testing tool, but a live defense mechanism—perfect for teams who want fast, actionable, and customizable OWASP-aligned security.

🧪 Grounding Standards in Real Systems, Not Opinions

As maintainers of the OWASP Agentic AI Threats and Mitigations, we believe standards must be built on evidence. Every threat in our taxonomy includes a sample implementation in our GitHub repo, and these tools are already helping to expand and validate that library.

By integrating testing and feedback loops, tools like SPLX.AI, Pensar, and AI&ME are not just consumers of OWASP—they help our community become co-creators, enabling them to iterate through practical, high-impact security guidance and provide back concrete feedback.

🧭 The Road to the OWASP Top 10 for Agentic AI

This tooling ecosystem is laying the foundation for the forthcoming OWASP Top 10 for Agentic Applications. As systems evolve from single models to orchestras of agents with access to tools, APIs, and memory, new risk patterns emerge—from over-delegation to invisible cascades.

We are actively using the feedback from these tools and their communities to shape a Top 10 that is actionable, evidence-based, and aligned to real-world threats. Your feedback and contributions make this possible.

🔊 Join Us at DEFCON & Black Hat

If you want to get involved, learn more, or help shape what’s next, come see us at these upcoming events:

🎤 DEF CON 33 GenAI Village – Live demos and discussion with OWASP contributors:

🎤 Black Hat USA 2025 Briefing – Deep dive into Agentic AI threats, mitigations, and assurance tooling:

Whether you’re a developer, red-teamer, researcher, or policymaker, we welcome contributions and feedback. Visit https://genai.owasp.org/initiatives/#agenticinitiative  and help us make Agentic AI security real, actionable, and effective.

Let’s build a secure future for Agentic AI— together.

 

Allie Howe is the OWASP GenAI Security Project, ASI Lead for Code Samples & Developer Engagement, and the vCISO at Growth Cyber.

John Sotiropoulos is an OWASP GenAI Security Project Board member and ASI Co-lead. He is also the Head of AI Security at Kainos.

 

Scroll to Top