OWASP Vendor Evaluation Criteria for AI Red Teaming Providers & Tooling v1.0

About

Vendor Evaluation Criteria for AI Red Teaming Providers & Tooling is a practical guide for organizations assessing vendors that offer AI red teaming services or automated testing tools. Developed under the OWASP GenAI Security Project, the document outlines clear criteria for evaluating both simple GenAI systems (such as chatbots and RAG applications) and advanced systems (including tool-calling agents, MCP architectures, and multi-agent workflows). It helps decision-makers distinguish meaningful adversarial testing from superficial “jailbreak-only” offerings by highlighting green flags, red flags, realistic threat models, evaluation rigor, tooling quality, and governance considerations. The goal is to enable confident vendor selection that meaningfully reduces real-world AI risk.