OWASP Gen AI Incident & Exploit Round-up, Q2’25

OWASP Gen AI Incident & Exploit Round-up, Q2 (Mar-Jun) 2025

About the Round-up

This is not an exhaustive list, but a semi-regular blog where we aim to track and share insights on recent exploits involving or targeting Generative AI. Our goal is to provide a clear summary of each reported incident, including its impact, a breakdown of the attack, relevant vulnerabilities from the OWASP Top 10 for LLM and Generative AI, potential mitigations, and any urgent actions to take.

Information comes from online sources and direct reports to the project. This list includes both documented exploits and research efforts uncovering vulnerabilities.

Sharing an Exploit or Incident with Us

We will continue to monitor and crowd source for exploits or incidents, if you have one you want to share which we should include in an upcoming round-up, just complete this Google Form. or if you are already on the OWASP slack account you can share on the #team-genai-threat-round-up channel.

Q2 GenAI Exploits Round-up

Exploit 1: GPT-4.1 Jailbreak via Tool Poisoning. 1

Exploit 2: Deepfake Voice Scam Targets Banking Systems. 1

Exploit 3: Prompt Injection in ChatGPT Leads to Data Leaks. 1

Exploit 4: DeepSeek Data Breach. 1

Exploit 5: AI-Generated Deepfake Music Takedown by Sony Music. 1

Exploit 6: NVIDIA TensorRT-LLM Python Executor Vulnerability (CVE-2025-23254) 1

Exploit 7: CAIN – Targeted LLM Prompt Hijacking. 1

Exploit 8: AI-Generated Vishing Attacks via ViKing. 1

Exploit 9: AI-Powered Credential Stuffing and Automated Scanning. 1

Exploit 10: DeepSeek Data Breach and Unauthorized Data Transfer 1

Exploit 11. McDonald’s AI Hiring Bot Breach. 1

Exploit 12. AI‑Driven Rubio Impersonation Campaign. 1

Exploit 13. Indirect Prompt‑Injection Malware (“Skynet”) 1

Exploit 14. M365 Copilot Zero‑Click AI Command Injection. 1

 

Exploit 1: GPT-4.1 Jailbreak via Tool Poisoning

Description   :
In April 2025, attackers exploited GPT-4.1’s tool integration by embedding malicious instructions within tool descriptions. This “tool poisoning” led the AI to execute unauthorized actions, including data exfiltration, without user awareness.

Incident Summary:

  • Incident Name: GPT‑4.1 Echo Chamber / Tool Poisoning Jailbreak
  • Date & Location: April – June 2025
  • Affected Organizations: OpenAI GPT‑4.1 deployments, any platform using Model Context Protocol (MCP)-based tools
  • Attack Type: Indirect prompt injection via tool-poisoned context
  • System Impacted: GPT‑4.1 LLM and its tool-integration layer (e.g. MCP tool descriptions)

Impact Assessment  :
The exploit allowed unauthorized data access and potential exfiltration in applications using GPT-4.1. While financial losses are unreported, the breach undermined trust in AI integrations and highlighted vulnerabilities in tool descriptions.

Attack Breakdown:
Attackers created tool descriptions embedded with hidden malicious prompts and integrated these tools into applications utilizing GPT-4.1. When GPT-4.1 interacted with these tools, it unintentionally carried out harmful actions as instructed by the concealed prompts. This exploitation led to the unauthorized access and transmission of sensitive data, resulting in data exfiltration without user consent.

OWASP Top 10 LLM Vulnerabilities Exploited:

  • LLM01: Prompt Injection
  • LLM02: Indirect Prompt Injection
  • LLM05: Model Output Trust
  • LLM07: Insecure Plugin Design
  • LLM08: Insecure Third‑Party Integration
  • LLM09: Overreliance on AI-Generated Content

Potential Mitigations:

  • Implement strict validation and sanitization of tool descriptions.
  • Establish permissions and access controls for tool integrations.
  • Monitor AI behavior for anomalies during tool execution.
  • Educate developers on secure integration practices.

Call to Action:

IT and cybersecurity teams should audit AI tool integrations and monitor for unusual activity. Developers must validate third-party tools and ensure descriptions are free of hidden prompts. End users should stay alert to unexpected AI behaviors and report any anomalies. Policymakers need to establish clear guidelines to ensure secure and trustworthy AI tool integrations.

Source References:

Exploit 2: Deepfake Voice Scam Targets Banking Systems

Description   :
In March 2025, scammers used AI-generated deepfake voices to impersonate bank customers, bypassing voice authentication systems. This led to unauthorized access to accounts and significant financial theft, exposing vulnerabilities in voice-based security.

Incident Summary:

  • Incident Name: Deepfake Voice Banking Scam
  • Date & Location: March 2025, Hong Kong
  • Affected Organizations: Multiple banks employing voice authentication
  • Attack Type: AI-Generated Deepfake Voice Impersonation
  • System Impacted: Banking voice authentication system

Impact Assessment  :
The attack resulted in unauthorized transactions totaling approximately $25 million. It highlighted the inadequacy of voice authentication against sophisticated AI-generated impersonations.

Attack Breakdown:
Scammers collected voice samples from public sources and used AI tools to generate highly realistic voice clones. These deepfakes were then employed to bypass voice recognition systems used by banks for authentication. As a result, unauthorized transactions were carried out, with funds being transferred without the account holders’ consent.

OWASP Top 10 LLM Vulnerabilities Exploited:

  • LLM09: Overreliance on AI-Generated Content
  • LLM07: Insecure Plugin Design
  • LLM04: Training Data Poisoning

Potential Mitigations:

  • Implement multi-factor authentication combining voice with other verification methods.
  • Deploy deepfake detection algorithms in authentication systems.
  • Limit the use of voice authentication for high-risk transactions.
  • Regularly update and train staff on emerging AI threats.

Call to Action:

IT and cybersecurity teams should enhance authentication systems with multi-modal verification to counter AI threats. Banking institutions must upgrade security protocols to detect AI-generated inputs. Customers are urged to stay vigilant and report any suspicious account activity immediately. Regulators need to establish standards for authentication methods that are resistant to AI-driven attacks.

Source References:

Exploit 3: Prompt Injection in ChatGPT Leads to Data Leaks

Description   :
In March 2025, attackers exploited a prompt injection vulnerability in ChatGPT, causing it to disclose sensitive user data. By embedding malicious prompts in user inputs, they manipulated the AI to bypass its safety mechanisms and leak confidential information.

Incident Summary:

  • Incident Name: ChatGPT Prompt Injection Data Leak
  • Date & Location: March 2025, Global
  • Affected Organizations: Users and organizations utilizing ChatGPT
  • Attack Type: Prompt Injection
  • System Impacted: ChatGPT’s conversational AI platform

Impact Assessment  :
The breach led to unauthorized access to user data, undermining trust in AI platforms. While specific financial losses are unreported, the incident emphasized the need for robust input validation in AI systems.

Attack Breakdown:

Attackers crafted inputs containing hidden malicious prompts that manipulated ChatGPT into overriding its safety protocols. As a result, the AI unintentionally disclosed sensitive information from prior interactions. This exploit quickly spread across multiple instances, amplifying the breach and raising serious concerns about AI input handling and data security.

OWASP Top 10 LLM Vulnerabilities Exploited:

  • LLM01: Prompt Injection
  • LLM09: Overreliance on AI-Generated Content
  • LLM03: Training Data Leakage

Potential Mitigations:

  • Implement strict input validation and sanitization processes.
  • Enhance AI models with context-aware filters to detect and block malicious prompts.
  • Regularly update AI safety protocols and conduct security audits.
  • Educate users on safe interaction practices with AI systems.

Call to Action:

AI developers must enhance model defenses against prompt injection attacks, while organizations should monitor AI interactions for anomalies and data leaks. Users are advised to avoid sharing sensitive information with AI platforms. Regulatory bodies need to establish clear guidelines for secure input handling and data protection.

Source References:

Exploit 4: DeepSeek Data Breach

Description:
In late January and early March 2025, DeepSeek’s cloud database was exposed due to misconfiguration, leaking chat logs, API keys, and user metadata (approx. 1M+ records), prompting regulatory scrutiny in South Korea, Italy, and the U.S.

Incident Summary:

  • Incident Name: DeepSeek Cloud DB Exposure
  • Date & Location: Jan 29 – Mar 3, 2025 (Global)
  • Affected Organizations: DeepSeek (DeepSeek-R1 chatbot)
  • Attack Type: Cloud misconfiguration / data exposure
  • System Impacted: Cloud database storing user logs/API credentials

Impact Assessment :

Over one million chat records, API keys, and user data were exposed for at least an hour. Regulators in South Korea and Italy launched investigations; the U.S. Department of Commerce restricted government use. Long-term reputational and compliance damages likely.

Attack Breakdown :

A publicly accessible storage endpoint was misconfigured (no auth), enabling anyone to download sensitive logs. Security firm Wiz Research notified DeepSeek, which secured access within an hour. However, the incident triggered data privacy investigations and regulatory bans across several countries.

OWASP Top 10 LLM Vulnerabilities Exploited:

  • LLM08: Insecure Third‑Party Integration – misuse of cloud without secure controls
  • LLM09: Overreliance on AI Content – assumed internal data systems would self-protect

Potential Mitigations:

  • Technical: Enforce access controls, encryption at rest, infrastructure audit tools
  • Policy: Regular config reviews, organizational cloud security policies
  • Education: DevOps training for secure cloud deployment

Call to Action :

AI system operators must bake security into deployment pipelines, enabling automated detection of misconfigured environments. Cross-training between AI/development and cloud-security teams is critical to prevent data leakage incidents.

Source References:

Exploit 5: AI-Generated Deepfake Music Takedown by Sony Music

Description:

In March 2025, Sony Music removed over 75,000 AI-generated deepfake tracks from streaming platforms. These unauthorized recordings mimicked artists like Beyoncé and Harry Styles, highlighting the escalating issue of AI-generated content infringing on intellectual property rights.

Incident Summary:

  • Incident Name: Sony Music Deepfake Takedown
  • Date & Location: March 2025; Global
  • Affected Organizations: Sony Music Entertainment, streaming platforms, artists including Beyoncé and Harry Styles
  • Attack Type: Unauthorized AI-generated deepfake music
  • System Impacted: Digital music streaming platforms, artists’ intellectual property

Impact Assessment:

The proliferation of deepfake tracks poses significant financial and reputational risks to artists and record labels. Sony’s removal of 75,000 tracks underscores the challenge of protecting intellectual property in the age of generative AI.

Attack Breakdown:

Developers trained AI models on existing music catalogs, often without permission, to replicate the vocal and musical styles of popular artists. These models were used to generate deepfake tracks, which were then uploaded to streaming platforms under misleading names. Listeners, thinking the content was genuine, streamed or bought the tracks, diverting revenue from real artists and labels. Sony Music identified and issued takedown requests for over 75,000 such tracks, a costly and time-consuming effort

OWASP Top 10 LLM Vulnerabilities Exploited:

  • LLM04: Training Data Poisoning – Unauthorized use of copyrighted material to train AI models.
  • LLM09: Overreliance on AI-Generated Content – Consumers and platforms accepting AI-generated content without adequate verification.

Potential Mitigations:

  • Robust Content Verification: Implement AI-driven tools to detect and flag deepfake content on streaming platforms.
  • Licensing Frameworks: Establish clear licensing agreements for the use of artists’ works in AI training datasets.
  • Regulatory Measures: Governments to enact laws that protect against unauthorized use of likenesses and works in AI-generated content.
  • Industry Collaboration: Record labels, streaming services, and AI developers to collaborate on standards and practices to prevent misuse.

Call to Action:

IT and cybersecurity teams should deploy detection algorithms to identify and remove unauthorized AI-generated content. Streaming platforms must improve vetting processes and act quickly on takedown requests. Artists and rights holders need to monitor platforms and assert their rights through legal means. Policymakers must establish and enforce laws that address the growing challenges of AI-driven content creation and distribution.

Source References:

Exploit 6: NVIDIA TensorRT-LLM Python Executor Vulnerability (CVE-2025-23254)

Description:

In April 2025, a critical vulnerability (CVE-2025-23254) was identified in NVIDIA’s TensorRT-LLM framework. The flaw allows attackers with local access to execute arbitrary code, access sensitive information, and tamper with data due to insecure deserialization in the Python executor component.

Incident Summary:

  • Incident Name: TensorRT-LLM Python Executor Deserialization Vulnerability
  • Date & Location: Disclosed on April 29, 2025; Global impact
  • Affected Organizations: Users and organizations deploying NVIDIA TensorRT-LLM versions prior to 0.18.2
  • Attack Type: Deserialization of Untrusted Data
  • System Impacted: TensorRT-LLM’s Python executor component across Windows, Linux, and macOS platforms

Impact Assessment:

The vulnerability scores a high 8.8 on the CVSS v3.1 scale. Exploitation can lead to unauthorized code execution, information disclosure, and data tampering, posing significant risks to AI model integrity and system security.

Attack Breakdown:

The Python executor in TensorRT-LLM relies on Python’s pickle module for inter-process communication, creating a vulnerability. An attacker with local access can craft malicious serialized data, which the system deserializes without proper validation. This flaw enables arbitrary code execution, potentially compromising AI models, leaking sensitive data, and allowing data manipulation.

OWASP Top 10 LLM Vulnerabilities Exploited:

  • LLM07: Insecure Plugin Design – Use of unsafe deserialization methods without validation.
  • LLM09: Overreliance on AI-Generated Content – Trusting serialized data without verification.

Potential Mitigations:

  • Update Framework: Upgrade TensorRT-LLM to version 0.18.2 or later, which includes HMAC encryption for IPC.
  • Secure Serialization: Avoid using pickle for IPC; prefer safer serialization methods like JSON with strict schema validation.
  • Access Controls: Implement strict access controls to limit local access to the TensorRT-LLM server.
  • Monitoring: Deploy monitoring tools to detect unusual activities or unauthorized access attempts.

Call to Action:

In response to emerging vulnerabilities in TensorRT-LLM, a coordinated effort across all stakeholders is essential to uphold cybersecurity. IT and cybersecurity teams must promptly assess their systems for outdated or vulnerable versions and implement necessary updates without delay. Developers should proactively review and refactor their codebases, replacing insecure serialization methods and validating all data inputs to prevent exploitation. End users play a vital role by staying informed about software updates and promptly applying patches to safeguard their systems. Meanwhile, policymakers are encouraged to advocate for secure coding standards and enforce routine security audits in AI development frameworks to foster a more resilient digital infrastructure.

Source References:

 

Exploit 7: CAIN – Targeted LLM Prompt Hijacking

Description:
CAIN manipulates LLM system prompts to produce malicious answers to specific questions while appearing benign otherwise.

Incident Summary:

  • Incident Name: CAIN Prompt Hijacking
  • Date & Location: May 2025, Global
  • Affected Organizations: LLM providers, AI application developers\
  • Attack Type: Prompt injection
  • System Impacted: Large Language Models

Impact Assessment:
Enables large-scale information manipulation, potentially influencing public opinion or disseminating misinformation.

Attack Breakdown:

  • Initial vector: Injection of malicious system prompts targeting specific questions
  • Execution details: LLMs provide harmful answers to targeted prompts while remaining benign otherwise
  • Final impact: Spread of misinformation, erosion of trust in AI systems

OWASP Top 10 LLM Vulnerabilities Exploited:

  • LLM01: Prompt Injection

Potential Mitigations:

  • Technical defenses: Implement robust prompt validation and monitoring systems
  • Policy improvements: Develop guidelines for secure prompt engineering
  • User education: Educate users on the potential for manipulated AI responses

Call to Action:

IT and cybersecurity teams should continuously monitor AI outputs for unusual behavior, while affected organizations must audit their models for prompt injection vulnerabilities. End users and employees are encouraged to verify critical information across multiple sources to avoid misinformation. Regulators and policymakers must step in to establish clear standards for AI prompt security, ensuring safer and more reliable AI deployments across sectors.

Source References:

Exploit 8: AI-Generated Vishing Attacks via ViKing

Description:
ViKing utilizes AI to conduct realistic voice phishing attacks, deceiving individuals into disclosing sensitive information.

Incident Summary:

  • Incident Name: ViKing AI Vishing
  • Date & Location: April 2025, Global
  • Affected Organizations: Financial institutions, individuals
  • Attack Type: AI-generated vishing
  • System Impacted: Voice communication channels

Impact Assessment:

High success rate in deceiving individuals, leading to unauthorized disclosure of sensitive information.

Attack Breakdown:

Automated AI-generated phone calls served as the initial attack vector, using realistic conversations that mimicked trusted entities. These convincing interactions led to unauthorized access to personal and financial information, posing serious risks to individuals and organizations.

OWASP Top 10 LLM Vulnerabilities Exploited:

  • LLM09: Overreliance on AI Content

Potential Mitigations:

  • Technical defenses: Implement voice authentication mechanisms
  • Policy improvements: Establish protocols for verifying caller identities
  • User education: Train individuals to recognize and report suspicious calls

Call to Action:

IT and cybersecurity teams should deploy AI-driven call monitoring systems to detect and prevent fraudulent activity. Affected organizations need to strengthen customer verification methods to reduce risks. End users and employees must stay alert and avoid sharing sensitive information during unsolicited calls. Meanwhile, regulators and policymakers should create clear regulations to address threats posed by AI-generated communications.

Source References:

Exploit 9: AI-Powered Credential Stuffing and Automated Scanning

Description:
Cybercriminals leverage AI to conduct large-scale automated scanning and credential stuffing attacks, compromising numerous systems.

Incident Summary:

  • Incident Name: AI-Driven Credential Attacks
  • Date & Location: May 2025, Global
  • Affected Organizations: Various industries
  • Attack Type: Credential stuffing, automated scanning
  • System Impacted: Authentication systems, web applications:

Impact Assessment:

Significant increase in compromised accounts, leading to unauthorized access and potential data breaches.

Attack Breakdown:

Attackers are using AI to automate scanning and credential testing, rapidly deploying stolen credentials across multiple platforms. This technique enables quick identification of valid logins, leading to unauthorized access to user accounts and increasing the risk of data theft.

OWASP Top 10 LLM Vulnerabilities Exploited:

  • LLM09: Overreliance on AI Content

Potential Mitigations:

  • Technical defenses: Implement multi-factor authentication and rate limiting
  • Policy improvements: Regularly update and enforce strong password policies
  • User education: Educate users on creating strong, unique passwords

Call to Action:

IT and cybersecurity teams should monitor for unusual login patterns to detect potential breaches early. Affected organizations need to perform regular security audits to identify and fix vulnerabilities. End users and employees are encouraged to use password managers to maintain strong, unique credentials. Regulators and policymakers should promote standards that support secure authentication practices across digital platforms.

Source References:

Exploit 10: DeepSeek Data Breach and Unauthorized Data Transfer

Description:

Chinese AI startup DeepSeek transferred South Korean user data and AI prompts to overseas servers without consent, violating global privacy regulations and raising national security concerns.

Incident Summary:

  • Incident Name: DeepSeek Data Breach
  • Date & Location: April 2025, South Korea
  • Affected Organizations: DeepSeek, South Korean users
  • Attack Type: Data exfiltration
  • System Impacted: User data storage and transfer mechanism:

Impact Assessment:

Unauthorized data transfers led to regulatory blocks, financial penalties, and enforcements under GDPR and South Korea’s PIPC. Broader trust and reputational damage ensued globally.

Attack Breakdown:

Unauthorized data transfer mechanisms within DeepSeek enabled the unconsented transmission of personal data including AI prompts and device information to Beijing Volcano Engine Technology Co. Ltd. This breach led to the exposure of sensitive user information, triggering regulatory scrutiny and resulting in the suspension of app downloads in South Korea.

OWASP Top 10 LLM Vulnerabilities Exploited:

  • LLM09: Overreliance on AI Content

Potential Mitigations:

  • Technical defenses: Implement strict data access controls and monitoring
  • Policy improvements: Revise privacy policies to ensure compliance with data protection laws
  • User education: Inform users about data collection practices and obtain explicit consent:

Call to Action:

IT and cybersecurity teams must audit data transfer mechanisms and enforce strong encryption practices. Affected organizations should follow regulatory guidance and delete any improperly transferred data. End users and employees need to stay informed about app permissions and data sharing policies. Regulators and policymakers must enhance oversight of cross-border data transfers and ensure strict compliance to protect user privacy.

Source References:

    1. https://koreajoongangdaily.joins.com/news/2025-04-24/national/socialAffairs/DeepSeek-ordered-to-revise-personal-data-policies-delete-mishandled-user-information/2292939?
    2. https://www.reuters.com/technology/south-korea-agency-says-deepseek-transferred-user-info-prompts-without-consent-2025-04-24/?  

Exploit 11. McDonald’s AI Hiring Bot Breach

Description :

Security researchers easily breached McDonald’s AI hiring chatbot “Olivia,” guessing a simple password (“123456”) and gaining admin access on June 30, 2025, exposing personal details from up to 64 million job applicants.

Incident Summary:

  • Incident Name: “Olivia” AI Hiring Bot Breach
  • Date & Location: June 30, 2025 (global)
  • Affected Organizations: McDonald’s, Paradox.ai
  • Attack Type: Credential guessing on AI system
  • System Impacted: AI hiring chatbot backend (McHire.com)

Impact Assessment :

Exposed millions of personal records including names, emails, phone numbers across decades. Promoted Paradox.ai to launch a bug bounty program. McDonald’s responded swiftly, though damage to trust and compliance metrics remains.

Attack Breakdown :

Researchers found an unprotected staff login endpoint on McHire.com. By guessing “123456”, they accessed the admin console, enumerated applicant IDs, and retrieved millions of user records. The exploit stemmed from weak credential hygiene at vendor Paradox.ai, exploitable within 30 minutes.

OWASP Top 10 LLM Vulnerabilities Exploited:

  • LLM07: Insufficient Authentication – trivial password allowed full access
  • LLM08: Insecure Third‑Party Integration – vendor system lacked standard security controls

Potential Mitigations:

  • Technical: Enforce strong credential policies, MFA, automated log monitoring
  • Policy: Vendor risk assessments, contractually mandated security audits
  • Education: Train developers on secure AI deployment and credential hygiene

Call to Action:

IT teams should audit authentication on all AI‑powered endpoints, enforce strong passwords and MFA, and conduct third‑party vendor reviews. McDonald’s and similar orgs must integrate AI‑specific security into procurement and risk frameworks. Regulators should require reporting thresholds for AI system breaches.

Source References:

Exploit 12. AI‑Driven Rubio Impersonation Campaign

Description:

In mid‑June 2025, an actor used AI‑generated voice/text to impersonate Secretary of State Marco Rubio in Signal messages to diplomats and US officials. Aimed to extract sensitive details or account access.

Incident Summary:

  • Incident Name: Rubio AI Impersonation Scam
  • Date & Location: Mid‑June to early July 2025 (USA)
  • Affected Organizations:S. State Dept, foreign ministries, other officials
  • Attack Type: AI‑powered deepfake voice & text impersonation
  • System Impacted: Signal messaging app

Impact Assessment:

Raised alarm over realistic AI impersonation targeting geopolitical actors. Though no confirmed breaches, resulted in State Dept cybersecurity alerts and FBI warnings. This case eroded trust in messaging platforms for official communications.

Attack Breakdown:

Actor created a bogus Signal handle (“Marco.Rubio@state.gov”), used AI to clone Rubio’s voice and linguistic patterns. Voicemails/texts sent to several high-profile targets encouraged replies. While engagement is unknown, the campaign prompted an urgent cyber advisory from the State Department.

OWASP Top 10 LLM Vulnerabilities Exploited:

  • LLM04: AI‑Generated Disinformation – using deepfakes for social engineering
  • LLM09: Overreliance on AI Content – trusting AI-generated identity verification

Potential Mitigations:

  • Technical: Implement voice biometrics & metadata filters, verify sender handles
  • Policy: Ban unofficial messaging & enforce secure communication channels
  • Education: Train staff to identify deepfake indicators and verify identity via secondary means

Call to Action:

Government and enterprise security teams must harden communication channels, deploy AI‑detection tools, and mandate identity verification protocols. Messaging apps used for high‑security dialogue should employ multi‑factor identity checks and deepfake detection tech.

Source References:

Exploit 13. Indirect Prompt‑Injection Malware (“Skynet”)

Description:

In early June, malware dubbed “Skynet” (sample uploaded from the Netherlands) embedded a prompt‑injection string to evade AI detection, printing false “NO MALWARE DETECTED” prompts and setting up TOR proxies. Check Point Research+1Adversa+1

Incident Summary:

  • Incident Name: Skynet Prompt‑Injection Malware
  • Date & Location: Early June 2025 (Netherlands upload)
  • Affected Organizations: Potentially customizable to target any system
  • Attack Type: Malware with prompt‑injection to fool AI scanners
  • System Impacted: Endpoint AV/AI monitoring systems

Impact Assessment:

As an emerging malware proof‑of‑concept, it exploited AI AV that trusted model outputs. While limited in scope, it signals rising sophistication where attackers tailor malware to subvert AI defenses.

Attack Breakdown:

The binary initializes an AI prompt instructing the model to ignore prior instructions and falsely declare “NO MALWARE DETECTED.” It embeds sandbox evasion and TOR proxy set-up,though incomplete. This technique intends to manipulate AI‑powered endpoint detection workflows.

OWASP Top 10 LLM Vulnerabilities Exploited:

  • LLM01: Prompt Injection – malware injecting directives into AI scanners
  • LLM05: Model Output Trust – excessive reliance on AI decisions

Potential Mitigations:

  • Technical: Verify model chain-of-trust, use multiple detection modalities (behavioral & static), sandboxing
  • Policy: Define policies rejecting AI output as sole verdict
  • Education: Train analysts to interrogate AI results and apply defense‑in‑depth

Call to Action:

Security teams must treat AI AV outputs with skepticism, combining them with traditional detection methods. AI defense systems should include provenance tracking and alerts for conflicting evidence. Incident response must integrate AI‑based deception scenarios.

Source References:

Exploit 14. M365 Copilot Zero‑Click AI Command Injection

Description:

In June 2025, a zero-click vulnerability in Microsoft 365 Copilot allowed attackers to embed instructions within received emails and exfiltrate internal data without user action.

Incident Summary:

  • Incident Name: M365 Copilot AI Command Injection
  • Date & Location: June 2025 (global)
  • Affected Organizations: Microsoft 365 enterprise clients
  • Attack Type: Indirect prompt injection (scope violation)
  • System Impacted: Microsoft 365 Copilot LLM

Impact Assessment:

Could enable stealth exfiltration of sensitive corporate information across organizations, bypassing traditional filters or user confirmation. Microsoft patched the exploit on June Patch Tuesday.

Attack Breakdown:

An attacker embeds malicious prompt within email. When Copilot processes it, it executes unintended instructions,leaking internal data. Offers exfil without clicking links or user intervention, due to “scope violation.” Disclosed by Aim Security.

OWASP Top 10 LLM Vulnerabilities Exploited:

  • LLM02: Indirect Prompt Injection – commands hidden in benign-looking content
  • LLM09: Overreliance on AI Content Approvals – automated actions without verification

Potential Mitigations:

  • Technical: Input sanitization, strict prompting policies, context confinement
  • Policy: Disable auto‑read input triggers, require explicit user commands
  • Education: Warn users about AI‑triggered behaviors and implement internal audits

Call to Action:

Enterprises using Copilot or similar services should ensure updates are applied and audit systems for LLM‑triggered process integrity. Security teams must log AI interactions and monitor for anomalous data flows. Policies should prevent AI from executing actions without oversight.

Source References:

Scroll to Top