LLM02: Insecure Output Handling

Insecure Output Handling refers specifically to insufficient validation, sanitization, and handling of the outputs generated by large language models before they are passed downstream to other components and systems. Since LLM-generated content can be controlled by prompt input, this behavior is similar to providing users indirect access to additional functionality.

Insecure Output Handling differs from Overreliance in that it deals with LLM-generated outputs before they are passed downstream whereas Overreliance focuses on broader concerns around overdependence on the accuracy and appropriateness of LLM outputs.

Successful exploitation of an Insecure Output Handling vulnerability can result in XSS and CSRF in web browsers as well as SSRF, privilege escalation, or remote code execution on backend systems. The following conditions can increase the impact of this vulnerability:

  • The application grants the LLM privileges beyond what is intended for end users, enabling escalation of privileges or remote code execution.
  • The application is vulnerable to indirect prompt injection attacks, which could allow an attacker to gain privileged access to a target user’s environment.
  • 3rd party plugins do not adequately validate inputs.

Common Examples of Vulnerability

  1. LLM output is entered directly into a system shell or similar function such as exec or eval, resulting in remote code execution.
  2. JavaScript or Markdown is generated by the LLM and returned to a user. The code is then interpreted by the browser, resulting in XSS.

How to Prevent

  1. Treat the model as any other user, adopting a zero-trust approach, and apply proper input validation on responses coming from the model to backend functions.
  2. Follow the OWASP ASVS (Application Security Verification Standard) guidelines to ensure effective input validation and sanitization.
  3. Encode model output back to users to mitigate undesired code execution by JavaScript or Markdown. OWASP ASVS provides detailed guidance on output encoding.

Example Attack Scenarios

  1. An application utilizes an LLM plugin to generate responses for a chatbot feature. The plugin also offers a number of administrative functions accessible to another privileged LLM. The general purpose LLM directly passes its response, without proper output validation, to the plugin causing the plugin to shut down for maintenance.
  2. A user utilizes a website summarizer tool powered by an LLM to generate a concise summary of an article. The website includes a prompt injection instructing the LLM to capture sensitive content from either the website or from the user’s conversation. From there the LLM can encode the sensitive data and send it, without any output validation or filtering, to an attacker-controlled server.
  3. An LLM allows users to craft SQL queries for a backend database through a chat-like feature. A user requests a query to delete all database tables. If the crafted query from the LLM is not scrutinized, then all database tables will be deleted.
  4. A web app uses an LLM to generate content from user text prompts without output sanitization. An attacker could submit a crafted prompt causing the LLM to return an unsanitized JavaScript payload, leading to XSS when rendered on a victim’s browser. Insufficient validation of prompts enabled this attack.

LLM Top 10

LLM01: Prompt Injection

Prompt Injection Vulnerability occurs when an attacker manipulates a large language model (LLM) through crafted inputs, causing the LLM...

LLM02: Insecure Output Handling

Insecure Output Handling refers specifically to insufficient validation, sanitization, and handling of the outputs generated by large language models...

LLM03: Training Data Poisoning

The starting point of any machine learning approach is training data, simply “raw text”. To be highly capable (e.g.,...

LLM04: Model Denial of Service

An attacker interacts with an LLM in a method that consumes an exceptionally high amount of resources, which results...

LLM05: Supply Chain Vulnerabilities

The supply chain in LLMs can be vulnerable, impacting the integrity of training data, ML models, and deployment platforms....

LLM06: Sensitive Information Disclosure

LLM applications have the potential to reveal sensitive information, proprietary algorithms, or other confidential details through their output. This...

LLM07: Insecure Plugin Design

LLM plugins are extensions that, when enabled, are called automatically by the model during user interactions. They are driven...

LLM08: Excessive Agency

An LLM-based system is often granted a degree of agency by its developer – the ability to interface with...

LLM09: Overreliance

Overreliance can occur when an LLM produces erroneous information and provides it in an authoritative manner. While LLMs can...

LLM10: Model Theft

This entry refers to the unauthorized access and exfiltration of LLM models by malicious actors or APTs. This arises...

Scroll to Top