XML External Entity (XXE)

An XML External Entity (XXE) vulnerability allows an attacker to interfere with an application's processing of XML data, often leading to information disclosure, server-side request forgery (SSRF), or remote code execution.

An XML External Entity (XXE) vulnerability arises when an application processes XML input containing references to external entities, and the XML parser is configured to resolve these references without proper validation or sanitization. Attackers can leverage this to inject malicious DTD (Document Type Definition) or other external entity declarations into XML documents, compelling the server to disclose sensitive files, execute arbitrary code, or perform server-side requests to other systems. This can lead to severe consequences, including data exfiltration, denial of service, and full system compromise, making it a critical threat in application security.

What is an XML external entity?

An XML external entity is a custom entity defined outside the body of an XML document, typically declared within a DTD. External entities use the SYSTEM or PUBLIC keyword to reference content from an external URI, such as a file on the local filesystem or a remote URL. When an XML parser encounters these declarations, it attempts to fetch and include the referenced content into the document during parsing.

In a legitimate context, external entities can be useful for modularizing XML content. However, when user-controlled XML input is parsed by a server without restrictions, attackers can craft malicious payloads that abuse external entity resolution. This class of vulnerability is cataloged as CWE-611: Improper Restriction of XML External Entity Reference and is recognized as a significant application security risk by organizations such as OWASP.

Why are XML external entities a security risk?

XXE vulnerabilities pose a critical security risk because they can be exploited to achieve several dangerous outcomes:

  • Sensitive file disclosure: An attacker can read arbitrary files from the server. For example, a payload like ]><foo>&xxe;</foo> can retrieve the contents of /etc/passwd on a Linux server.
  • Server-Side Request Forgery (SSRF): By referencing internal network resources, an attacker can force the server to make requests to internal systems. For instance: ]><foo>&xxe;</foo> can access internal administration panels not exposed to the public internet.
  • Denial of Service (DoS): Recursive or extremely large entity expansions (known as "Billion Laughs" attacks) can consume server memory and crash the application.
  • Remote Code Execution: In some configurations, XXE can be chained with other vulnerabilities or protocol handlers (such as expect:// in PHP) to execute arbitrary commands on the server.

These risks are highlighted in the OWASP Top 10 and extensively documented by PortSwigger's Web Security Academy.

How to prevent XML external entity attacks?

Preventing XXE attacks requires a defense-in-depth approach focused on secure XML parsing configuration:

  • Disable external entity resolution: Configure the XML parser to disallow DTDs and external entity processing entirely. This is the most effective mitigation.
  • Use less complex data formats: Where possible, use JSON or other data formats that do not support entity references, reducing the attack surface.
  • Input validation and sanitization: Validate and sanitize all XML input before parsing, rejecting any documents that contain DTD declarations.
  • Apply the principle of least privilege: Ensure the application runs with minimal permissions so that even if an XXE vulnerability is exploited, the impact is limited.
  • Keep libraries updated: Regularly update XML parsing libraries and frameworks, as newer versions often include secure defaults against XXE.
  • Use Web Application Firewalls (WAFs): Deploy WAF rules to detect and block common XXE payloads as an additional layer of defense.

The OWASP XXE Prevention Cheat Sheet provides language-specific configuration guidance for securely disabling external entity processing.

When should XML parsing be secured?

XML parsing should be secured in every scenario where an application processes XML data, regardless of whether the input originates from trusted or untrusted sources. Specific situations that demand heightened attention include:

  • API endpoints that accept XML payloads (SOAP services, REST APIs with XML content types)
  • File upload functionality that processes XML-based formats such as DOCX, XLSX, SVG, or XHTML
  • Configuration parsing where XML files are read from external or user-supplied sources
  • Single Sign-On (SSO) implementations using SAML, which relies heavily on XML processing
  • Third-party integrations that exchange data in XML format

Security measures should be implemented during the design and development phase and continuously validated through security testing and code reviews.

Which XML parsers are vulnerable to XXE?

Most XML parsers are vulnerable to XXE by default unless explicitly configured to disable external entity resolution. Common parsers and their default behaviors include:

Parser / LanguageVulnerable by Default?Mitigation
`javax.xml.parsers` (Java)YesSet `FEATURE_SECURE_PROCESSING`, disable DTDs
`libxml2` (C/Python)YesSet `XML_PARSE_NOENT` flag to off, use `defusedxml` in Python
`System.Xml` (.NET)Varies by versionSet `XmlReaderSettings.DtdProcessing = DtdProcessing.Prohibit`
`SimpleXML / DOMDocument` (PHP)YesCall `libxml_disable_entity_loader(true)`
`Nokogiri` (Ruby)Partially (safe by default in recent versions)Use `NONET` option, keep library updated

As recommended by the SANS Institute and other security organizations, developers should always consult the documentation of their specific XML parser and explicitly disable external entity processing rather than relying on defaults.