XML EXTERNAL ENTITY (XXE)
XXE is a parser-level failure that enables local file reads, SSRF to internal control planes, denial-of-service via entity expansion, and in some stacks, code execution through XInclude/XSLT or language-specific wrappers. Treat every XML input as untrusted until the parser is proven hardened.
- File disclosure: read server files and configuration
- SSRF: reach metadata services, internal admin panels, service ports
- DoS: entity expansion (billion laughs), external resource amplification
- Injection surfaces: REST/SOAP/SAML/XML-RPC, file uploads (SVG, Office), PDF generators, build/report pipelines, config importers
- Transclusion: XInclude and XSLT document() loading external resources
1. Inventory all XML consumers: endpoints, upload parsers, background jobs, CLI tools, converters, and third-party SDKs.
2. Start with capability probes: does the parser accept DOCTYPE? resolve external entities? allow network access? support XInclude/XSLT?
3. Establish a quiet oracle (error shape, length/ETag diffs, OAST callbacks), then escalate to targeted file/SSRF payloads.
4. Validate per-channel parity: the same parser options must hold across REST, SOAP, SAML, file uploads, and background jobs.
- File uploads: SVG/MathML, Office (docx/xlsx/ods/odt), XML-based archives, Android/iOS plist, project config imports
- Protocols: SOAP/XML-RPC/WebDAV/SAML (ACS endpoints), RSS/Atom feeds, server-side renderers and converters
- Hidden paths: "xml", "upload", "import", "transform", "xslt", "xsl", "xinclude" parameters; processing-instruction headers
- Minimal DOCTYPE: attempt a harmless internal entity to detect acceptance without causing side effects
- External fetch test: point to an OAST URL to confirm egress; prefer DNS first, then HTTP
- XInclude probe: add xi:include to see if transclusion is enabled
- XSLT probe: xml-stylesheet PI or transform endpoints that accept stylesheets
- Inline disclosure of entity content in the HTTP response, transformed output, or error pages
- Coerce parser errors that leak path fragments or file content via interpolated messages
- Blind XXE via parameter entities and external DTDs; confirm with DNS/HTTP callbacks
- Encode data into request paths/parameters to exfiltrate small secrets (hostnames, tokens)
- Fetch slow or unroutable resources to produce measurable latency differences (connect vs read timeouts)
]>
&xxe;
]>
&xxe;
]>
&xxe;
]>
&xxe;
%dtd;]>
evil.dtd:
">
%e; %exfil;
- Use parameter entities in the DTD subset to define secondary entities that exfiltrate content; works even when general entities are sanitized in the XML tree
- Effective where entity resolution is blocked but XInclude remains enabled in the pipeline
- XSLT processors can fetch external resources via document():
- Targets: transform endpoints, reporting engines (XSLT/Jasper/FOP), xml-stylesheet PI consumers
- Java: jar:, netdoc:
- PHP: php://filter, expect:// (when module enabled)
- Gopher: craft raw requests to Redis/FCGI when client allows non-HTTP schemes
- UTF-16/UTF-7 declarations, mixed newlines, CDATA and comments to evade naive filters
- PUBLIC vs SYSTEM, mixed case , internal vs external subsets, multi-DOCTYPE edge handling
- If network blocked but filesystem readable, pivot to local file disclosure; if files blocked but network open, pivot to SSRF/OAST
]>
&xxe;
- Assertions are XML-signed, but upstream XML parsers prior to signature verification may still process entities/XInclude; test ACS endpoints with minimal probes
- Inline SVG and server-side SVG→PNG/PDF renderers process XML; attempt local file reads via entities/XInclude
- OOXML (docx/xlsx/pptx) are ZIPs containing XML; insert payloads into document.xml, rels, or drawing XML and repackage
1. Provide a minimal payload proving parser capability (DOCTYPE/XInclude/XSLT).
2. Demonstrate controlled access (file path or internal URL) with reproducible evidence.
3. Confirm blind channels with OAST and correlate to the triggering request.
4. Show cross-channel consistency (e.g., same behavior in upload and SOAP paths).
5. Bound impact: exact files/data reached or internal targets proven.
- DOCTYPE accepted but entities not resolved and no transclusion reachable
- Filters or sandboxes that emit entity strings literally (no IO performed)
- Mocks/stubs that simulate success without network/file access
- XML processed only client-side (no server parse)
- Disclosure of credentials/keys/configs, code, and environment secrets
- Access to cloud metadata/token services and internal admin panels
- Denial of service via entity expansion or slow external resources
- Code execution via XSLT/expect:// in insecure stacks
1. Prefer OAST first; it is the quietest confirmation in production-like paths.
2. When content is sanitized, use error-based and length/ETag diffs.
3. Probe XInclude/XSLT; they often remain enabled after entity resolution is disabled.
4. Aim SSRF at internal well-known ports (kubelet, Docker, Redis, metadata) before public hosts.
5. In uploads, repackage OOXML/SVG rather than standalone XML; many apps parse these implicitly.
6. Keep payloads minimal; avoid noisy billion-laughs unless specifically testing DoS.
7. Test background processors separately; they often use different parser settings.
8. Validate parser options in code/config; do not rely on WAFs to block DOCTYPE.
9. Combine with path traversal and deserialization where XML touches downstream systems.
10. Document exact parser behavior per stack; defenses must match real libraries and flags.
XXE is eliminated by hardening parsers: forbid DOCTYPE, disable external entity resolution, and disable network access for XML processors and transformers across every code path.